From 5e79e620cdceabed36f560f760adab0692bb44d3 Mon Sep 17 00:00:00 2001 From: ca693b8bd1eb5f2789da9233f56f5d25 Date: Tue, 26 Mar 2024 16:18:41 +0000 Subject: [PATCH] Add some notes on module 2. --- journal/logbook.md | 70 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 69 insertions(+), 1 deletion(-) diff --git a/journal/logbook.md b/journal/logbook.md index 0359725..7185c84 100644 --- a/journal/logbook.md +++ b/journal/logbook.md @@ -1,5 +1,14 @@ +--- +title: MOOC, Reproducible Research +author: Thomas Rushton +--- + # MOOC: Reproducible Research +- [22/03/2024](#22/03/2024) +- [26/03/2024](#26/03/2024) + + # 22/03/2024 I started this MOOC a few days ago (18/03) and this is my first day tackling @@ -25,7 +34,9 @@ perspectives on otherwise familiar tools. - [Apparently](https://stackoverflow.com/a/4829998) Pandoc prefers the triple-hyphen; - I thought markdown comments had to be some awful thing like - `[//] # (this is a comment)`. + `[//] # (this is a comment)`; + - Anyway, here's an example: ``​; + - And now there's a(n invisible) tag against which to search this document. ## Interesting links @@ -93,5 +104,62 @@ painful. ### ExifTool Perhaps I *should* be adding metadata to my images and audio files. + + +```shell +exiftool -[comment|notes]=":mylabel:" img.jpg +``` + Worth remembering that, in addition to EXIF, [XMP](https://en.wikipedia.org/wiki/Extensible_Metadata_Platform) exists. + +# 26/03/2024 + +Working my way through Module 2. + +## Reproducibility Problems + +- [Reinhart & Rogoff](https://en.wikipedia.org/wiki/Growth_in_a_Time_of_Debt): + Growth in a Time of Debt + +In short, the basis for the economic orthodoxy with regard to _austerity_ in the +wake of the 2008/9 financial crisis. +Ultimately based the insubstantiable assertion that national debt exceeding 90% +of GDP has "[dramatic consequences for growth]". +By the time their dubious data-handling and slipshod statistical practice had +been discovered, their conclusions were already in the hands of conservative +economic policymakers. + +- Chang et al. and the database column-swap + [debacle](https://people.ligo-wa.caltech.edu/~michael.landry/calibration/S5/getsignright.pdf) + +Papers had to be retracted. +Sure, methodological problems, but driven by _sociological/cultural_ ones; +high productivist pressure — _publish or perish_. + +The real problem is the risk of a lack of rigour and transparency... + +## Why is reproducibility difficult? + +- Lack of info leading to inability to replicate decisions made by original + researchers. +- Profusion of errors caused by the user of computers; + - computers permit us to go further and faster, but also to make errors more + readily and rapidly; + - there's also the black-box effect of proprietary software, and the daftness + of opinionated design decisions being confused for helpful ones, e.g. + "MARCH1" and "2310009E13" being interpreted by Excel as a date and a very + large number respectively. +- Lack of rigour and organisation; + - no VCS, manual file-naming conventions, etc.; + - no code-review or continuous integration. +- And, as ever, cultural/social issues; + - an article can be (uncharitably) described as an advert for the _real_ work + of research and result-gathering, but why?; + - well, perhaps we feel like we could have been more rigorous, so we take a + few liberties with documenting our work, we get selective with our results, + etc.; + - ultimately we don't wish to suffer embarrassment or humiliation, or (worse + still!) miss an opportunity to publish; + - the irony being that we'd be better-placed to publish if we were open, + transparent, etc.; but we're far from alone in all this. -- 2.18.1