Add some notes on module 2.

parent 259ca862
---
title: MOOC, Reproducible Research
author: Thomas Rushton
---
# MOOC: Reproducible Research # MOOC: Reproducible Research
- [22/03/2024](#22/03/2024)
- [26/03/2024](#26/03/2024)
# 22/03/2024 # 22/03/2024
I started this MOOC a few days ago (18/03) and this is my first day tackling I started this MOOC a few days ago (18/03) and this is my first day tackling
...@@ -25,7 +34,9 @@ perspectives on otherwise familiar tools. ...@@ -25,7 +34,9 @@ perspectives on otherwise familiar tools.
- [Apparently](https://stackoverflow.com/a/4829998) Pandoc prefers the - [Apparently](https://stackoverflow.com/a/4829998) Pandoc prefers the
triple-hyphen; triple-hyphen;
- I thought markdown comments had to be some awful thing like - I thought markdown comments had to be some awful thing like
`[//] # (this is a comment)`. `[//] # (this is a comment)`;
- Anyway, here's an example: `<!--- :mylabel: --->`​;
- And now there's a(n invisible) tag against which to search this document.
## Interesting links ## Interesting links
...@@ -93,5 +104,62 @@ painful. ...@@ -93,5 +104,62 @@ painful.
### ExifTool ### ExifTool
Perhaps I *should* be adding metadata to my images and audio files. Perhaps I *should* be adding metadata to my images and audio files.
```shell
exiftool -[comment|notes]=":mylabel:" img.jpg
```
Worth remembering that, in addition to EXIF, Worth remembering that, in addition to EXIF,
[XMP](https://en.wikipedia.org/wiki/Extensible_Metadata_Platform) exists. [XMP](https://en.wikipedia.org/wiki/Extensible_Metadata_Platform) exists.
# 26/03/2024
Working my way through Module 2.
## Reproducibility Problems
- [Reinhart & Rogoff](https://en.wikipedia.org/wiki/Growth_in_a_Time_of_Debt):
Growth in a Time of Debt
In short, the basis for the economic orthodoxy with regard to _austerity_ in the
wake of the 2008/9 financial crisis.
Ultimately based the insubstantiable assertion that national debt exceeding 90%
of GDP has "[dramatic consequences for growth]".
By the time their dubious data-handling and slipshod statistical practice had
been discovered, their conclusions were already in the hands of conservative
economic policymakers.
- Chang et al. and the database column-swap
[debacle](https://people.ligo-wa.caltech.edu/~michael.landry/calibration/S5/getsignright.pdf)
Papers had to be retracted.
Sure, methodological problems, but driven by _sociological/cultural_ ones;
high productivist pressure &mdash; _publish or perish_.
The real problem is the risk of a lack of rigour and transparency...
## Why is reproducibility difficult?
- Lack of info leading to inability to replicate decisions made by original
researchers.
- Profusion of errors caused by the user of computers;
- computers permit us to go further and faster, but also to make errors more
readily and rapidly;
- there's also the black-box effect of proprietary software, and the daftness
of opinionated design decisions being confused for helpful ones, e.g.
"MARCH1" and "2310009E13" being interpreted by Excel as a date and a very
large number respectively.
- Lack of rigour and organisation;
- no VCS, manual file-naming conventions, etc.;
- no code-review or continuous integration.
- And, as ever, cultural/social issues;
- an article can be (uncharitably) described as an advert for the _real_ work
of research and result-gathering, but why?;
- well, perhaps we feel like we could have been more rigorous, so we take a
few liberties with documenting our work, we get selective with our results,
etc.;
- ultimately we don't wish to suffer embarrassment or humiliation, or (worse
still!) miss an opportunity to publish;
- the irony being that we'd be better-placed to publish if we were open,
transparent, etc.; but we're far from alone in all this.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment