- Access all the calculations underlying an analysis
*Which environment(s) are presented to you in this MOOC?*
- Rstudio
- Emacs/OrgMode
- Jupyter
*Which environment is recommended if your preferred language is Python?*
Jupyter
*Which environment is recommended if your preferred language is the R language?*
Rstudio
*RstudioWhich environment is used daily by the three authors of this MOOC?*
Emacs/OrgMode b. Emacs/OrgMode - correct
**Quiz 7**
*In the studies we have presented to you, what prevents, sometimes for several years, the debate on the relevance of a study?*
- Unpublished computation procedures
- Data used in the study was not released
*In the various examples presented (economics, MRI, crystallography), what are the main causes of errors ?*
- Data acquisition (bias, machine calibration, etc.)
- Computation errors
- Inadequate data processing or statistics
*What are the consequences of lack of transparency?*
- It's difficult to rely on the work of others
- Articles contain less information (no details on calculations, experimental protocols, data analysis, etc.) and are therefore easier to read
- It is difficult to verify and reproduce the analyses presented in the articles
- Two articles may present results that seem to contradict each other, but are both perfectly correct, as the lack of detail prevents the exact conditions of application from being determined
**Quiz 8**
*What are the main technical causes behind the difficulties in reproducing someone else's work?*
- Lack of documentation on the choices made
- Interactive graphical software that hide computation details
- Computation errors
- Data loss (no backup or no more readable format)
*Which solutions are mentioned?*
- Using a laboratory notebook
- Code review and continuous integration
- Using version control systems and several backup mechanisms
*What are the most legitimate/valid fears associated with the systematic disclosure of data (open data)*
- Some information may be sensitive and its disclosure may hurt people
- My resources are limited. If I systematically host all this data on the web page provided by my employer, I am likely to quickly exceed my quota
**Quiz 9**
*What is commonly found in a computational document?*
- Commentaries
- Code
- An overview of data
- Computational results
- Hypertext links
- Images
*What does a computational document allow?*
- Inspect the computations
- Easily re-run the computations if the original environment is available
- Document the code
- Explain why a particular computation is made based on the data analysis so far
- Use multiple languages to perform computations (even if it may require some work)
**QuizP 01**
*What does an environment like Jupyter provide in comparison to working in the Python console or running R scripts directly?*
- It provides a well-structured history of the analyses performed
- It allows you to inspect data, keep a history of this inspection, and explain the transformations you perform as you go along
- It saves intermediate results, whether textual or graphical.
- It allows you to generate documents in HTML or PDF
- It allows you to ensure that a figure is the result of the computation described in the document
*In Jupyter, what features are provided for the Python language but not available for the R language?*
There are the same features for both languages
*What allows you to be effective in an environment like Jupyter?*
- The export functions and the ability to easily re-run the code from the beginning
- Autocompletion
- Learning keyboard shortcuts
- Reading the documentation and cheat sheets
**Quiz 10**
*What should I prepare for if I use a computational document ?*
- By letting my co-authors easily access and modify my code, they may break everything
- My collaborators may realize how bad I am at writing code and how often I fiddle with my results (assuming that this is the case...;)
- I will have no more excuses for not rereading and checking the code of my collaborators
- My co-authors and I will have to make sure that it works on each of our machines and it will take us time
- We'll have to install a lot of complicated stuff when our machines are not up to date
*What are the benefits of using and publishing a computational document?*
- These tools are relatively easy to learn, which allows as many people as possible to use them and better understand my work
- These tools allow to have in a single document (1) an overview of the data (2) the code (3) the computation results, and especially (4) explanations on how these three types of objects are articulated with each other
- This makes it possible to be transparent about how we reach a particular conclusion
- This makes it easier for others to reuse all or part of our computation procedures
*How to make your computational document available in a sustainable way?*
- Gitlab, Github, …
- An open archive (HAL, figshare, zenodo)
**Quiz 11**
*What makes the three environments Jupyter, Rstudio and OrgMode different?*
- Ease of installation and learning curve
- The year of creation and the underlying language
- The ability to write documents in a specific style for submission to a journal or conference
- The underlying file format
*What justifies using one of these three environments (Jupyter, Rstudio and OrgMode) rather than an other?*
- The type of document (tutorial sheet, laboratory notebook, article...) that you wish to write
- The audience who will contribute to or use this document
*A computational document facilitates the collection and sharing of information on data and on a computation. But for this information to be exploitable it is important to:*
- Manage your document using a version manager
- Structure the document to make it as readable as possible
- Explain in natural language the general idea behind a computation and why a particular computation decision is taken
- Use keywords to facilitate indexing and navigation
- Think about who the information is intended for (yourself, a colleague, your advisor, ...)