# Reproducible research course logbook ## Module 1: Laboratory Notebook ### Quiz 01 (4/4) 1. Why has a European project recently used the logbooks of the Portuguese, Spanish, Dutch and English Indian? *Answer* * a . To try to reconstitute the ocean climate criss-crossed by the Western navies 2. What note media are illustrated in the course video "Note-taking concerns everyone" by Christophe Pouzat? *Answer* * b. Notes in books and manuscripts margins * d. Notes on cards and paper slips 3. Why did Leibniz order the construction of a closet ? *Answer* * b. To store and order notes written on paper slips 4. For the curious, visit the Darwin Online web sites go to the notebooks and describe how Darwin took his notes. *Answer* * c. First in notebooks then on cards and paper sheets stored in folders ### Quiz 02 (2/3) 1. What is the origin of the codex? *Answer* * a. Initially codices were made of parchment a much stronger (and therefore more durable) material than the papyrus used for scrolls ***correction*** * c. The Egyptian production of papyrus was not large enough to meet the demand of writers 2. What aspect of Eusebius work is presented in this sequence? *Answer* * b. His canon tables (cross-references between the Gospel books) 3. In which line should the keyword "Analysis" go in John Locke's index ? *Answer* * c. « Aa » ### Quiz 03 (3/3) 1. What is a text file ? *Answer* * b. A file made up (stored as) UTF-8 characters 2. What is a tag ? *Answer* * c. A character, or series of characters, used to structure a document that will be invisible to the final reader. 3. Markdown is a markup language: *Answer* * a. "Light" ### Exercice 01 (1st part): Getting familiar with GitLab #### Exercice 01-1 : Gitlab Recherche/Search #### Exercice 01-1 : Gitlab Historique (2/2 points) ### Exercice 01 (2nd part): Getting familiar with Markdown #### Exercice 01-2 (0.86/1 point) # Module 2: the showcase behind the scenes: the computational document ## Quiz 07 (3/3 marks) 1. In the studies we have presented to you, what prevents, sometimes for several years, the debate on the relevance of a study? *Answer* * [Correct] has. Unpublished computational procedures * [False] has. Unpublished computational proceduresb. Pressure from government or industry lobbyists who wish to build on the study in question * [Correct] vs. Data used in the study was not released * [False] Experts disagree 2. In the various examples presented (economics, MRI, crystallography), what are the main causes of errors? *Answer* * [Correct] has. Data acquisition (bias, machine calibration, etc.) * [Correct] b. Computation errorsvs. * [Correct] Inadequate data processing or statisticsd. * [False] It's all about interpretation, error is inevitable. 3. What are the consequences of lack of transparency? *Answer* * [Correct] has. It's difficult to rely on the work of othersb. * [Correct] Articles contain less information (no details on calculations, experimental protocols, data analysis, etc.) and are therefore easier to readvs. * [Correct] It is difficult to verify and reproduce the analyzes presented in the articlesd. * [Correct] Two articles may present results that seem to contradict each other, but are both perfectly correct, as the lack of detail prevents the exact conditions of application from being determinede. * [False] Hiding the data used in the studies, we "guarantee" a certain anonymity to the people who may appear in these studies ## Quiz 08 (2/3 points) 1. What are the main technical causes behind the difficulties in reproducing someone else's work? *Answer* * [Correct]a. The use of open source software: * [Correct]b. Lack of documentation on the choices made * [False] c. Interactive graphical software that hide computation details * [False] d. The overload of information because it is very difficult to find what you are looking for * [Correct]e. Computation errors * [Correct]f. Data loss (no backup or no more readable format) ***correction*** * [False]a. The use of open source software: * [Correct]b. Lack of documentation on the choices made * [Correct] c. Interactive graphical software that hide computation details * [False] d. The overload of information because it is very difficult to find what you are looking for * [Correct]e. Computation errors * [Correct]f. Data loss (no backup or no more readable format) 2. Which solutions are mentioned *Answer* * [Correct]a. Using a laboratory notebook * [False] b. Using spreadsheets * [Correct]c. Code review and continuous integratio * [Correct]d. Using version control systems and several backup mechanisms 3. What are the most legitimate/valid fears associated with the systematic disclosure of data (open data) ? *Answer* * [False]a. Be seen as a fool because others may then find your mistakes.: * [False]b. Someone could benefit from my work. * [Correct]c. Some information may be sensitive and its disclosure may hurt people.: * [Correct]d. My resources are limited. If I systematically host all this data on the web page provided by my employer, I am likely to quickly exceed my quota.: ## Quiz 09 (1/2 points) 1. What is commonly found in a computational document? *Answer* * a. Commentaries * b. Code * c. An overview of data * e. Computational results * f. Hypertext links * g. Images 2. What does a computational document allow? *Answer* * a. Inspect the computations * b. Easily re-run the computations if the original environment is available * c. Document the code * d. Explain why a particular computation is made based on the data analysis so far ***correction*** * a. Inspect the computations * b. Easily re-run the computations if the original environment is available * c. Document the code * d. Explain why a particular computation is made based on the data analysis so far * e. Perform non-regression tests in a systematic way ## 2.4. Familiarization with the tools ### 2.4A Get started with Jupyter tool ## QuizP 01 (3/3 points) 1. What does an environment like Jupyter provide in comparison to working in the Python console or running R scripts directly? *Answer* * a. It provides a well-structured history of the analyses performed. * b. It allows you to inspect data, keep a history of this inspection, and explain the transformations you perform as you go along. * c. It saves intermediate results, whether textual or graphical. * e. It allows you to generate documents in HTML or PDF . * f. It allows you to ensure that a figure is the result of the computation described in the document. 2. In Jupyter, what features are provided for the Python language but not available for the R language? *Answer* * f. There are the same features for both languages 3. What allows you to be effective in an environment like Jupyter? *Answer* * a. The export functions and the ability to easily re-run the code from the beginning * c. Autocompletion * e. Learning keyboard shortcuts * f. Reading the documentation and cheat sheets Jupyter installation and configuration # Module 3 ## Quiz 12 (2/2) 1. What distinguishes a replicable data analysis from a traditional analysis? *Answer* * b. The code for all computations is included 2. What are the advantages of a replicable analysis? What are the advantages of a replicable analysis? *Answer* * a. It is easier to prepare b. It is easier to modify * d. It is easier to verify ## Quiz 13 (4/4) 1. Where do the data on the incidence of influenza-like illness come from?. *Answer* * a. From the “réseau Sentinelles”, a network of general practitioners 2. In which format are the data avialable? *Answer* * b. CSV format 3. Which is the sampling frequency of the incidence data? *Answer* * c. One value per week 4. Why do we advise against removing the missing data line from the downloaded data file? *Answer* * b. It would leave no visible trace of the manipulation Data import-Jupyter