diff --git a/journal/DS logbook.md b/journal/DS logbook.md index e7a09072219d1b9537635c91c684b6ba4e49794e..d7bb5877008fef59fd1b98a059dcb07f93f8bb37 100644 --- a/journal/DS logbook.md +++ b/journal/DS logbook.md @@ -64,3 +64,111 @@ ### Exercice 01 (2nd part): Getting familiar with Markdown #### Exercice 01-2 (0.86/1 point) +# Module 2: the showcase behind the scenes: the computational document + +## Quiz 07 (3/3 marks) +1. In the studies we have presented to you, what prevents, sometimes for several years, the debate on the relevance of a study? + *Answer* +* [Correct] has. Unpublished computational procedures +* [False] has. Unpublished computational proceduresb. Pressure from government or industry lobbyists who wish to build on the study in question +* [Correct] vs. Data used in the study was not released +* [False] Experts disagree + +2. In the various examples presented (economics, MRI, crystallography), what are the main causes of errors? + *Answer* +* [Correct] has. Data acquisition (bias, machine calibration, etc.) +* [Correct] b. Computation errorsvs. +* [Correct] Inadequate data processing or statisticsd. +* [False] It's all about interpretation, error is inevitable. + +3. What are the consequences of lack of transparency? + *Answer* +* [Correct] has. It's difficult to rely on the work of othersb. +* [Correct] Articles contain less information (no details on calculations, experimental protocols, data analysis, etc.) and are therefore easier to readvs. +* [Correct] It is difficult to verify and reproduce the analyzes presented in the articlesd. +* [Correct] Two articles may present results that seem to contradict each other, but are both perfectly correct, as the lack of detail prevents the exact conditions of application from being determinede. +* [False] Hiding the data used in the studies, we "guarantee" a certain anonymity to the people who may appear in these studies + +## Quiz 08 (2/3 points) + +1. What are the main technical causes behind the difficulties in reproducing someone else's work? + *Answer* +* [Correct]a. The use of open source software: +* [Correct]b. Lack of documentation on the choices made +* [False] c. Interactive graphical software that hide computation details +* [False] d. The overload of information because it is very difficult to find what you are looking for +* [Correct]e. Computation errors +* [Correct]f. Data loss (no backup or no more readable format) + + ***correction*** +* [False]a. The use of open source software: +* [Correct]b. Lack of documentation on the choices made +* [Correct] c. Interactive graphical software that hide computation details +* [False] d. The overload of information because it is very difficult to find what you are looking for +* [Correct]e. Computation errors +* [Correct]f. Data loss (no backup or no more readable format) + +2. Which solutions are mentioned + *Answer* +* [Correct]a. Using a laboratory notebook +* [False] b. Using spreadsheets +* [Correct]c. Code review and continuous integratio +* [Correct]d. Using version control systems and several backup mechanisms + +3. What are the most legitimate/valid fears associated with the systematic disclosure of data (open data) ? + *Answer* +* [False]a. Be seen as a fool because others may then find your mistakes.: +* [False]b. Someone could benefit from my work. +* [Correct]c. Some information may be sensitive and its disclosure may hurt people.: +* [Correct]d. My resources are limited. If I systematically host all this data on the web page provided by my employer, I am likely to quickly exceed my quota.: + +## Quiz 09 (1/2 points) + +1. What is commonly found in a computational document? + *Answer* +* a. Commentaries +* b. Code +* c. An overview of data +* e. Computational results +* f. Hypertext links +* g. Images + +2. What does a computational document allow? + *Answer* +* a. Inspect the computations +* b. Easily re-run the computations if the original environment is available +* c. Document the code +* d. Explain why a particular computation is made based on the data analysis so far + + ***correction*** + +* a. Inspect the computations +* b. Easily re-run the computations if the original environment is available +* c. Document the code +* d. Explain why a particular computation is made based on the data analysis so far +* e. Perform non-regression tests in a systematic way + +## 2.4. Familiarization with the tools +### 2.4A Get started with Jupyter tool + +## QuizP 01 (3/3 points) +1. What does an environment like Jupyter provide in comparison to working in the Python console or running R scripts directly? + *Answer* +* a. It provides a well-structured history of the analyses performed. +* b. It allows you to inspect data, keep a history of this inspection, and explain the transformations you perform as you go along. +* c. It saves intermediate results, whether textual or graphical. +* e. It allows you to generate documents in HTML or PDF . +* f. It allows you to ensure that a figure is the result of the computation described in the document. + +2. In Jupyter, what features are provided for the Python language but not available for the R language? + *Answer* +* f. There are the same features for both languages + +3. What allows you to be effective in an environment like Jupyter? + *Answer* +* a. The export functions and the ability to easily re-run the code from the beginning +* c. Autocompletion +* e. Learning keyboard shortcuts +* f. Reading the documentation and cheat sheets + +Jupyter installation and configuration