# Journal de bord du Mooc / Mooc's logbook

## Module 1:
**Exercise 01-1:**
*Which two files contain the character string "LE MOOC RECHERCHE REPRODUCTIBLE C'EST GENIAL" ?*

module1/exo1/aebef6b0a5.txt

module1/exo1/f683bbad4b.txt

**Quiz 01**

*Why has a European project recently used the logbooks of the Portuguese, Spanish, Dutch and English Indian Companies*

To try to reconstitute the ocean climate criss-crossed by the Western navies 

*What note media are illustrated in the course video "Note-taking concerns everyone" by Christophe Pouzat?*
- Notes in books and manuscripts margins 
- Notes in field books 
- Notes on cards and paper slips 

*Why did Leibniz order the construction of a closet ?*

To store and order notes written on paper slips

*For the curious, visit the Darwin Online web sites go to the notebooks and describe how Darwin took his notes.*

First in notebooks then on cards and paper sheets stored in folders

**Quiz 02**

*What is the origin of the codex?*

The Egyptian production of papyrus was not large enough to meet the demand of writers 

*What aspect of Eusebius work is presented in this sequence?*

His canon tables (cross-references between the Gospel books) 

*In which line should the keyword "Analysis" go in John Locke's index ?*

« Aa »

**Quiz 03**

**Quiz 04**

**Quiz 05**


## Module 2
### toy project
# Asking the maths library
My computer tells me that π is approximatively


```python
from math import *
print(pi)
```

    3.141592653589793


#  Buffon’s needle
Applying the method of Buffon’s needle, we get the approximation


```python
import numpy as np
np.random.seed(seed=42)
N = 10000
x = np.random.uniform(size=N, low=0, high=1)
theta = np.random.uniform(size=N, low=0, high=pi/2)
2/(sum((x+np.sin(theta))>1)/N)
```


    3.128911138923655


# Using a surface fraction argument
A method that is easier to understand and does not make use of the sin function is based on the
fact that if X ∼ U(0, 1) and Y ∼ U(0, 1), then P[X
2 + Y
2 ≤ 1] = π/4 (see "Monte Carlo method"
on Wikipedia). The following code uses this approach:


```python
%matplotlib inline
import matplotlib.pyplot as plt
np.random.seed(seed=42)
N = 1000
x = np.random.uniform(size=N, low=0, high=1)
y = np.random.uniform(size=N, low=0, high=1)
accept = (x*x+y*y) <= 1
reject = np.logical_not(accept)
fig, ax = plt.subplots(1)
ax.scatter(x[accept], y[accept], c='b', alpha=0.2, edgecolor=None)
ax.scatter(x[reject], y[reject], c='r', alpha=0.2, edgecolor=None)
ax.set_aspect('equal')
```


It is then straightforward to obtain a (not really good) approximation to π by counting how
many times, on average, X
2 + Y
2
is smaller than 1:


```python
4*np.mean(accept)

```


    3.112


```python

```
**Exercice 02-2**
*What is the average ?*

14.11

*What is the minimum ?*

2.8

*What is the maximum ?*

23.4

*What is the median ?*

14.5

*What is the standard deviation ?*

4.33

**Quiz 06**

*A computational document allows you to:*

- Improve the traceability of a calculation
- Easily present your work to colleagues
- Access all the calculations underlying an analysis 

*Which environment(s) are presented to you in this MOOC?*

- Rstudio
- Emacs/OrgMode
- Jupyter

*Which environment is recommended if your preferred language is Python?*

Jupyter 

*Which environment is recommended if your preferred language is the R language?*

Rstudio

*RstudioWhich environment is used daily by the three authors of this MOOC?*

Emacs/OrgMode b. Emacs/OrgMode - correct

**Quiz 7**

*In the studies we have presented to you, what prevents, sometimes for several years, the debate on the relevance of a study?*

- Unpublished computation procedures
- Data used in the study was not released

*In the various examples presented (economics, MRI, crystallography), what are the main causes of errors ?*

- Data acquisition (bias, machine calibration, etc.)
- Computation errors
- Inadequate data processing or statistics

*What are the consequences of lack of transparency?*

- It's difficult to rely on the work of others
- Articles contain less information (no details on calculations, experimental protocols, data analysis, etc.) and are therefore easier to read
- It is difficult to verify and reproduce the analyses presented in the articles
- Two articles may present results that seem to contradict each other, but are both perfectly correct, as the lack of detail prevents the exact conditions of application from being determined

**Quiz 8**

*What are the main technical causes behind the difficulties in reproducing someone else's work?*

- Lack of documentation on the choices made
- Interactive graphical software that hide computation details 
- Computation errors 
- Data loss (no backup or no more readable format)

*Which solutions are mentioned?*

- Using a laboratory notebook
- Code review and continuous integration
- Using version control systems and several backup mechanisms

*What are the most legitimate/valid fears associated with the systematic disclosure of data (open data)*

- Some information may be sensitive and its disclosure may hurt people
- My resources are limited. If I systematically host all this data on the web page provided by my employer, I am likely to quickly exceed my quota