Commit 6d8f56b1 authored by dc97a6904245e0d07c1302ee90fceec3's avatar dc97a6904245e0d07c1302ee90fceec3
parents 2ecef28a 4de5158a
...@@ -28,7 +28,51 @@ In this exercise, write for the first time a **markdown file** whose rendered ve ...@@ -28,7 +28,51 @@ In this exercise, write for the first time a **markdown file** whose rendered ve
You can find the markdown file written by me at: [https://app-learninglab.inria.fr/](https://app-learninglab.inria.fr/moocrr/gitlab/dc97a6904245e0d07c1302ee90fceec3/mooc-rr/blob/master/module1/exo2/fichier-markdown.md). You can find the markdown file written by me at: [https://app-learninglab.inria.fr/](https://app-learninglab.inria.fr/moocrr/gitlab/dc97a6904245e0d07c1302ee90fceec3/mooc-rr/blob/master/module1/exo2/fichier-markdown.md).
# 18/06/2025 - Update of the journal
## Introduction: Using Jupyter and GitLab
In this MOOC on reproducible research, I am learning how to use **Jupyter Notebooks** in combination with **GitLab** for scientific programming and collaborative version control.
- **Jupyter** is an interactive environment that lets you write and run code in small cells, see outputs instantly, and combine code with visualizations and markdown explanations — all in one place. It's ideal for data exploration, visualization, and documentation.
- **GitLab** is a web-based platform for **version control** and **collaboration**. It allows us to track changes in our code or notebooks, share with others, and maintain a reproducible research history.
During the course, GitLab hosts the exercises, data, and instructions, while Jupyter is used to interactively write code and analyze data.
## Data analysis with Python (from module 1 exercises)
## Exercise 02-2: Descriptive statistics
In this exercise, I learned how to calculate basic descriptive statistics using Python and the NumPy library. The dataset consisted of 100+ numerical values.
Using `np.mean`, `np.std(ddof=1)`, `np.min`, `np.median`, and `np.max`, I computed:
- **Mean**: ~14.11
- **Standard deviation**: ~4.33
- **Minimum**: 2.8
- **Median**: 14.5
- **Maximum**: 23.4
This helped me understand how to summarize and describe the distribution of a dataset.
## Exercise 02-03: Visualizing the dataset
Next, I practiced using **Matplotlib** to create two types of plots:
1. **Sequence plot**: A line graph showing each data point in the order it appears.
2. **Histogram**: A graphical representation of the distribution of the dataset.
Code used (Python):
```python
import numpy as np
import matplotlib.pyplot as plt
# Data list defined earlier
plt.figure()
plt.plot(data, 'b') # blue line plot
plt.show()
plt.figure()
plt.hist(data, color='blue', edgecolor='black')
plt.show()
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment