{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Sujet 1 : Concentration de CO2 dans l'atmosphère depuis 1958\n", "\n", " Charles David Keeling a lancé une campagne de mesure de la concentration de C02 dans l'atmosphère. Il a installé ces instrument à l'observatoire de Mauna Loa, Hawaii, Etats-Unis. Depuis 1958, nous avons continuellement des données.\n", " \n", " L'étude initiale devait étudier les variations saisonnière de la concentration, mais avec le réchauffement climatique, elle se tourne maintenant sur la croissance de la concentration.\n", " \n", " A partir des données hebdomadaires disponible sur le [site Web de l'institut Scripps](https://www.scripps.edu/), nous souhaitons reproduire l'analyse de l'évolution de la concentration de C02 dans l'atmosphère pour faire un modèle prédictif." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Chargement et inspection des données" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ " %matplotlib inline\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import statsmodels.api as sm\n", "import os\n", "import urllib" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " Dans le cas où le jeu de donnée n est pas disponible en local, nous téléchargeons le jeu de données complets hebdomadaire du 6 Avril 2020 depuis le lien suivant : https://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/in_situ_co2/weekly/weekly_in_situ_co2_mlo.csv" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "data_url = \"https://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/in_situ_co2/weekly/weekly_in_situ_co2_mlo.csv\"\n", "data_filename = \"weekly_in_situ_co2_mlo.csv\"\n", "# Si les données ne sont pas disponibles localement\n", "if not(os.path.exists(data_filename)):\n", " # Alors les télécharger depuis le site officiel\n", " urllib.request.urlretrieve(data_url,data_filename)\n", "# Vérifier que le fichier n'est pas vide\n", "assert os.path.getsize(data_filename)>0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Il ne semble pas y avoir de souci lors de l'ouverture du fichier pour une vérification visuelle. En revanche, ce fichier CSV possède un en-tête conséquent qu'il va falloir \"enlever\" lors de la récupération des données. Pour cela, nous allons chercher automatiquemlent la première ligne de données (pour pallier tout changement de format du fichier par l'institut Scripps." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "def find_num_first_dataline(filename):\n", " with open(filename,\"r\") as f:\n", " lignes = f.readlines()\n", " for i,ligne in enumerate(lignes):\n", " if ligne[0] != '\"':\n", " return i\n", " raise Exception(\"No data\")\n", "\n", "# Call the function\n", "first_line = find_num_first_dataline(data_filename)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Après l'observsation du fichier et de l'en-tête, on peut se rendre compte que:\n", "* La première colonne correspond aux dates d'acquisition (12h chaque jour)\n", "* La seconde colonne correspond aux concentrations mesurées (moyenne par jour)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | Date | \n", "Concentration | \n", "
|---|---|---|
| 0 | \n", "1958-03-29 | \n", "316.19 | \n", "
| 1 | \n", "1958-04-05 | \n", "317.31 | \n", "
| 2 | \n", "1958-04-12 | \n", "317.69 | \n", "
| 3 | \n", "1958-04-19 | \n", "317.58 | \n", "
| 4 | \n", "1958-04-26 | \n", "316.48 | \n", "
| 5 | \n", "1958-05-03 | \n", "316.95 | \n", "
| 6 | \n", "1958-05-17 | \n", "317.56 | \n", "
| 7 | \n", "1958-05-24 | \n", "317.99 | \n", "
| 8 | \n", "1958-07-05 | \n", "315.85 | \n", "
| 9 | \n", "1958-07-12 | \n", "315.85 | \n", "
| 10 | \n", "1958-07-19 | \n", "315.46 | \n", "
| 11 | \n", "1958-07-26 | \n", "315.59 | \n", "
| 12 | \n", "1958-08-02 | \n", "315.64 | \n", "
| 13 | \n", "1958-08-09 | \n", "315.10 | \n", "
| 14 | \n", "1958-08-16 | \n", "315.09 | \n", "
| 15 | \n", "1958-08-30 | \n", "314.14 | \n", "
| 16 | \n", "1958-09-06 | \n", "313.54 | \n", "
| 17 | \n", "1958-11-08 | \n", "313.05 | \n", "
| 18 | \n", "1958-11-15 | \n", "313.26 | \n", "
| 19 | \n", "1958-11-22 | \n", "313.57 | \n", "
| 20 | \n", "1958-11-29 | \n", "314.01 | \n", "
| 21 | \n", "1958-12-06 | \n", "314.56 | \n", "
| 22 | \n", "1958-12-13 | \n", "314.41 | \n", "
| 23 | \n", "1958-12-20 | \n", "314.77 | \n", "
| 24 | \n", "1958-12-27 | \n", "315.21 | \n", "
| 25 | \n", "1959-01-03 | \n", "315.24 | \n", "
| 26 | \n", "1959-01-10 | \n", "315.50 | \n", "
| 27 | \n", "1959-01-17 | \n", "315.69 | \n", "
| 28 | \n", "1959-01-24 | \n", "315.86 | \n", "
| 29 | \n", "1959-01-31 | \n", "315.42 | \n", "
| ... | \n", "... | \n", "... | \n", "
| 3126 | \n", "2019-07-06 | \n", "412.69 | \n", "
| 3127 | \n", "2019-07-13 | \n", "412.30 | \n", "
| 3128 | \n", "2019-07-20 | \n", "411.76 | \n", "
| 3129 | \n", "2019-07-27 | \n", "410.32 | \n", "
| 3130 | \n", "2019-08-03 | \n", "410.50 | \n", "
| 3131 | \n", "2019-08-10 | \n", "410.48 | \n", "
| 3132 | \n", "2019-08-17 | \n", "410.05 | \n", "
| 3133 | \n", "2019-08-24 | \n", "409.52 | \n", "
| 3134 | \n", "2019-08-31 | \n", "409.32 | \n", "
| 3135 | \n", "2019-09-07 | \n", "408.80 | \n", "
| 3136 | \n", "2019-09-14 | \n", "408.61 | \n", "
| 3137 | \n", "2019-09-21 | \n", "408.50 | \n", "
| 3138 | \n", "2019-09-28 | \n", "408.28 | \n", "
| 3139 | \n", "2019-10-05 | \n", "407.99 | \n", "
| 3140 | \n", "2019-10-12 | \n", "408.61 | \n", "
| 3141 | \n", "2019-10-19 | \n", "408.77 | \n", "
| 3142 | \n", "2019-10-26 | \n", "408.68 | \n", "
| 3143 | \n", "2019-11-02 | \n", "409.86 | \n", "
| 3144 | \n", "2019-11-09 | \n", "410.15 | \n", "
| 3145 | \n", "2019-11-16 | \n", "410.22 | \n", "
| 3146 | \n", "2019-11-23 | \n", "410.48 | \n", "
| 3147 | \n", "2019-11-30 | \n", "410.92 | \n", "
| 3148 | \n", "2019-12-07 | \n", "411.27 | \n", "
| 3149 | \n", "2019-12-14 | \n", "411.67 | \n", "
| 3150 | \n", "2019-12-21 | \n", "412.30 | \n", "
| 3151 | \n", "2019-12-28 | \n", "412.59 | \n", "
| 3152 | \n", "2020-01-04 | \n", "413.19 | \n", "
| 3153 | \n", "2020-01-11 | \n", "413.39 | \n", "
| 3154 | \n", "2020-01-25 | \n", "413.36 | \n", "
| 3155 | \n", "2020-02-01 | \n", "413.99 | \n", "
3156 rows × 2 columns
\n", "