{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Etude de la concentration de CO2 dans l'atmosphère depuis 1958" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Importation des librairies nécessaires à l'analyse:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "import pandas as pd\n", "import isoweek" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Importation des données\n", "\n", "Les données sont accessible sur le site de [l'institut Scripps](https://scrippsco2.ucsd.edu/data/atmospheric_co2/primary_mlo_co2_record.html). Elles sont téléchargée en date du 02/06/2021. Si le fichier de données n'a plus de version locale, il sera téléchargé." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "data_url = 'https://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/in_situ_co2/monthly/monthly_in_situ_co2_mlo.csv'\n", "\n", "data_file = \"C02-atmosphere.csv\"\n", "import os\n", "import urllib.request\n", "if not os.path.exists(data_file):\n", " urllib.request.urlretrieve(data_url, data_file)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Les 54 premières lignes du fichiers sont une description des données et quelques indications, elle ne seront pas prise en compte lors de la création du dataFrame de l'analyse. \n", "La structure des données, comme indiquée dans le fichier CSV est la suivante: \n", "\n", "| Nom de colonne | Libellé de colonne |\n", "|----------------|-----------------------------------------------------------------------------------------------------------------------------------|\n", "| Year | Année aucours de laquelle la mesure a été faite |\n", "| Month | Mois au cours duquel la mesure a été faite\n", " |\n", "| Date | Date de la mesure au format Excel\n", " |\n", "| Date | Date de la mesure au format ISO\n", " |\n", "| CO2 (ppm) | Taux de CO2 en micro mole par mole (ppm)\n", " |\n", "| CO2 adjusted (ppm)| Taux de CO2 auquel on a retiré les variations saisonnières\n", " |\n", "| CO2 smoothed(ppm) | Taux de CO2 ajusté\n", " |\n", "| CO2 smoothed and adjusted (ppm) | Taux de CO2 auquel on a retiré les variations saisonnières et ajusté\n", " |\n", "| CO2 completed (ppm) | Identique à la colonne 5, les valeurs manquantes sont prises dans la colonne 7\n", " |\n", "| CO2 adjusted completed (ppm) | Identique à la colonne 6, les valeurs manquantes sont prises dans la colonne 8\n", " | \n", " \n", " " ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
YrMnDateDateCO2seasonallyfitseasonallyCO2seasonally
0adjustedadjusted fitfilledadjusted filled
1Excel[ppm][ppm][ppm][ppm][ppm][ppm]
2195801212001958.0411-99.99-99.99-99.99-99.99-99.99-99.99
3195802212311958.1260-99.99-99.99-99.99-99.99-99.99-99.99
4195803212591958.2027315.70314.44316.19314.91315.70314.44
\n", "
" ], "text/plain": [ " Yr Mn Date Date CO2 seasonally fit \\\n", "0 adjusted \n", "1 Excel [ppm] [ppm] [ppm] \n", "2 1958 01 21200 1958.0411 -99.99 -99.99 -99.99 \n", "3 1958 02 21231 1958.1260 -99.99 -99.99 -99.99 \n", "4 1958 03 21259 1958.2027 315.70 314.44 316.19 \n", "\n", " seasonally CO2 seasonally \n", "0 adjusted fit filled adjusted filled \n", "1 [ppm] [ppm] [ppm] \n", "2 -99.99 -99.99 -99.99 \n", "3 -99.99 -99.99 -99.99 \n", "4 314.91 315.70 314.44 " ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "raw_data = pd.read_csv(data_file, skiprows=54)\n", "raw_data.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Les lignes 0 et 1 vont gêner l'analyse et ne contiennent que des indications sur les données. Nous les retirons:" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
YrMnDateDateCO2seasonallyfitseasonallyCO2seasonally
2195801212001958.0411-99.99-99.99-99.99-99.99-99.99-99.99
3195802212311958.1260-99.99-99.99-99.99-99.99-99.99-99.99
4195803212591958.2027315.70314.44316.19314.91315.70314.44
5195804212901958.2877317.45315.16317.30314.99317.45315.16
6195805213201958.3699317.51314.70317.87315.07317.51314.70
\n", "
" ], "text/plain": [ " Yr Mn Date Date CO2 seasonally fit \\\n", "2 1958 01 21200 1958.0411 -99.99 -99.99 -99.99 \n", "3 1958 02 21231 1958.1260 -99.99 -99.99 -99.99 \n", "4 1958 03 21259 1958.2027 315.70 314.44 316.19 \n", "5 1958 04 21290 1958.2877 317.45 315.16 317.30 \n", "6 1958 05 21320 1958.3699 317.51 314.70 317.87 \n", "\n", " seasonally CO2 seasonally \n", "2 -99.99 -99.99 -99.99 \n", "3 -99.99 -99.99 -99.99 \n", "4 314.91 315.70 314.44 \n", "5 314.99 317.45 315.16 \n", "6 315.07 317.51 314.70 " ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data = raw_data.drop(labels=[0,1], axis=0).copy()\n", "data.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Les titres de colonnes contiennent des espaces qui gêne leur appel. Nous renommons donc les colonnes comme suit:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
yearmonthDate1Date2CO2CO2 overallC02_3CO2_4C02_5CO2_6
2195801212001958.0411-99.99-99.99-99.99-99.99-99.99-99.99
3195802212311958.1260-99.99-99.99-99.99-99.99-99.99-99.99
4195803212591958.2027315.70314.44316.19314.91315.70314.44
5195804212901958.2877317.45315.16317.30314.99317.45315.16
6195805213201958.3699317.51314.70317.87315.07317.51314.70
\n", "
" ], "text/plain": [ " year month Date1 Date2 CO2 CO2 overall C02_3 \\\n", "2 1958 01 21200 1958.0411 -99.99 -99.99 -99.99 \n", "3 1958 02 21231 1958.1260 -99.99 -99.99 -99.99 \n", "4 1958 03 21259 1958.2027 315.70 314.44 316.19 \n", "5 1958 04 21290 1958.2877 317.45 315.16 317.30 \n", "6 1958 05 21320 1958.3699 317.51 314.70 317.87 \n", "\n", " CO2_4 C02_5 CO2_6 \n", "2 -99.99 -99.99 -99.99 \n", "3 -99.99 -99.99 -99.99 \n", "4 314.91 315.70 314.44 \n", "5 314.99 317.45 315.16 \n", "6 315.07 317.51 314.70 " ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "col_list = data.columns\n", "data.rename(columns={col_list[0]: 'year', col_list[1]: 'month', col_list[2]: 'Date1',\n", " col_list[3]: 'Date2', col_list[4]: 'CO2', col_list[5]: 'CO2 overall',\n", " col_list[6]: 'C02_3', col_list[7]: 'CO2_4', col_list[8]: 'C02_5',\n", " col_list[9]: 'CO2_6'}, inplace=True)\n", "data.head()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On converti les colonnes 'year' et 'month' en période que l'on défini ensuite comme index" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
yearmonthDate1Date2CO2CO2 overallC02_3CO2_4C02_5CO2_6
period
1958-01195801212001958.0411-99.99-99.99-99.99-99.99-99.99-99.99
1958-02195802212311958.1260-99.99-99.99-99.99-99.99-99.99-99.99
1958-03195803212591958.2027315.70314.44316.19314.91315.70314.44
1958-04195804212901958.2877317.45315.16317.30314.99317.45315.16
1958-05195805213201958.3699317.51314.70317.87315.07317.51314.70
\n", "
" ], "text/plain": [ " year month Date1 Date2 CO2 CO2 overall C02_3 \\\n", "period \n", "1958-01 1958 01 21200 1958.0411 -99.99 -99.99 -99.99 \n", "1958-02 1958 02 21231 1958.1260 -99.99 -99.99 -99.99 \n", "1958-03 1958 03 21259 1958.2027 315.70 314.44 316.19 \n", "1958-04 1958 04 21290 1958.2877 317.45 315.16 317.30 \n", "1958-05 1958 05 21320 1958.3699 317.51 314.70 317.87 \n", "\n", " CO2_4 C02_5 CO2_6 \n", "period \n", "1958-01 -99.99 -99.99 -99.99 \n", "1958-02 -99.99 -99.99 -99.99 \n", "1958-03 314.91 315.70 314.44 \n", "1958-04 314.99 317.45 315.16 \n", "1958-05 315.07 317.51 314.70 " ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def convert_month(year,month):\n", " return pd.Period(year = int(year), month = int(month), freq='M')\n", "\n", "data['period'] = [convert_month(year,month) for year,month in zip(data['year'],data['month'])]\n", "data = data.set_index('period')\n", "data.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On vérifie qu'il n'y a pas de trou dans les périodes:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "periods = data.index\n", "for p1, p2 in zip(periods[:-1], periods[1:]):\n", " delta = p2.to_timestamp() - p1.end_time\n", " if delta > pd.Timedelta('1s'):\n", " print(p1, p2)" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "-99.99" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }