{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Etude de la concentration de CO2 dans l'atmosphère depuis 1958"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Importation des librairies nécessaires à l'analyse:"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"%matplotlib inline\n",
"import matplotlib.pyplot as plt\n",
"import pandas as pd\n",
"import isoweek"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Importation des données\n",
"\n",
"Les données sont accessible sur le site de [l'institut Scripps](https://scrippsco2.ucsd.edu/data/atmospheric_co2/primary_mlo_co2_record.html). Elles sont téléchargée en date du 02/06/2021. Si le fichier de données n'a plus de version locale, il sera téléchargé."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"data_url = 'https://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/in_situ_co2/monthly/monthly_in_situ_co2_mlo.csv'\n",
"\n",
"data_file = \"C02-atmosphere.csv\"\n",
"import os\n",
"import urllib.request\n",
"if not os.path.exists(data_file):\n",
" urllib.request.urlretrieve(data_url, data_file)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Les 54 premières lignes du fichiers sont une description des données et quelques indications, elle ne seront pas prise en compte lors de la création du dataFrame de l'analyse. \n",
"La structure des données, comme indiquée dans le fichier CSV est la suivante: \n",
"\n",
"| Nom de colonne | Libellé de colonne |\n",
"|----------------|-----------------------------------------------------------------------------------------------------------------------------------|\n",
"| Year | Année aucours de laquelle la mesure a été faite |\n",
"| Month | Mois au cours duquel la mesure a été faite\n",
" |\n",
"| Date | Date de la mesure au format Excel\n",
" |\n",
"| Date | Date de la mesure au format ISO\n",
" |\n",
"| CO2 (ppm) | Taux de CO2 en micro mole par mole (ppm)\n",
" |\n",
"| CO2 adjusted (ppm)| Taux de CO2 auquel on a retiré les variations saisonnières\n",
" |\n",
"| CO2 smoothed(ppm) | Taux de CO2 ajusté\n",
" |\n",
"| CO2 smoothed and adjusted (ppm) | Taux de CO2 auquel on a retiré les variations saisonnières et ajusté\n",
" |\n",
"| CO2 completed (ppm) | Identique à la colonne 5, les valeurs manquantes sont prises dans la colonne 7\n",
" |\n",
"| CO2 adjusted completed (ppm) | Identique à la colonne 6, les valeurs manquantes sont prises dans la colonne 8\n",
" | \n",
" \n",
" "
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Yr | \n",
" Mn | \n",
" Date | \n",
" Date | \n",
" CO2 | \n",
" seasonally | \n",
" fit | \n",
" seasonally | \n",
" CO2 | \n",
" seasonally | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" adjusted | \n",
" | \n",
" adjusted fit | \n",
" filled | \n",
" adjusted filled | \n",
"
\n",
" \n",
" 1 | \n",
" | \n",
" | \n",
" Excel | \n",
" | \n",
" [ppm] | \n",
" [ppm] | \n",
" [ppm] | \n",
" [ppm] | \n",
" [ppm] | \n",
" [ppm] | \n",
"
\n",
" \n",
" 2 | \n",
" 1958 | \n",
" 01 | \n",
" 21200 | \n",
" 1958.0411 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
"
\n",
" \n",
" 3 | \n",
" 1958 | \n",
" 02 | \n",
" 21231 | \n",
" 1958.1260 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
"
\n",
" \n",
" 4 | \n",
" 1958 | \n",
" 03 | \n",
" 21259 | \n",
" 1958.2027 | \n",
" 315.70 | \n",
" 314.44 | \n",
" 316.19 | \n",
" 314.91 | \n",
" 315.70 | \n",
" 314.44 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Yr Mn Date Date CO2 seasonally fit \\\n",
"0 adjusted \n",
"1 Excel [ppm] [ppm] [ppm] \n",
"2 1958 01 21200 1958.0411 -99.99 -99.99 -99.99 \n",
"3 1958 02 21231 1958.1260 -99.99 -99.99 -99.99 \n",
"4 1958 03 21259 1958.2027 315.70 314.44 316.19 \n",
"\n",
" seasonally CO2 seasonally \n",
"0 adjusted fit filled adjusted filled \n",
"1 [ppm] [ppm] [ppm] \n",
"2 -99.99 -99.99 -99.99 \n",
"3 -99.99 -99.99 -99.99 \n",
"4 314.91 315.70 314.44 "
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"raw_data = pd.read_csv(data_file, skiprows=54)\n",
"raw_data.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Les lignes 0 et 1 vont gêner l'analyse et ne contiennent que des indications sur les données. Nous les retirons:"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Yr | \n",
" Mn | \n",
" Date | \n",
" Date | \n",
" CO2 | \n",
" seasonally | \n",
" fit | \n",
" seasonally | \n",
" CO2 | \n",
" seasonally | \n",
"
\n",
" \n",
" \n",
" \n",
" 2 | \n",
" 1958 | \n",
" 01 | \n",
" 21200 | \n",
" 1958.0411 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
"
\n",
" \n",
" 3 | \n",
" 1958 | \n",
" 02 | \n",
" 21231 | \n",
" 1958.1260 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
"
\n",
" \n",
" 4 | \n",
" 1958 | \n",
" 03 | \n",
" 21259 | \n",
" 1958.2027 | \n",
" 315.70 | \n",
" 314.44 | \n",
" 316.19 | \n",
" 314.91 | \n",
" 315.70 | \n",
" 314.44 | \n",
"
\n",
" \n",
" 5 | \n",
" 1958 | \n",
" 04 | \n",
" 21290 | \n",
" 1958.2877 | \n",
" 317.45 | \n",
" 315.16 | \n",
" 317.30 | \n",
" 314.99 | \n",
" 317.45 | \n",
" 315.16 | \n",
"
\n",
" \n",
" 6 | \n",
" 1958 | \n",
" 05 | \n",
" 21320 | \n",
" 1958.3699 | \n",
" 317.51 | \n",
" 314.70 | \n",
" 317.87 | \n",
" 315.07 | \n",
" 317.51 | \n",
" 314.70 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Yr Mn Date Date CO2 seasonally fit \\\n",
"2 1958 01 21200 1958.0411 -99.99 -99.99 -99.99 \n",
"3 1958 02 21231 1958.1260 -99.99 -99.99 -99.99 \n",
"4 1958 03 21259 1958.2027 315.70 314.44 316.19 \n",
"5 1958 04 21290 1958.2877 317.45 315.16 317.30 \n",
"6 1958 05 21320 1958.3699 317.51 314.70 317.87 \n",
"\n",
" seasonally CO2 seasonally \n",
"2 -99.99 -99.99 -99.99 \n",
"3 -99.99 -99.99 -99.99 \n",
"4 314.91 315.70 314.44 \n",
"5 314.99 317.45 315.16 \n",
"6 315.07 317.51 314.70 "
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data = raw_data.drop(labels=[0,1], axis=0).copy()\n",
"data.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Les titres de colonnes contiennent des espaces qui gêne leur appel. Nous renommons donc les colonnes comme suit:"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" year | \n",
" month | \n",
" Date1 | \n",
" Date2 | \n",
" CO2 | \n",
" CO2 overall | \n",
" C02_3 | \n",
" CO2_4 | \n",
" C02_5 | \n",
" CO2_6 | \n",
"
\n",
" \n",
" \n",
" \n",
" 2 | \n",
" 1958 | \n",
" 01 | \n",
" 21200 | \n",
" 1958.0411 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
"
\n",
" \n",
" 3 | \n",
" 1958 | \n",
" 02 | \n",
" 21231 | \n",
" 1958.1260 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
"
\n",
" \n",
" 4 | \n",
" 1958 | \n",
" 03 | \n",
" 21259 | \n",
" 1958.2027 | \n",
" 315.70 | \n",
" 314.44 | \n",
" 316.19 | \n",
" 314.91 | \n",
" 315.70 | \n",
" 314.44 | \n",
"
\n",
" \n",
" 5 | \n",
" 1958 | \n",
" 04 | \n",
" 21290 | \n",
" 1958.2877 | \n",
" 317.45 | \n",
" 315.16 | \n",
" 317.30 | \n",
" 314.99 | \n",
" 317.45 | \n",
" 315.16 | \n",
"
\n",
" \n",
" 6 | \n",
" 1958 | \n",
" 05 | \n",
" 21320 | \n",
" 1958.3699 | \n",
" 317.51 | \n",
" 314.70 | \n",
" 317.87 | \n",
" 315.07 | \n",
" 317.51 | \n",
" 314.70 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" year month Date1 Date2 CO2 CO2 overall C02_3 \\\n",
"2 1958 01 21200 1958.0411 -99.99 -99.99 -99.99 \n",
"3 1958 02 21231 1958.1260 -99.99 -99.99 -99.99 \n",
"4 1958 03 21259 1958.2027 315.70 314.44 316.19 \n",
"5 1958 04 21290 1958.2877 317.45 315.16 317.30 \n",
"6 1958 05 21320 1958.3699 317.51 314.70 317.87 \n",
"\n",
" CO2_4 C02_5 CO2_6 \n",
"2 -99.99 -99.99 -99.99 \n",
"3 -99.99 -99.99 -99.99 \n",
"4 314.91 315.70 314.44 \n",
"5 314.99 317.45 315.16 \n",
"6 315.07 317.51 314.70 "
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"col_list = data.columns\n",
"data.rename(columns={col_list[0]: 'year', col_list[1]: 'month', col_list[2]: 'Date1',\n",
" col_list[3]: 'Date2', col_list[4]: 'CO2', col_list[5]: 'CO2 overall',\n",
" col_list[6]: 'C02_3', col_list[7]: 'CO2_4', col_list[8]: 'C02_5',\n",
" col_list[9]: 'CO2_6'}, inplace=True)\n",
"data.head()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"On converti les colonnes 'year' et 'month' en période que l'on défini ensuite comme index"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" year | \n",
" month | \n",
" Date1 | \n",
" Date2 | \n",
" CO2 | \n",
" CO2 overall | \n",
" C02_3 | \n",
" CO2_4 | \n",
" C02_5 | \n",
" CO2_6 | \n",
"
\n",
" \n",
" period | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 1958-01 | \n",
" 1958 | \n",
" 01 | \n",
" 21200 | \n",
" 1958.0411 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
"
\n",
" \n",
" 1958-02 | \n",
" 1958 | \n",
" 02 | \n",
" 21231 | \n",
" 1958.1260 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
" -99.99 | \n",
"
\n",
" \n",
" 1958-03 | \n",
" 1958 | \n",
" 03 | \n",
" 21259 | \n",
" 1958.2027 | \n",
" 315.70 | \n",
" 314.44 | \n",
" 316.19 | \n",
" 314.91 | \n",
" 315.70 | \n",
" 314.44 | \n",
"
\n",
" \n",
" 1958-04 | \n",
" 1958 | \n",
" 04 | \n",
" 21290 | \n",
" 1958.2877 | \n",
" 317.45 | \n",
" 315.16 | \n",
" 317.30 | \n",
" 314.99 | \n",
" 317.45 | \n",
" 315.16 | \n",
"
\n",
" \n",
" 1958-05 | \n",
" 1958 | \n",
" 05 | \n",
" 21320 | \n",
" 1958.3699 | \n",
" 317.51 | \n",
" 314.70 | \n",
" 317.87 | \n",
" 315.07 | \n",
" 317.51 | \n",
" 314.70 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" year month Date1 Date2 CO2 CO2 overall C02_3 \\\n",
"period \n",
"1958-01 1958 01 21200 1958.0411 -99.99 -99.99 -99.99 \n",
"1958-02 1958 02 21231 1958.1260 -99.99 -99.99 -99.99 \n",
"1958-03 1958 03 21259 1958.2027 315.70 314.44 316.19 \n",
"1958-04 1958 04 21290 1958.2877 317.45 315.16 317.30 \n",
"1958-05 1958 05 21320 1958.3699 317.51 314.70 317.87 \n",
"\n",
" CO2_4 C02_5 CO2_6 \n",
"period \n",
"1958-01 -99.99 -99.99 -99.99 \n",
"1958-02 -99.99 -99.99 -99.99 \n",
"1958-03 314.91 315.70 314.44 \n",
"1958-04 314.99 317.45 315.16 \n",
"1958-05 315.07 317.51 314.70 "
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def convert_month(year,month):\n",
" return pd.Period(year = int(year), month = int(month), freq='M')\n",
"\n",
"data['period'] = [convert_month(year,month) for year,month in zip(data['year'],data['month'])]\n",
"data = data.set_index('period')\n",
"data.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"On vérifie qu'il n'y a pas de trou dans les périodes:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"periods = data.index\n",
"for p1, p2 in zip(periods[:-1], periods[1:]):\n",
" delta = p2.to_timestamp() - p1.end_time\n",
" if delta > pd.Timedelta('1s'):\n",
" print(p1, p2)"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"-99.99"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}