{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Sujet 1 : Concentration de CO2 dans l'atmosphère depuis 1958" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Charles David Keeling a lancé une campagne de mesure de la concentration de C02 dans l'atmosphère. Il a installé ces instrument à l'observatoire de Mauna Loa, Hawaii, Etats-Unis. Depuis 1958, nous avons continuellement des données.\n", "\n", "L'étude initiale devait étudier les variations saisonnière de la concentration, mais avec le réchauffement climatique, elle se tourne maintenant sur la croissance de la concentration.\n", "\n", "A partir des données hebdomadaires disponible sur le [site Web de l'institut Scripps](https://scrippsco2.ucsd.edu/data/atmospheric_co2/primary_mlo_co2_record.html), nous souhaitons reproduire l'analyse de l'évolution de la concentration de C02 dans l'atmosphère pour faire un modèle prédictif." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Environnement de travail" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nous définions quelques fonctions pour faciliter l'affichage des numéros de version associés à notre système et à nos modules. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "hidePrompt": false }, "outputs": [], "source": [ "def print_imported_modules():\n", " import sys\n", " print(\"Imported modules\")\n", " for name, val in sorted(sys.modules.items()):\n", " if(hasattr(val, '__version__')): \n", " print(\"\\t\",val.__name__, val.__version__)\n", " \n", "def print_sys_info():\n", " import sys\n", " import platform\n", " print(\"System Info\")\n", " print(\"\\t\",sys.version)\n", " print(\"\\t\",platform.uname())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nous utilisons les modules usuels en traitement des données sous le langage python3 à la date du *6 Avril 2020* : numpy, pandas, seaborn, matplotlib, statsmodels, ... " ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import statsmodels.api as sm\n", "import seaborn as sns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ci-aprés un aperçu de notre environnement d'execution pour les personnes qui souhaiteraient reproduire ces travaux sur leur machine." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "System Info\n", "\t 3.6.4 |Anaconda, Inc.| (default, Mar 13 2018, 01:15:57) \n", "[GCC 7.2.0]\n", "\t uname_result(system='Linux', node='dc160b48d9b7', release='4.4.0-164-generic', version='#192-Ubuntu SMP Fri Sep 13 12:02:50 UTC 2019', machine='x86_64', processor='x86_64')\n", "Imported modules\n", "\t IPython 7.12.0\n", "\t IPython.core.release 7.12.0\n", "\t PIL 7.0.0\n", "\t PIL.Image 7.0.0\n", "\t PIL._version 7.0.0\n", "\t _csv 1.0\n", "\t _ctypes 1.1.0\n", "\t _curses b'2.2'\n", "\t decimal 1.70\n", "\t argparse 1.1\n", "\t backcall 0.1.0\n", "\t cffi 1.13.2\n", "\t csv 1.0\n", "\t ctypes 1.1.0\n", "\t cycler 0.10.0\n", "\t dateutil 2.8.1\n", "\t decimal 1.70\n", "\t decorator 4.4.1\n", "\t distutils 3.6.4\n", "\t ipaddress 1.0\n", "\t ipykernel 5.1.4\n", "\t ipykernel._version 5.1.4\n", "\t ipython_genutils 0.2.0\n", "\t ipython_genutils._version 0.2.0\n", "\t ipywidgets 7.2.1\n", "\t ipywidgets._version 7.2.1\n", "\t jedi 0.16.0\n", "\t json 2.0.9\n", "\t jupyter_client 6.0.0\n", "\t jupyter_client._version 6.0.0\n", "\t jupyter_core 4.6.3\n", "\t jupyter_core.version 4.6.3\n", "\t kiwisolver 1.1.0\n", "\t logging 0.5.1.2\n", "\t matplotlib 2.2.3\n", "\t matplotlib.backends.backend_agg 2.2.3\n", "\t numpy 1.15.2\n", "\t numpy.core 1.15.2\n", "\t numpy.core.multiarray 3.1\n", "\t numpy.lib 1.15.2\n", "\t numpy.linalg._umath_linalg b'0.1.5'\n", "\t numpy.matlib 1.15.2\n", "\t optparse 1.5.3\n", "\t pandas 0.22.0\n", "\t _libjson 1.33\n", "\t parso 0.6.0\n", "\t patsy 0.5.1\n", "\t patsy.version 0.5.1\n", "\t pexpect 4.8.0\n", "\t pickleshare 0.7.5\n", "\t platform 1.0.8\n", "\t prompt_toolkit 3.0.3\n", "\t ptyprocess 0.6.0\n", "\t pygments 2.5.2\n", "\t pyparsing 2.4.6\n", "\t pytz 2019.3\n", "\t re 2.2.1\n", "\t scipy 1.1.0\n", "\t scipy._lib.decorator 4.0.5\n", "\t scipy._lib.six 1.2.0\n", "\t scipy.fftpack._fftpack b'$Revision: $'\n", "\t scipy.fftpack.convolve b'$Revision: $'\n", "\t scipy.integrate._dop b'$Revision: $'\n", "\t scipy.integrate._ode $Id$\n", "\t scipy.integrate._odepack 1.9 \n", "\t scipy.integrate._quadpack 1.13 \n", "\t scipy.integrate.lsoda b'$Revision: $'\n", "\t scipy.integrate.vode b'$Revision: $'\n", "\t scipy.interpolate._fitpack 1.7 \n", "\t scipy.interpolate.dfitpack b'$Revision: $'\n", "\t scipy.linalg 0.4.9\n", "\t scipy.linalg._fblas b'$Revision: $'\n", "\t scipy.linalg._flapack b'$Revision: $'\n", "\t scipy.linalg._flinalg b'$Revision: $'\n", "\t scipy.ndimage 2.0\n", "\t scipy.optimize._cobyla b'$Revision: $'\n", "\t scipy.optimize._lbfgsb b'$Revision: $'\n", "\t scipy.optimize._minpack 1.10 \n", "\t scipy.optimize._nnls b'$Revision: $'\n", "\t scipy.optimize._slsqp b'$Revision: $'\n", "\t scipy.optimize.minpack2 b'$Revision: $'\n", "\t scipy.signal.spline 0.2\n", "\t scipy.sparse.linalg.eigen.arpack._arpack b'$Revision: $'\n", "\t scipy.sparse.linalg.isolve._iterative b'$Revision: $'\n", "\t scipy.special.specfun b'$Revision: $'\n", "\t scipy.stats.mvn b'$Revision: $'\n", "\t scipy.stats.statlib b'$Revision: $'\n", "\t seaborn 0.8.1\n", "\t seaborn.external.husl 2.1.0\n", "\t seaborn.external.six 1.10.0\n", "\t six 1.14.0\n", "\t statsmodels 0.9.0\n", "\t statsmodels.__init__ 0.9.0\n", "\t traitlets 4.3.3\n", "\t traitlets._version 4.3.3\n", "\t urllib.request 3.6\n", "\t zlib 1.0\n", "\t zmq 17.1.2\n", "\t zmq.sugar 17.1.2\n", "\t zmq.sugar.version 17.1.2\n" ] } ], "source": [ "print_sys_info()\n", "print_imported_modules()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Chargement et inspection des données" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nous avons récupérer les données hebdomadaire le *6 Avril 2020* depuis le lien suivant : [https://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/in_situ_co2/weekly/weekly_in_situ_co2_mlo.csv](https://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/in_situ_co2/weekly/weekly_in_situ_co2_mlo.csv)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "filename = \"./weekly_in_situ_co2_mlo.csv\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nous affichons les premières lignes du fichier pour repérer d'éventuelle lignes à ignorer." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Ligne 0 : \"-------------------------------------------------------------------------------------------\"\n", "Ligne 1 : \" Atmospheric CO2 concentrations (ppm) derived from in situ air measurements \"\n", "Ligne 2 : \" at Mauna Loa, Observatory, Hawaii: Latitude 19.5°N Longitude 155.6°W Elevation 3397m \"\n", "Ligne 3 : \" \"\n", "Ligne 4 : \" Source: R. F. Keeling, S. J. Walker, S. C. Piper and A. F. Bollenbacher \"\n" ] } ], "source": [ "def head(filename,n):\n", " with open(filename,\"r\") as f:\n", " lignes = f.readlines()\n", " n = min(n,len(lignes))\n", " for i,ligne in enumerate(lignes[:n]):\n", " print(\"Ligne\",i,\":\",ligne,end=\"\")\n", "head(filename,5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Le fichier semble être correctement formaté :\n", "* Les lignes de commentaire/metadonnée commencent par \"\n", "* Les données ne commencent pas par \"\n", "\n", "Trouvons donc la première ligne de données." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "44" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def find_num_first_dataline(filename):\n", " with open(filename,\"r\") as f:\n", " lignes = f.readlines()\n", " for i,ligne in enumerate(lignes):\n", " if ligne[0] != '\"':\n", " return i\n", " raise Exception(\"No dataline found\")\n", "find_num_first_dataline(filename)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Aprés une inspection visuelle, nous avons aussi trouvé que les données commence ligne 44.\n", "Lors de cette inspection, nous avons pu relever les informations suivantes :\n", "1. La première colonne correspond aux dates d'acquisition\n", "2. Les données sont centrées sur 12h00 chaque jour\n", "3. La seconde colonne correspond aux concentrations mesurées\n", "4. La concentration est la concentration moyenne de C02 dans l'atomosphère de la journée" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DateConcentration
01958-03-29316.19
11958-04-05317.31
21958-04-12317.69
31958-04-19317.58
41958-04-26316.48
51958-05-03316.95
61958-05-17317.56
71958-05-24317.99
81958-07-05315.85
91958-07-12315.85
101958-07-19315.46
111958-07-26315.59
121958-08-02315.64
131958-08-09315.10
141958-08-16315.09
151958-08-30314.14
161958-09-06313.54
171958-11-08313.05
181958-11-15313.26
191958-11-22313.57
201958-11-29314.01
211958-12-06314.56
221958-12-13314.41
231958-12-20314.77
241958-12-27315.21
251959-01-03315.24
261959-01-10315.50
271959-01-17315.69
281959-01-24315.86
291959-01-31315.42
.........
31262019-07-06412.69
31272019-07-13412.30
31282019-07-20411.76
31292019-07-27410.32
31302019-08-03410.50
31312019-08-10410.48
31322019-08-17410.05
31332019-08-24409.52
31342019-08-31409.32
31352019-09-07408.80
31362019-09-14408.61
31372019-09-21408.50
31382019-09-28408.28
31392019-10-05407.99
31402019-10-12408.61
31412019-10-19408.77
31422019-10-26408.68
31432019-11-02409.86
31442019-11-09410.15
31452019-11-16410.22
31462019-11-23410.48
31472019-11-30410.92
31482019-12-07411.27
31492019-12-14411.67
31502019-12-21412.30
31512019-12-28412.59
31522020-01-04413.19
31532020-01-11413.39
31542020-01-25413.36
31552020-02-01413.99
\n", "

3156 rows × 2 columns

\n", "
" ], "text/plain": [ " Date Concentration\n", "0 1958-03-29 316.19\n", "1 1958-04-05 317.31\n", "2 1958-04-12 317.69\n", "3 1958-04-19 317.58\n", "4 1958-04-26 316.48\n", "5 1958-05-03 316.95\n", "6 1958-05-17 317.56\n", "7 1958-05-24 317.99\n", "8 1958-07-05 315.85\n", "9 1958-07-12 315.85\n", "10 1958-07-19 315.46\n", "11 1958-07-26 315.59\n", "12 1958-08-02 315.64\n", "13 1958-08-09 315.10\n", "14 1958-08-16 315.09\n", "15 1958-08-30 314.14\n", "16 1958-09-06 313.54\n", "17 1958-11-08 313.05\n", "18 1958-11-15 313.26\n", "19 1958-11-22 313.57\n", "20 1958-11-29 314.01\n", "21 1958-12-06 314.56\n", "22 1958-12-13 314.41\n", "23 1958-12-20 314.77\n", "24 1958-12-27 315.21\n", "25 1959-01-03 315.24\n", "26 1959-01-10 315.50\n", "27 1959-01-17 315.69\n", "28 1959-01-24 315.86\n", "29 1959-01-31 315.42\n", "... ... ...\n", "3126 2019-07-06 412.69\n", "3127 2019-07-13 412.30\n", "3128 2019-07-20 411.76\n", "3129 2019-07-27 410.32\n", "3130 2019-08-03 410.50\n", "3131 2019-08-10 410.48\n", "3132 2019-08-17 410.05\n", "3133 2019-08-24 409.52\n", "3134 2019-08-31 409.32\n", "3135 2019-09-07 408.80\n", "3136 2019-09-14 408.61\n", "3137 2019-09-21 408.50\n", "3138 2019-09-28 408.28\n", "3139 2019-10-05 407.99\n", "3140 2019-10-12 408.61\n", "3141 2019-10-19 408.77\n", "3142 2019-10-26 408.68\n", "3143 2019-11-02 409.86\n", "3144 2019-11-09 410.15\n", "3145 2019-11-16 410.22\n", "3146 2019-11-23 410.48\n", "3147 2019-11-30 410.92\n", "3148 2019-12-07 411.27\n", "3149 2019-12-14 411.67\n", "3150 2019-12-21 412.30\n", "3151 2019-12-28 412.59\n", "3152 2020-01-04 413.19\n", "3153 2020-01-11 413.39\n", "3154 2020-01-25 413.36\n", "3155 2020-02-01 413.99\n", "\n", "[3156 rows x 2 columns]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data = pd.read_csv(filename,skiprows=44,header=None,names=[\"Date\",\"Concentration\"])\n", "data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nous y somme presque. Il ne reste plus qu'à convertir les dates en Period Pandas." ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [], "source": [ "def convert_date(year_month_day):\n", " year_month_day_str = str(year_month_day)\n", " year = int(year_month_day[:4])\n", " month = int(year_month_day[5:7])\n", " day = int(year_month_day[8:])\n", " return pd.Timestamp(year=year,month=month,day=day,hour=12,minute=0,second=0)\n", "\n", "data[\"Datetime\"] = [convert_date(ymd) for ymd in data[\"Date\"]]\n", "data[\"Timestamp\"] = [date.value/10**8 for date in data[\"Datetime\"]]\n", "data[\"Period\"] = [date.to_period('W') for date in data[\"Datetime\"]]\n", "data.set_index(\"Period\");" ] }, { "cell_type": "markdown", "metadata": { "hideCode": false }, "source": [ "Il faut retenir les élèments suivants :\n", "\n", "| Nom | Signification |\n", "|:-------------:|:-------------------------------------|\n", "|**Date** | La date donnée dans le fichier brute |\n", "|**Datetime** | La date au format ISO-8601 |\n", "|**Timestamp** | Temps unix |\n", "|**Period** | Plage de temps |\n", "\n", "Et un petit graphique pour guider notre travail." ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "data.plot(x=\"Period\",y=\"Concentration\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Décomposition de la concentration" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sur le graphique, nous aperçevons une courbe croissante avec de faible oscilliations. Cette courbe se prête bien à une décompostion en saisonière." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "from statsmodels.tsa.seasonal import seasonal_decompose\n", "# freq = 52 car 52 semaines par an\n", "result = seasonal_decompose(data['Concentration'],model=\"additive\", freq=52)\n", "result.plot()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "La tendance globale (Trend) de la concentration moyenne de C02 dans l'atmosphère est croissante.\n", "\n", "Les oscillations semblent bien être saisonières (Seasonal).\n", "\n", "**Nous noterons cependant que le bruit résiduel (Residual) est du même ordre que les oscillations saisonnières**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Modélisation du phénomène" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Au regard de la décomposition précédente, nous modéliserons la variation de la concentration comme la somme d'une fonction affine croissante et d'une fonction périodique." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Estimation de la fonction linéaire" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pour estimer notre fonction affine, nous allons utiliser un modlèle linéaire génralisé : Concentration ~ 1 + Temps" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "
Generalized Linear Model Regression Results
Dep. Variable: Concentration No. Observations: 3156
Model: GLM Df Residuals: 3154
Model Family: Gaussian Df Model: 1
Link Function: identity Scale: 18.461
Method: IRLS Log-Likelihood: -9078.1
Date: Tue, 07 Apr 2020 Deviance: 58226.
Time: 13:14:18 Pearson chi2: 5.82e+04
No. Iterations: 3 Covariance Type: nonrobust
\n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "
coef std err z P>|z| [0.025 0.975]
Intercept 324.5375 0.114 2845.850 0.000 324.314 324.761
Timestamp 5e-09 1.37e-11 364.652 0.000 4.97e-09 5.03e-09
" ], "text/plain": [ "\n", "\"\"\"\n", " Generalized Linear Model Regression Results \n", "==============================================================================\n", "Dep. Variable: Concentration No. Observations: 3156\n", "Model: GLM Df Residuals: 3154\n", "Model Family: Gaussian Df Model: 1\n", "Link Function: identity Scale: 18.461\n", "Method: IRLS Log-Likelihood: -9078.1\n", "Date: Tue, 07 Apr 2020 Deviance: 58226.\n", "Time: 13:14:18 Pearson chi2: 5.82e+04\n", "No. Iterations: 3 Covariance Type: nonrobust\n", "==============================================================================\n", " coef std err z P>|z| [0.025 0.975]\n", "------------------------------------------------------------------------------\n", "Intercept 324.5375 0.114 2845.850 0.000 324.314 324.761\n", "Timestamp 5e-09 1.37e-11 364.652 0.000 4.97e-09 5.03e-09\n", "==============================================================================\n", "\"\"\"" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data[\"Intercept\"]=1\n", "linmodel = sm.GLM(data['Concentration'],data[['Intercept','Timestamp']]).fit()\n", "linmodel.summary()" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "%matplotlib inline\n", "data_pred = pd.DataFrame({'Timestamp': np.linspace(start=data[\"Timestamp\"][0], stop=data[\"Timestamp\"][len(data[\"Timestamp\"])-1], num=50), 'Intercept': 1})\n", "data_pred['Concentration'] = linmodel.predict(data_pred)\n", "data_pred.plot(x=\"Timestamp\",y=\"Concentration\",kind=\"line\")\n", "plt.scatter(x=data[\"Timestamp\"],y=data[\"Concentration\"],color=\"red\")\n", "plt.grid(True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prédiction en 2050" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nous allons contruire une fonction pour prédire la concentration à une date donnée." ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2050-06-04 12:00:00 1.269077e+10\n", "dtype: float64" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def predict(year=2050,month=6,day=4,hour=12,minute=0,second=0):\n", " datetime = pd.Timestamp(year=year,month=month,day=day,hour=hour,minute=minute,second=second)\n", " timestamp = datetime.value\n", " date_pred = pd.DataFrame({'Timestamp':[timestamp],'Intercept':1},index=[datetime])\n", " return linmodel.predict(date_pred)\n", "predict()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "La concentration de C02 de l'atmosphère en 2050 sera de 10 Gppm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conclusion" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "La droite que nous avons pris pour faire nos prédictions se trouve sous les mesures réelles. Cela signifie que la réalité pourrait être encore pire dans le future. Cependant, notre prédiction présentée ici est plein de biais : le plus important reste sans doute qu'il s'appuie sur un modèle linéaire." ] } ], "metadata": { "celltoolbar": "Aucun(e)", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }