{ "cells": [ { "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "# Concentration de CO2 dans l'atmosphère depuis 1958" ] }, { "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "## Bibliotheques " ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "hideCode": false, "hidePrompt": false }, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "import pandas as pd\n", "import isoweek\n", "import numpy as np" ] }, { "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "## Presentation des donnèes" ] }, { "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "Les données de l'évolution de la concentration de CO2 dans l'atmosphère sont disponibles du site Web de [l'Institut Scripps](https://scrippsco2.ucsd.edu/data/atmospheric_co2/primary_mlo_co2_record.html). Nous les récupérons sous forme d'un fichier en format CSV. Le fichier contient 10 colonnes. Les colonnes 1-4 donnent les dates en différentes formats. La colonne 5 montre la concentration de CO2 à Mauna Loa en micro-mol per mol (ppm), reporté sur l'échelle 2008A SIO. Les valeurs reportées dans le tableau sont prises à minuit (24:00) du 15 de chaque mois, entre les années 1958 et 2020. La colonne 6 montre la même information de la colonne 5 avec un ajustement pour retirer l'effet quasi régulier saisonnier (4 harmonica fit avec un facteur linéaire de croissance). \n", "La colonne 7 est une version adouci de la même information de la colonne 5 avec une courbe spline cubique plus une fonction 4-harmonic gain avec facteur linéaire de croissance. La colonne 8 présente la donnée de la colonne 7 sans l'effet du cycle saisonnier. \n", "Les valeurs manquantes sont indiqués avec \"-99.99\". La colonne 9 est identique à la colonne 5 avec les valeurs manquantes substitués par les valeurs de colonne 7. La colonne 10 est identique à la colonne 6 avec les valeurs manquantes substitués par les valeurs de colonne 8.\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "hideCode": false, "hidePrompt": false }, "outputs": [], "source": [ "data_url = \"https://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/in_situ_co2/monthly/monthly_in_situ_co2_mlo.csv\"" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "hideCode": false, "hidePrompt": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
YrMnDateDateCO2seasonallyfitseasonallyCO2seasonally
0adjustedadjusted fitfilledadjusted filled
1Excel[ppm][ppm][ppm][ppm][ppm][ppm]
2195801212001958.0411-99.99-99.99-99.99-99.99-99.99-99.99
3195802212311958.1260-99.99-99.99-99.99-99.99-99.99-99.99
4195803212591958.2027315.70314.44316.18314.90315.70314.44
5195804212901958.2877317.45315.16317.29314.98317.45315.16
6195805213201958.3699317.51314.71317.86315.06317.51314.71
7195806213511958.4548-99.99-99.99317.24315.14317.24315.14
8195807213811958.5370315.86315.19315.86315.21315.86315.19
9195808214121958.6219314.93316.19313.99315.28314.93316.19
10195809214431958.7068313.21316.08312.45315.35313.21316.08
11195810214731958.7890-99.99-99.99312.43315.40312.43315.40
12195811215041958.8740313.33315.20313.61315.46313.33315.20
13195812215341958.9562314.67315.43314.76315.51314.67315.43
14195901215651959.0411315.58315.54315.62315.57315.58315.54
15195902215961959.1260316.49315.86316.26315.63316.49315.86
16195903216241959.2027316.65315.38316.97315.69316.65315.38
17195904216551959.2877317.72315.42318.08315.76317.72315.42
18195905216851959.3699318.29315.49318.65315.84318.29315.49
19195906217161959.4548318.15316.03318.04315.93318.15316.03
20195907217461959.5370316.54315.86316.67316.02316.54315.86
21195908217771959.6219314.80316.06314.82316.12314.80316.06
22195909218081959.7068313.84316.73313.31316.21313.84316.73
23195910218381959.7890313.33316.33313.32316.30313.33316.33
24195911218691959.8740314.81316.68314.54316.39314.81316.68
25195912218991959.9562315.58316.35315.72316.47315.58316.35
26196001219301960.0410316.43316.39316.61316.55316.43316.39
27196002219611960.1257316.98316.35317.27316.64316.98316.35
28196003219901960.2049317.58316.28318.02316.71317.58316.28
29196004220211960.2896319.03316.70319.14316.79319.03316.70
.................................
728201807432962018.5370408.90408.08409.43408.65408.90408.08
729201808433272018.6219407.10408.63407.33408.90407.10408.63
730201809433582018.7068405.59409.08405.66409.18405.59409.08
731201810433882018.7890405.99409.61405.84409.44405.99409.61
732201811434192018.8740408.12410.38407.48409.72408.12410.38
733201812434492018.9562409.23410.15409.07409.98409.23410.15
734201901434802019.0411410.92410.87410.30410.24410.92410.87
735201902435112019.1260411.66410.90411.25410.48411.66410.90
736201903435392019.2027412.00410.46412.25410.69412.00410.46
737201904435702019.2877413.52410.72413.73410.92413.52410.72
738201905436002019.3699414.83411.42414.54411.14414.83411.42
739201906436312019.4548413.96411.38413.91411.36413.96411.38
740201907436612019.5370411.85411.03412.36411.57411.85411.03
741201908436922019.6219410.08411.62410.22411.79410.08411.62
742201909437232019.7068408.55412.06408.49412.02408.55412.06
743201910437532019.7890408.43412.06408.62412.23408.43412.06
744201911437842019.8740410.29412.56410.21412.46410.29412.56
745201912438142019.9562411.85412.78411.76412.67411.85412.78
746202001438452020.0410413.37413.32412.95412.89413.37413.32
747202002438762020.1257414.09413.33413.87413.10414.09413.33
748202003439052020.2049414.51412.94414.89413.30414.51412.94
749202004439362020.2896416.18413.35416.35413.50416.18413.35
750202005439662020.3716417.16413.75-99.99-99.99417.16413.75
751202006439972020.4563-99.99-99.99-99.99-99.99-99.99-99.99
752202007440272020.5383-99.99-99.99-99.99-99.99-99.99-99.99
753202008440582020.6230-99.99-99.99-99.99-99.99-99.99-99.99
754202009440892020.7077-99.99-99.99-99.99-99.99-99.99-99.99
755202010441192020.7896-99.99-99.99-99.99-99.99-99.99-99.99
756202011441502020.8743-99.99-99.99-99.99-99.99-99.99-99.99
757202012441802020.9563-99.99-99.99-99.99-99.99-99.99-99.99
\n", "

758 rows × 10 columns

\n", "
" ], "text/plain": [ " Yr Mn Date Date CO2 seasonally fit \\\n", "0 adjusted \n", "1 Excel [ppm] [ppm] [ppm] \n", "2 1958 01 21200 1958.0411 -99.99 -99.99 -99.99 \n", "3 1958 02 21231 1958.1260 -99.99 -99.99 -99.99 \n", "4 1958 03 21259 1958.2027 315.70 314.44 316.18 \n", "5 1958 04 21290 1958.2877 317.45 315.16 317.29 \n", "6 1958 05 21320 1958.3699 317.51 314.71 317.86 \n", "7 1958 06 21351 1958.4548 -99.99 -99.99 317.24 \n", "8 1958 07 21381 1958.5370 315.86 315.19 315.86 \n", "9 1958 08 21412 1958.6219 314.93 316.19 313.99 \n", "10 1958 09 21443 1958.7068 313.21 316.08 312.45 \n", "11 1958 10 21473 1958.7890 -99.99 -99.99 312.43 \n", "12 1958 11 21504 1958.8740 313.33 315.20 313.61 \n", "13 1958 12 21534 1958.9562 314.67 315.43 314.76 \n", "14 1959 01 21565 1959.0411 315.58 315.54 315.62 \n", "15 1959 02 21596 1959.1260 316.49 315.86 316.26 \n", "16 1959 03 21624 1959.2027 316.65 315.38 316.97 \n", "17 1959 04 21655 1959.2877 317.72 315.42 318.08 \n", "18 1959 05 21685 1959.3699 318.29 315.49 318.65 \n", "19 1959 06 21716 1959.4548 318.15 316.03 318.04 \n", "20 1959 07 21746 1959.5370 316.54 315.86 316.67 \n", "21 1959 08 21777 1959.6219 314.80 316.06 314.82 \n", "22 1959 09 21808 1959.7068 313.84 316.73 313.31 \n", "23 1959 10 21838 1959.7890 313.33 316.33 313.32 \n", "24 1959 11 21869 1959.8740 314.81 316.68 314.54 \n", "25 1959 12 21899 1959.9562 315.58 316.35 315.72 \n", "26 1960 01 21930 1960.0410 316.43 316.39 316.61 \n", "27 1960 02 21961 1960.1257 316.98 316.35 317.27 \n", "28 1960 03 21990 1960.2049 317.58 316.28 318.02 \n", "29 1960 04 22021 1960.2896 319.03 316.70 319.14 \n", ".. ... ... ... ... ... ... ... \n", "728 2018 07 43296 2018.5370 408.90 408.08 409.43 \n", "729 2018 08 43327 2018.6219 407.10 408.63 407.33 \n", "730 2018 09 43358 2018.7068 405.59 409.08 405.66 \n", "731 2018 10 43388 2018.7890 405.99 409.61 405.84 \n", "732 2018 11 43419 2018.8740 408.12 410.38 407.48 \n", "733 2018 12 43449 2018.9562 409.23 410.15 409.07 \n", "734 2019 01 43480 2019.0411 410.92 410.87 410.30 \n", "735 2019 02 43511 2019.1260 411.66 410.90 411.25 \n", "736 2019 03 43539 2019.2027 412.00 410.46 412.25 \n", "737 2019 04 43570 2019.2877 413.52 410.72 413.73 \n", "738 2019 05 43600 2019.3699 414.83 411.42 414.54 \n", "739 2019 06 43631 2019.4548 413.96 411.38 413.91 \n", "740 2019 07 43661 2019.5370 411.85 411.03 412.36 \n", "741 2019 08 43692 2019.6219 410.08 411.62 410.22 \n", "742 2019 09 43723 2019.7068 408.55 412.06 408.49 \n", "743 2019 10 43753 2019.7890 408.43 412.06 408.62 \n", "744 2019 11 43784 2019.8740 410.29 412.56 410.21 \n", "745 2019 12 43814 2019.9562 411.85 412.78 411.76 \n", "746 2020 01 43845 2020.0410 413.37 413.32 412.95 \n", "747 2020 02 43876 2020.1257 414.09 413.33 413.87 \n", "748 2020 03 43905 2020.2049 414.51 412.94 414.89 \n", "749 2020 04 43936 2020.2896 416.18 413.35 416.35 \n", "750 2020 05 43966 2020.3716 417.16 413.75 -99.99 \n", "751 2020 06 43997 2020.4563 -99.99 -99.99 -99.99 \n", "752 2020 07 44027 2020.5383 -99.99 -99.99 -99.99 \n", "753 2020 08 44058 2020.6230 -99.99 -99.99 -99.99 \n", "754 2020 09 44089 2020.7077 -99.99 -99.99 -99.99 \n", "755 2020 10 44119 2020.7896 -99.99 -99.99 -99.99 \n", "756 2020 11 44150 2020.8743 -99.99 -99.99 -99.99 \n", "757 2020 12 44180 2020.9563 -99.99 -99.99 -99.99 \n", "\n", " seasonally CO2 seasonally \n", "0 adjusted fit filled adjusted filled \n", "1 [ppm] [ppm] [ppm] \n", "2 -99.99 -99.99 -99.99 \n", "3 -99.99 -99.99 -99.99 \n", "4 314.90 315.70 314.44 \n", "5 314.98 317.45 315.16 \n", "6 315.06 317.51 314.71 \n", "7 315.14 317.24 315.14 \n", "8 315.21 315.86 315.19 \n", "9 315.28 314.93 316.19 \n", "10 315.35 313.21 316.08 \n", "11 315.40 312.43 315.40 \n", "12 315.46 313.33 315.20 \n", "13 315.51 314.67 315.43 \n", "14 315.57 315.58 315.54 \n", "15 315.63 316.49 315.86 \n", "16 315.69 316.65 315.38 \n", "17 315.76 317.72 315.42 \n", "18 315.84 318.29 315.49 \n", "19 315.93 318.15 316.03 \n", "20 316.02 316.54 315.86 \n", "21 316.12 314.80 316.06 \n", "22 316.21 313.84 316.73 \n", "23 316.30 313.33 316.33 \n", "24 316.39 314.81 316.68 \n", "25 316.47 315.58 316.35 \n", "26 316.55 316.43 316.39 \n", "27 316.64 316.98 316.35 \n", "28 316.71 317.58 316.28 \n", "29 316.79 319.03 316.70 \n", ".. ... ... ... \n", "728 408.65 408.90 408.08 \n", "729 408.90 407.10 408.63 \n", "730 409.18 405.59 409.08 \n", "731 409.44 405.99 409.61 \n", "732 409.72 408.12 410.38 \n", "733 409.98 409.23 410.15 \n", "734 410.24 410.92 410.87 \n", "735 410.48 411.66 410.90 \n", "736 410.69 412.00 410.46 \n", "737 410.92 413.52 410.72 \n", "738 411.14 414.83 411.42 \n", "739 411.36 413.96 411.38 \n", "740 411.57 411.85 411.03 \n", "741 411.79 410.08 411.62 \n", "742 412.02 408.55 412.06 \n", "743 412.23 408.43 412.06 \n", "744 412.46 410.29 412.56 \n", "745 412.67 411.85 412.78 \n", "746 412.89 413.37 413.32 \n", "747 413.10 414.09 413.33 \n", "748 413.30 414.51 412.94 \n", "749 413.50 416.18 413.35 \n", "750 -99.99 417.16 413.75 \n", "751 -99.99 -99.99 -99.99 \n", "752 -99.99 -99.99 -99.99 \n", "753 -99.99 -99.99 -99.99 \n", "754 -99.99 -99.99 -99.99 \n", "755 -99.99 -99.99 -99.99 \n", "756 -99.99 -99.99 -99.99 \n", "757 -99.99 -99.99 -99.99 \n", "\n", "[758 rows x 10 columns]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "raw_data = pd.read_csv(data_url, skiprows=54)\n", "raw_data" ] }, { "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "## Traitement des données" ] }, { "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "On visualise les noms des colonnes." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "hideCode": false, "hidePrompt": false }, "outputs": [ { "data": { "text/plain": [ "[' Yr',\n", " ' Mn',\n", " ' Date',\n", " ' Date',\n", " ' CO2',\n", " 'seasonally',\n", " ' fit',\n", " ' seasonally',\n", " ' CO2',\n", " ' seasonally']" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(raw_data.columns) " ] }, { "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "On modifie les noms des colonnes, pour mettre au propre le tableau." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "hideCode": false, "hidePrompt": false }, "outputs": [ { "data": { "text/plain": [ "['year',\n", " 'Month',\n", " 'data1',\n", " 'data2',\n", " 'CO2',\n", " 'seasonally_adjusted',\n", " 'fit',\n", " 'seasonally_adjusted_fit',\n", " 'CO2_filled',\n", " 'seasonally_adjusted_filled']" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "raw_data_new = raw_data.rename(columns={' Yr': 'year',' Mn':'Month',' Date':'data1',' Date':'data2',' CO2':'CO2','seasonally':'seasonally_adjusted',' fit':'fit',' seasonally':'seasonally_adjusted_fit', ' CO2':'CO2_filled',' seasonally':'seasonally_adjusted_filled'})\n", "list(raw_data_new.columns) " ] }, { "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "On supprime les premières quatre lignes. Les premières deux lignes sont vides, et les lignes 3 et 4 n'ont pas d'échantillon." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "hideCode": false, "hidePrompt": false }, "outputs": [], "source": [ "raw_data_new=raw_data_new.drop([0, 1,2,3])" ] }, { "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "On Supprime le format de data 'data1' et 'data2', qui ne sont pas intéressantes pour notre analyse." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "hideCode": false, "hidePrompt": false }, "outputs": [], "source": [ "raw_data_new=raw_data_new.drop(columns=['data1', 'data2'])" ] }, { "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "Nous vérifions qu'il n'y a pas des valeurs nulles dans le tableau." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "hideCode": false, "hidePrompt": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
yearMonthCO2seasonally_adjustedfitseasonally_adjusted_fitCO2_filledseasonally_adjusted_filled
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [year, Month, CO2, seasonally_adjusted, fit, seasonally_adjusted_fit, CO2_filled, seasonally_adjusted_filled]\n", "Index: []" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "raw_data_new[raw_data_new.isnull().any(axis=1)]" ] }, { "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "On voit qu'il n'y a pas des valeurs nulles. On vérifie le type de donné :" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "hideCode": false, "hidePrompt": false }, "outputs": [ { "data": { "text/plain": [ "year object\n", "Month object\n", "CO2 object\n", "seasonally_adjusted object\n", "fit object\n", "seasonally_adjusted_fit object\n", "CO2_filled object\n", "seasonally_adjusted_filled object\n", "dtype: object" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "raw_data_new.dtypes" ] }, { "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "On voit que le tableau est composé par des 'object'. On va le convertir en valeurs numériques." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "hideCode": false, "hidePrompt": false }, "outputs": [], "source": [ "raw_data_new['year']=raw_data_new['year'].astype(int)\n", "raw_data_new['Month']=raw_data_new['Month'].astype(int)\n", "raw_data_new['CO2'] = pd.to_numeric(raw_data_new['CO2'], errors='coerce').fillna(0)\n", "raw_data_new['seasonally_adjusted'] = pd.to_numeric(raw_data_new['seasonally_adjusted'], errors='coerce').fillna(0)\n", "raw_data_new['fit'] = pd.to_numeric(raw_data_new['fit'], errors='coerce').fillna(0)\n", "raw_data_new['seasonally_adjusted_fit'] = pd.to_numeric(raw_data_new['seasonally_adjusted_fit'], errors='coerce').fillna(0)\n", "raw_data_new['CO2_filled'] = pd.to_numeric(raw_data_new['CO2_filled'], errors='coerce').fillna(0)\n", "raw_data_new['seasonally_adjusted_filled'] = pd.to_numeric(raw_data_new['seasonally_adjusted_filled'], errors='coerce').fillna(0)\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "On verifie la conversion: \n" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "hideCode": false, "hidePrompt": false }, "outputs": [ { "data": { "text/plain": [ "year int64\n", "Month int64\n", "CO2 float64\n", "seasonally_adjusted float64\n", "fit float64\n", "seasonally_adjusted_fit float64\n", "CO2_filled float64\n", "seasonally_adjusted_filled float64\n", "dtype: object" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "raw_data_new.dtypes" ] }, { "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "Les 6 dernières lignes sont vides, on peut les retirer." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "hideCode": false, "hidePrompt": false }, "outputs": [], "source": [ "raw_data_new=raw_data_new.drop([751,752,753,754,755,756,757])" ] }, { "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "On peut aussi retirer les lignes sans valeurs :(-99,99)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "hideCode": false, "hidePrompt": false }, "outputs": [], "source": [ "raw_data_new = raw_data_new.drop(raw_data_new[raw_data_new.CO2 < 0].index)" ] }, { "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "On réinitialise les index de nos listes." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "hideCode": false, "hidePrompt": false }, "outputs": [], "source": [ "raw_data_new=raw_data_new.reset_index(drop=True)" ] }, { "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "## Point 1 - Evolution Annuelle de la CO2\n", "Le graphique suivant nous montrera une oscillation périodique superposée à une évolution systématique plus lente." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "hideCode": false, "hidePrompt": false }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "x = raw_data_new['year']\n", "y1 = raw_data_new['CO2_filled']\n", "y2 = raw_data_new['seasonally_adjusted_filled']\n", "\n", "fig, ax = plt.subplots()\n", "ax.plot(x, y1, '-b', label='CO2 evolution - per year (ppm)')\n", "ax.plot(x, y2, '--r', label='CO2 evolution - without seasonal influence (ppm)')\n", "leg = ax.legend();\n", "fig.set_size_inches(12, 8)" ] }, { "attachments": { "Screenshot%202020-07-19%20at%2009.15.53.png": { "image/png": "" } }, "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "## Point 2 - Prevision jusq'à 2025\n", "\n", "\n", "Dans ce paragraphe on va développer un modèle pour prévoir l'évolution de la concentration de CO2 jusqu’au 2025, avec les informations hystériques qu'on a à disposition. \n", "\n", "On peut utiliser la méthode des moindres carrés pour identifier l’évolution linéaire de la tendance montrée en rouge dans le graphique précèdent. La méthode des moindres carrés permet d'identifier la ligne droite qui s'approche le mieux aux différentes points de l'étude. Cette ligne droit présente la forme suivante:\n", "\n", "\n", "\\begin{align}\n", "y=ax+b\n", "\\end{align}\n", "\n", "La théorie de la méthode des moindres carrées, nous permet de définir la forme des coefficients a et b.\n", "\n", "\\begin{equation}\n", "a=\\frac{N\\sum(xy)+\\sum(x)\\sum(y)}{N\\sum(x^2)-(\\sum x)^2}\n", "\\end{equation}\n", "\n", "et\n", "\n", "\\begin{equation}\n", "b=\\frac{\\sum(y)- a\\sum(x)}{N}\n", "\\end{equation}\n", "\n", "Le lien suivant nous montre ça dans le détail.(https://www.mathsisfun.com/data/least-squares-regression.html)\n", "\n", "\n", "Il est intéressant de simplifier cette équation. Pour ce faire on peut rendre 'barycentrique' la série historique, comme montré dans l'image suivante:\n", "\n", "\n", "![Screenshot%202020-07-19%20at%2009.15.53.png](attachment:Screenshot%202020-07-19%20at%2009.15.53.png)\n", "\n", "Cette opération nous permet de réduire la complexité des termes 'a' et 'b' car les sommes\n", "\n", "\\begin{equation}\n", "\\sum x\n", "\\end{equation}\n", "\n", "et\n", "\n", "\\begin{equation}\n", "(\\sum x)^2\n", "\\end{equation}\n", "\n", "deviennent nulle. Donc on peut calculer a et b avec les formes suivantes:\n", "\n", "\\begin{equation}\n", "a=\\frac{\\sum(xy)}{\\sum(x^2)}\n", "\\end{equation}\n", "\n", "et\n", "\n", "\\begin{equation}\n", "b=\\frac{\\sum(y)}{N}\n", "\\end{equation}\n", "\n", "On commence par calculer le terme 'a'. Pour ce faire on réalise un tableau en normalisant les périodes prises dans l'étude: chaque mois représente une période normalisé, on aura donc 744 (12*62 ) périodes, équivalentes à la longueur des vecteurs de 'raw_data_new'. \n", "\n", "On va donc définir tous les opérateurs nécessaires pour calculer a." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "hideCode": false, "hidePrompt": false }, "outputs": [ { "data": { "text/plain": [ "0.1511674880564176" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x=np.zeros((len(raw_data_new),1))\n", "\n", "\n", "for i in range(len(raw_data_new)):\n", " x[0]=-368\n", " x[i]=x[i-1]+1\n", " \n", "sumx =len(raw_data_new)\n", " \n", "y=np.zeros((len(raw_data_new),1))\n", " \n", "for j in range(len(raw_data_new)):\n", " y[j]=raw_data_new.seasonally_adjusted_filled[j]\n", " \n", " \n", "xy=np.multiply(x,y)\n", "sumxy=np.sum(xy)\n", "\n", "x2=np.multiply(x,x)# c'est le vecteur des x^2\n", "sumx2=np.sum(x2)\n", "\n", "N=len(raw_data_new)\n", "\n", "#on passe a calculer a\n", "\n", "a=(sumxy)/(sumx2)\n", "a\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "La valeure de a est:0.1511674880564176 et ça représente le coefficient angulaire de la ligne droite qu'on cherche à calculer. On passe à calculer b." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "hideCode": false, "hidePrompt": false }, "outputs": [ { "ename": "NameError", "evalue": "name 'sumy' is not defined", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mb\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msumy\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m/\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mN\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mb\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mNameError\u001b[0m: name 'sumy' is not defined" ] } ], "source": [ "b=((sumy))/(N)\n", "b" ] }, { "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "On définit donc la ligne droite calculée comme :\n", "\n", "\\begin{equation}\n", " y=0.1511674880564176*x+355.3829380053908\n", "\\end{equation}\n", "\n", "Avec x qui représente une unité temporelle d'un mois. Les 742 mois donnent l'information jusqu’au 2020. Donc pour chercher l'évolution de la concentration de CO2 au 2025, il faut considérer qu'il nous font 5*12 mois, soit 60unité temporelles normalisées. Ces 60 unités temporelles normalisées il faut les sommer aux 371 qui donnent la quantité de CO2 au 2020, en arrivant à 431 unités de temps normalisé. A la fin du 2025, la concentration de CO2 sera:\n", "\n", "\\begin{equation}\n", " y=0.1511674880564176*431+355.3829380053908\n", "\\end{equation}\n", "\n", "Soit, " ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "hideCode": false, "hidePrompt": false }, "outputs": [ { "data": { "text/plain": [ "420.53612535770685" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y2025=(0.1511674880564176*431)+355.3829380053908\n", "\n", "y2025" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "hideCode": false, "hidePrompt": false }, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "xnew=np.append(x,431)\n", "ynew=np.append(y,y2025)\n", "plt.plot(xnew,ynew) \n", "plt.plot(431, y2025, marker='o', markersize=3, color=\"red\")" ] }, { "cell_type": "markdown", "metadata": { "hideCode": false, "hidePrompt": false }, "source": [ "Le point en rouge représente le niveau de concentration de CO2 en ppm à la fin du 2025." ] } ], "metadata": { "hide_code_all_hidden": false, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }