{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Concentration de CO2 dans l'atmosphère depuis 1958" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "import pandas as pd\n", "import isoweek\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "raw_data = pd.read_csv(\"https://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/in_situ_co2/monthly/monthly_in_situ_co2_mlo.csv\", skiprows = 54, sep=r'\\s*,\\s*', engine='python')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Les données ont été extraites le 11/05/2020. \n", "Les 54 premières lignes correspondent à du texte contenant les références à citer, des explications sur la forme des données ... On les supprime donc pour permettre à Pandas de lire les données sous forme de tableau. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
YrMnDateDate.1CO2seasonallyfitseasonally.1CO2.1seasonally.2
0NaNNaNNaNNaNNaNadjustedNaNadjusted fitfilledadjusted filled
1NaNNaNExcelNaN[ppm][ppm][ppm][ppm][ppm][ppm]
21958.01.0212001958.0411-99.99-99.99-99.99-99.99-99.99-99.99
31958.02.0212311958.1260-99.99-99.99-99.99-99.99-99.99-99.99
41958.03.0212591958.2027315.70314.44316.18314.90315.70314.44
51958.04.0212901958.2877317.46315.16317.29314.98317.46315.16
61958.05.0213201958.3699317.51314.71317.86315.06317.51314.71
71958.06.0213511958.4548-99.99-99.99317.24315.14317.24315.14
81958.07.0213811958.5370315.86315.19315.86315.21315.86315.19
91958.08.0214121958.6219314.93316.19313.99315.28314.93316.19
101958.09.0214431958.7068313.21316.08312.45315.35313.21316.08
111958.010.0214731958.7890-99.99-99.99312.43315.40312.43315.40
121958.011.0215041958.8740313.33315.20313.61315.46313.33315.20
131958.012.0215341958.9562314.67315.43314.76315.51314.67315.43
141959.01.0215651959.0411315.58315.54315.62315.57315.58315.54
151959.02.0215961959.1260316.49315.86316.27315.63316.49315.86
161959.03.0216241959.2027316.65315.38316.98315.69316.65315.38
171959.04.0216551959.2877317.72315.42318.09315.77317.72315.42
181959.05.0216851959.3699318.29315.49318.65315.85318.29315.49
191959.06.0217161959.4548318.15316.03318.04315.94318.15316.03
201959.07.0217461959.5370316.54315.86316.67316.03316.54315.86
211959.08.0217771959.6219314.80316.06314.82316.12314.80316.06
221959.09.0218081959.7068313.84316.73313.31316.22313.84316.73
231959.010.0218381959.7890313.33316.33313.32316.30313.33316.33
241959.011.0218691959.8740314.81316.68314.54316.39314.81316.68
251959.012.0218991959.9562315.58316.35315.72316.47315.58316.35
261960.01.0219301960.0410316.43316.39316.61316.56316.43316.39
271960.02.0219611960.1257316.98316.35317.27316.64316.98316.35
281960.03.0219901960.2049317.58316.28318.03316.71317.58316.28
291960.04.0220211960.2896319.03316.70319.14316.79319.03316.70
.................................
7282018.07.0432962018.5370408.90408.08409.44408.65408.90408.08
7292018.08.0433272018.6219407.10408.63407.34408.91407.10408.63
7302018.09.0433582018.7068405.59409.08405.67409.19405.59409.08
7312018.010.0433882018.7890405.99409.61405.85409.45405.99409.61
7322018.011.0434192018.8740408.12410.38407.49409.73408.12410.38
7332018.012.0434492018.9562409.23410.15409.08409.99409.23410.15
7342019.01.0434802019.0411410.92410.87410.31410.25410.92410.87
7352019.02.0435112019.1260411.66410.90411.26410.49411.66410.90
7362019.03.0435392019.2027412.00410.46412.26410.70412.00410.46
7372019.04.0435702019.2877413.52410.72413.75410.93413.52410.72
7382019.05.0436002019.3699414.83411.42414.55411.15414.83411.42
7392019.06.0436312019.4548413.96411.38413.92411.37413.96411.38
7402019.07.0436612019.5370411.85411.03412.37411.58411.85411.03
7412019.08.0436922019.6219410.08411.62410.23411.80410.08411.62
7422019.09.0437232019.7068408.55412.06408.50412.03408.55412.06
7432019.010.0437532019.7890408.43412.06408.63412.24408.43412.06
7442019.011.0437842019.8740410.29412.56410.22412.47410.29412.56
7452019.012.0438142019.9562411.85412.78411.77412.68411.85412.78
7462020.01.0438452020.0410413.37413.32412.96412.89413.37413.32
7472020.02.0438762020.1257414.09413.33413.87413.10414.09413.33
7482020.03.0439052020.2049414.51412.94414.88413.29414.51412.94
7492020.04.0439362020.2896416.18413.35-99.99-99.99416.18413.35
7502020.05.0439662020.3716-99.99-99.99-99.99-99.99-99.99-99.99
7512020.06.0439972020.4563-99.99-99.99-99.99-99.99-99.99-99.99
7522020.07.0440272020.5383-99.99-99.99-99.99-99.99-99.99-99.99
7532020.08.0440582020.6230-99.99-99.99-99.99-99.99-99.99-99.99
7542020.09.0440892020.7077-99.99-99.99-99.99-99.99-99.99-99.99
7552020.010.0441192020.7896-99.99-99.99-99.99-99.99-99.99-99.99
7562020.011.0441502020.8743-99.99-99.99-99.99-99.99-99.99-99.99
7572020.012.0441802020.9563-99.99-99.99-99.99-99.99-99.99-99.99
\n", "

758 rows × 10 columns

\n", "
" ], "text/plain": [ " Yr Mn Date Date.1 CO2 seasonally fit seasonally.1 \\\n", "0 NaN NaN NaN NaN NaN adjusted NaN adjusted fit \n", "1 NaN NaN Excel NaN [ppm] [ppm] [ppm] [ppm] \n", "2 1958.0 1.0 21200 1958.0411 -99.99 -99.99 -99.99 -99.99 \n", "3 1958.0 2.0 21231 1958.1260 -99.99 -99.99 -99.99 -99.99 \n", "4 1958.0 3.0 21259 1958.2027 315.70 314.44 316.18 314.90 \n", "5 1958.0 4.0 21290 1958.2877 317.46 315.16 317.29 314.98 \n", "6 1958.0 5.0 21320 1958.3699 317.51 314.71 317.86 315.06 \n", "7 1958.0 6.0 21351 1958.4548 -99.99 -99.99 317.24 315.14 \n", "8 1958.0 7.0 21381 1958.5370 315.86 315.19 315.86 315.21 \n", "9 1958.0 8.0 21412 1958.6219 314.93 316.19 313.99 315.28 \n", "10 1958.0 9.0 21443 1958.7068 313.21 316.08 312.45 315.35 \n", "11 1958.0 10.0 21473 1958.7890 -99.99 -99.99 312.43 315.40 \n", "12 1958.0 11.0 21504 1958.8740 313.33 315.20 313.61 315.46 \n", "13 1958.0 12.0 21534 1958.9562 314.67 315.43 314.76 315.51 \n", "14 1959.0 1.0 21565 1959.0411 315.58 315.54 315.62 315.57 \n", "15 1959.0 2.0 21596 1959.1260 316.49 315.86 316.27 315.63 \n", "16 1959.0 3.0 21624 1959.2027 316.65 315.38 316.98 315.69 \n", "17 1959.0 4.0 21655 1959.2877 317.72 315.42 318.09 315.77 \n", "18 1959.0 5.0 21685 1959.3699 318.29 315.49 318.65 315.85 \n", "19 1959.0 6.0 21716 1959.4548 318.15 316.03 318.04 315.94 \n", "20 1959.0 7.0 21746 1959.5370 316.54 315.86 316.67 316.03 \n", "21 1959.0 8.0 21777 1959.6219 314.80 316.06 314.82 316.12 \n", "22 1959.0 9.0 21808 1959.7068 313.84 316.73 313.31 316.22 \n", "23 1959.0 10.0 21838 1959.7890 313.33 316.33 313.32 316.30 \n", "24 1959.0 11.0 21869 1959.8740 314.81 316.68 314.54 316.39 \n", "25 1959.0 12.0 21899 1959.9562 315.58 316.35 315.72 316.47 \n", "26 1960.0 1.0 21930 1960.0410 316.43 316.39 316.61 316.56 \n", "27 1960.0 2.0 21961 1960.1257 316.98 316.35 317.27 316.64 \n", "28 1960.0 3.0 21990 1960.2049 317.58 316.28 318.03 316.71 \n", "29 1960.0 4.0 22021 1960.2896 319.03 316.70 319.14 316.79 \n", ".. ... ... ... ... ... ... ... ... \n", "728 2018.0 7.0 43296 2018.5370 408.90 408.08 409.44 408.65 \n", "729 2018.0 8.0 43327 2018.6219 407.10 408.63 407.34 408.91 \n", "730 2018.0 9.0 43358 2018.7068 405.59 409.08 405.67 409.19 \n", "731 2018.0 10.0 43388 2018.7890 405.99 409.61 405.85 409.45 \n", "732 2018.0 11.0 43419 2018.8740 408.12 410.38 407.49 409.73 \n", "733 2018.0 12.0 43449 2018.9562 409.23 410.15 409.08 409.99 \n", "734 2019.0 1.0 43480 2019.0411 410.92 410.87 410.31 410.25 \n", "735 2019.0 2.0 43511 2019.1260 411.66 410.90 411.26 410.49 \n", "736 2019.0 3.0 43539 2019.2027 412.00 410.46 412.26 410.70 \n", "737 2019.0 4.0 43570 2019.2877 413.52 410.72 413.75 410.93 \n", "738 2019.0 5.0 43600 2019.3699 414.83 411.42 414.55 411.15 \n", "739 2019.0 6.0 43631 2019.4548 413.96 411.38 413.92 411.37 \n", "740 2019.0 7.0 43661 2019.5370 411.85 411.03 412.37 411.58 \n", "741 2019.0 8.0 43692 2019.6219 410.08 411.62 410.23 411.80 \n", "742 2019.0 9.0 43723 2019.7068 408.55 412.06 408.50 412.03 \n", "743 2019.0 10.0 43753 2019.7890 408.43 412.06 408.63 412.24 \n", "744 2019.0 11.0 43784 2019.8740 410.29 412.56 410.22 412.47 \n", "745 2019.0 12.0 43814 2019.9562 411.85 412.78 411.77 412.68 \n", "746 2020.0 1.0 43845 2020.0410 413.37 413.32 412.96 412.89 \n", "747 2020.0 2.0 43876 2020.1257 414.09 413.33 413.87 413.10 \n", "748 2020.0 3.0 43905 2020.2049 414.51 412.94 414.88 413.29 \n", "749 2020.0 4.0 43936 2020.2896 416.18 413.35 -99.99 -99.99 \n", "750 2020.0 5.0 43966 2020.3716 -99.99 -99.99 -99.99 -99.99 \n", "751 2020.0 6.0 43997 2020.4563 -99.99 -99.99 -99.99 -99.99 \n", "752 2020.0 7.0 44027 2020.5383 -99.99 -99.99 -99.99 -99.99 \n", "753 2020.0 8.0 44058 2020.6230 -99.99 -99.99 -99.99 -99.99 \n", "754 2020.0 9.0 44089 2020.7077 -99.99 -99.99 -99.99 -99.99 \n", "755 2020.0 10.0 44119 2020.7896 -99.99 -99.99 -99.99 -99.99 \n", "756 2020.0 11.0 44150 2020.8743 -99.99 -99.99 -99.99 -99.99 \n", "757 2020.0 12.0 44180 2020.9563 -99.99 -99.99 -99.99 -99.99 \n", "\n", " CO2.1 seasonally.2 \n", "0 filled adjusted filled \n", "1 [ppm] [ppm] \n", "2 -99.99 -99.99 \n", "3 -99.99 -99.99 \n", "4 315.70 314.44 \n", "5 317.46 315.16 \n", "6 317.51 314.71 \n", "7 317.24 315.14 \n", "8 315.86 315.19 \n", "9 314.93 316.19 \n", "10 313.21 316.08 \n", "11 312.43 315.40 \n", "12 313.33 315.20 \n", "13 314.67 315.43 \n", "14 315.58 315.54 \n", "15 316.49 315.86 \n", "16 316.65 315.38 \n", "17 317.72 315.42 \n", "18 318.29 315.49 \n", "19 318.15 316.03 \n", "20 316.54 315.86 \n", "21 314.80 316.06 \n", "22 313.84 316.73 \n", "23 313.33 316.33 \n", "24 314.81 316.68 \n", "25 315.58 316.35 \n", "26 316.43 316.39 \n", "27 316.98 316.35 \n", "28 317.58 316.28 \n", "29 319.03 316.70 \n", ".. ... ... \n", "728 408.90 408.08 \n", "729 407.10 408.63 \n", "730 405.59 409.08 \n", "731 405.99 409.61 \n", "732 408.12 410.38 \n", "733 409.23 410.15 \n", "734 410.92 410.87 \n", "735 411.66 410.90 \n", "736 412.00 410.46 \n", "737 413.52 410.72 \n", "738 414.83 411.42 \n", "739 413.96 411.38 \n", "740 411.85 411.03 \n", "741 410.08 411.62 \n", "742 408.55 412.06 \n", "743 408.43 412.06 \n", "744 410.29 412.56 \n", "745 411.85 412.78 \n", "746 413.37 413.32 \n", "747 414.09 413.33 \n", "748 414.51 412.94 \n", "749 416.18 413.35 \n", "750 -99.99 -99.99 \n", "751 -99.99 -99.99 \n", "752 -99.99 -99.99 \n", "753 -99.99 -99.99 \n", "754 -99.99 -99.99 \n", "755 -99.99 -99.99 \n", "756 -99.99 -99.99 \n", "757 -99.99 -99.99 \n", "\n", "[758 rows x 10 columns]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "raw_data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Les deux premières lignes contiennent des unités et non des valeurs, on les retire du tableau pour l'instant." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "data = raw_data.iloc[2:]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pour ce jeu de données, les 4 premières colonnes sont des dates, et seule la colonne 5 contient des mesures brutes. Nous allons conserver uniquement les informations sur l'année, le mois, et la valeur brute de la mesure." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "useful_data = data.iloc[0:len(data.index), [0,1,4]]\n", "#useful_data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On vérifie que les données ont un type approprié." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 1958.0\n", " 2.0\n", " -99.99\n" ] } ], "source": [ "print(type(useful_data['Yr'][3]), useful_data['Yr'][3])\n", "print(type(useful_data['Mn'][3]), useful_data['Mn'][3])\n", "print(type(useful_data['CO2'][3]), useful_data['CO2'][3])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On voit que la troisième colonne n'est pas bien interprétée, peut être à cause du signe '-'. On essaye de convertir les données." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "useful_data['CO2'] = useful_data['CO2'].astype(float)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Les explications jointes au fichier indiquent que les valeurs manquantes sont remplacées par la valeur -99.99. On souhaite donc supprimer chaque ligne comportant cette valeur." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[2, 3, 7, 11, 75, 76, 77, 750, 751, 752, 753, 754, 755, 756, 757]\n" ] } ], "source": [ "liste = []\n", "for i in range(len(useful_data.index)):\n", " try:\n", " if(useful_data['CO2'][useful_data.index[i]] == -99.99):\n", " liste.append(useful_data.index[i])\n", " except:\n", " print(i, ' ', end='')\n", "print(liste)\n", "useful_data.drop(liste, inplace=True)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
YrMnCO2
41958.03.0315.70
51958.04.0317.46
61958.05.0317.51
81958.07.0315.86
91958.08.0314.93
101958.09.0313.21
121958.011.0313.33
131958.012.0314.67
141959.01.0315.58
151959.02.0316.49
161959.03.0316.65
171959.04.0317.72
181959.05.0318.29
191959.06.0318.15
201959.07.0316.54
211959.08.0314.80
221959.09.0313.84
231959.010.0313.33
241959.011.0314.81
251959.012.0315.58
261960.01.0316.43
271960.02.0316.98
281960.03.0317.58
291960.04.0319.03
301960.05.0320.04
311960.06.0319.58
321960.07.0318.18
331960.08.0315.90
341960.09.0314.17
351960.010.0313.83
............
7202017.011.0405.17
7212017.012.0406.75
7222018.01.0408.05
7232018.02.0408.34
7242018.03.0409.25
7252018.04.0410.30
7262018.05.0411.30
7272018.06.0410.88
7282018.07.0408.90
7292018.08.0407.10
7302018.09.0405.59
7312018.010.0405.99
7322018.011.0408.12
7332018.012.0409.23
7342019.01.0410.92
7352019.02.0411.66
7362019.03.0412.00
7372019.04.0413.52
7382019.05.0414.83
7392019.06.0413.96
7402019.07.0411.85
7412019.08.0410.08
7422019.09.0408.55
7432019.010.0408.43
7442019.011.0410.29
7452019.012.0411.85
7462020.01.0413.37
7472020.02.0414.09
7482020.03.0414.51
7492020.04.0416.18
\n", "

741 rows × 3 columns

\n", "
" ], "text/plain": [ " Yr Mn CO2\n", "4 1958.0 3.0 315.70\n", "5 1958.0 4.0 317.46\n", "6 1958.0 5.0 317.51\n", "8 1958.0 7.0 315.86\n", "9 1958.0 8.0 314.93\n", "10 1958.0 9.0 313.21\n", "12 1958.0 11.0 313.33\n", "13 1958.0 12.0 314.67\n", "14 1959.0 1.0 315.58\n", "15 1959.0 2.0 316.49\n", "16 1959.0 3.0 316.65\n", "17 1959.0 4.0 317.72\n", "18 1959.0 5.0 318.29\n", "19 1959.0 6.0 318.15\n", "20 1959.0 7.0 316.54\n", "21 1959.0 8.0 314.80\n", "22 1959.0 9.0 313.84\n", "23 1959.0 10.0 313.33\n", "24 1959.0 11.0 314.81\n", "25 1959.0 12.0 315.58\n", "26 1960.0 1.0 316.43\n", "27 1960.0 2.0 316.98\n", "28 1960.0 3.0 317.58\n", "29 1960.0 4.0 319.03\n", "30 1960.0 5.0 320.04\n", "31 1960.0 6.0 319.58\n", "32 1960.0 7.0 318.18\n", "33 1960.0 8.0 315.90\n", "34 1960.0 9.0 314.17\n", "35 1960.0 10.0 313.83\n", ".. ... ... ...\n", "720 2017.0 11.0 405.17\n", "721 2017.0 12.0 406.75\n", "722 2018.0 1.0 408.05\n", "723 2018.0 2.0 408.34\n", "724 2018.0 3.0 409.25\n", "725 2018.0 4.0 410.30\n", "726 2018.0 5.0 411.30\n", "727 2018.0 6.0 410.88\n", "728 2018.0 7.0 408.90\n", "729 2018.0 8.0 407.10\n", "730 2018.0 9.0 405.59\n", "731 2018.0 10.0 405.99\n", "732 2018.0 11.0 408.12\n", "733 2018.0 12.0 409.23\n", "734 2019.0 1.0 410.92\n", "735 2019.0 2.0 411.66\n", "736 2019.0 3.0 412.00\n", "737 2019.0 4.0 413.52\n", "738 2019.0 5.0 414.83\n", "739 2019.0 6.0 413.96\n", "740 2019.0 7.0 411.85\n", "741 2019.0 8.0 410.08\n", "742 2019.0 9.0 408.55\n", "743 2019.0 10.0 408.43\n", "744 2019.0 11.0 410.29\n", "745 2019.0 12.0 411.85\n", "746 2020.0 1.0 413.37\n", "747 2020.0 2.0 414.09\n", "748 2020.0 3.0 414.51\n", "749 2020.0 4.0 416.18\n", "\n", "[741 rows x 3 columns]" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "useful_data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On souhaite maintenant convertir l'année et le mois en un format plus adapté à Pandas, et à l'utiliser comme index. Un méthode possible est présentée ici, en rassemblant les deux informations puis en appliquant une fonction pour une mise au format Pandas." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "useful_data['period'] = useful_data['Yr']*100 + useful_data['Mn']" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "useful_data['period'] = useful_data['period'].astype(int)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "useful_data = useful_data.iloc[0:len(useful_data.index), [2,3]]" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CO2
period
1958-03315.70
1958-04317.46
1958-05317.51
1958-07315.86
1958-08314.93
1958-09313.21
1958-11313.33
1958-12314.67
1959-01315.58
1959-02316.49
1959-03316.65
1959-04317.72
1959-05318.29
1959-06318.15
1959-07316.54
1959-08314.80
1959-09313.84
1959-10313.33
1959-11314.81
1959-12315.58
1960-01316.43
1960-02316.98
1960-03317.58
1960-04319.03
1960-05320.04
1960-06319.58
1960-07318.18
1960-08315.90
1960-09314.17
1960-10313.83
......
2017-11405.17
2017-12406.75
2018-01408.05
2018-02408.34
2018-03409.25
2018-04410.30
2018-05411.30
2018-06410.88
2018-07408.90
2018-08407.10
2018-09405.59
2018-10405.99
2018-11408.12
2018-12409.23
2019-01410.92
2019-02411.66
2019-03412.00
2019-04413.52
2019-05414.83
2019-06413.96
2019-07411.85
2019-08410.08
2019-09408.55
2019-10408.43
2019-11410.29
2019-12411.85
2020-01413.37
2020-02414.09
2020-03414.51
2020-04416.18
\n", "

741 rows × 1 columns

\n", "
" ], "text/plain": [ " CO2\n", "period \n", "1958-03 315.70\n", "1958-04 317.46\n", "1958-05 317.51\n", "1958-07 315.86\n", "1958-08 314.93\n", "1958-09 313.21\n", "1958-11 313.33\n", "1958-12 314.67\n", "1959-01 315.58\n", "1959-02 316.49\n", "1959-03 316.65\n", "1959-04 317.72\n", "1959-05 318.29\n", "1959-06 318.15\n", "1959-07 316.54\n", "1959-08 314.80\n", "1959-09 313.84\n", "1959-10 313.33\n", "1959-11 314.81\n", "1959-12 315.58\n", "1960-01 316.43\n", "1960-02 316.98\n", "1960-03 317.58\n", "1960-04 319.03\n", "1960-05 320.04\n", "1960-06 319.58\n", "1960-07 318.18\n", "1960-08 315.90\n", "1960-09 314.17\n", "1960-10 313.83\n", "... ...\n", "2017-11 405.17\n", "2017-12 406.75\n", "2018-01 408.05\n", "2018-02 408.34\n", "2018-03 409.25\n", "2018-04 410.30\n", "2018-05 411.30\n", "2018-06 410.88\n", "2018-07 408.90\n", "2018-08 407.10\n", "2018-09 405.59\n", "2018-10 405.99\n", "2018-11 408.12\n", "2018-12 409.23\n", "2019-01 410.92\n", "2019-02 411.66\n", "2019-03 412.00\n", "2019-04 413.52\n", "2019-05 414.83\n", "2019-06 413.96\n", "2019-07 411.85\n", "2019-08 410.08\n", "2019-09 408.55\n", "2019-10 408.43\n", "2019-11 410.29\n", "2019-12 411.85\n", "2020-01 413.37\n", "2020-02 414.09\n", "2020-03 414.51\n", "2020-04 416.18\n", "\n", "[741 rows x 1 columns]" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def convertIntoPeriod(anneeEtMois):\n", " y = (int)(anneeEtMois/100)\n", " m = (int)(anneeEtMois%100)\n", " return pd.Period(pd.Timestamp(y,m,1), 'M')\n", "useful_data['period'] = [convertIntoPeriod(date) for date in useful_data['period']]\n", "useful_data.set_index('period')" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "useful_data['CO2'].plot()" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "useful_data['CO2'][-60:].plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On voit de prime abord une augmentation globale, et des oscillations assez régulières avec des minima locaux les mois de Septembre / Octobre et des maxima locaux les mois de Mai et Juin." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pour caractériser la croissance globale de la concentration de CO2 dans l'atmosphère, on va tenter de joindre au graphe des courbes de tendance linéaire et exponentielle, et voir quelle est la plus appropriée." ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "a1, b1 = np.polyfit([x for x in range(len(useful_data.index))], useful_data['CO2'], 1)\n", "a2, b2 = np.polyfit([x for x in range(len(useful_data.index))], [np.log(y) for y in useful_data['CO2']], 1)\n", "fit_data = [x*a1 + b1 for x in range(len(useful_data.index))]\n", "fit_dataExp = [np.exp(b2)*np.exp(a2*x) for x in range(len(useful_data.index))]\n", "useful_data['CO2'].plot()\n", "plt.plot([x for x in range(len(useful_data.index))], fit_data)\n", "plt.plot([x for x in range(len(useful_data.index))], fit_dataExp)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ces courbes de tendance ne sont pas satisfaisantes, elles ne semblent pas adaptées aux données. On tente une courbe de tendance polynomiale de degré 2." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "a3, b3, c3 = np.polyfit([x for x in range(len(useful_data.index))], useful_data['CO2'], 2)\n", "fit_dataCarre = [x*x*a3 + b3*x + c3 for x in range(len(useful_data.index))]\n", "useful_data['CO2'].plot()\n", "plt.plot([x for x in range(len(useful_data.index))], fit_dataCarre)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Cette courbe de tendance a l'air plus à même de nous fournir des données moyennes correctes. On souhaite maintenant faire une extrapolation jusqu'en 2025. Plutôt que de donner des valeurs par mois, il est plus pertinent ici de donner des valeurs moyennées par années.\n", "Pour ça, il suffit d'intégrer la fonction fit_dataCarre entre les bornes qui nous intéressent. " ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "749 2020-04\n", "Name: period, dtype: object\n" ] } ], "source": [ "#Valeur moyenne 2020\n", "borne1 = useful_data['period'][-1:]\n", "print(borne1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }