{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Concentration de CO2 dans l'atmosphère depuis 1958" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Les données pour cette étude ont été téléchargées au format CSV le 8 Janvier 2020 via [le lien du MOOC](https://gitlab.inria.fr/learninglab/mooc-rr/mooc-rr-ressources/blob/master/module3/Practical_session/Subject6_smoking.csv).\n", "Sur chaque ligne il est indiqué si la personne fume ou non, si elle est vivante ou décédée au moment de la seconde étude, et son âge lors du premier sondage." ] }, { "cell_type": "code", "execution_count": 87, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "import isoweek\n", "import pandas as pd\n", "import numpy as np\n" ] }, { "cell_type": "code", "execution_count": 88, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
SmokerStatusAge
0YesAlive21.0
1YesAlive19.3
2NoDead57.5
3NoAlive47.1
4YesAlive81.4
5NoAlive36.8
6NoAlive23.8
7YesDead57.5
8YesAlive24.8
9YesAlive49.5
10YesAlive30.0
11NoDead66.0
12YesAlive49.2
13NoAlive58.4
14NoDead60.6
15NoAlive25.1
16NoAlive43.5
17NoAlive27.1
18NoAlive58.3
19YesAlive65.7
20NoDead73.2
21YesAlive38.3
22NoAlive33.4
23YesDead62.3
24NoAlive18.0
25NoAlive56.2
26YesAlive59.2
27NoAlive25.8
28NoDead36.9
29NoAlive20.2
............
1284YesDead36.0
1285YesAlive48.3
1286NoAlive63.1
1287NoAlive60.8
1288YesDead39.3
1289NoAlive36.7
1290NoAlive63.8
1291NoDead71.3
1292NoAlive57.7
1293NoAlive63.2
1294NoAlive46.6
1295YesDead82.4
1296YesAlive38.3
1297YesAlive32.7
1298NoAlive39.7
1299YesDead60.0
1300NoDead71.0
1301NoAlive20.5
1302NoAlive44.4
1303YesAlive31.2
1304YesAlive47.8
1305YesAlive60.9
1306NoDead61.4
1307YesAlive43.0
1308NoAlive42.1
1309YesAlive35.9
1310NoAlive22.3
1311YesDead62.1
1312NoDead88.6
1313NoAlive39.1
\n", "

1314 rows × 3 columns

\n", "
" ], "text/plain": [ " Smoker Status Age\n", "0 Yes Alive 21.0\n", "1 Yes Alive 19.3\n", "2 No Dead 57.5\n", "3 No Alive 47.1\n", "4 Yes Alive 81.4\n", "5 No Alive 36.8\n", "6 No Alive 23.8\n", "7 Yes Dead 57.5\n", "8 Yes Alive 24.8\n", "9 Yes Alive 49.5\n", "10 Yes Alive 30.0\n", "11 No Dead 66.0\n", "12 Yes Alive 49.2\n", "13 No Alive 58.4\n", "14 No Dead 60.6\n", "15 No Alive 25.1\n", "16 No Alive 43.5\n", "17 No Alive 27.1\n", "18 No Alive 58.3\n", "19 Yes Alive 65.7\n", "20 No Dead 73.2\n", "21 Yes Alive 38.3\n", "22 No Alive 33.4\n", "23 Yes Dead 62.3\n", "24 No Alive 18.0\n", "25 No Alive 56.2\n", "26 Yes Alive 59.2\n", "27 No Alive 25.8\n", "28 No Dead 36.9\n", "29 No Alive 20.2\n", "... ... ... ...\n", "1284 Yes Dead 36.0\n", "1285 Yes Alive 48.3\n", "1286 No Alive 63.1\n", "1287 No Alive 60.8\n", "1288 Yes Dead 39.3\n", "1289 No Alive 36.7\n", "1290 No Alive 63.8\n", "1291 No Dead 71.3\n", "1292 No Alive 57.7\n", "1293 No Alive 63.2\n", "1294 No Alive 46.6\n", "1295 Yes Dead 82.4\n", "1296 Yes Alive 38.3\n", "1297 Yes Alive 32.7\n", "1298 No Alive 39.7\n", "1299 Yes Dead 60.0\n", "1300 No Dead 71.0\n", "1301 No Alive 20.5\n", "1302 No Alive 44.4\n", "1303 Yes Alive 31.2\n", "1304 Yes Alive 47.8\n", "1305 Yes Alive 60.9\n", "1306 No Dead 61.4\n", "1307 Yes Alive 43.0\n", "1308 No Alive 42.1\n", "1309 Yes Alive 35.9\n", "1310 No Alive 22.3\n", "1311 Yes Dead 62.1\n", "1312 No Dead 88.6\n", "1313 No Alive 39.1\n", "\n", "[1314 rows x 3 columns]" ] }, "execution_count": 88, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ini_data = pd.read_csv('Subject6_smoking.csv')\n", "ini_data" ] }, { "cell_type": "code", "execution_count": 89, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
SmokerStatusAge
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [Smoker, Status, Age]\n", "Index: []" ] }, "execution_count": 89, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ini_data[ini_data.isnull().any(axis=1)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Y a-t-il des points manquants dans ce jeux de données ? Non il n'y en a pas" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Dans un tableau sont représentés le nombre total de femmes vivantes et décédées sur la période en fonction de leur habitude de tabagisme. Pour chaque groupe (fumeuses / non fumeuses) le taux de mortalité (le rapport entre le nombre de femmes décédées dans un groupe et le nombre total de femmes dans ce groupe) est aussi calculé." ] }, { "cell_type": "code", "execution_count": 90, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
StatusAliveDeadtaux_mortalité
Smoker
No50223031.420765
Yes44313923.883162
\n", "
" ], "text/plain": [ "Status Alive Dead taux_mortalité\n", "Smoker \n", "No 502 230 31.420765\n", "Yes 443 139 23.883162" ] }, "execution_count": 90, "metadata": {}, "output_type": "execute_result" } ], "source": [ "raw_data=ini_data.copy()\n", "raw_data=pd.crosstab(index=raw_data['Smoker'], columns=raw_data['Status'])\n", "\n", "taux1=raw_data.Dead[0]/(raw_data.Dead[0]+raw_data.Alive[0])\n", "taux2=raw_data.Dead[1]/(raw_data.Dead[1]+raw_data.Alive[1])\n", "raw_data['taux_mortalité']= [taux1*100, taux2*100]\n", "raw_data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ces résultats sont surpenants car ils montrent qu'il y a plus de femmes qui ne fumaient pas lors de la première étude qui sont mortes au moement de la deuxième (230 non fumeuses mortes) que de femmes qui fumaient (139 fumeuses mortes). Le taux de mortalité est plus élevé chez les non fumeuses (31 %) que chez les fumeuses (24 %).\n", "Après 20 ans, le taux de mortalité chez les fumeuses est de 24%, alors que celui des non-fumeuses est de 31%. Alors, est-ce que non-fumer tue ?\n", "\n", "Afin de préciser cette étude les effectifs et les taux de mortalité vont être recalculés en ajoutant une nouvelle catégorie liée à la classe d'âge. Ici sont considéres les classes d'ages suivantes : 18-34 ans, 34-54 ans, 55-64 ans et plus de 65 ans." ] }, { "cell_type": "code", "execution_count": 91, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
SmokerStatusAge
0YesAlive21.0
1YesAlive19.3
6NoAlive23.8
8YesAlive24.8
10YesAlive30.0
15NoAlive25.1
17NoAlive27.1
22NoAlive33.4
24NoAlive18.0
27NoAlive25.8
29NoAlive20.2
33NoAlive19.4
37YesAlive29.5
38YesDead33.0
44NoAlive25.3
47NoAlive18.5
49YesAlive22.1
54NoAlive28.4
58NoAlive22.9
65YesAlive33.0
67YesAlive27.9
71YesAlive26.2
76NoAlive27.6
77YesAlive31.4
79NoAlive18.9
81YesAlive25.4
84NoAlive27.3
86NoAlive32.8
91NoAlive18.3
92YesAlive20.2
............
1205NoAlive23.2
1207YesAlive31.4
1208YesAlive30.0
1213NoAlive21.4
1216YesAlive27.9
1217YesAlive29.5
1219YesAlive27.0
1223YesAlive28.3
1226YesAlive31.0
1232NoAlive28.3
1240YesAlive29.7
1247NoAlive26.0
1250NoAlive19.8
1251YesAlive27.8
1253YesAlive27.8
1255NoDead28.5
1256NoAlive26.7
1260YesAlive20.4
1263YesAlive20.9
1265NoAlive26.7
1267NoAlive33.7
1271YesAlive24.9
1272NoAlive33.0
1274NoAlive25.7
1275NoAlive19.5
1277NoAlive23.4
1297YesAlive32.7
1301NoAlive20.5
1303YesAlive31.2
1310NoAlive22.3
\n", "

400 rows × 3 columns

\n", "
" ], "text/plain": [ " Smoker Status Age\n", "0 Yes Alive 21.0\n", "1 Yes Alive 19.3\n", "6 No Alive 23.8\n", "8 Yes Alive 24.8\n", "10 Yes Alive 30.0\n", "15 No Alive 25.1\n", "17 No Alive 27.1\n", "22 No Alive 33.4\n", "24 No Alive 18.0\n", "27 No Alive 25.8\n", "29 No Alive 20.2\n", "33 No Alive 19.4\n", "37 Yes Alive 29.5\n", "38 Yes Dead 33.0\n", "44 No Alive 25.3\n", "47 No Alive 18.5\n", "49 Yes Alive 22.1\n", "54 No Alive 28.4\n", "58 No Alive 22.9\n", "65 Yes Alive 33.0\n", "67 Yes Alive 27.9\n", "71 Yes Alive 26.2\n", "76 No Alive 27.6\n", "77 Yes Alive 31.4\n", "79 No Alive 18.9\n", "81 Yes Alive 25.4\n", "84 No Alive 27.3\n", "86 No Alive 32.8\n", "91 No Alive 18.3\n", "92 Yes Alive 20.2\n", "... ... ... ...\n", "1205 No Alive 23.2\n", "1207 Yes Alive 31.4\n", "1208 Yes Alive 30.0\n", "1213 No Alive 21.4\n", "1216 Yes Alive 27.9\n", "1217 Yes Alive 29.5\n", "1219 Yes Alive 27.0\n", "1223 Yes Alive 28.3\n", "1226 Yes Alive 31.0\n", "1232 No Alive 28.3\n", "1240 Yes Alive 29.7\n", "1247 No Alive 26.0\n", "1250 No Alive 19.8\n", "1251 Yes Alive 27.8\n", "1253 Yes Alive 27.8\n", "1255 No Dead 28.5\n", "1256 No Alive 26.7\n", "1260 Yes Alive 20.4\n", "1263 Yes Alive 20.9\n", "1265 No Alive 26.7\n", "1267 No Alive 33.7\n", "1271 Yes Alive 24.9\n", "1272 No Alive 33.0\n", "1274 No Alive 25.7\n", "1275 No Alive 19.5\n", "1277 No Alive 23.4\n", "1297 Yes Alive 32.7\n", "1301 No Alive 20.5\n", "1303 Yes Alive 31.2\n", "1310 No Alive 22.3\n", "\n", "[400 rows x 3 columns]" ] }, "execution_count": 91, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Pour ne garder que les femmes ayant entre 18 et 34 ans : enlèves colonnes > 34 ans\n", "dfini_18_34=ini_data.drop(ini_data[ini_data.Age > 34] .index)\n", "# Pour ne garder que les femmes ayant entre 35 et 54 ans : enlèves colonnes < ou = à 34 ans et colonnes > 54 ans\n", "dfini_35_54=ini_data.drop(ini_data[(ini_data.Age <= 34) | (ini_data.Age >54)] .index)\n", "# Pour ne garder que les femmes ayant entre 55 et 64 ans : enlèves colonnes < ou = à 54 ans et colonnes > 64 ans\n", "dfini_55_64=ini_data.drop(ini_data[(ini_data.Age <= 54) | (ini_data.Age >64)] .index)\n", "# Pour ne garder que les femmes ayant plus de 65 ans : enlèves colonnes < ou = à 64 ans\n", "dfini_65=ini_data.drop(ini_data[ini_data.Age <= 64] .index)\n", "\n", "dfini_18_34" ] }, { "cell_type": "code", "execution_count": 92, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
StatusAliveDeadtaux_mortalité
Smoker
No21362.739726
Yes17652.762431
\n", "
" ], "text/plain": [ "Status Alive Dead taux_mortalité\n", "Smoker \n", "No 213 6 2.739726\n", "Yes 176 5 2.762431" ] }, "execution_count": 92, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_18_34=dfini_18_34.copy()\n", "df_18_34=pd.crosstab(index=df_18_34['Smoker'], columns=df_18_34['Status'])\n", "taux11=df_18_34.Dead[0]/(df_18_34.Dead[0]+df_18_34.Alive[0])\n", "taux22=df_18_34.Dead[1]/(df_18_34.Dead[1]+df_18_34.Alive[1])\n", "df_18_34['taux_mortalité']= [taux11*100, taux22*100]\n", "\n", "df_18_34" ] }, { "cell_type": "code", "execution_count": 93, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
StatusAliveDeadtaux_mortalité
Smoker
No180199.547739
Yes1964117.299578
\n", "
" ], "text/plain": [ "Status Alive Dead taux_mortalité\n", "Smoker \n", "No 180 19 9.547739\n", "Yes 196 41 17.299578" ] }, "execution_count": 93, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_35_54=dfini_35_54.copy()\n", "df_35_54=pd.crosstab(index=df_35_54['Smoker'], columns=df_35_54['Status'])\n", "taux111=df_35_54.Dead[0]/(df_35_54.Dead[0]+df_35_54.Alive[0])\n", "taux222=df_35_54.Dead[1]/(df_35_54.Dead[1]+df_35_54.Alive[1])\n", "df_35_54['taux_mortalité']= [taux111*100, taux222*100]\n", "\n", "df_35_54" ] }, { "cell_type": "code", "execution_count": 94, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
StatusAliveDeadtaux_mortalité
Smoker
No814033.057851
Yes645144.347826
\n", "
" ], "text/plain": [ "Status Alive Dead taux_mortalité\n", "Smoker \n", "No 81 40 33.057851\n", "Yes 64 51 44.347826" ] }, "execution_count": 94, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_55_64=dfini_55_64.copy()\n", "df_55_64=pd.crosstab(index=df_55_64['Smoker'], columns=df_55_64['Status'])\n", "taux1111=df_55_64.Dead[0]/(df_55_64.Dead[0]+df_55_64.Alive[0])\n", "taux2222=df_55_64.Dead[1]/(df_55_64.Dead[1]+df_55_64.Alive[1])\n", "df_55_64['taux_mortalité']= [taux1111*100, taux2222*100]\n", "\n", "df_55_64" ] }, { "cell_type": "code", "execution_count": 95, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
StatusAliveDeadtaux_mortalité
Smoker
No2816585.492228
Yes74285.714286
\n", "
" ], "text/plain": [ "Status Alive Dead taux_mortalité\n", "Smoker \n", "No 28 165 85.492228\n", "Yes 7 42 85.714286" ] }, "execution_count": 95, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_65=dfini_65.copy()\n", "df_65=pd.crosstab(index=df_65['Smoker'], columns=df_65['Status'])\n", "taux11111=df_65.Dead[0]/(df_65.Dead[0]+df_65.Alive[0])\n", "taux22222=df_65.Dead[1]/(df_65.Dead[1]+df_65.Alive[1])\n", "df_65['taux_mortalité']= [taux11111*100, taux22222*100]\n", "\n", "df_65" ] }, { "cell_type": "code", "execution_count": 96, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" }, { "data": { "text/plain": [ "" ] }, "execution_count": 96, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.title(\"Taux de mortalité après 20 ans - Total \")\n", "x_tot=[\"Non-fumeuses\", \"Fumeuses\"]\n", "y_tot=[raw_data.taux_mortalité[0], raw_data.taux_mortalité[1]]\n", "plt.ylim([0,100])\n", "plt.bar(x_tot, y_tot, color=[\"green\", \"red\"] )\n", "plt.ylabel(\"Taux de mortalité en %\")\n", "plt.show()\n", "\n", "\n", "labels = ['18-34', '35-54', '55-64', '65+']\n", "y_nf=[df_18_34.taux_mortalité[0], df_35_54.taux_mortalité[0], df_55_64.taux_mortalité[0], df_65.taux_mortalité[0]]\n", "y_f=[df_18_34.taux_mortalité[1], df_35_54.taux_mortalité[1], df_55_64.taux_mortalité[1], df_65.taux_mortalité[1]]\n", "\n", "x = np.arange(len(labels)) # the label locations\n", "width = 0.35 # the width of the bars\n", "\n", "fig, ax = plt.subplots()\n", "rects1 = ax.bar(x - width/2, y_nf, width, color=\"green\", label='Non-fumeuses')\n", "rects2 = ax.bar(x + width/2, y_f, width, color=\"red\",label='Fumeuses')\n", "\n", "ax.set_ylabel('Taux de mortalité (%)')\n", "ax.set_title(\"Taux d emortalité par tranche d'âge\")\n", "ax.set_xticks(x)\n", "ax.set_ylim([0,100])\n", "ax.set_xticklabels(labels)\n", "ax.legend()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "D'après les graphiques précédents, si on raisonne par classe d’âge, dans chaque tranche la mortalité chez les fumeuses a été supérieure à celle des non-fumeuses. C'est rassurant, mais comment les chiffres peuvent-ils s’inverser quand on groupe tout le monde ?\n", "\n", "En fait dans la population initiale, il y avait plus de femmes âgées chez les non-fumeuses que chez les fumeuses. Et même si dans chaque tranche d’âge les non-fumeuses meurent moins, cet effet est compensé par le fait que la tranche d’âge « élevée » est sur-représentée chez les non-fumeuses… qui donc en moyenne meurent plus !" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }