From 0b9d5da99f4e1b7dc0e3cf72403a4018712969ad Mon Sep 17 00:00:00 2001 From: 264af2e9a1e4e844f861df089a5604e3 <264af2e9a1e4e844f861df089a5604e3@app-learninglab.inria.fr> Date: Wed, 30 Oct 2024 14:31:39 +0000 Subject: [PATCH] no commit message --- module3/exo2/exerciceTabac.ipynb | 453 +++++++++++++++++-------------- 1 file changed, 247 insertions(+), 206 deletions(-) diff --git a/module3/exo2/exerciceTabac.ipynb b/module3/exo2/exerciceTabac.ipynb index b665ffe..2a3553f 100644 --- a/module3/exo2/exerciceTabac.ipynb +++ b/module3/exo2/exerciceTabac.ipynb @@ -42,6 +42,13 @@ "3. Etablir une régression logistique en introduisant un variable Death valant 1 ou 0 si la personne est morte ou pas au cours des 20 années entre les 2 sondages. Conclure." ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Etape 1 :" + ] + }, { "cell_type": "markdown", "metadata": { @@ -49,7 +56,7 @@ "hidePrompt": true }, "source": [ - "Tout d'abord, il faut commencer par inclure les bibliothèques dont on aura besoin." + "Tout d'abord, il faut commencer par inclure les bibliothèques dont nous aurons besoin." ] }, { @@ -565,7 +572,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - " Création de 2 tableaux à partir du contenu du fichier csv :\n", + " Création de 2 \"tableaux\" à partir du contenu du fichier csv :\n", " *nonFumeuses* contient les données des personnes qui ne fument pas (qui ont \"No\" dans la colonne \"Smoker\")\n", " et *fumeuses* contient les données des personnes qui fument (qui ont \"Yes\" dans la colonne \"Smoker\")" ] @@ -585,7 +592,7 @@ }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 21, "metadata": {}, "outputs": [ { @@ -616,184 +623,184 @@ " \n", " \n", " \n", - " 915\n", + " 0\n", " Yes\n", " Alive\n", - " 47.4\n", + " 21.0\n", " \n", " \n", - " 1095\n", + " 1\n", " Yes\n", " Alive\n", - " 30.2\n", + " 19.3\n", " \n", " \n", - " 941\n", + " 4\n", " Yes\n", " Alive\n", - " 30.4\n", + " 81.4\n", " \n", " \n", - " 1311\n", + " 7\n", " Yes\n", " Dead\n", - " 62.1\n", + " 57.5\n", " \n", " \n", - " 1091\n", + " 8\n", " Yes\n", " Alive\n", - " 60.0\n", + " 24.8\n", " \n", " \n", - " 945\n", + " 9\n", " Yes\n", " Alive\n", - " 60.2\n", + " 49.5\n", " \n", " \n", - " 1148\n", + " 10\n", " Yes\n", " Alive\n", - " 50.6\n", + " 30.0\n", " \n", " \n", - " 1307\n", + " 12\n", " Yes\n", " Alive\n", - " 43.0\n", - " \n", - " \n", - " 913\n", - " Yes\n", - " Dead\n", - " 84.4\n", + " 49.2\n", " \n", " \n", - " 946\n", + " 19\n", " Yes\n", " Alive\n", - " 25.0\n", + " 65.7\n", " \n", " \n", - " 1276\n", + " 21\n", " Yes\n", " Alive\n", - " 58.5\n", + " 38.3\n", " \n", " \n", - " 950\n", + " 23\n", " Yes\n", " Dead\n", - " 43.3\n", + " 62.3\n", " \n", " \n", - " 1309\n", + " 26\n", " Yes\n", " Alive\n", - " 35.9\n", + " 59.2\n", " \n", " \n", - " 947\n", + " 30\n", " Yes\n", - " Dead\n", - " 37.1\n", + " Alive\n", + " 34.6\n", " \n", " \n", - " 1125\n", + " 31\n", " Yes\n", " Alive\n", - " 57.2\n", + " 51.9\n", " \n", " \n", - " 948\n", + " 32\n", " Yes\n", " Alive\n", - " 47.7\n", + " 49.9\n", " \n", " \n", - " 1093\n", + " 35\n", " Yes\n", - " Dead\n", - " 84.3\n", + " Alive\n", + " 46.7\n", " \n", " \n", - " 911\n", + " 36\n", " Yes\n", " Alive\n", - " 38.6\n", + " 44.4\n", " \n", " \n", - " 1273\n", + " 37\n", " Yes\n", " Alive\n", - " 55.7\n", + " 29.5\n", " \n", " \n", - " 1149\n", + " 38\n", " Yes\n", - " Alive\n", - " 21.5\n", + " Dead\n", + " 33.0\n", " \n", " \n", - " 1111\n", + " 39\n", " Yes\n", " Alive\n", - " 41.9\n", + " 35.6\n", " \n", " \n", - " 1127\n", + " 40\n", " Yes\n", " Alive\n", - " 32.5\n", + " 39.1\n", " \n", " \n", - " 1115\n", + " 42\n", " Yes\n", - " Dead\n", - " 63.3\n", + " Alive\n", + " 35.7\n", " \n", " \n", - " 1114\n", + " 46\n", " Yes\n", " Dead\n", - " 31.3\n", + " 44.3\n", " \n", " \n", - " 1143\n", + " 48\n", " Yes\n", " Alive\n", - " 26.6\n", + " 37.5\n", " \n", " \n", - " 1102\n", + " 49\n", " Yes\n", " Alive\n", - " 29.7\n", + " 22.1\n", " \n", " \n", - " 1142\n", + " 53\n", " Yes\n", - " Dead\n", - " 71.0\n", + " Alive\n", + " 39.0\n", " \n", " \n", - " 1140\n", + " 56\n", " Yes\n", " Alive\n", - " 42.3\n", + " 40.1\n", " \n", " \n", - " 1288\n", + " 60\n", " Yes\n", - " Dead\n", - " 39.3\n", + " Alive\n", + " 58.1\n", " \n", " \n", - " 1132\n", + " 61\n", " Yes\n", " Alive\n", - " 18.0\n", + " 37.3\n", + " \n", + " \n", + " 63\n", + " Yes\n", + " Dead\n", + " 36.3\n", " \n", " \n", " ...\n", @@ -802,184 +809,184 @@ " ...\n", " \n", " \n", - " 649\n", + " 1240\n", " Yes\n", " Alive\n", - " 36.9\n", + " 29.7\n", " \n", " \n", - " 650\n", + " 1243\n", " Yes\n", - " Dead\n", - " 81.8\n", + " Alive\n", + " 40.1\n", " \n", " \n", - " 611\n", + " 1251\n", " Yes\n", " Alive\n", - " 43.4\n", + " 27.8\n", " \n", " \n", - " 608\n", + " 1252\n", " Yes\n", " Alive\n", - " 23.5\n", + " 52.4\n", " \n", " \n", - " 605\n", + " 1253\n", " Yes\n", " Alive\n", - " 59.0\n", + " 27.8\n", " \n", " \n", - " 604\n", + " 1254\n", " Yes\n", " Alive\n", - " 43.8\n", + " 41.0\n", " \n", " \n", - " 554\n", + " 1259\n", " Yes\n", " Alive\n", - " 21.3\n", + " 40.8\n", " \n", " \n", - " 555\n", + " 1260\n", " Yes\n", - " Dead\n", - " 76.9\n", + " Alive\n", + " 20.4\n", " \n", " \n", - " 558\n", + " 1263\n", " Yes\n", - " Dead\n", - " 75.2\n", + " Alive\n", + " 20.9\n", " \n", " \n", - " 560\n", + " 1264\n", " Yes\n", " Alive\n", - " 53.0\n", + " 45.5\n", " \n", " \n", - " 562\n", + " 1269\n", " Yes\n", " Alive\n", - " 43.7\n", + " 38.8\n", " \n", " \n", - " 563\n", + " 1270\n", " Yes\n", " Alive\n", - " 50.9\n", + " 55.5\n", " \n", " \n", - " 565\n", + " 1271\n", " Yes\n", " Alive\n", - " 32.8\n", + " 24.9\n", " \n", " \n", - " 566\n", + " 1273\n", " Yes\n", " Alive\n", - " 50.7\n", + " 55.7\n", " \n", " \n", - " 567\n", + " 1276\n", " Yes\n", - " Dead\n", - " 66.1\n", + " Alive\n", + " 58.5\n", " \n", " \n", - " 569\n", + " 1278\n", " Yes\n", " Alive\n", - " 27.2\n", + " 43.7\n", " \n", " \n", - " 548\n", + " 1282\n", " Yes\n", " Alive\n", - " 62.1\n", + " 51.2\n", + " \n", + " \n", + " 1284\n", + " Yes\n", + " Dead\n", + " 36.0\n", " \n", " \n", - " 571\n", + " 1285\n", " Yes\n", " Alive\n", - " 38.1\n", + " 48.3\n", " \n", " \n", - " 573\n", + " 1288\n", " Yes\n", " Dead\n", - " 55.2\n", + " 39.3\n", " \n", " \n", - " 575\n", + " 1295\n", " Yes\n", - " Alive\n", - " 50.9\n", + " Dead\n", + " 82.4\n", " \n", " \n", - " 580\n", + " 1296\n", " Yes\n", " Alive\n", - " 42.5\n", + " 38.3\n", " \n", " \n", - " 583\n", + " 1297\n", " Yes\n", " Alive\n", - " 26.6\n", + " 32.7\n", " \n", " \n", - " 584\n", + " 1299\n", " Yes\n", - " Alive\n", - " 23.3\n", + " Dead\n", + " 60.0\n", " \n", " \n", - " 587\n", + " 1303\n", " Yes\n", " Alive\n", - " 34.8\n", + " 31.2\n", " \n", " \n", - " 589\n", + " 1304\n", " Yes\n", " Alive\n", - " 28.2\n", + " 47.8\n", " \n", " \n", - " 591\n", + " 1305\n", " Yes\n", " Alive\n", - " 38.5\n", + " 60.9\n", " \n", " \n", - " 592\n", + " 1307\n", " Yes\n", " Alive\n", - " 41.0\n", + " 43.0\n", " \n", " \n", - " 595\n", + " 1309\n", " Yes\n", " Alive\n", - " 25.7\n", + " 35.9\n", " \n", " \n", - " 572\n", + " 1311\n", " Yes\n", " Dead\n", - " 66.8\n", - " \n", - " \n", - " 656\n", - " Yes\n", - " Alive\n", - " 43.0\n", + " 62.1\n", " \n", " \n", "\n", @@ -988,83 +995,84 @@ ], "text/plain": [ " Smoker Status Age\n", - "915 Yes Alive 47.4\n", - "1095 Yes Alive 30.2\n", - "941 Yes Alive 30.4\n", - "1311 Yes Dead 62.1\n", - "1091 Yes Alive 60.0\n", - "945 Yes Alive 60.2\n", - "1148 Yes Alive 50.6\n", - "1307 Yes Alive 43.0\n", - "913 Yes Dead 84.4\n", - "946 Yes Alive 25.0\n", - "1276 Yes Alive 58.5\n", - "950 Yes Dead 43.3\n", - "1309 Yes Alive 35.9\n", - "947 Yes Dead 37.1\n", - "1125 Yes Alive 57.2\n", - "948 Yes Alive 47.7\n", - "1093 Yes Dead 84.3\n", - "911 Yes Alive 38.6\n", + "0 Yes Alive 21.0\n", + "1 Yes Alive 19.3\n", + "4 Yes Alive 81.4\n", + "7 Yes Dead 57.5\n", + "8 Yes Alive 24.8\n", + "9 Yes Alive 49.5\n", + "10 Yes Alive 30.0\n", + "12 Yes Alive 49.2\n", + "19 Yes Alive 65.7\n", + "21 Yes Alive 38.3\n", + "23 Yes Dead 62.3\n", + "26 Yes Alive 59.2\n", + "30 Yes Alive 34.6\n", + "31 Yes Alive 51.9\n", + "32 Yes Alive 49.9\n", + "35 Yes Alive 46.7\n", + "36 Yes Alive 44.4\n", + "37 Yes Alive 29.5\n", + "38 Yes Dead 33.0\n", + "39 Yes Alive 35.6\n", + "40 Yes Alive 39.1\n", + "42 Yes Alive 35.7\n", + "46 Yes Dead 44.3\n", + "48 Yes Alive 37.5\n", + "49 Yes Alive 22.1\n", + "53 Yes Alive 39.0\n", + "56 Yes Alive 40.1\n", + "60 Yes Alive 58.1\n", + "61 Yes Alive 37.3\n", + "63 Yes Dead 36.3\n", + "... ... ... ...\n", + "1240 Yes Alive 29.7\n", + "1243 Yes Alive 40.1\n", + "1251 Yes Alive 27.8\n", + "1252 Yes Alive 52.4\n", + "1253 Yes Alive 27.8\n", + "1254 Yes Alive 41.0\n", + "1259 Yes Alive 40.8\n", + "1260 Yes Alive 20.4\n", + "1263 Yes Alive 20.9\n", + "1264 Yes Alive 45.5\n", + "1269 Yes Alive 38.8\n", + "1270 Yes Alive 55.5\n", + "1271 Yes Alive 24.9\n", "1273 Yes Alive 55.7\n", - "1149 Yes Alive 21.5\n", - "1111 Yes Alive 41.9\n", - "1127 Yes Alive 32.5\n", - "1115 Yes Dead 63.3\n", - "1114 Yes Dead 31.3\n", - "1143 Yes Alive 26.6\n", - "1102 Yes Alive 29.7\n", - "1142 Yes Dead 71.0\n", - "1140 Yes Alive 42.3\n", + "1276 Yes Alive 58.5\n", + "1278 Yes Alive 43.7\n", + "1282 Yes Alive 51.2\n", + "1284 Yes Dead 36.0\n", + "1285 Yes Alive 48.3\n", "1288 Yes Dead 39.3\n", - "1132 Yes Alive 18.0\n", - "... ... ... ...\n", - "649 Yes Alive 36.9\n", - "650 Yes Dead 81.8\n", - "611 Yes Alive 43.4\n", - "608 Yes Alive 23.5\n", - "605 Yes Alive 59.0\n", - "604 Yes Alive 43.8\n", - "554 Yes Alive 21.3\n", - "555 Yes Dead 76.9\n", - "558 Yes Dead 75.2\n", - "560 Yes Alive 53.0\n", - "562 Yes Alive 43.7\n", - "563 Yes Alive 50.9\n", - "565 Yes Alive 32.8\n", - "566 Yes Alive 50.7\n", - "567 Yes Dead 66.1\n", - "569 Yes Alive 27.2\n", - "548 Yes Alive 62.1\n", - "571 Yes Alive 38.1\n", - "573 Yes Dead 55.2\n", - "575 Yes Alive 50.9\n", - "580 Yes Alive 42.5\n", - "583 Yes Alive 26.6\n", - "584 Yes Alive 23.3\n", - "587 Yes Alive 34.8\n", - "589 Yes Alive 28.2\n", - "591 Yes Alive 38.5\n", - "592 Yes Alive 41.0\n", - "595 Yes Alive 25.7\n", - "572 Yes Dead 66.8\n", - "656 Yes Alive 43.0\n", + "1295 Yes Dead 82.4\n", + "1296 Yes Alive 38.3\n", + "1297 Yes Alive 32.7\n", + "1299 Yes Dead 60.0\n", + "1303 Yes Alive 31.2\n", + "1304 Yes Alive 47.8\n", + "1305 Yes Alive 60.9\n", + "1307 Yes Alive 43.0\n", + "1309 Yes Alive 35.9\n", + "1311 Yes Dead 62.1\n", "\n", "[582 rows x 3 columns]" ] }, - "execution_count": 5, + "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ + "#Affichage\n", "fumeuses" ] }, { "cell_type": "code", - "execution_count": 6, + "execution_count": 22, "metadata": {}, "outputs": [ { @@ -1532,12 +1540,13 @@ "[732 rows x 3 columns]" ] }, - "execution_count": 6, + "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ + "#Affichage\n", "nonFumeuses" ] }, @@ -1763,7 +1772,8 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "On obtient des résultats assez surprenants dans le sens où étant donné que l'on nous a souvent répété que fumer est mauvais pour la santé, nous nous attendions à retrouver ce fait dans cette étude." + "On obtient des résultats assez surprenants dans le sens où, étant donné que l'on nous a souvent répété que fumer est mauvais pour la santé, nous nous attendions à retrouver ce fait dans cette étude.\n", + "Or, nous pouvons observer que le résultat des calculs effectués nous montre l'inverse de ce à quoi nous nous attendions : le groupe de femmes qui ne fumaient pas a un taux de mortalité supérieur à celui composé de femmes qui fumaient." ] }, { @@ -1773,6 +1783,37 @@ "# Etape 2" ] }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": {}, + "outputs": [], + "source": [ + "nb18_34F = len(fumeuses.loc[fumeuses[\"Age\"]<=34]) - len(fumeuses.loc[fumeuses[\"Age\"]<18])\n", + "nb18_34NF = len(nonFumeuses.loc[nonFumeuses[\"Age\"]<=34]) - len(nonFumeuses.loc[nonFumeuses[\"Age\"]<18])" + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "5 fumeuses ayant entre 18 et 34 ans lors du premier sondage sont décédées durant la période de 20 ans\n" + ] + } + ], + "source": [ + "test = fumeuses.loc[fumeuses[\"Age\"]<=34]\n", + "t2 = test.loc[test[\"Age\"]>18]\n", + "\n", + "nbDecedees18_34F = len(t2.loc[t2[\"Status\"]==\"Dead\"])\n", + "print(nbDecedees18_34F, \"fumeuses ayant entre 18 et 34 ans lors du premier sondage sont décédées durant la période de 20 ans\")" + ] + }, { "cell_type": "code", "execution_count": null, -- 2.18.1