{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Étude du Paradoxe de Simpson : Effet du Tabagisme sur la Survie des Femmes à Whickham" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction\n", "En 1972-1974, une enquête a été menée sur la santé des femmes à Whickham, en Angleterre. L'objectif était d'évaluer la relation entre le tabagisme et la survie à long terme. Par simplicité, nous nous restreindrons aux femmes et parmi celles-ci aux 1314 qui ont été catégorisées comme __fumant actuellement__ ou __n'ayant jamais fumé__. Nous allons analyser ces données pour explorer le Paradoxe de Simpson." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Étape 1 : Préparation des Données" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import seaborn as sns\n", "import matplotlib.pyplot as plt\n", "import statsmodels.api as sm\n", "import statsmodels.formula.api as smf" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | Smoker | \n", "Status | \n", "Age | \n", "
|---|---|---|---|
| 0 | \n", "Yes | \n", "Alive | \n", "21.0 | \n", "
| 1 | \n", "Yes | \n", "Alive | \n", "19.3 | \n", "
| 2 | \n", "No | \n", "Dead | \n", "57.5 | \n", "
| 3 | \n", "No | \n", "Alive | \n", "47.1 | \n", "
| 4 | \n", "Yes | \n", "Alive | \n", "81.4 | \n", "
| 5 | \n", "No | \n", "Alive | \n", "36.8 | \n", "
| 6 | \n", "No | \n", "Alive | \n", "23.8 | \n", "
| 7 | \n", "Yes | \n", "Dead | \n", "57.5 | \n", "
| 8 | \n", "Yes | \n", "Alive | \n", "24.8 | \n", "
| 9 | \n", "Yes | \n", "Alive | \n", "49.5 | \n", "
| 10 | \n", "Yes | \n", "Alive | \n", "30.0 | \n", "
| 11 | \n", "No | \n", "Dead | \n", "66.0 | \n", "
| 12 | \n", "Yes | \n", "Alive | \n", "49.2 | \n", "
| 13 | \n", "No | \n", "Alive | \n", "58.4 | \n", "
| 14 | \n", "No | \n", "Dead | \n", "60.6 | \n", "
| 15 | \n", "No | \n", "Alive | \n", "25.1 | \n", "
| 16 | \n", "No | \n", "Alive | \n", "43.5 | \n", "
| 17 | \n", "No | \n", "Alive | \n", "27.1 | \n", "
| 18 | \n", "No | \n", "Alive | \n", "58.3 | \n", "
| 19 | \n", "Yes | \n", "Alive | \n", "65.7 | \n", "
| 20 | \n", "No | \n", "Dead | \n", "73.2 | \n", "
| 21 | \n", "Yes | \n", "Alive | \n", "38.3 | \n", "
| 22 | \n", "No | \n", "Alive | \n", "33.4 | \n", "
| 23 | \n", "Yes | \n", "Dead | \n", "62.3 | \n", "
| 24 | \n", "No | \n", "Alive | \n", "18.0 | \n", "
| 25 | \n", "No | \n", "Alive | \n", "56.2 | \n", "
| 26 | \n", "Yes | \n", "Alive | \n", "59.2 | \n", "
| 27 | \n", "No | \n", "Alive | \n", "25.8 | \n", "
| 28 | \n", "No | \n", "Dead | \n", "36.9 | \n", "
| 29 | \n", "No | \n", "Alive | \n", "20.2 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "
| 1284 | \n", "Yes | \n", "Dead | \n", "36.0 | \n", "
| 1285 | \n", "Yes | \n", "Alive | \n", "48.3 | \n", "
| 1286 | \n", "No | \n", "Alive | \n", "63.1 | \n", "
| 1287 | \n", "No | \n", "Alive | \n", "60.8 | \n", "
| 1288 | \n", "Yes | \n", "Dead | \n", "39.3 | \n", "
| 1289 | \n", "No | \n", "Alive | \n", "36.7 | \n", "
| 1290 | \n", "No | \n", "Alive | \n", "63.8 | \n", "
| 1291 | \n", "No | \n", "Dead | \n", "71.3 | \n", "
| 1292 | \n", "No | \n", "Alive | \n", "57.7 | \n", "
| 1293 | \n", "No | \n", "Alive | \n", "63.2 | \n", "
| 1294 | \n", "No | \n", "Alive | \n", "46.6 | \n", "
| 1295 | \n", "Yes | \n", "Dead | \n", "82.4 | \n", "
| 1296 | \n", "Yes | \n", "Alive | \n", "38.3 | \n", "
| 1297 | \n", "Yes | \n", "Alive | \n", "32.7 | \n", "
| 1298 | \n", "No | \n", "Alive | \n", "39.7 | \n", "
| 1299 | \n", "Yes | \n", "Dead | \n", "60.0 | \n", "
| 1300 | \n", "No | \n", "Dead | \n", "71.0 | \n", "
| 1301 | \n", "No | \n", "Alive | \n", "20.5 | \n", "
| 1302 | \n", "No | \n", "Alive | \n", "44.4 | \n", "
| 1303 | \n", "Yes | \n", "Alive | \n", "31.2 | \n", "
| 1304 | \n", "Yes | \n", "Alive | \n", "47.8 | \n", "
| 1305 | \n", "Yes | \n", "Alive | \n", "60.9 | \n", "
| 1306 | \n", "No | \n", "Dead | \n", "61.4 | \n", "
| 1307 | \n", "Yes | \n", "Alive | \n", "43.0 | \n", "
| 1308 | \n", "No | \n", "Alive | \n", "42.1 | \n", "
| 1309 | \n", "Yes | \n", "Alive | \n", "35.9 | \n", "
| 1310 | \n", "No | \n", "Alive | \n", "22.3 | \n", "
| 1311 | \n", "Yes | \n", "Dead | \n", "62.1 | \n", "
| 1312 | \n", "No | \n", "Dead | \n", "88.6 | \n", "
| 1313 | \n", "No | \n", "Alive | \n", "39.1 | \n", "
1314 rows × 3 columns
\n", "