{ "cells": [ { "cell_type": "markdown", "id": "5065fde3", "metadata": {}, "source": [ "\n", "# Analyse de données sur le tabagisme chez les femmes dans l'intervalle de 20 ans" ] }, { "cell_type": "markdown", "id": "7b06ba8b", "metadata": {}, "source": [ "### Les données restreint de travaux sur les maladies thyroïdiennes et cardiaques (Tunbridge et al. 1977) et la suite de cette étude a été menée vingt ans plus tard. Certains des résultats avaient trait au tabagisme et cherchaient à savoir si les individus étaient toujours en vie lors de la seconde étude. La survie à 20 ans a été déterminée pour l'ensemble des femmes du premier sondage. \n", "\n" ] }, { "cell_type": "markdown", "id": "d64ae68d", "metadata": {}, "source": [ "Importations des bibliothéques" ] }, { "cell_type": "code", "execution_count": 13, "id": "bb493378", "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as pl\n", "import numpy as np\n", "import pandas as pd\n", "import csv" ] }, { "cell_type": "markdown", "id": "06f16a9f", "metadata": {}, "source": [ "importataion des données du site de inria." ] }, { "cell_type": "code", "execution_count": 14, "id": "391a6abc", "metadata": {}, "outputs": [], "source": [ "data=pd.read_csv('https://gitlab.inria.fr/learninglab/mooc-rr/mooc-rr-ressources/-/raw/master/module3/Practical_session/Subject6_smoking.csv?inline=false')#, skiprows=1)" ] }, { "cell_type": "markdown", "id": "481c65ef", "metadata": {}, "source": [ "La première ligne du fichier CSV est un commentaire, que nous ignorons en précisant skiprows=1." ] }, { "cell_type": "markdown", "id": "e7e340db", "metadata": {}, "source": [ "Affichage des données" ] }, { "cell_type": "code", "execution_count": 117, "id": "279fc0bd", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Smoker | \n", "Status | \n", "Age | \n", "
---|---|---|---|
0 | \n", "Yes | \n", "Alive | \n", "21.0 | \n", "
1 | \n", "Yes | \n", "Alive | \n", "19.3 | \n", "
2 | \n", "No | \n", "Dead | \n", "57.5 | \n", "
3 | \n", "No | \n", "Alive | \n", "47.1 | \n", "
4 | \n", "Yes | \n", "Alive | \n", "81.4 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "
1309 | \n", "Yes | \n", "Alive | \n", "35.9 | \n", "
1310 | \n", "No | \n", "Alive | \n", "22.3 | \n", "
1311 | \n", "Yes | \n", "Dead | \n", "62.1 | \n", "
1312 | \n", "No | \n", "Dead | \n", "88.6 | \n", "
1313 | \n", "No | \n", "Alive | \n", "39.1 | \n", "
1314 rows × 3 columns
\n", "\n", " | Status | \n", "Smoker | \n", "
---|---|---|
0 | \n", "Alive | \n", "Yes | \n", "
1 | \n", "Alive | \n", "Yes | \n", "
3 | \n", "Alive | \n", "No | \n", "
4 | \n", "Alive | \n", "Yes | \n", "
5 | \n", "Alive | \n", "No | \n", "
... | \n", "... | \n", "... | \n", "
1307 | \n", "Alive | \n", "Yes | \n", "
1308 | \n", "Alive | \n", "No | \n", "
1309 | \n", "Alive | \n", "Yes | \n", "
1310 | \n", "Alive | \n", "No | \n", "
1313 | \n", "Alive | \n", "No | \n", "
945 rows × 2 columns
\n", "\n", " | Status | \n", "Smoker | \n", "
---|---|---|
2 | \n", "Dead | \n", "No | \n", "
7 | \n", "Dead | \n", "Yes | \n", "
11 | \n", "Dead | \n", "No | \n", "
14 | \n", "Dead | \n", "No | \n", "
20 | \n", "Dead | \n", "No | \n", "
... | \n", "... | \n", "... | \n", "
1299 | \n", "Dead | \n", "Yes | \n", "
1300 | \n", "Dead | \n", "No | \n", "
1306 | \n", "Dead | \n", "No | \n", "
1311 | \n", "Dead | \n", "Yes | \n", "
1312 | \n", "Dead | \n", "No | \n", "
369 rows × 2 columns
\n", "