no commit message

parent d13160be
...@@ -32,7 +32,9 @@ ...@@ -32,7 +32,9 @@
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 3, "execution_count": 3,
"metadata": {}, "metadata": {
"scrolled": true
},
"outputs": [ "outputs": [
{ {
"data": { "data": {
...@@ -211,7 +213,7 @@ ...@@ -211,7 +213,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"On compte les morts." "On compte les morts, les vivants, les fumeurs et non-fumeurs. Cela permet notamment de vérifier rapidement l'intégrité des données. "
] ]
}, },
{ {
...@@ -227,7 +229,9 @@ ...@@ -227,7 +229,9 @@
"number dead = 369\n", "number dead = 369\n",
"total number = 1314\n", "total number = 1314\n",
"number smoker = 582\n", "number smoker = 582\n",
"number non smoker = 732\n" "number non smoker = 732\n",
"total number = 1314\n",
"Number of data : 1314\n"
] ]
} }
], ],
...@@ -240,7 +244,10 @@ ...@@ -240,7 +244,10 @@
"smoker = raw_data['Smoker'].value_counts()['Yes']\n", "smoker = raw_data['Smoker'].value_counts()['Yes']\n",
"non_smoker = raw_data['Smoker'].value_counts()['No']\n", "non_smoker = raw_data['Smoker'].value_counts()['No']\n",
"print(f'number smoker = {smoker}')\n", "print(f'number smoker = {smoker}')\n",
"print(f'number non smoker = {non_smoker}')\n" "print(f'number non smoker = {non_smoker}')\n",
"print(f'total number = {smoker + non_smoker}')\n",
"\n",
"print(f'Number of data : {len(raw_data)}')"
] ]
}, },
{ {
...@@ -346,7 +353,7 @@ ...@@ -346,7 +353,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Ah bah c'est dommage, les gens qui ne fument pas meurent plus que les gens qui fument... embêtant." "C'est dommage, il semblerait que les gens qui ne fument pas meurent plus que les gens qui fument... embêtant."
] ]
}, },
{ {
...@@ -825,15 +832,183 @@ ...@@ -825,15 +832,183 @@
"**Analyse :** \n", "**Analyse :** \n",
"- Dans le groupe 1 : chez les plus jeunes (18-34 ans), le fait de fumer n'influe pas énormément sur le taux de mortalité. \n", "- Dans le groupe 1 : chez les plus jeunes (18-34 ans), le fait de fumer n'influe pas énormément sur le taux de mortalité. \n",
"- Dans le groupe 2 : fumer tue, le taux de mortalité est presque 2 fois plus élévé chez les fumeurs. \n", "- Dans le groupe 2 : fumer tue, le taux de mortalité est presque 2 fois plus élévé chez les fumeurs. \n",
"- Dans le groupe 3 : tout le monde meurt. Mais un peu plus souvent chez les fumeurs. \n", "- Dans le groupe 3 : ici, le tabac semble augmenter un peu le taux de mortalité. \n",
"- Dans le groupe 4 : c'est catastrophique (mais normal, ils sont vieux), tout le monde meurt. " "- Dans le groupe 4 : c'est catastrophique, tout le monde meurt. Les fumeurs ont le même taux de mortalité que les non-fumeurs, ce qui est normal puisque c'est la catégorie où l'âge est le plus élevé. "
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"C'est donc le nombre de membres de la catégorie 4 qui biaise les données globales : il y en a beaucoup, dont une grande partie de non-fumeurs, là où pour les autres catégories, la proportion de fumeurs et de non-fumeurs est presque équivalente. Puisque beaucoup de vieux meurent, la consommation de tabac ne semble pas faire varier le taux, ce qui biaise le ratio global." "C'est donc la catégorie 4 qui semble biaiser les données globales."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Question 3"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"raw_data.loc[(raw_data.Status =='Dead'),'Death'] = 1\n",
"raw_data.loc[(raw_data.Status =='Alive'),'Death'] = 0"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Smoker</th>\n",
" <th>Status</th>\n",
" <th>Age</th>\n",
" <th>AgeGroup</th>\n",
" <th>Death</th>\n",
" <th>Intercept</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>21.0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>19.3</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>No</td>\n",
" <td>Dead</td>\n",
" <td>57.5</td>\n",
" <td>3</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>47.1</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>81.4</td>\n",
" <td>4</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>36.8</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>23.8</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>Yes</td>\n",
" <td>Dead</td>\n",
" <td>57.5</td>\n",
" <td>3</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>24.8</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>49.5</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Smoker Status Age AgeGroup Death Intercept\n",
"0 Yes Alive 21.0 1 0 1\n",
"1 Yes Alive 19.3 1 0 1\n",
"2 No Dead 57.5 3 1 1\n",
"3 No Alive 47.1 2 0 1\n",
"4 Yes Alive 81.4 4 0 1\n",
"5 No Alive 36.8 2 0 1\n",
"6 No Alive 23.8 1 0 1\n",
"7 Yes Dead 57.5 3 1 1\n",
"8 Yes Alive 24.8 1 0 1\n",
"9 Yes Alive 49.5 2 0 1"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"display(raw_data[:10])"
] ]
} }
], ],
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment