diff --git a/module3/exo3/exercice.ipynb b/module3/exo3/exercice.ipynb index 0bbbe371b01e359e381e43239412d77bf53fb1fb..23a6af10c748d851421d1ff66c43ef8cde6355fd 100644 --- a/module3/exo3/exercice.ipynb +++ b/module3/exo3/exercice.ipynb @@ -1,5 +1,930 @@ { - "cells": [], + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Around Simpson's Paradox" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "%matplotlib inline\n", + "import matplotlib.pyplot as plt\n", + "import pandas as pd\n", + "import numpy as np\n", + "import isoweek\n", + "import os\n", + "import requests" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In 1972-1974, in Whickham, a town in the north-east of England, located approximately 6.5 kilometres south-west of Newcastle upon Tyne, a survey of one-sixth of the electorate was conducted in order to inform work on thyroid and heart disease (Tunbridge and al. 1977). A continuation of this study was carried out twenty years later. (Vanderpump et al. 1995). Some of the results were related to smoking and whether individuals were still alive at the time of the second study. For the purpose of simplicity, we will restrict the data to women and among these to the 1314 that were categorized as \"smoking currently\" or \"never smoked\". There were relatively few women in the initial survey who smoked but have since quit (162) and very few for which information was not available (18). Survival at 20 years was determined for all women of the first survey.\n", + "\n", + "All these data are available in this [file CSV](https://gitlab.inria.fr/learninglab/mooc-rr/mooc-rr-ressources/blob/master/module3/Practical_session/Subject6_smoking.csv). You will find on each line if the person smokes or not, whether alive or dead at the time of the second study, and his age at the time of the first survey." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "__The mission is to:__\n", + "\n", + "1. Tabulate the total number of women alive and dead over the period according to their smoking habits. Calculate in each group (smoking/non-smoking) the mortality rate (the ratio of the number of women who died in a group to the total number of women in that group).\n", + "2. Go back to question 1 (numbers and mortality rates) and add a new category related to the age group.\n", + "3. In order to avoid a bias induced by arbitrary and non-regular age groupings, it is possible to try to perform a logistic regression." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 1. Smoking influence" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We should check whether we have the local csv file with the data and to download it if not." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "data_file = \"smoking.csv\"\n", + "data_url = \"https://gitlab.inria.fr/learninglab/mooc-rr/mooc-rr-ressources/-/raw/master/module3/Practical_session/Subject6_smoking.csv?inline=false\"\n", + "if not(os.path.exists(data_file)) :\n", + " with open(data_file, \"wb\") as file:\n", + " file.write(requests.get(data_url).content)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The dataset contains the information about smoking habits, age and survivability. The first column contains index. Here is a description of the rest of the columns:\n", + "\n", + "`Smoker` contains *Yes* or *No* value and shows whether a person smoked or not.\n", + "\n", + "`Status` contains *Alive* or *Dead* value and shows whether a person were alive or dead.\n", + "\n", + "`Age` contains a float value and indicates the age of a person." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
SmokerStatusAge
0YesAlive21.0
1YesAlive19.3
2NoDead57.5
3NoAlive47.1
4YesAlive81.4
5NoAlive36.8
6NoAlive23.8
7YesDead57.5
8YesAlive24.8
9YesAlive49.5
10YesAlive30.0
11NoDead66.0
12YesAlive49.2
13NoAlive58.4
14NoDead60.6
15NoAlive25.1
16NoAlive43.5
17NoAlive27.1
18NoAlive58.3
19YesAlive65.7
20NoDead73.2
21YesAlive38.3
22NoAlive33.4
23YesDead62.3
24NoAlive18.0
25NoAlive56.2
26YesAlive59.2
27NoAlive25.8
28NoDead36.9
29NoAlive20.2
............
1284YesDead36.0
1285YesAlive48.3
1286NoAlive63.1
1287NoAlive60.8
1288YesDead39.3
1289NoAlive36.7
1290NoAlive63.8
1291NoDead71.3
1292NoAlive57.7
1293NoAlive63.2
1294NoAlive46.6
1295YesDead82.4
1296YesAlive38.3
1297YesAlive32.7
1298NoAlive39.7
1299YesDead60.0
1300NoDead71.0
1301NoAlive20.5
1302NoAlive44.4
1303YesAlive31.2
1304YesAlive47.8
1305YesAlive60.9
1306NoDead61.4
1307YesAlive43.0
1308NoAlive42.1
1309YesAlive35.9
1310NoAlive22.3
1311YesDead62.1
1312NoDead88.6
1313NoAlive39.1
\n", + "

1314 rows × 3 columns

\n", + "
" + ], + "text/plain": [ + " Smoker Status Age\n", + "0 Yes Alive 21.0\n", + "1 Yes Alive 19.3\n", + "2 No Dead 57.5\n", + "3 No Alive 47.1\n", + "4 Yes Alive 81.4\n", + "5 No Alive 36.8\n", + "6 No Alive 23.8\n", + "7 Yes Dead 57.5\n", + "8 Yes Alive 24.8\n", + "9 Yes Alive 49.5\n", + "10 Yes Alive 30.0\n", + "11 No Dead 66.0\n", + "12 Yes Alive 49.2\n", + "13 No Alive 58.4\n", + "14 No Dead 60.6\n", + "15 No Alive 25.1\n", + "16 No Alive 43.5\n", + "17 No Alive 27.1\n", + "18 No Alive 58.3\n", + "19 Yes Alive 65.7\n", + "20 No Dead 73.2\n", + "21 Yes Alive 38.3\n", + "22 No Alive 33.4\n", + "23 Yes Dead 62.3\n", + "24 No Alive 18.0\n", + "25 No Alive 56.2\n", + "26 Yes Alive 59.2\n", + "27 No Alive 25.8\n", + "28 No Dead 36.9\n", + "29 No Alive 20.2\n", + "... ... ... ...\n", + "1284 Yes Dead 36.0\n", + "1285 Yes Alive 48.3\n", + "1286 No Alive 63.1\n", + "1287 No Alive 60.8\n", + "1288 Yes Dead 39.3\n", + "1289 No Alive 36.7\n", + "1290 No Alive 63.8\n", + "1291 No Dead 71.3\n", + "1292 No Alive 57.7\n", + "1293 No Alive 63.2\n", + "1294 No Alive 46.6\n", + "1295 Yes Dead 82.4\n", + "1296 Yes Alive 38.3\n", + "1297 Yes Alive 32.7\n", + "1298 No Alive 39.7\n", + "1299 Yes Dead 60.0\n", + "1300 No Dead 71.0\n", + "1301 No Alive 20.5\n", + "1302 No Alive 44.4\n", + "1303 Yes Alive 31.2\n", + "1304 Yes Alive 47.8\n", + "1305 Yes Alive 60.9\n", + "1306 No Dead 61.4\n", + "1307 Yes Alive 43.0\n", + "1308 No Alive 42.1\n", + "1309 Yes Alive 35.9\n", + "1310 No Alive 22.3\n", + "1311 Yes Dead 62.1\n", + "1312 No Dead 88.6\n", + "1313 No Alive 39.1\n", + "\n", + "[1314 rows x 3 columns]" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data = pd.read_csv(data_file)\n", + "data" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We divide the dataset into two groups depending on the `Status` value." + ] + }, + { + "cell_type": "code", + "execution_count": 83, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Our dataset contains 945 alive and 369 dead persons.\n" + ] + } + ], + "source": [ + "alive, dead = data[data[\"Status\"] == \"Alive\"], data[data[\"Status\"] == \"Dead\"]\n", + "print(\"Our dataset contains\", alive.shape[0], \"alive and\", dead.shape[0], \"dead persons.\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We calculate the mortality rate in both smoking and non-smoking groups." + ] + }, + { + "cell_type": "code", + "execution_count": 84, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "The mortality rate is 0.23883161512027493 in smoking group and 0.31420765027322406 in non-smoking group.\n" + ] + } + ], + "source": [ + "smokers, non_smokers = data[data[\"Smoker\"] == \"Yes\"], data[data[\"Smoker\"] == \"No\"]\n", + "print(\"The mortality rate is\", smokers[smokers[\"Status\"] == \"Dead\"].shape[0] / smokers.shape[0] , \"in smoking group and\", non_smokers[non_smokers[\"Status\"] == \"Dead\"].shape[0] / non_smokers.shape[0] , \"in non-smoking group.\")\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We will graph these data." + ] + }, + { + "cell_type": "code", + "execution_count": 80, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:8: UserWarning: Boolean Series key will be reindexed to match DataFrame index.\n", + " \n", + "/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:9: UserWarning: Boolean Series key will be reindexed to match DataFrame index.\n", + " if __name__ == '__main__':\n" + ] + }, + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAD4CAYAAAAD6PrjAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAZRUlEQVR4nO3de5RV5Z3m8e8jkKBGJWLBMhZaaGM0GZRgYWtrjNGOmgTBO3Yco442miUZO3QbbWO8TOKYTGJMMB2VHrrVdAc0ZAC1TUeDoiZZKoUSwBCV0NAUrVCigqggl9/8sd/aHrGq2Fz2OXV5PmudVXu/+z27fqdW7Xrq3VdFBGZmZgC71LoAMzPrPBwKZmaWcyiYmVnOoWBmZjmHgpmZ5XrXuoAdsc8++0RDQ0OtyzAz61LmzJnzakTUtbWsS4dCQ0MDTU1NtS7DzKxLkbS0vWXefWRmZjmHgpmZ5RwKZmaWcyiYmVnOoWBmZjmHgpmZ5UoNBUlLJM2XNFdSU2rbW9Ijkl5KXz+a2iVpgqRFkuZJGl5mbWZm9kHVGCl8NiKGRURjmr8amBkRQ4CZaR7g88CQ9BoL3F6F2szMrEItdh+NBu5O03cDp1W03xOZp4B+kvatQX1mZj1W2Vc0B/CwpADujIiJwMCIeDktfwUYmKb3A5ZVvLc5tb2MmXVJulG1LqGwuN4PHIPyQ+HYiFguaQDwiKQ/Vi6MiEiBUZiksWS7l9h///13XqVmZlbu7qOIWJ6+rgSmAUcCK1p3C6WvK1P35cCgirfXp7Yt1zkxIhojorGurs37OZmZ2XYqLRQk7S5pj9Zp4CRgAXA/cEHqdgEwI03fD3w5nYV0FLC6YjeTmZlVQZm7jwYC0yS1fp+fRcS/S5oN3CfpYmApcE7q/xDwBWAR8DZwUYm1mZlZG0oLhYhYDBzeRvsq4MQ22gO4vKx6zMxs63xFs5mZ5RwKZmaWcyiYmVnOoWBmZjmHgpmZ5RwKZmaWcyiYmVnOoWBmZjmHgpmZ5RwKZmaWcyiYmVnOoWBmZjmHgpmZ5RwKZmaWcyiYmVnOoWBmZjmHgpmZ5RwKZmaWK/MZzZ2ablStS9gmcX3UugQz6wE8UjAzs5xDwczMcg4FMzPLORTMzCznUDAzs5xDwczMcg4FMzPLORTMzCznUDAzs5xDwczMcg4FMzPLORTMzCznUDAzs5xDwczMcqWHgqRekp6T9GCaHyzpaUmLJN0r6UOp/cNpflFa3lB2bWZm9n7VGClcASysmP8ucGtE/BnwOnBxar8YeD2135r6mZlZFZUaCpLqgS8C/zfNCzgBmJq63A2clqZHp3nS8hNTfzMzq5KyRwo/BL4ObE7z/YE3ImJjmm8G9kvT+wHLANLy1an/+0gaK6lJUlNLS0uJpZuZ9TylhYKkkcDKiJizM9cbERMjojEiGuvq6nbmqs3Merwyn9F8DDBK0heAvsCewI+AfpJ6p9FAPbA89V8ODAKaJfUG9gJWlVifmZltobSRQkT8fUTUR0QDcC7waEScBzwGnJW6XQDMSNP3p3nS8kcjwk+rNzOrolpcp3AVMF7SIrJjBpNS+ySgf2ofD1xdg9rMzHq0Mncf5SJiFjArTS8Gjmyjzzrg7GrUY2ZmbfMVzWZmlnMomJlZzqFgZmY5h4KZmeUcCmZmlnMomJlZzqFgZmY5h4KZmeW2GgqSDpY0U9KCNH+YpGvLL83MzKqtyEjhH4G/BzYARMQ8snsZmZlZN1MkFHaLiGe2aNvYZk8zM+vSioTCq5IOAgJA0lnAy6VWZWZmNVHkhniXAxOBQyQtB/4D+O+lVmVmZjWx1VBIdzX9S0m7A7tExJvll2VmZrXQbihIGt9OOwAR8YOSajIzsxrpaKSwR9WqMDOzTqHdUIiIG6tZiJmZ1V6Ri9cOlPSApBZJKyXNkHRgNYozM7PqKnJK6s+A+4B9gY8BPwcml1mUmZnVRtGL134aERvT61+AvmUXZmZm1VfkOoVfSroamEJ2AdsY4CFJewNExGsl1mdmZlVUJBTOSV8v3aL9XLKQ8PEFM7NuosjFa4OrUYiZmdXeVkNBUh/gK8BxqWkWcGdEbCixLjMzq4Eiu49uB/oAP0nz56e2S8oqyszMaqNIKIyIiMMr5h+V9PuyCjIzs9opckrqpnTrbCC7mA3YVF5JZmZWK0VGClcCj0laDAg4ALio1KrMzKwmipx9NFPSEODjqemFiFhfbllmZlYLRe59tBvZaOGr6fnM+0saWXplZmZWdUWOKfwz8C5wdJpfDny7tIrMzKxmioTCQRHxf4ANABHxNtmxBTMz62aKhMK7knYlu6UF6UykrR5TkNRX0jOSfi/peUk3pvbBkp6WtEjSvZI+lNo/nOYXpeUN2/+xzMxsexQJheuBfwcGSfpXYCbw9QLvWw+ckK5xGAacIuko4LvArRHxZ8DrwMWp/8XA66n91tTPzMyqaKuhEBGPAGcAF5I9R6ExImYVeF9ExNo02ye9AjgBmJra7wZOS9Oj0zxp+YlqfSC0mZlVRZGRAsBngBOBzwKfLrpySb0kzQVWAo8AfwLeiIiNqUszsF+a3g9YBpCWrwb6t7HOsZKaJDW1tLQULcXMzAoockrqT4DLgPnAAuBSSf9QZOURsSkihgH1wJHAIdtfar7OiRHRGBGNdXV1O7o6MzOrUOSK5hOAQyOi9UDz3cDz2/JNIuINSY+RndbaT1LvNBqoJzvFlfR1ENAsqTewF7BqW76PmZntmCK7jxYB+1fMD0ptHZJUJ6lfmt4V+BywEHgMOCt1uwCYkabvT/Ok5Y+2BpGZmVVHkZHCHsBCSc+QHSg+EmiSdD9ARIxq5337AndL6kUWPvdFxIOS/gBMkfRt4DlgUuo/CfippEXAa2RPdjMzsyoqEgrXbc+K0y0xPtVG+2KyYNmyfR1w9vZ8LzMz2zmK3BDv8WoUYmZmtVf0lFQzM+sBHApmZpZrNxQkzUxffbsJM7MeoqNjCvtK+gtglKQpbHFn1Ih4ttTKzMys6joKheuAb5JdYPaDLZa13sPIzMy6kXZDISKmAlMlfTMivlXFmszMrEaKnJL6LUmjgONS06yIeLDcsszMrBaK3BDvZuAK4A/pdYWk/112YWZmVn1Frmj+IjAsIjZDfkO854BryizMzMyqr+h1Cv0qpvcqoQ4zM+sEiowUbgaeS7e+FtmxhatLrcrMzGqiyIHmyZJmASNS01UR8UqpVZmZWU0UGSkQES+TPe/AzMy6Md/7yMzMcg4FMzPLdRgKknpJ+mO1ijEzs9rqMBQiYhPwgqT9O+pnZmbdQ5EDzR8Fnk/PaH6rtbGDZzObmVkXVSQUvll6FWZm1ikUekazpAOAIRHxa0m7Ab3KL83MzKqtyA3x/hqYCtyZmvYDppdYk5mZ1UiRU1IvB44B1gBExEvAgDKLMjOz2ihyTGF9RLwrZU/jlNSb7MlrZmbdh7T1Pp1JlPNnuMhI4XFJ1wC7Svoc8HPggVKqMTOzmioSClcDLcB84FLgIeDaMosyM7PaKHL20eb0YJ2nyXYbvRBR0rjFzMxqaquhIOmLwB3An8iepzBY0qUR8cuyizMzs+oqcqD5FuCzEbEIQNJBwL8BDgUzs26myDGFN1sDIVkMvFlSPWZmVkPtjhQknZEmmyQ9BNxHdkzhbGB2FWozM7Mq62j30akV0yuAz6TpFmDX0ioyM7OaaTcUIuKiHVmxpEHAPcBAshHGxIj4kaS9gXuBBmAJcE5EvK7s6rgfAV8A3gYujIhnd6QGMzPbNkXOPhoMfJXsj3jev8CtszcCfxsRz0raA5gj6RHgQmBmRHxH0tVk10FcBXweGJJefw7cnr6amVmVFDn7aDowiewq5s1FVxwRLwMvp+k3JS0ku5neaOD41O1uYBZZKIwG7knXQDwlqZ+kfdN6zMysCoqEwrqImLAj30RSA/ApsgvgBlb8oX+FbPcSZIGxrOJtzantfaEgaSwwFmD//f1AODOznanIKak/knS9pKMlDW99Ff0Gkj4C/AL4m4hYU7ksjQq26eroiJgYEY0R0VhXV7ctbzUzs60oMlIYCpwPnMB7u48izXdIUh+yQPjXiPh/qXlF624hSfsCK1P7cmBQxdvrU5uZmVVJkVA4GzgwIt7dlhWns4kmAQsj4gcVi+4HLgC+k77OqGgfJ2kK2QHm1T6eYGZWXUVCYQHQj/f+oy/qGLIRxnxJc1PbNWRhcJ+ki4GlwDlp2UNkp6MuIjsldYdOiTUzs21XJBT6AX+UNBtY39q4tVNSI+I3ZDfQa8uJbfQPsqe8mZlZjRQJhetLr8LMzDqFIs9TeLwahZiZWe0VuaL5Td47bfRDQB/grYjYs8zCzMys+oqMFPZonU5nFI0GjiqzKDMzq40iF6/lIjMdOLmccszMrJaK7D46o2J2F6ARWFdaRWZmVjNFzj6qfK7CRrLbXY8upRozM6upIscUfBGZmVkP0dHjOK/r4H0REd8qoR4zM6uhjkYKb7XRtjtwMdAfcCiYmXUzHT2O85bW6fTktCvI7kc0BbilvfeZmVnX1eExhfQ85fHAeWRPSRseEa9XozAzM6u+jo4pfA84A5gIDI2ItVWryszMaqKji9f+FvgYcC3wX5LWpNebktZ08D4zM+uiOjqmsE1XO5uZWdfnP/xmZpZzKJiZWc6hYGZmOYeCmZnlitwQzzoDtfe4604oYut9zKxT8kjBzMxyDgUzM8s5FMzMLOdQMDOznEPBzMxyDgUzM8s5FMzMLOdQMDOznEPBzMxyDgUzM8s5FMzMLOdQMDOzXGmhIOmfJK2UtKCibW9Jj0h6KX39aGqXpAmSFkmaJ2l4WXWZmVn7yhwp3AWcskXb1cDMiBgCzEzzAJ8HhqTXWOD2EusyM7N2lBYKEfEE8NoWzaOBu9P03cBpFe33ROYpoJ+kfcuqzczM2lbtYwoDI+LlNP0KMDBN7wcsq+jXnNrMzKyKanagOSIC2OansUgaK6lJUlNLS0sJlZmZ9VzVDoUVrbuF0teVqX05MKiiX31q+4CImBgRjRHRWFdXV2qxZmY9TbVD4X7ggjR9ATCjov3L6Syko4DVFbuZzMysSkp7RrOkycDxwD6SmoHrge8A90m6GFgKnJO6PwR8AVgEvA1cVFZd1nNs2LCB5uZm1q1bV+tSqqpv377U19fTp0+fWpdiXVBpoRARf9XOohPb6BvA5WXVYj1Tc3Mze+yxBw0NDUiqdTlVERGsWrWK5uZmBg8eXOtyrAvyFc3Wba1bt47+/fv3mEAAkET//v173OjIdh6HgnVrPSkQWvXEz2w7j0PBzMxypR1TMOtsdOPO/Q86ri92mc306dM5/fTTWbhwIYcccghLlixh5MiRLFiwgKamJu655x4mTJiwU2sz214eKZiVbPLkyRx77LFMnjz5A8saGxsdCNapOBTMSrR27Vp+85vfMGnSJKZMmfKB5bNmzWLkyJFs3ryZhoYG3njjjXzZkCFDWLFiBS0tLZx55pmMGDGCESNG8Nvf/raKn8B6GoeCWYlmzJjBKaecwsEHH0z//v2ZM2dOm/122WUXRo8ezbRp0wB4+umnOeCAAxg4cCBXXHEFX/va15g9eza/+MUvuOSSS6r5EayHcSiYlWjy5Mmce+65AJx77rlt7kJqNWbMGO69914ApkyZwpgxYwD49a9/zbhx4xg2bBijRo1izZo1rF27tvzirUfygWazkrz22ms8+uijzJ8/H0ls2rQJSVx+edvXaR599NEsWrSIlpYWpk+fzrXXXgvA5s2beeqpp+jbt281y7ceyiMFs5JMnTqV888/n6VLl7JkyRKWLVvG4MGDWbZsWZv9JXH66aczfvx4Dj30UPr37w/ASSedxG233Zb3mzt3bjXKtx7KIwXrMYqeQrqzTJ48mauuuup9bWeeeSY333xzu+8ZM2YMI0aM4K677srbJkyYwOWXX85hhx3Gxo0bOe6447jjjjvKKtt6OGW3HeqaGhsbo6mpabveu7PPWS9b3FDrCrZBJ/mdWrhwIYceemity6iJzvLZu9J21qW2Mdih7UzSnIhobGuZdx+ZmVnOoWBmZjmHgpmZ5RwKZmaWcyiYmVnOoWBmZjmHgvUc0s59FdCrVy+GDRvGJz/5SQ4//HBuueUWNm/evFM+zg033MD3v//9nbIus1a+eM2sRLvuumt+BfLKlSv50pe+xJo1a7jxxhtrW5hZOzxSMKuSAQMGMHHiRH784x8TEWzatIkrr7ySESNGcNhhh3HnnXcC2e22TzzxRIYPH87QoUOZMWNGvo6bbrqJgw8+mGOPPZYXXnihVh/FujGPFMyq6MADD2TTpk2sXLmSGTNmsNdeezF79mzWr1/PMcccw0knncSgQYOYNm0ae+65J6+++ipHHXUUo0aN4tlnn2XKlCnMnTuXjRs3Mnz4cI444ohafyTrZhwKZjXy8MMPM2/ePKZOnQrA6tWreemll6ivr+eaa67hiSeeYJdddmH58uWsWLGCJ598ktNPP53ddtsNgFGjRtWyfOumHApmVbR48WJ69erFgAEDiAhuu+02Tj755Pf1ueuuu2hpaWHOnDn06dOHhoYG1q1bV6OKrafxMQWzKmlpaeGyyy5j3LhxSOLkk0/m9ttvZ8OGDQC8+OKLvPXWW6xevZoBAwbQp08fHnvsMZYuXQrAcccdx/Tp03nnnXd48803eeCBB2r5cayb8kjBeo4a3L31nXfeYdiwYWzYsIHevXtz/vnnM378eAAuueQSlixZwvDhw4kI6urqmD59Oueddx6nnnoqQ4cOpbGxkUMOOQSA4cOHM2bMGA4//HAGDBjAiBEjqv55rPvzrbO7iC51W99O8jvVWW4fXQud5bN3pe2sS21j4Ftnm5lZ+RwKZmaWcyhYt9aVd49ur574mW3ncShYt9W3b19WrVrVo/5IRgSrVq2ib9++tS7FuiiffWTdVn19Pc3NzbS0tNS6lKrq27cv9fX1tS7DuiiHgnVbffr0YfDgwbUuw6xL6VS7jySdIukFSYskXV3reszMeppOEwqSegH/AHwe+ATwV5I+UduqzMx6lk4TCsCRwKKIWBwR7wJTgNE1rsnMrEfpTMcU9gOWVcw3A3++ZSdJY4GxaXatpB5xU/kSrwvdB3h1p66x4FPJzDqTLrWNwY5uZwe0t6AzhUIhETERmFjrOroLSU3tXe5uZjuuq21jnWn30XJgUMV8fWozM7Mq6UyhMBsYImmwpA8B5wL317gmM7MepdPsPoqIjZLGAb8CegH/FBHP17isnsC74szK1aW2sS5962wzM9u5OtPuIzMzqzGHgpmZ5RwKXYykb0h6XtI8SXMlfeBajm1c3/GSHtxZ9Zl1NpJC0i0V838n6YYaltQhSbMk1ewUVodCFyLpaGAkMDwiDgP+kvdf8FftejrNiQpmHVgPnCFpn1oXUqadtT06FLqWfYFXI2I9QES8GhH/JWmJpJvTyKFJ0nBJv5L0J0mXASjzPUkLJM2XNGbLlUsaIek5SQdJOkLS45LmpHXtm/rMkvRDSU3AFZLOTuv8vaQnqvnDMCtoI9kZQF/bcoGkBkmPppH3TEn7p/a7JE2Q9DtJiyWd1daK2/r9l3ShpOmSHknb5jhJ49O29ZSkvVO/YWl+nqRpkj66xbp3SXV8W1KvtP3OTv0vTX2Ol/SkpPuBP0jaXdK/pXoWtLWdb1VE+NVFXsBHgLnAi8BPgM+k9iXAV9L0rcA8YA+gDliR2s8EHiE73Xcg8J9kIXM88CDwF8AcYH+gD/A7oC69dwzZKcIAs4CfVNQ0H9gvTfer9c/IL7+2fAFrgT3TdrIX8HfADWnZA8AFafp/ANPT9F3Az8n+cf4E2X3Z2lr3B37/gQuBRRXb4GrgsrTsVuBv0vS8im34fwE/TNOzgKOAycA3UttY4No0/WGgCRictt+3gMFp2ZnAP1bUt9e2/rw8UuhCImItcATZL0gLcK+kC9Pi1gv95gNPR8SbEdECrJfUDzgWmBwRmyJiBfA4MCK951Cy/6ROjYj/BD4O/DfgEUlzgWvJrjBvdW/F9G+BuyT9NVngmHU6EbEGuAf4n1ssOhr4WZr+Kdl20mp6RGyOiD+Q/SPVlvZ+/x+r2AZXk4UPZNtng6S9yELk8dR+N3BcxfvvBBZExE1p/iTgy2l7fBroDwxJy56JiP+oWP/nJH1X0qcjYnU7dbfLodDFpD/qsyLiemAc2X8GkO03BdhcMd06v7V9jS8D64BPpXkBz0fEsPQaGhEnVfR/q6Key8hCYxAwR1L/7flcZlXwQ+BiYPeC/Su3IwFIuintpp0LHf7+b7kNVm6fRfb9/w74rKTW56oK+GrFNjk4Ih5Oyyq3xxeB4WTh8G1J1xX7qO9xKHQhkj4uaUhF0zBgacG3PwmMSfsm68j+K3kmLXsD+CJws6TjgReAunRgG0l9JH2ynZoOioinI+I6stHLoLb6mdVaRLwG3EcWDK1+R3ZLHYDzyLaTjtbxjdY/zLD9v//pP/jXJX06NZ1PNnpvNQl4CLgvHUD+FfAVSX3S9z1Y0gfCTdLHgLcj4l+A75EFxDbx2SNdy0eA29LuoI1k+y3Hkp2RtDXTyIbKvwcC+HpEvCLpEICIWCFpJPBLsn2rZwET0jC3N9l/WW3dduR7KagEzEzrN+usbiEbYbf6KvDPkq4k+6N+0Taur63f/2EF33sBcIek3YDFW37viPhB2v5+ShZYDcCzkpRqPa2NdQ5NNW0GNgBf2cbP49tcmJnZe7z7yMzMcg4FMzPLORTMzCznUDAzs5xDwczMcg4FMzPLORTMzCz3/wHkrtFkRYGHZwAAAABJRU5ErkJggg==\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "smokers_alive = data[data[\"Smoker\"] == \"Yes\"][data[\"Status\"] == \"Alive\"].shape[0]\n", + "non_smokers_alive = data[data[\"Smoker\"] == \"No\"][data[\"Status\"] == \"Alive\"].shape[0]\n", + "smokers_dead = data[data[\"Smoker\"] == \"Yes\"].shape[0] - smokers_alive\n", + "non_smokers_dead = data[data[\"Smoker\"] == \"No\"].shape[0] - non_smokers_alive\n", + "\n", + "x = np.arange(2)\n", + "width = 0.2\n", + "plt.bar(x-width, height=[smokers_alive, non_smokers_alive],width=width,color='green')\n", + "plt.bar(x, [smokers_dead, non_smokers_dead], width, color='red')\n", + "plt.xticks(x, ['Smokers', 'Non-smokers'])\n", + "plt.ylabel(\"Number of people\")\n", + "plt.legend([\"Alive\", \"Dead\"])\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "According to our data, there is almost no corelation between smoking and the death rate. Moreover, the mortality rate is a little higher in a non-smoking group. Perharps, we did not consider all the necessary factors. Therefore, we will observe the influence of the age. We distinguish four age groups :\n", + "- __young :__ under 34 years old,\n", + "- __middle-aged :__ between 34 and 55 years old,\n", + "- __elder adults :__ betwen 55 and 65 years old,\n", + "- __seniors :__ above 65 years old." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 2. Age influence" + ] + }, + { + "cell_type": "code", + "execution_count": 105, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "The mortality rate is\n", + " 0.027932960893854747 in young smoking group,\n", + " 0.9726027397260274 in young non-smoking group,\n", + " 0.17154811715481172 in middle-aged smoking group,\n", + " 0.9045226130653267 in middle-aged non-smoking group,\n", + " 0.4434782608695652 in elder adults smoking group,\n", + " 0.6721311475409836 in elder adults non-smoking group,\n", + " 0.8571428571428571 in senior smoking group and\n", + " 0.140625 in senior non-smoking group,\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:3: UserWarning: Boolean Series key will be reindexed to match DataFrame index.\n", + " This is separate from the ipykernel package so we can avoid doing imports until\n", + "/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:6: UserWarning: Boolean Series key will be reindexed to match DataFrame index.\n", + " \n", + "/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:7: UserWarning: Boolean Series key will be reindexed to match DataFrame index.\n", + " import sys\n", + "/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:8: UserWarning: Boolean Series key will be reindexed to match DataFrame index.\n", + " \n", + "/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:9: UserWarning: Boolean Series key will be reindexed to match DataFrame index.\n", + " if __name__ == '__main__':\n", + "/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:10: UserWarning: Boolean Series key will be reindexed to match DataFrame index.\n", + " # Remove the CWD from sys.path while we load stuff.\n", + "/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:11: UserWarning: Boolean Series key will be reindexed to match DataFrame index.\n", + " # This is added back by InteractiveShellApp.init_path()\n", + "/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:12: UserWarning: Boolean Series key will be reindexed to match DataFrame index.\n", + " if sys.path[0] == '':\n", + "/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:13: UserWarning: Boolean Series key will be reindexed to match DataFrame index.\n", + " del sys.path[0]\n", + "/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:15: UserWarning: Boolean Series key will be reindexed to match DataFrame index.\n", + " from ipykernel import kernelapp as app\n", + "/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:16: UserWarning: Boolean Series key will be reindexed to match DataFrame index.\n", + " app.launch_new_instance()\n", + "/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:17: UserWarning: Boolean Series key will be reindexed to match DataFrame index.\n", + "/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:18: UserWarning: Boolean Series key will be reindexed to match DataFrame index.\n", + "/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:19: UserWarning: Boolean Series key will be reindexed to match DataFrame index.\n", + "/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:20: UserWarning: Boolean Series key will be reindexed to match DataFrame index.\n", + "/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:21: UserWarning: Boolean Series key will be reindexed to match DataFrame index.\n", + "/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:22: UserWarning: Boolean Series key will be reindexed to match DataFrame index.\n" + ] + } + ], + "source": [ + "young, middle_aged, elder_adults, seniors = data[data.Age < 34], data[(34 <= data.Age) & (data.Age < 55)], data[(55 <= data.Age) & (data.Age <= 65)], data[data.Age > 65]\n", + "\n", + "smokers_alive = data[data[\"Smoker\"] == \"Yes\"][data[\"Status\"] == \"Alive\"].shape[0]\n", + "\n", + "\n", + "young_smokers = young[data[\"Smoker\"] == \"Yes\"]\n", + "young_non_smokers = young[data[\"Smoker\"] == \"No\"]\n", + "middle_aged_smokers = middle_aged[data[\"Smoker\"] == \"Yes\"]\n", + "middle_aged_non_smokers = middle_aged[data[\"Smoker\"] == \"No\"]\n", + "elder_adults_smokers = elder_adults[data[\"Smoker\"] == \"Yes\"]\n", + "elder_adults_non_smokers = elder_adults[data[\"Smoker\"] == \"No\"]\n", + "seniors_smokers = seniors[data[\"Smoker\"] == \"Yes\"]\n", + "seniors_non_smokers = seniors[data[\"Smoker\"] == \"No\"]\n", + "\n", + "young_smokers_alive = young_smokers[data[\"Status\"] == \"Alive\"]\n", + "young_non_smokers_alive = young_non_smokers[data[\"Status\"] == \"Dead\"]\n", + "middle_aged_smokers_alive = middle_aged_smokers[data[\"Status\"] == \"Alive\"]\n", + "middle_aged_non_smokers_alive = middle_aged_non_smokers[data[\"Status\"] == \"Dead\"]\n", + "elder_adults_smokers_alive = elder_adults_smokers[data[\"Status\"] == \"Alive\"]\n", + "elder_adults_non_smokers_alive = elder_adults_non_smokers[data[\"Status\"] == \"Dead\"]\n", + "seniors_smokers_alive = seniors_smokers[data[\"Status\"] == \"Alive\"]\n", + "seniors_non_smokers_alive = seniors_non_smokers[data[\"Status\"] == \"Dead\"]\n", + "\n", + "print(\"The mortality rate is\\n\",\n", + " (young_smokers.shape[0]- young_smokers_alive.shape[0])/ young_smokers.shape[0],\n", + " \"in young smoking group,\\n\",\n", + " (young_non_smokers.shape[0]- young_non_smokers_alive.shape[0]) / young_non_smokers.shape[0],\n", + " \"in young non-smoking group,\\n\",\n", + " (middle_aged_smokers.shape[0]- middle_aged_smokers_alive.shape[0]) / middle_aged_smokers.shape[0],\n", + " \"in middle-aged smoking group,\\n\",\n", + " (middle_aged_non_smokers.shape[0]- middle_aged_non_smokers_alive.shape[0]) / middle_aged_non_smokers.shape[0],\n", + " \"in middle-aged non-smoking group,\\n\",\n", + " (elder_adults_smokers.shape[0]- elder_adults_smokers_alive.shape[0]) / elder_adults_smokers.shape[0],\n", + " \"in elder adults smoking group,\\n\",\n", + " (elder_adults_non_smokers.shape[0]- elder_adults_non_smokers_alive.shape[0]) / elder_adults_non_smokers.shape[0],\n", + " \"in elder adults non-smoking group,\\n\",\n", + " (seniors_smokers.shape[0]- seniors_smokers_alive.shape[0]) / seniors_smokers.shape[0],\n", + " \"in senior smoking group and\\n\",\n", + " (seniors_non_smokers.shape[0]- seniors_non_smokers_alive.shape[0]) / seniors_non_smokers.shape[0],\n", + " \"in senior non-smoking group,\",)\n" + ] + }, + { + "cell_type": "code", + "execution_count": 124, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAEbCAYAAAA1T5h7AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAiIElEQVR4nO3de5gcZZn38e+PEDZBziHhzZLgBN4gIMEYJgqbCCoKiBg2ogZeNxyEDbiwsrC6IrIcPCy6rLoCigZhUZEJCiZBRZeIhNMryARiDoZDiGFJjGEAgQQh5nDvH/VMpzP0zNQcuqtn5ve5rr6m+uk63JmqzN1V9Tx3KSIwMzMD2K7oAMzMrH44KZiZWYmTgpmZlTgpmJlZiZOCmZmVOCmYmVnJ9kUH0BN77rlnNDQ0FB2GmVmfsmDBguciYnilz/p0UmhoaKC5ubnoMMzM+hRJT7f3mS8fmZlZiZOCmZmVOCmYmVlJn76nYGaW18aNG1m1ahWvvfZa0aHUzJAhQxg1ahSDBw/OvYyTgpkNCKtWrWLnnXemoaEBSUWHU3URwfPPP8+qVasYM2ZM7uV8+cjMBoTXXnuNYcOGDYiEACCJYcOGdfnMyEnBzAaMgZIQWnXn3+ukYGZWQ3PmzEESjz32GAArV67k4IMPBqC5uZlPfOITRYbnewpmfUJPv+H6YVqvo8t796whLs33O25qamLy5Mk0NTVx+eWXb/NZY2MjjY2NvRpXV/lMwcysRtavX8/999/P9ddfz6xZs173+fz58zn++OPZsmULDQ0NvPjii6XPxo4dy9q1a2lpaeHEE09k4sSJTJw4kQceeKBXY3RSMDOrkblz53Lsscey//77M2zYMBYsWFBxvu22244TTjiB2bNnA/DQQw/xxje+kb322ovzzjuP888/n4cffpjbbruNM888s1djdFIwM6uRpqYmTjrpJABOOukkmpqa2p132rRp3HLLLQDMmjWLadOmAfDLX/6Sc889l/HjxzNlyhRefvll1q9f32sx+p6CmVkNvPDCC/zqV79i8eLFSGLz5s1I4pxzzqk4/+GHH87y5ctpaWlhzpw5XHzxxQBs2bKFBx98kCFDhlQlTp8pmJnVwK233sr06dN5+umnWblyJc888wxjxozhmWeeqTi/JKZOncoFF1zAgQceyLBhwwA4+uijufrqq0vzLVy4sFfjdFIwM6uBpqYmpk6duk3biSeeyBVXXNHuMtOmTeOmm24qXToCuOqqq2hubuaQQw7hoIMO4lvf+lavxqnow13VGhsbw89TsAHBXVJ7bNmyZRx44IFFh1Fzlf7dkhZERMW+rz5TMDOzkqolBUmjJd0t6XeSlko6L7XvIWmepCfTz91TuyRdJWm5pEWSJlQrNjMzq6yaZwqbgH+OiIOAw4BzJB0EXAjcFRFjgbvSe4D3AWPTawZwbRVjMzOzCqqWFCJiTUQ8kqbXAcuAvYETgO+m2b4L/G2aPgH4XmQeBHaTNLJa8ZmZ2evV5J6CpAbgrcBDwF4RsSZ99EdgrzS9N1DeN2tVajMzsxqpelKQtBNwG/BPEfFy+WeRdX3qUrcISTMkNUtqbmlp6cVIzcysqiOaJQ0mSwg/iIgfp+a1kkZGxJp0eejZ1L4aGF22+KjUto2ImAnMhKxLatWCL1hPKjjmrdZoZrU1aNAgxo0bx8aNG9l+++055ZRTOP/889luu55/P7/sssvYaaed+OQnP9mj9VSz95GA64FlEfHVso9uB05N06cCc8vaT0m9kA4DXiq7zGRm1ruk3n3lMHToUBYuXMjSpUuZN28eP//5z19XPrto1bx8NAmYDrxb0sL0Og74EvBeSU8C70nvAe4AVgDLgeuAf6hibGZmhRoxYgQzZ87kmmuuISLYvHkzn/rUp5g4cSKHHHII3/72t4Gs3PZRRx3FhAkTGDduHHPnzi2t44tf/CL7778/kydP5vHHH++VuKp2+Sgi7gfaS59HVZg/gMqVoczM+qF9992XzZs38+yzzzJ37lx23XVXHn74YTZs2MCkSZM4+uijGT16NLNnz2aXXXbhueee47DDDmPKlCk88sgjzJo1i4ULF7Jp0yYmTJjAoYce2uOYXCXVzKwO3HnnnSxatIhbb70VgJdeeoknn3ySUaNGcdFFF3Hvvfey3XbbsXr1atauXct9993H1KlT2XHHHQGYMmVKr8ThpGBmVpAVK1YwaNAgRowYQURw9dVXc8wxx2wzz4033khLSwsLFixg8ODBNDQ08Nprr1UtJtc+MjMrQEtLC2effTbnnnsukjjmmGO49tpr2bhxIwBPPPEEr7zyCi+99BIjRoxg8ODB3H333Tz99NMAHHHEEcyZM4dXX32VdevW8ZOf/KRX4vKZgplZjbz66quMHz++1CV1+vTpXHDBBQCceeaZrFy5kgkTJhARDB8+nDlz5vDRj36UD3zgA4wbN47GxkYOOOAAACZMmMC0adN4y1vewogRI5g4cWKvxOjS2XXK4xRsGy6d3WMunb2VS2ebmVkuTgpmZlbipGBmZiVOCmY2YPTle6jd0Z1/r5OCmQ0IQ4YM4fnnnx8wiSEieP755xkyZEiXlnOXVDMbEEaNGsWqVasYSCX3hwwZwqhRo7q0jJOCmQ0IgwcPZsyYMUWHUfd8+cjMzEqcFMzMrMSXj8ysZzzaul+p5pPXbpD0rKQlZW23lD1wZ6Wkham9QdKrZZ99q1pxmZlZ+6p5pnAjcA3wvdaGiJjWOi3pK8BLZfM/FRHjqxiPmZl1oppPXrtXUkOlz9Lzmz8CvLta2zczs64r6kbzO4C1EfFkWdsYSY9KukfSOwqKy8xsQCvqRvPJQFPZ+zXAPhHxvKRDgTmS3hwRL7ddUNIMYAbAPvvsU5NgzcwGiponBUnbAx8ESk+YjogNwIY0vUDSU8D+wOselhARM4GZkD1PoUex+JkFZmbbKOLy0XuAxyJiVWuDpOGSBqXpfYGxwIoCYjMzG9Cq2SW1Cfg18CZJqySdkT46iW0vHQEcASxKXVRvBc6OiBeqFZuZmVVWzd5HJ7fTflqFttuA26oVi5mZ5eMyF2ZmVuKkYGZmJU4KZmZW4qRgZmYlTgpmZlbi0tlmZt3UHwfA+kzBzMxKnBTMzKzEScHMzEqcFMzMrMRJwczMSpwUzMysxEnBzMxKnBTMzKyk06QgaX9Jd0lakt4fIuni6odmZma1ludM4TrgM8BGgIhYRPagHDMz62fyJIUdI+I3bdo2dbaQpBskPdt6hpHaLpO0WtLC9Dqu7LPPSFou6XFJx+T/J5iZWW/JkxSek7QfEACSPgSsybHcjcCxFdq/FhHj0+uOtM6DyM4+3pyW+WbrM5vNzKx28hTEOweYCRwgaTXwe+DvOlsoIu6V1JAzjhOAWRGxAfi9pOXA28ie8WxmZjXS6ZlCRKyIiPcAw4EDImJyRKzswTbPlbQoXV7aPbXtDTxTNs+q1PY6kmZIapbU3NLS0oMwzMysrXbPFCRd0E47ABHx1W5s71rg82SXoj4PfAX4WFdWEBEzyc5caGxsrM/as2ZmfVRHl4927u2NRcTa1mlJ1wE/TW9XA6PLZh2V2szMrIbaTQoRcXlvb0zSyIhovUk9FWjtmXQ7cLOkrwJ/DYwF2vZ4MjOzKuv0RrOkfYGvA4eRXfb5NXB+RKzoZLkm4J3AnpJWAZcC75Q0Pq1nJXAWQEQslfRD4Hdk3V3PiYjN3fsnmZlZd+XpfXQz8A2yb/aQdR1tAt7e0UIRcXKF5us7mP+LwBdzxGNmZlWSd/Da9yNiU3rdBAypdmBmZlZ7ec4Ufi7pQmAW2WWfacAdkvYAiIgXqhifmZnVUJ6k8JH086w27SeRJYl9ezUiMzMrTKdJISLG1CIQMzMrXp7eR4OBjwNHpKb5wLcjYmMV4zIzswLkuXx0LTAY+GZ6Pz21nVmtoMzMrBh5ksLEiHhL2ftfSfpttQIyM7Pi5OmSujmVzgZKg9k8sMzMrB/Kc6bwKeBuSSsAAW8ETq9qVGZmVog8vY/ukjQWeFNqejw998DMzPqZTi8fSdqR7GzhH9PzmfeRdHzVIzMzs5rLc0/hv4C/AIen96uBL1QtIjMzK0yepLBfRPw7sBEgIv5Mdm/BzMz6mTxJ4S+ShpKVtCD1RPI9BTOzfihP76NLgV8AoyX9AJgEnFbNoMzMrBh5eh/Nk/QI2UN2BJwXEc91tpykG4DjgWcj4uDUdiXwAbJ7FE8Bp0fEi5IagGXA42nxByPi7G78e8zMrAfyXD4COBI4CngX8I6cy9wIHNumbR5wcEQcAjwBfKbss6ciYnx6OSGYmRUgT5fUbwJnA4vJnql8lqRvdLZcRNwLvNCm7c6I2JTePgiM6nLEZmZWNXnuKbwbODAiWm80fxdY2gvb/hhwS9n7MZIeBV4GLo6I+yotJGkGMANgn3326YUwzMysVZ7LR8uB8r++o1Nbt0n6LLAJ+EFqWgPsExFvBS4Abpa0S6VlI2JmRDRGROPw4cN7EoaZmbWR50xhZ2CZpN+QdUt9G9As6XaAiJjSlQ1KOo3sBvRRrWcfqWzGhjS9QNJTwP5Ac1fWbWZmPZMnKVzSWxuTdCzwL8CRaRBca/tw4IWI2JyqsI4FVvTWds3MLJ88XVLv6c6KJTUB7wT2lLSKbLzDZ4C/AuZJgq1dT48APidpI7AFODsiXqi4YjMzq5o8ZwrdEhEnV2i+vp15bwNuq1YsZmaWT95xCmZmNgC0mxQk3ZV+frl24ZiZWZE6unw0UtLfAFMkzaJNZdSIeKSqkZmZWc11lBQuAf6VbNTxV9t8FmSD2szMrB9pNylExK3ArZL+NSI+X8OYzMysIHm6pH5e0hSybqMA8yPip9UNy8zMipCnIN4VwHnA79LrPEn/Vu3AzMys9vKMU3g/MD4itkCpIN6jwEXVDMzMzGov7ziF3cqmd61CHGZmVgfynClcATwq6W6ybqlHABdWNSozMytEnhvNTZLmAxNT06cj4o9VjcrMzAqRq/ZRRKwBbq9yLGZmVjDXPjIzsxInBTMzK+kwKUgaJOmxWgVjZmbF6jApRMRm4HFJ+3Q0X3sk3SDpWUlLytr2kDRP0pPp5+6pXZKukrRc0iJJE7qzTTMz6748l492B5ZKukvS7a2vnOu/ETi2TduFwF0RMRa4i63dW99H9hjOscAM4Nqc2zAzs16Sp/fRv3Z35RFxr6SGNs0nkD2mE+C7wHzg06n9exERwIOSdpM0MvV8MjOzGsj1jGZJbwTGRsQvJe0IDOrBNvcq+0P/R2CvNL038EzZfKtSm5OCmVmN5CmI9/fArcC3U9PewJze2Hg6K4iuLCNphqRmSc0tLS29EYaZmSV57imcA0wCXgaIiCeBET3Y5lpJIwHSz2dT+2pgdNl8o1LbNiJiZkQ0RkTj8OHDexCGmZm1lScpbIiIv7S+kbQ9Xfx238btwKlp+lRgbln7KakX0mHAS76fYGZWW3luNN8j6SJgqKT3Av8A/CTPyiU1kd1U3lPSKuBS4EvADyWdATwNfCTNfgdwHLAc+DNwehf+HWZm1gvyJIULgTOAxcBZZH+8v5Nn5RFxcjsfHVVh3iC7VGVmZgXJ0/toS3qwzkNkl40eT3/Azcysn+k0KUh6P/At4Cmy5ymMkXRWRPy82sGZmVlt5bl89BXgXRGxHEDSfsDPACcFM7N+Jk/vo3WtCSFZAayrUjxmZlagds8UJH0wTTZLugP4Idk9hQ8DD9cgNjMzq7GOLh99oGx6LXBkmm4BhlYtIjMzK0y7SSEiPE7AzGyAydP7aAzwj0BD+fwRMaV6YZmZWRHy9D6aA1xPNop5S1WjMTOzQuVJCq9FxFVVj8TMzAqXJyl8XdKlwJ3AhtbGiHikalGZmVkh8iSFccB04N1svXwU6b2ZmfUjeZLCh4F9y8tnm5n1CVLPlh+AZd7yjGheAuxW5TjMzKwO5DlT2A14TNLDbHtPwV1Szcz6mTxJ4dKqR2FmZnUhz/MU7unNDUp6E3BLWdO+wCVkZyR/T1ZGA+CiiLijN7dtZmYdyzOieR1bn8m8AzAYeCUidunOBiPicWB8WvcgYDUwm+zxm1+LiP/oznrNzKzn8pwp7Nw6LUnACcBhvbT9o4CnIuJp9bSXgJmZ9Vie3kclkZkDHNNL2z8JaCp7f66kRZJukLR7pQUkzZDULKm5paWl0ixmZtZNnSYFSR8se31I0peA13q6YUk7AFOAH6Wma4H9yC4trSF74tvrRMTMiGiMiMbhw4f3NAwzMyuTp/dR+XMVNgEryS4h9dT7gEciYi1A608ASdcBP+2FbZiZWRfkuadQrecqnEzZpSNJIyNiTXo7lWzQnJmZ1VBHj+O8pIPlIiI+392NSnoD8F7grLLmf5c0nqyn08o2n5mZWQ10dKbwSoW2NwBnAMOAbieFiHglraO8bXp312dmZr2jo8dxlm70StoZOI9sLMEs2rkJbGZmfVuH9xQk7QFcAHwU+C4wISL+VIvAzMys9jq6p3Al8EFgJjAuItbXLCozMytER+MU/hn4a+Bi4A+SXk6vdZJerk14ZmZWSx3dU+jSaGczM+v78gxeM9uGLu9+naq4dOA9ycqsL/HZgJmZlTgpmJlZiZOCmZmVOCmYmVmJk4KZmZU4KZiZWYmTgpmZlTgpmJlZiZOCmZmVFDaiWdJKYB2wGdgUEY2pKustQAPZg3Y+4qqsZma1U/SZwrsiYnxENKb3FwJ3RcRY4K703szMaqTopNDWCWTPbSD9/NviQrGqkHr2MrOqKjIpBHCnpAWSZqS2vSJiTZr+I7BX24UkzZDULKm5paWlVrGamQ0IRVZJnRwRqyWNAOZJeqz8w4gISa8rqRkRM8ke/ENjY6NLbpqZ9aLCzhQiYnX6+SwwG3gbsFbSSID089mi4jMzG4gKSQqS3iBp59Zp4GhgCXA7cGqa7VRgbhHxmZkNVEVdPtoLmK3sxuH2wM0R8QtJDwM/lHQG8DTwkYLiMzMbkApJChGxAnhLhfbngaNqH5GZmUH9dUk1M7MCOSmYmVlJkV1SzepLTwfHhXtIW9/npNAf+Y+bmXWTk4KZ1TVd3v0vOf5603W+p2BmZiVOCmZmVuKkYGZmJU4KZmZW4qRgZmYlTgpmZlbipGBmZiVOCmZmVuKkYGZmJU4KZmZWUvOkIGm0pLsl/U7SUknnpfbLJK2WtDC9jqt1bGZmA10RtY82Af8cEY+kR3IukDQvffa1iPiPAmIyMzMKSAoRsQZYk6bXSVoG7F3rOMzMClWn1YwLvacgqQF4K/BQajpX0iJJN0javbjIzMwGpsKSgqSdgNuAf4qIl4Frgf2A8WRnEl9pZ7kZkpolNbe0tNQqXDOzAaGQpCBpMFlC+EFE/BggItZGxOaI2AJcB7yt0rIRMTMiGiOicfjw4bUL2sxsACii95GA64FlEfHVsvaRZbNNBZbUOjYzs4GuiN5Hk4DpwGJJC1PbRcDJksaTPSxpJXBWAbGZmQ1oRfQ+uh+odNv9jlrHYmZm2/KIZjMzKyni8pGZ1Rld3v0+89XpLW9F8ZmCmZmVOCmYmVmJk4KZmZU4KZiZWYmTgpmZlTgpmJlZiZOCmZmVOCmYmVmJk4KZmZU4KZiZWYnLXFi/4nINZj3jpGBWI05Y1hc4KXRXnT5028ysJ3xPwczMSuouKUg6VtLjkpZLurDoeMzMBpK6SgqSBgHfAN4HHET2iM6Dio3KzGzgqKukALwNWB4RKyLiL8As4ISCYzIzGzDq7Ubz3sAzZe9XAW8vn0HSDGBGerte0uM1im0bOW4z7wk81/4KenijugOOrXscW/c4tu4pOLY3tvdBvSWFTkXETGBm0XF0RlJzRDQWHUcljq17HFv3OLbuKSq2ert8tBoYXfZ+VGozM7MaqLek8DAwVtIYSTsAJwG3FxyTmdmAUVeXjyJik6Rzgf8GBgE3RMTSgsPqrnq+xOXYusexdY9j655CYlN4ZK2ZmSX1dvnIzMwK5KRgZmYlTgodUOZ+Se8ra/uwpF8UGFNIuqns/faSWiT9NL2f0l55EEnr22m/UdKH0vR8Sb3WDU7SZkkLy14XdrQdSadJuqa3tl+23s9KWippUYrj7Z0v9bp1NEq6qrdjK1u/j7eex+vjrYfq6kZzvYmIkHQ28CNJd5P9vv4NOLbAsF4BDpY0NCJeBd5LWbfdiLid+uqx9WpEjK/WyiVtHxGbOpnncOB4YEJEbJC0J7BDV7cVEc1Ac2/G1mb9Pt56zsdbD/lMoRMRsQT4CfBp4BLgJuAr6RvAg5IOAZB0maRPti4naYmkhvRaJum69M3hTklD0zwTy75JXClpSc6w7gDen6ZPBprKtlv65pO69v5a0mJJXyibR5KuUVZ48JfAiEobkXR0Wv4RST+StFPO+LpE0umSnpD0G2BSWftwSbdJeji9JqX2yyR9X9IDwPdzbGIk8FxEbACIiOci4g+SDpV0j6QFkv5b0si0/vmSvizpNymud6T2d5Z9Q95D0px2joNSbJLenNazMM07tqNAfbz5eKvl8VaJk0I+lwP/j6xQ3/8BHo2IQ4CLgO/lWH4s8I2IeDPwInBiav8v4Kz0zWZzF+KZBZwkaQhwCPBQO/N9Hbg2IsYBa8rapwJvIis6eArwN20XVPbt5mLgPRExgewbywVdiLHVUG17Oj+tzXZGkv1+JwGTU0zl8X8tIiaS/c6+U/bZQSm2k3PEcCcwOv2H+6akIyUNBq4GPhQRhwI3AF8sW2b7iHgb8E/ApRXWeTntHwflsZ0NfD3t40ay0i2d8fHm462tah5v2/Dloxwi4hVJtwDryb4pnZjafyVpmKRdOlnF7yNiYZpeADRI2g3YOSJ+ndpvJjvlzBPPIkkNKZY7Oph1Elv/IHwf+HKaPgJoiojNwB8k/arCsoeRHWwPKKuxsgPw6wrzdaaz0/m3A/MjogUg/Z73T5+9BzhIW2u87FL27fH2dDmjUxGxXtKhwDuAdwG3AF8ADgbmpfUPYts/ZD9OPxcADRVWO5n2j4Py2H4NfFbSKODHEfFkjnh9vPl4a6tqx1tbTgr5bUmv9mxi2zOvIWXTG8qmNwNDeyGe24H/AN4JDOtgvu4ORBEwr+03I2U3zL6d3l6SrilXy3bAYRHxWpsYILvWnVv6gzQfmC9pMXAOsDQiDm9nkdZ9tpmu/z8pxRYRN0t6iOzyyx2SzoqISn8U2/Lxho+3nHrjeCvx5aOuuw/4KGTX/MiuHb4MrAQmpPYJwJiOVhIRLwLrtLVXwkldjOMG4PKIWNzBPA+UrfejZe33AtMkDUqn0++qsOyDwCRJ/xdA0hsk7R8RD0XE+PTqjf+gDwFHpm8+g4EPl312J/CPrW8kje/OBiS9qc211fHAMmC4spuCSBos6c1dWG17x0Hbbe8LrIiIq4C5ZJdfusLHm483qN3x5jOFbrgMuEHSIuDPwKmp/TbgFElLyQ68J3Ks6wzgOklbgHuAl/IGERGrgM66q50H3Czp02QHSKvZwLuB3wH/Q4XT9IhokXQa0CTpr1LzxeT7d5UbKmlh2ftfRESpC2NErJF0WYrhRaB83k8A30i/6+3J/ric3cXtA+wEXJ0uoWwClpOVX58JXCVp17T+/wTyllW5jMrHQVsfAaZL2gj8kaw3UVe0tx0fb5X5eOvZ8eYyF0WStFNErE/TFwIjI+K8gsOyfsrHm+XhM4VivV/SZ8j2w9PAacWGY/2cjzfrlM8UzMysxDeaq0RlA0/6KkkrU//xaqy70/ICygZiLUnT4yUdV41Y8uoP+7Se+XirD04KdUiSL+u93nigz/4n9T7tc8bTh4+3nnBSSFIXuJ9J+q2ykgHT0jeXK5SNjGyWNEHZ8PSnlNWoaR3Cf2VaZrHajKBM80yU9Kik/dTxUPf/lNQMnKesENqSFM+9OeKfk9a5VNKM1HaG0nB+ZWUPWssRtDecf5iysghLJX0H8jxbvEvxnK7K5QVuVCqQlt6vb7OuHYDPkXVrXJj2zZHaOmr1UUk7V4ihT+/TtI6KZSvSN9kHlZUymC1p97Jtvq5kQm/z8fb6460LsdblPi2JCL+y+yonAteVvd+VrC/4x9P7rwGLgJ2B4cDasuXmkY1Q3Iusy91IskE+PyUb0r8A2AcYDPx/YHhadhrZ0+UgG+jyzbLtLwb2TtO75Yh/j/RzKLAE2DvFv0fa7n3ANWmem4HJaXofYFmavopsgBBkg18C2LObv89K8fxP+t3tQNanvTWeG8mG/7cuuz79bACWpOnTWudP738CTErTO5GVCehX+7Tsd7AJGJ/e/xD4uxT3kantc8B/lm3zK2n6OOCXVfr/4uOt+7+7utynrS+f0m61mKzw2JeBn0bEfcpGMt5e9vlOEbGObBDQBmX9kCezdQj/Wkn3ABOBl4EDyfomHx1ZQayD6Xio+y1l0w8AN0r6IVuHwHfkE5KmpunRwHTgnoh4AUDSj+h8OP8RwAcBIuJnkv6UY7tdiWd+VC4v0B0PAF+V9AOy4fyVarz09X3a6vexbdmK/ciSyj2p7bvAj8rm76xkQm/w8dYz9bhPAXdJLYmIJ5SNDD0O+IKku9JHrcPPt7Bt+YAtdP77W0NWfuCtwB/ITo87GupePlz9bGWjT98PLJB0aEQ8X2khZSMc3wMcHhF/ljQfeIzsD1glHQ3n77EO4jmonUVKJRskbUeOMsMR8SVJPyPbXw9IOiYiHmszT5/dp220LVuxW875SyUTJP1Xa8wR0aNr5T7eKh9vXVRX+7Sc7ykkkv4a+HNE3ARcSSohkMN9bB3CP5zs289v0mcvkv0BuCIduI+Tc6i7pP0iG+J/CdBC9u2nPbsCf0r/IQ4gKy72BrLh/Lsru8l5Ytn87Q3nv5esOifKHvSye67fQL54htJ+eYGVwKFpegrZ5Ye21pFd5mmNeb+IWBwRXwYeBg5ou0Af36cdeQn4U9m15elkI5TbFRGnR1Yqojf+ePh4q3C89VDR+7TEZwpbjQOuVFYCYCPwceDWHMvNBg4Hfkt2TfRfIuKP6eAkItZKOh74OfAx4EPkG+p+pbL6KQLuSutvzy+AsyUtI/sj9SDZg1D+jeyP2Qtk35xayxq0N5z/crIyA0vJrpP/T45/f9541pAN1a9UXuA6YK6k36ZlKxUfuxu4UFkJgyuAyZLeRfbtfinZ77etvrxPO3Mq8C1JOwIrgNN7sK6u8vFW+XjrqSL3aYkHr/VjSmUN0je32WQ3QGcXHZf1Tz7e+gdfPurfLkvfdJYAvwfmFBqN9Xc+3voBnymYmVmJzxTMzKzESWEAUh3X8Knn2OqZf2/9T1H71EnBukR1XMOnnmOrZ/699T892adOCn2A6ruGz/11HFuu+kJFqPN92qfrMhUVW3/Yp4BrH/WFF/Vdw2d6Hce2W9H7ro/u01y/N+q4hk8RsfWHfRrh2kd9RT3X8Pkt8Pk6ja0r9YVqrZ73aV+vy1RUbP1inzop9AFR3zV8PgscRTYCuN5i60p9oZqq833aZ+syFRlbf9mnvqfQB6i+a/i8CAyr09h6Ul+oqup8n/blukyFxdZf9qnPFPqGeq7h8yTwnTqNraf1haqpnvdpX67L1JlqxtYv9qlHNJuZWYkvH5mZWYmTgpmZlTgpmFlVqY5LcNRzbEVxUjCzuqU6LsFRz7H1hJOCmW1D9V2uwWVVqq1aw8z98suvvvmivss1uKxKlV/98vTHzHqknss1uKxKlTkpmNk2or7LNbisSpX5noKZbUP1Xa7hRVxWpap8pmBmbdVzuQaXVakyl7kwM7MSXz4yM7MSJwUzMytxUjAzsxInBTMzK3FSMDOzEicFMzMrcVIwM7MSJwUzMyv5X4T5gObNYbRwAAAAAElFTkSuQmCC\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "x = np.arange(8)\n", + "width = 0.4\n", + "plt.bar(x-width, height=[young_smokers_alive.shape[0], middle_aged_smokers_alive.shape[0], elder_adults_smokers_alive.shape[0], seniors_smokers_alive.shape[0], young_non_smokers_alive.shape[0], middle_aged_non_smokers_alive.shape[0], elder_adults_non_smokers_alive.shape[0], seniors_non_smokers_alive.shape[0]],width=width,color='green')\n", + "plt.bar(x, [young_smokers.shape[0]-young_smokers_alive.shape[0], middle_aged_smokers.shape[0]-middle_aged_smokers_alive.shape[0], elder_adults_smokers.shape[0]-elder_adults_smokers_alive.shape[0], seniors_smokers.shape[0]-seniors_smokers_alive.shape[0], young_non_smokers.shape[0]-young_non_smokers_alive.shape[0], middle_aged_non_smokers.shape[0]-middle_aged_non_smokers_alive.shape[0], elder_adults_non_smokers.shape[0]-elder_adults_non_smokers_alive.shape[0], seniors_non_smokers.shape[0]-seniors_non_smokers_alive.shape[0]], width, color='red')\n", + "plt.xticks(x, ['Young\\nsmokers', 'Middle-\\naged\\nsmokers', 'Elder\\nadults\\nsmokers', 'Seniors\\nsmokers', 'Young\\nnon-\\nsmokers', 'Middle-\\naged\\nnon-\\nsmokers', 'Elder\\nadults\\nnon-\\nsmokers', 'Seniors\\nnon-\\nsmokers'])\n", + "plt.ylabel(\"Number of people\")\n", + "plt.legend([\"Alive\", \"Dead\"])\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "According to our data, the mortality rate is high in all the non-smoking groups except seniors as well as in the senior smoking group. Smoking affects seniors health condition because they are more vulnerable to all deseases types than others. However, in other groups we could see the opposite effect. Of course, smoking does not improve state of health, but there are other reasons which lead to death and are more dangerous than smoking such as cardiac deseases, accidents, etc.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 3. Logistic regression analysis" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As it was proposed, we will check our hypothesis with the logistic regression." + ] + }, + { + "cell_type": "code", + "execution_count": 129, + "metadata": {}, + "outputs": [], + "source": [ + "from sklearn.linear_model import LogisticRegression\n", + "from sklearn.model_selection import train_test_split\n", + "from sklearn.metrics import classification_report" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We need to transform our data to numerical values first." + ] + }, + { + "cell_type": "code", + "execution_count": 126, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "Smoker int64\n", + "Status int64\n", + "Age float64\n", + "dtype: object" + ] + }, + "execution_count": 126, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "numeric_data = data.copy()\n", + "numeric_data.loc[numeric_data[\"Smoker\"] == \"Yes\", \"Smoker\"] = 1\n", + "numeric_data.loc[numeric_data[\"Smoker\"] == \"No\", \"Smoker\"] = 0\n", + "numeric_data.loc[numeric_data[\"Status\"] == \"Alive\", \"Status\"] = 1\n", + "numeric_data.loc[numeric_data[\"Status\"] == \"Dead\", \"Status\"] = 0\n", + "numeric_data.dtypes" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " We take 90% of our data to train the model and 10% to test it." + ] + }, + { + "cell_type": "code", + "execution_count": 152, + "metadata": { + "scrolled": false + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " precision recall f1-score support\n", + "\n", + " 0 0.71 0.67 0.69 33\n", + " 1 0.89 0.91 0.90 99\n", + "\n", + "avg / total 0.85 0.85 0.85 132\n", + "\n" + ] + } + ], + "source": [ + "X_train, X_test, y_train, y_test = train_test_split(numeric_data.drop('Status',axis=1), numeric_data['Status'], test_size=0.10, random_state=1)\n", + "model = LogisticRegression()\n", + "model.fit(X_train,y_train)\n", + "predictions = model.predict(X_test)\n", + "print(classification_report(y_test,predictions))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We obtained satisfactory results, which could approve our hypothesis. However, it is not sufficient to conclude that there is a correlation between smoking and the mortality rate. Further analysis is neeeded." + ] + } + ], "metadata": { "kernelspec": { "display_name": "Python 3", @@ -16,10 +941,9 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.6.3" + "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 } - diff --git a/module3/exo3/smoking.csv b/module3/exo3/smoking.csv new file mode 100644 index 0000000000000000000000000000000000000000..6c47065bc7aa53c0c92eef463cf16384eb028aaa --- /dev/null +++ b/module3/exo3/smoking.csv @@ -0,0 +1,1315 @@ +"Smoker","Status","Age" +"Yes","Alive",21 +"Yes","Alive",19.3 +"No","Dead",57.5 +"No","Alive",47.1 +"Yes","Alive",81.4 +"No","Alive",36.8 +"No","Alive",23.8 +"Yes","Dead",57.5 +"Yes","Alive",24.8 +"Yes","Alive",49.5 +"Yes","Alive",30 +"No","Dead",66 +"Yes","Alive",49.2 +"No","Alive",58.4 +"No","Dead",60.6 +"No","Alive",25.1 +"No","Alive",43.5 +"No","Alive",27.1 +"No","Alive",58.3 +"Yes","Alive",65.7 +"No","Dead",73.2 +"Yes","Alive",38.3 +"No","Alive",33.4 +"Yes","Dead",62.3 +"No","Alive",18 +"No","Alive",56.2 +"Yes","Alive",59.2 +"No","Alive",25.8 +"No","Dead",36.9 +"No","Alive",20.2 +"Yes","Alive",34.6 +"Yes","Alive",51.9 +"Yes","Alive",49.9 +"No","Alive",19.4 +"No","Alive",56.9 +"Yes","Alive",46.7 +"Yes","Alive",44.4 +"Yes","Alive",29.5 +"Yes","Dead",33 +"Yes","Alive",35.6 +"Yes","Alive",39.1 +"No","Dead",69.7 +"Yes","Alive",35.7 +"No","Dead",75.8 +"No","Alive",25.3 +"No","Dead",83 +"Yes","Dead",44.3 +"No","Alive",18.5 +"Yes","Alive",37.5 +"Yes","Alive",22.1 +"No","Alive",82.8 +"No","Alive",45 +"No","Dead",73.3 +"Yes","Alive",39 +"No","Alive",28.4 +"No","Dead",73.7 +"Yes","Alive",40.1 +"No","Alive",51.2 +"No","Alive",22.9 +"No","Alive",41.9 +"Yes","Alive",58.1 +"Yes","Alive",37.3 +"No","Alive",41.7 +"Yes","Dead",36.3 +"Yes","Dead",80.7 +"Yes","Alive",33 +"Yes","Alive",38.6 +"Yes","Alive",27.9 +"No","Alive",47.6 +"No","Dead",77.6 +"No","Dead",58.1 +"Yes","Alive",26.2 +"No","Alive",45.4 +"No","Alive",62.4 +"No","Alive",62.5 +"No","Alive",39.5 +"No","Alive",27.6 +"Yes","Alive",31.4 +"No","Dead",85 +"No","Alive",18.9 +"No","Alive",35.3 +"Yes","Alive",25.4 +"No","Dead",72.8 +"Yes","Alive",58.3 +"No","Alive",27.3 +"No","Dead",55.9 +"No","Alive",32.8 +"Yes","Dead",53.6 +"No","Alive",55.9 +"Yes","Alive",48 +"Yes","Alive",56.1 +"No","Alive",18.3 +"Yes","Alive",20.2 +"No","Alive",62.8 +"Yes","Alive",18.6 +"No","Alive",46.3 +"No","Alive",36 +"Yes","Dead",55.5 +"Yes","Alive",18.6 +"No","Dead",65.7 +"No","Dead",76.5 +"Yes","Dead",61 +"No","Alive",26.8 +"Yes","Alive",47.6 +"No","Dead",70.5 +"No","Dead",81.8 +"Yes","Alive",32.5 +"No","Alive",23 +"No","Alive",83.7 +"Yes","Dead",62.8 +"Yes","Alive",45.9 +"No","Alive",59.9 +"Yes","Dead",66.5 +"No","Alive",47.5 +"No","Dead",89.3 +"No","Alive",57.2 +"Yes","Alive",21.3 +"Yes","Alive",34 +"No","Alive",59.5 +"Yes","Alive",50.1 +"No","Alive",56.1 +"Yes","Alive",30.6 +"Yes","Dead",63.8 +"Yes","Alive",27.4 +"Yes","Alive",32.5 +"No","Alive",22.5 +"Yes","Alive",24.2 +"No","Alive",56.8 +"Yes","Alive",28.9 +"Yes","Dead",87.8 +"Yes","Alive",19.4 +"No","Dead",88.4 +"Yes","Dead",35.7 +"No","Alive",33.7 +"No","Dead",62.3 +"Yes","Alive",63.6 +"Yes","Dead",71.7 +"No","Alive",48.5 +"No","Alive",82 +"Yes","Dead",40.8 +"Yes","Alive",31.3 +"No","Alive",25.1 +"No","Alive",24.4 +"No","Alive",32.2 +"No","Alive",53.1 +"No","Alive",47.5 +"No","Dead",26.3 +"No","Dead",66 +"Yes","Alive",41 +"No","Dead",58.3 +"No","Dead",86.8 +"Yes","Alive",49.7 +"No","Alive",21 +"No","Dead",75.8 +"No","Alive",50.5 +"No","Dead",63.5 +"Yes","Alive",33.1 +"No","Alive",30.7 +"Yes","Dead",59.4 +"No","Alive",67.2 +"Yes","Alive",20.7 +"No","Alive",41.6 +"No","Alive",37.7 +"No","Dead",89.7 +"No","Alive",45.2 +"Yes","Dead",59.9 +"No","Alive",55.4 +"Yes","Alive",44.8 +"No","Alive",18.3 +"No","Dead",76.7 +"Yes","Dead",62.3 +"Yes","Dead",48.4 +"No","Alive",82.7 +"No","Alive",27 +"No","Alive",30.9 +"No","Dead",82.9 +"No","Alive",40.5 +"Yes","Alive",56.8 +"No","Alive",26.5 +"Yes","Alive",41.9 +"Yes","Alive",29.9 +"No","Dead",75 +"No","Dead",66.3 +"No","Dead",87 +"No","Dead",79.5 +"Yes","Alive",51.7 +"No","Alive",30 +"No","Alive",78.2 +"No","Alive",80 +"Yes","Alive",62.9 +"Yes","Dead",78.3 +"Yes","Alive",49.8 +"Yes","Alive",36.5 +"No","Dead",60.1 +"Yes","Alive",62 +"No","Alive",19 +"No","Dead",87.6 +"No","Alive",24.3 +"No","Alive",22.2 +"Yes","Dead",68.4 +"No","Alive",32.5 +"Yes","Dead",89.2 +"Yes","Alive",44.5 +"Yes","Alive",43.3 +"Yes","Dead",57.4 +"Yes","Alive",45.6 +"Yes","Alive",18.1 +"Yes","Dead",63.4 +"No","Alive",67 +"Yes","Alive",55.6 +"Yes","Alive",51.9 +"No","Alive",23.3 +"Yes","Dead",57.6 +"No","Alive",38.4 +"No","Dead",35.2 +"Yes","Alive",26.2 +"No","Alive",60.3 +"Yes","Alive",48.7 +"Yes","Alive",23.7 +"Yes","Alive",46.9 +"Yes","Alive",18 +"No","Dead",65.8 +"Yes","Alive",33 +"Yes","Dead",83.1 +"No","Dead",58.4 +"Yes","Alive",23.2 +"No","Alive",66.7 +"No","Alive",58.8 +"Yes","Alive",56.7 +"Yes","Alive",21.5 +"No","Dead",78.3 +"No","Alive",51.5 +"Yes","Alive",63.5 +"Yes","Alive",57.8 +"Yes","Alive",59.5 +"Yes","Dead",47.8 +"Yes","Alive",53.9 +"Yes","Alive",45.5 +"Yes","Alive",24.2 +"No","Alive",63.9 +"Yes","Alive",37.5 +"No","Alive",20.6 +"No","Alive",22.9 +"No","Alive",46.1 +"No","Alive",49.6 +"No","Alive",31.4 +"No","Alive",25.9 +"Yes","Alive",46.8 +"Yes","Dead",81 +"No","Dead",84.3 +"No","Alive",30.8 +"Yes","Alive",52.4 +"No","Alive",20.1 +"Yes","Dead",58.9 +"Yes","Alive",72.1 +"No","Alive",19.6 +"No","Alive",52.6 +"Yes","Alive",35 +"Yes","Dead",35.4 +"No","Dead",55.1 +"Yes","Alive",23.7 +"No","Alive",49.1 +"Yes","Alive",39.7 +"Yes","Alive",33.7 +"No","Dead",66.4 +"No","Alive",24.2 +"No","Dead",67.2 +"No","Alive",19.4 +"No","Alive",52.4 +"No","Dead",58.6 +"Yes","Alive",36.2 +"Yes","Alive",38.8 +"Yes","Alive",47.9 +"No","Alive",36.5 +"Yes","Alive",24.3 +"No","Alive",38.8 +"No","Alive",38.4 +"No","Alive",55.3 +"No","Dead",87.7 +"Yes","Dead",56.7 +"No","Alive",74.1 +"Yes","Alive",62.3 +"No","Alive",18.5 +"Yes","Dead",59.3 +"No","Alive",39.8 +"Yes","Dead",55 +"No","Alive",42.8 +"No","Alive",34.2 +"Yes","Alive",33.7 +"No","Alive",30.6 +"No","Dead",81.7 +"Yes","Alive",62 +"No","Alive",61.3 +"Yes","Alive",58.5 +"No","Alive",41.6 +"Yes","Alive",60.6 +"No","Alive",52.9 +"Yes","Alive",34 +"No","Dead",52.4 +"No","Alive",38.5 +"No","Alive",23.7 +"Yes","Alive",38.7 +"No","Alive",49.3 +"No","Alive",59.5 +"Yes","Alive",26.2 +"Yes","Dead",65.8 +"Yes","Alive",44.3 +"No","Alive",31.9 +"No","Dead",47.9 +"Yes","Alive",57.7 +"Yes","Dead",36.5 +"Yes","Alive",36.3 +"No","Dead",56.1 +"No","Alive",21.1 +"Yes","Alive",22.7 +"No","Alive",19.7 +"Yes","Dead",60.1 +"Yes","Dead",77.6 +"No","Dead",67.6 +"No","Alive",49.3 +"Yes","Alive",37 +"No","Dead",79.9 +"No","Dead",56.3 +"Yes","Alive",20.2 +"No","Alive",31.1 +"Yes","Alive",40.9 +"Yes","Dead",35.2 +"No","Alive",24.5 +"Yes","Alive",35 +"Yes","Alive",36.3 +"Yes","Dead",34.3 +"Yes","Alive",20.5 +"Yes","Alive",29 +"Yes","Dead",74.1 +"Yes","Alive",43.6 +"Yes","Alive",33 +"Yes","Dead",42.3 +"No","Dead",63.2 +"No","Alive",53.2 +"Yes","Alive",53.7 +"No","Alive",62.7 +"Yes","Alive",39 +"Yes","Alive",39.3 +"No","Dead",47 +"No","Alive",35.8 +"No","Alive",49.4 +"No","Alive",20.7 +"No","Dead",76.7 +"Yes","Alive",31.3 +"No","Alive",20.1 +"No","Alive",56.3 +"No","Alive",51.3 +"No","Dead",85.2 +"Yes","Alive",25.2 +"Yes","Alive",20.2 +"No","Alive",58.1 +"No","Alive",49.9 +"No","Dead",79.4 +"Yes","Alive",31.6 +"No","Alive",31.6 +"No","Alive",55.4 +"No","Alive",41.6 +"No","Dead",74.6 +"No","Dead",81.3 +"No","Dead",71.4 +"Yes","Alive",56.4 +"Yes","Alive",39.7 +"Yes","Alive",59.1 +"No","Alive",20.7 +"No","Alive",89.7 +"Yes","Alive",61.8 +"Yes","Alive",26.8 +"Yes","Dead",44.3 +"Yes","Alive",36.1 +"Yes","Alive",22.1 +"No","Alive",33.5 +"Yes","Alive",44.5 +"Yes","Alive",24.1 +"No","Dead",72.5 +"Yes","Dead",57.7 +"Yes","Alive",58.7 +"Yes","Alive",40.7 +"No","Dead",75.6 +"No","Dead",74.1 +"No","Alive",37 +"Yes","Alive",31.6 +"Yes","Alive",34.7 +"Yes","Dead",38.5 +"No","Alive",22 +"Yes","Alive",39.3 +"No","Dead",61.2 +"No","Alive",37.2 +"No","Alive",25.7 +"No","Dead",88.8 +"No","Dead",65.6 +"Yes","Dead",58 +"No","Alive",20.4 +"Yes","Alive",44.4 +"No","Alive",46.2 +"No","Alive",29.7 +"Yes","Alive",51.9 +"Yes","Dead",43.7 +"Yes","Dead",61.1 +"No","Dead",78 +"No","Alive",26.8 +"No","Alive",63 +"No","Dead",82.3 +"No","Alive",32.2 +"No","Dead",57.2 +"No","Alive",36.7 +"Yes","Dead",67.5 +"No","Alive",59.2 +"No","Dead",55.6 +"No","Dead",86.2 +"Yes","Dead",79.1 +"No","Dead",75.1 +"No","Alive",52 +"No","Alive",55.3 +"Yes","Alive",40.3 +"Yes","Dead",56.9 +"Yes","Alive",53.9 +"No","Alive",26.5 +"Yes","Alive",33 +"No","Alive",19.7 +"No","Alive",23.3 +"Yes","Dead",75.9 +"Yes","Dead",35.5 +"No","Alive",31.9 +"No","Alive",34.2 +"No","Dead",76.2 +"Yes","Alive",21.7 +"Yes","Alive",50.6 +"No","Alive",25.7 +"No","Alive",24.2 +"No","Alive",42.2 +"Yes","Alive",49.2 +"Yes","Alive",33.6 +"Yes","Alive",49.5 +"Yes","Alive",61.6 +"No","Alive",23 +"No","Alive",24.3 +"No","Alive",23.6 +"No","Alive",57.1 +"Yes","Alive",32.5 +"No","Dead",83.1 +"Yes","Alive",21.8 +"Yes","Alive",43.2 +"Yes","Alive",26.6 +"Yes","Alive",45.7 +"Yes","Alive",18.1 +"Yes","Dead",45.6 +"Yes","Alive",29.7 +"Yes","Dead",73.9 +"No","Alive",56.4 +"No","Alive",55.6 +"Yes","Alive",55.1 +"No","Alive",80.8 +"Yes","Alive",29.7 +"No","Alive",25.7 +"No","Alive",52.8 +"No","Dead",81.3 +"Yes","Dead",80.5 +"Yes","Alive",34.3 +"No","Dead",59 +"No","Alive",42.5 +"No","Alive",76.9 +"Yes","Alive",33.3 +"No","Alive",20.6 +"Yes","Dead",86.8 +"No","Alive",33.1 +"No","Dead",80.2 +"Yes","Alive",30.5 +"No","Alive",31.9 +"No","Alive",19.8 +"No","Dead",84.5 +"No","Alive",56 +"No","Alive",50.3 +"No","Alive",56.8 +"Yes","Dead",60.7 +"Yes","Alive",27.6 +"Yes","Alive",32.9 +"No","Alive",56.2 +"Yes","Dead",63.4 +"No","Alive",86.9 +"No","Dead",79.9 +"No","Alive",41.5 +"Yes","Alive",45.3 +"Yes","Alive",63 +"No","Dead",77.2 +"No","Dead",69.4 +"No","Alive",49 +"No","Alive",44.7 +"Yes","Alive",27.7 +"Yes","Dead",62.3 +"No","Dead",70.7 +"No","Alive",38 +"Yes","Alive",44.3 +"No","Alive",32.3 +"Yes","Alive",56.1 +"Yes","Alive",58 +"No","Dead",82.9 +"Yes","Alive",44.4 +"No","Alive",24.9 +"Yes","Alive",63.1 +"No","Alive",35.9 +"Yes","Alive",31.1 +"No","Alive",24 +"No","Dead",88.5 +"Yes","Alive",39.5 +"No","Alive",35.6 +"No","Dead",82.4 +"No","Dead",63.8 +"No","Alive",87.4 +"No","Alive",37.2 +"No","Dead",69.5 +"No","Dead",25.3 +"Yes","Alive",59.6 +"Yes","Dead",35.7 +"Yes","Dead",56.6 +"Yes","Alive",34.5 +"Yes","Alive",58.6 +"Yes","Dead",78.2 +"Yes","Alive",48.3 +"Yes","Alive",25.4 +"Yes","Alive",74.1 +"Yes","Dead",88.7 +"No","Alive",68.4 +"No","Alive",33.4 +"No","Alive",36.5 +"No","Alive",25.5 +"Yes","Alive",21.2 +"Yes","Dead",61.8 +"Yes","Alive",38 +"No","Alive",35.1 +"No","Alive",38 +"Yes","Dead",36.2 +"Yes","Dead",87.9 +"No","Dead",76.1 +"No","Alive",59.4 +"No","Alive",18.9 +"Yes","Alive",53.3 +"Yes","Dead",82.6 +"Yes","Alive",45.3 +"No","Dead",86.3 +"Yes","Dead",63.2 +"No","Dead",88.1 +"Yes","Alive",36.1 +"No","Dead",71 +"Yes","Alive",62.1 +"Yes","Dead",55.3 +"No","Alive",52.2 +"No","Alive",25.6 +"No","Alive",33 +"No","Dead",75.3 +"Yes","Alive",21.3 +"Yes","Dead",76.9 +"No","Alive",30 +"No","Dead",77.5 +"Yes","Dead",75.2 +"No","Dead",83.9 +"Yes","Alive",53 +"No","Alive",62.4 +"Yes","Alive",43.7 +"Yes","Alive",50.9 +"No","Dead",29.8 +"Yes","Alive",32.8 +"Yes","Alive",50.7 +"Yes","Dead",66.1 +"No","Alive",33.5 +"Yes","Alive",27.2 +"No","Dead",56.2 +"Yes","Alive",38.1 +"Yes","Dead",66.8 +"Yes","Dead",55.2 +"No","Alive",51.6 +"Yes","Alive",50.9 +"No","Alive",41.4 +"No","Dead",65.4 +"No","Dead",67.7 +"No","Alive",37.8 +"Yes","Alive",42.5 +"No","Alive",23.9 +"No","Alive",60.1 +"Yes","Alive",26.6 +"Yes","Alive",23.3 +"No","Dead",75.6 +"No","Dead",72.1 +"Yes","Alive",34.8 +"No","Dead",55.3 +"Yes","Alive",28.2 +"No","Dead",79.3 +"Yes","Alive",38.5 +"Yes","Alive",41 +"No","Alive",60.7 +"No","Alive",51.8 +"Yes","Alive",25.7 +"No","Dead",62.7 +"No","Alive",23.7 +"No","Alive",23.4 +"No","Alive",56.5 +"No","Alive",28.4 +"No","Alive",42.8 +"No","Dead",83.5 +"No","Alive",36.8 +"Yes","Alive",43.8 +"Yes","Alive",59 +"No","Alive",25.5 +"No","Dead",47.2 +"Yes","Alive",23.5 +"No","Alive",19.4 +"No","Dead",68.5 +"Yes","Alive",43.4 +"No","Alive",19.5 +"Yes","Alive",62.2 +"Yes","Alive",31.1 +"No","Alive",19.2 +"No","Dead",61.9 +"No","Alive",27.6 +"Yes","Alive",30.2 +"Yes","Alive",59 +"Yes","Alive",49.2 +"No","Alive",40.3 +"Yes","Alive",45.4 +"No","Alive",29.4 +"No","Alive",36.8 +"No","Alive",29.6 +"Yes","Dead",58.6 +"No","Dead",29.3 +"No","Alive",40 +"Yes","Alive",21.7 +"Yes","Alive",40.3 +"No","Dead",81.6 +"Yes","Alive",22.9 +"Yes","Alive",42.7 +"Yes","Alive",40.4 +"No","Dead",85.7 +"Yes","Alive",32.2 +"Yes","Alive",19.3 +"Yes","Alive",24.3 +"No","Alive",55.8 +"No","Alive",28.8 +"No","Alive",55.1 +"No","Alive",31.1 +"Yes","Alive",31.8 +"No","Alive",46.7 +"Yes","Alive",36.3 +"No","Alive",23.3 +"Yes","Alive",50.7 +"Yes","Alive",40.8 +"Yes","Alive",36.9 +"Yes","Dead",81.8 +"No","Alive",55.2 +"Yes","Dead",62.4 +"No","Dead",78.4 +"Yes","Alive",18 +"No","Dead",85.7 +"Yes","Alive",43 +"Yes","Dead",88.3 +"Yes","Alive",26.2 +"Yes","Alive",52.7 +"No","Dead",81.9 +"No","Alive",44.4 +"No","Alive",71.8 +"No","Alive",35.4 +"No","Dead",71.4 +"No","Dead",55.9 +"Yes","Dead",46.6 +"No","Dead",65.6 +"No","Alive",56.2 +"No","Dead",57.9 +"Yes","Alive",43.5 +"No","Alive",22.6 +"No","Alive",27.2 +"No","Alive",27.2 +"No","Dead",20.2 +"Yes","Dead",60.2 +"Yes","Alive",55.2 +"No","Alive",39.6 +"No","Alive",24.5 +"No","Alive",36.7 +"Yes","Alive",24.2 +"No","Dead",73.3 +"No","Alive",26.6 +"No","Alive",41.7 +"No","Dead",42.6 +"No","Alive",18.6 +"Yes","Alive",31.3 +"No","Alive",51.6 +"No","Alive",19 +"No","Dead",72.6 +"No","Alive",35.7 +"No","Alive",44.1 +"No","Alive",58.3 +"Yes","Dead",65.6 +"No","Alive",62.3 +"Yes","Alive",57.4 +"No","Alive",26 +"No","Dead",85.7 +"No","Dead",47.3 +"Yes","Dead",62.1 +"Yes","Dead",66.1 +"Yes","Alive",18.5 +"Yes","Alive",24.6 +"Yes","Alive",48.3 +"Yes","Alive",28.8 +"No","Alive",52.2 +"No","Dead",85.5 +"No","Dead",58.4 +"Yes","Alive",38.1 +"Yes","Alive",27.7 +"No","Alive",42.1 +"Yes","Alive",47.9 +"No","Dead",67.4 +"No","Alive",29 +"No","Alive",29.4 +"No","Alive",21.4 +"No","Alive",41.5 +"No","Alive",74 +"No","Alive",42 +"No","Dead",68.1 +"Yes","Alive",21.5 +"No","Dead",58.5 +"No","Alive",32.8 +"Yes","Alive",37.7 +"No","Alive",55.5 +"No","Dead",78.7 +"No","Alive",31 +"Yes","Dead",51.6 +"No","Dead",66.6 +"No","Dead",40 +"Yes","Alive",52.1 +"Yes","Alive",30.4 +"No","Alive",38.1 +"Yes","Alive",23.1 +"Yes","Dead",57.9 +"Yes","Alive",25.2 +"No","Dead",76.2 +"No","Alive",63.4 +"No","Alive",21 +"Yes","Alive",45.5 +"No","Alive",46.5 +"No","Alive",48.1 +"No","Alive",32.4 +"Yes","Alive",40.1 +"No","Alive",23.4 +"Yes","Alive",62.1 +"No","Alive",45.1 +"Yes","Dead",53.6 +"No","Dead",60.6 +"No","Alive",83 +"No","Alive",55.5 +"No","Alive",41.8 +"No","Dead",40.1 +"Yes","Alive",24.4 +"Yes","Dead",62.7 +"Yes","Alive",23.7 +"No","Alive",84.9 +"Yes","Dead",50.2 +"No","Alive",40 +"Yes","Alive",27.3 +"Yes","Dead",67.2 +"Yes","Alive",48.4 +"Yes","Alive",32.7 +"No","Alive",56 +"Yes","Dead",63.4 +"No","Alive",22.5 +"No","Alive",59.8 +"Yes","Alive",22.3 +"No","Alive",38 +"No","Alive",62.3 +"Yes","Alive",43.5 +"No","Alive",47.7 +"No","Alive",34.6 +"Yes","Alive",37 +"Yes","Alive",18.7 +"Yes","Alive",35.5 +"No","Dead",62.1 +"Yes","Alive",42.5 +"Yes","Dead",61.1 +"No","Alive",45.7 +"Yes","Alive",35 +"Yes","Alive",30.3 +"Yes","Alive",27.3 +"No","Alive",43.1 +"No","Alive",20.5 +"Yes","Dead",59.7 +"Yes","Alive",35.9 +"Yes","Dead",36.9 +"No","Alive",18.8 +"No","Dead",66.4 +"Yes","Alive",27.5 +"No","Dead",67.7 +"Yes","Alive",43.6 +"No","Alive",62.2 +"No","Dead",86 +"No","Dead",85.8 +"No","Alive",29.8 +"Yes","Alive",28.7 +"Yes","Alive",61.4 +"No","Alive",73.2 +"No","Alive",57.6 +"No","Alive",29.5 +"Yes","Dead",56.5 +"Yes","Alive",19.9 +"No","Alive",55.1 +"Yes","Dead",58.9 +"No","Alive",32.6 +"No","Dead",82.5 +"No","Alive",38.4 +"Yes","Alive",47.3 +"No","Dead",76.8 +"Yes","Alive",28.8 +"No","Alive",26.4 +"No","Dead",67.5 +"Yes","Alive",40.5 +"Yes","Alive",50.8 +"No","Alive",25.8 +"Yes","Alive",52.1 +"Yes","Alive",50.2 +"No","Alive",38.5 +"No","Dead",56.1 +"Yes","Alive",29.8 +"No","Alive",58.1 +"No","Dead",87.6 +"No","Dead",69.6 +"Yes","Alive",33.5 +"No","Dead",86 +"Yes","Dead",22.6 +"No","Alive",53.3 +"No","Alive",21.9 +"No","Alive",26 +"Yes","Alive",49.4 +"Yes","Alive",18 +"No","Alive",44.1 +"No","Dead",79.1 +"Yes","Alive",38.5 +"No","Alive",26.3 +"Yes","Alive",49.4 +"No","Alive",33.9 +"No","Dead",63.2 +"No","Alive",77.1 +"No","Dead",63.8 +"Yes","Alive",37.1 +"Yes","Alive",29.6 +"Yes","Alive",22.2 +"No","Dead",83.4 +"No","Dead",61.8 +"No","Alive",32.7 +"No","Alive",32.8 +"No","Alive",24.7 +"No","Alive",60.6 +"Yes","Alive",38.2 +"No","Dead",43.7 +"Yes","Alive",29.9 +"No","Alive",50.8 +"No","Alive",19.5 +"No","Alive",55.4 +"Yes","Alive",55.1 +"No","Dead",41.9 +"Yes","Alive",30.7 +"No","Alive",27.9 +"No","Alive",26 +"No","Dead",77.8 +"No","Alive",48.3 +"Yes","Dead",42.4 +"No","Alive",20.5 +"No","Dead",77.5 +"No","Alive",58.5 +"No","Alive",44.7 +"No","Alive",28.5 +"Yes","Alive",21 +"No","Alive",50.5 +"Yes","Alive",71.5 +"No","Alive",37.8 +"No","Alive",23.1 +"Yes","Dead",44.9 +"Yes","Alive",55.6 +"Yes","Alive",20.2 +"No","Alive",48.1 +"Yes","Alive",42.8 +"Yes","Alive",44.5 +"No","Dead",85.8 +"No","Alive",44.7 +"No","Alive",87.6 +"No","Alive",27 +"No","Alive",52 +"Yes","Alive",58.7 +"No","Alive",34.3 +"Yes","Alive",19.9 +"No","Alive",19.7 +"Yes","Alive",33.8 +"No","Alive",53 +"Yes","Alive",20.7 +"Yes","Alive",59.4 +"Yes","Alive",44.4 +"Yes","Alive",20.4 +"No","Dead",69 +"Yes","Alive",60.5 +"No","Alive",44 +"Yes","Alive",33.1 +"No","Alive",42.2 +"No","Alive",50.5 +"No","Alive",30.5 +"Yes","Alive",26.6 +"Yes","Alive",21 +"Yes","Alive",36.6 +"Yes","Alive",28.9 +"No","Alive",47.8 +"No","Dead",73.3 +"No","Alive",49.6 +"No","Alive",44.8 +"Yes","Alive",38.6 +"No","Dead",79.9 +"Yes","Dead",84.4 +"No","Dead",39.1 +"Yes","Alive",47.4 +"No","Alive",57.8 +"No","Alive",41.5 +"No","Alive",20.3 +"Yes","Alive",38.1 +"Yes","Alive",44.6 +"Yes","Alive",39.3 +"Yes","Alive",18.1 +"No","Alive",51.5 +"No","Alive",23.1 +"No","Alive",22.7 +"Yes","Alive",36.8 +"No","Alive",57.4 +"Yes","Alive",57.1 +"No","Alive",19.2 +"No","Dead",84.8 +"No","Alive",26.9 +"No","Dead",88.4 +"No","Dead",77.4 +"No","Dead",41.3 +"No","Alive",53.4 +"Yes","Alive",58.9 +"Yes","Dead",38.8 +"No","Dead",82.2 +"No","Alive",46.9 +"Yes","Alive",24.6 +"Yes","Alive",30.4 +"No","Alive",42.4 +"No","Dead",64 +"No","Alive",33.3 +"Yes","Alive",60.2 +"Yes","Alive",25 +"Yes","Dead",37.1 +"Yes","Alive",47.7 +"No","Dead",66.5 +"Yes","Dead",43.3 +"No","Alive",19.1 +"No","Alive",52.4 +"No","Alive",33.9 +"No","Alive",40 +"No","Alive",29.9 +"Yes","Alive",58.4 +"Yes","Alive",48.7 +"Yes","Alive",52.3 +"No","Dead",59.9 +"No","Alive",63.5 +"Yes","Alive",48.3 +"Yes","Alive",51.1 +"Yes","Dead",34.5 +"Yes","Alive",37.5 +"Yes","Alive",73.8 +"Yes","Alive",24.6 +"No","Dead",65.3 +"No","Alive",34.2 +"No","Alive",71.8 +"No","Dead",47.5 +"No","Alive",31.3 +"Yes","Dead",28.3 +"Yes","Dead",61.9 +"Yes","Dead",74.8 +"Yes","Alive",51 +"Yes","Dead",42.5 +"Yes","Alive",38.1 +"No","Alive",47.4 +"No","Alive",32.1 +"No","Dead",86.2 +"Yes","Alive",55.7 +"Yes","Alive",43.6 +"Yes","Dead",58.2 +"No","Alive",25.1 +"No","Dead",75 +"No","Alive",23.2 +"No","Alive",20.6 +"No","Dead",66.4 +"Yes","Alive",44 +"Yes","Alive",19.4 +"Yes","Dead",61 +"Yes","Alive",29.5 +"No","Alive",22.5 +"No","Alive",60 +"Yes","Alive",84.7 +"No","Dead",85.1 +"Yes","Dead",82 +"Yes","Alive",33.4 +"No","Alive",21.3 +"No","Dead",65.2 +"Yes","Dead",83.6 +"Yes","Alive",52.4 +"Yes","Alive",38.9 +"No","Alive",32.9 +"Yes","Alive",53.6 +"No","Alive",35.7 +"Yes","Alive",19.8 +"No","Dead",65.7 +"No","Alive",40.6 +"Yes","Alive",25.7 +"Yes","Dead",44.3 +"No","Alive",68.4 +"No","Alive",33.7 +"No","Alive",26.5 +"Yes","Alive",43.6 +"Yes","Dead",32.6 +"No","Alive",21 +"No","Alive",25.2 +"No","Dead",81.4 +"No","Dead",50.2 +"No","Alive",85 +"No","Alive",45.9 +"Yes","Dead",56.5 +"No","Alive",33.4 +"Yes","Alive",61.1 +"No","Alive",35.1 +"No","Alive",22.7 +"Yes","Alive",40.4 +"No","Alive",48.6 +"No","Dead",82.9 +"No","Dead",78.1 +"No","Alive",36.9 +"Yes","Alive",61.8 +"Yes","Alive",29.5 +"No","Alive",38.9 +"No","Dead",71.3 +"No","Alive",36.5 +"No","Dead",81.8 +"No","Alive",21.7 +"Yes","Dead",78.3 +"Yes","Alive",30.5 +"No","Alive",61.5 +"Yes","Alive",33.1 +"No","Alive",32.2 +"No","Alive",48.5 +"No","Alive",20.3 +"No","Alive",62.6 +"No","Alive",28.5 +"Yes","Alive",52.4 +"Yes","Dead",55.7 +"No","Alive",53.8 +"No","Alive",20.7 +"Yes","Alive",33.4 +"No","Alive",43.8 +"Yes","Alive",53.1 +"Yes","Alive",51.5 +"No","Alive",31.8 +"No","Dead",73.2 +"Yes","Alive",41.1 +"No","Dead",82 +"Yes","Alive",27 +"Yes","Alive",44.3 +"Yes","Dead",42.9 +"Yes","Dead",56.1 +"No","Dead",60.2 +"Yes","Alive",55.8 +"Yes","Alive",29.1 +"No","Alive",49.4 +"Yes","Dead",44.9 +"No","Dead",80.9 +"No","Alive",25.8 +"No","Alive",31.5 +"No","Dead",82.6 +"No","Alive",27.3 +"No","Alive",18.8 +"No","Alive",33.2 +"No","Alive",29.7 +"Yes","Dead",52.6 +"No","Dead",81.1 +"Yes","Dead",88.6 +"No","Alive",35 +"No","Dead",75.2 +"Yes","Alive",37.3 +"Yes","Alive",52.1 +"No","Dead",84.7 +"No","Dead",85 +"No","Alive",27 +"No","Dead",85 +"No","Alive",20.2 +"No","Alive",46.3 +"Yes","Alive",60 +"No","Dead",63.5 +"Yes","Dead",84.3 +"No","Alive",66.4 +"Yes","Alive",30.2 +"Yes","Alive",23.1 +"No","Alive",61.5 +"No","Alive",40.7 +"Yes","Alive",27.1 +"Yes","Alive",36.7 +"No","Alive",58.2 +"Yes","Alive",29.7 +"No","Alive",48.9 +"No","Alive",52.9 +"No","Alive",41.7 +"No","Alive",23 +"No","Alive",18.3 +"No","Dead",89.9 +"No","Alive",60.6 +"No","Alive",30.1 +"Yes","Alive",41.9 +"Yes","Alive",47 +"No","Alive",23.8 +"Yes","Dead",31.3 +"Yes","Dead",63.3 +"No","Alive",52.4 +"No","Alive",65 +"No","Dead",74.8 +"No","Alive",32.9 +"Yes","Dead",49.6 +"No","Alive",59.9 +"No","Alive",30.8 +"No","Alive",30.1 +"No","Alive",52 +"Yes","Alive",57.2 +"No","Dead",89.5 +"Yes","Alive",32.5 +"No","Alive",19.1 +"Yes","Alive",44 +"Yes","Dead",39.2 +"No","Alive",22.9 +"Yes","Alive",18 +"No","Alive",20.1 +"Yes","Alive",28 +"No","Alive",53 +"Yes","Alive",46.7 +"No","Alive",44.6 +"No","Alive",18.7 +"No","Dead",71.1 +"Yes","Alive",42.3 +"No","Alive",64 +"Yes","Dead",71 +"Yes","Alive",26.6 +"Yes","Alive",50.8 +"No","Alive",25.5 +"Yes","Alive",24 +"No","Alive",48.1 +"Yes","Alive",50.6 +"Yes","Alive",21.5 +"No","Alive",61.2 +"No","Dead",75.9 +"No","Dead",88 +"No","Dead",66.8 +"No","Alive",50.8 +"No","Alive",34.9 +"No","Dead",83.8 +"No","Alive",25 +"Yes","Dead",41.7 +"No","Alive",42.3 +"No","Alive",62.4 +"Yes","Alive",38.1 +"Yes","Alive",23.3 +"Yes","Alive",25.6 +"No","Dead",51.1 +"Yes","Alive",21.2 +"No","Dead",56.9 +"No","Alive",35 +"Yes","Dead",45 +"Yes","Alive",25.2 +"Yes","Alive",43.7 +"No","Dead",86.7 +"No","Alive",20.2 +"No","Dead",71.6 +"No","Dead",78.3 +"No","Alive",23.1 +"No","Dead",84.8 +"Yes","Alive",58.1 +"Yes","Alive",53.9 +"No","Alive",53.3 +"No","Alive",30.9 +"Yes","Alive",60.6 +"Yes","Dead",85.2 +"No","Alive",57.5 +"No","Alive",46.5 +"No","Dead",73.8 +"No","Alive",62.6 +"No","Alive",43.5 +"No","Alive",52.5 +"Yes","Alive",34.1 +"No","Alive",38.7 +"No","Alive",22.6 +"No","Alive",20 +"No","Alive",59.9 +"No","Dead",83.3 +"Yes","Alive",52.2 +"No","Dead",76.2 +"Yes","Alive",28 +"Yes","Alive",56.6 +"No","Dead",67.8 +"No","Alive",21.2 +"No","Alive",27.9 +"Yes","Alive",29.8 +"Yes","Alive",28.1 +"Yes","Alive",53.2 +"No","Alive",23.2 +"No","Alive",39.5 +"Yes","Alive",31.4 +"Yes","Alive",30 +"Yes","Alive",37.8 +"Yes","Alive",46.9 +"Yes","Alive",43.8 +"Yes","Alive",63.1 +"No","Alive",21.4 +"No","Dead",62.5 +"No","Alive",45.5 +"Yes","Alive",27.9 +"Yes","Alive",29.5 +"Yes","Alive",61 +"Yes","Alive",27 +"Yes","Alive",61.5 +"Yes","Dead",56.2 +"Yes","Dead",87.9 +"Yes","Alive",28.3 +"No","Dead",75.1 +"No","Dead",87.9 +"Yes","Alive",31 +"Yes","Alive",55.3 +"No","Alive",40.8 +"Yes","Alive",46.2 +"No","Alive",52.3 +"Yes","Alive",51.9 +"No","Alive",28.3 +"Yes","Alive",44.4 +"Yes","Dead",63.3 +"Yes","Alive",41 +"Yes","Alive",50.2 +"No","Alive",55.4 +"No","Dead",43.3 +"No","Alive",60.1 +"Yes","Alive",29.7 +"No","Dead",79 +"No","Dead",65.1 +"Yes","Alive",40.1 +"No","Alive",46 +"No","Alive",40.2 +"No","Dead",89.2 +"No","Alive",26 +"No","Alive",43.4 +"No","Alive",48.8 +"No","Alive",19.8 +"Yes","Alive",27.8 +"Yes","Alive",52.4 +"Yes","Alive",27.8 +"Yes","Alive",41 +"No","Dead",28.5 +"No","Alive",26.7 +"No","Alive",36 +"No","Dead",74.4 +"Yes","Alive",40.8 +"Yes","Alive",20.4 +"No","Dead",42.1 +"No","Alive",41.2 +"Yes","Alive",20.9 +"Yes","Alive",45.5 +"No","Alive",26.7 +"No","Alive",41.8 +"No","Alive",33.7 +"No","Alive",56.5 +"Yes","Alive",38.8 +"Yes","Alive",55.5 +"Yes","Alive",24.9 +"No","Alive",33 +"Yes","Alive",55.7 +"No","Alive",25.7 +"No","Alive",19.5 +"Yes","Alive",58.5 +"No","Alive",23.4 +"Yes","Alive",43.7 +"No","Alive",34.4 +"No","Dead",83.9 +"No","Alive",34.9 +"Yes","Alive",51.2 +"No","Dead",86.3 +"Yes","Dead",36 +"Yes","Alive",48.3 +"No","Alive",63.1 +"No","Alive",60.8 +"Yes","Dead",39.3 +"No","Alive",36.7 +"No","Alive",63.8 +"No","Dead",71.3 +"No","Alive",57.7 +"No","Alive",63.2 +"No","Alive",46.6 +"Yes","Dead",82.4 +"Yes","Alive",38.3 +"Yes","Alive",32.7 +"No","Alive",39.7 +"Yes","Dead",60 +"No","Dead",71 +"No","Alive",20.5 +"No","Alive",44.4 +"Yes","Alive",31.2 +"Yes","Alive",47.8 +"Yes","Alive",60.9 +"No","Dead",61.4 +"Yes","Alive",43 +"No","Alive",42.1 +"Yes","Alive",35.9 +"No","Alive",22.3 +"Yes","Dead",62.1 +"No","Dead",88.6 +"No","Alive",39.1