"As first step, I've downloaded the file and I've put it in the GitLab. I've modified the file .csv removing the header.\n",
"Then, I've download the file via Python and printed every row of the dataset. Moreover, I've created a dataframe in order to organize better the information. \n",
"The measure extracted from the .csv are converted in Float."
"### Computation of Mortality of Smocker and Not Smocker Women"
]
},
{
"cell_type": "markdown",
"metadata": {
"hideCode": false,
"hidePrompt": false
},
"source": [
"I'm going to tabulate the total number of women alive and dead over the period according to their smoking habits."
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {
"hideCode": false,
"hidePrompt": false
},
"outputs": [],
"source": [
"smockerWomen = df[df.Smocker == 'Yes']\n",
"noSmockerWomen = df[df.Smocker == 'No']"
]
},
{
"cell_type": "markdown",
"metadata": {
"hideCode": false,
"hidePrompt": false
},
"source": [
"First of all, I will count the number of women for each groups."
]
},
{
"cell_type": "code",
"execution_count": 79,
"metadata": {
"hideCode": false,
"hidePrompt": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Number of Smocker Women: 582\n",
"Number of No Smocker Women: 732\n"
]
}
],
"source": [
"print(\"Number of Smocker Women: \" + str(len(list(smockerWomen.Status))))\n",
"print(\"Number of No Smocker Women: \" + str(len(list(noSmockerWomen.Status))))"
]
},
{
"cell_type": "markdown",
"metadata": {
"hideCode": false,
"hidePrompt": false
},
"source": [
"I'm going to compute the mortality rate of each table. The computation is very easy, I just count the number of women \"*Dead*\" for each group, subdivided by the number of women of the same group. "
]
},
{
"cell_type": "code",
"execution_count": 101,
"metadata": {
"hideCode": false,
"hidePrompt": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Mortality rate of Smocker Women: 0.23883161512027493\n",
"Mortality rate of No Smocker Women: 0.31420765027322406\n"
]
}
],
"source": [
"from statistics import mean\n",
"\n",
"print(\"Mortality rate of Smocker Women: \" + str((list(smockerWomen.Status).count(\"Dead\"))/len(list(smockerWomen.Status))))\n",
"print(\"Mortality rate of No Smocker Women: \" + str((list(noSmockerWomen.Status).count(\"Dead\"))/len(list(noSmockerWomen.Status))))\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"hideCode": false,
"hidePrompt": false
},
"source": [
"The Mortality rate of the Smocker Women is lower than the No Smocker Women, but it contains less number of people."
]
},
{
"cell_type": "markdown",
"metadata": {
"hideCode": false,
"hidePrompt": false
},
"source": [
"### Mortality Rate by Age Group"
]
},
{
"cell_type": "markdown",
"metadata": {
"hideCode": false,
"hidePrompt": false
},
"source": [
"I'm going to compute the mortality rate grouped by the age of the women. First of all, I'm going to subdivide the Smocker and NoSmocker Group by the age."
"print(\"Mortality rate of Smocker Women 18-34: \" + str(smockerWomen1834Mort))\n",
"print(\"Mortality rate of Smocker Women 35-54: \" + str(smockerWomen3454Mort))\n",
"print(\"Mortality rate of Smocker Women 55-64: \" + str(smockerWomen5564Mort))\n",
"print(\"Mortality rate of Smocker Women 65: \" + str(smockerWomen65Mort))\n",
"print()\n",
"print(\"Mortality rate of No Smocker Women 18-34: \" + str(noSmockerWomen1834Mort))\n",
"print(\"Mortality rate of No Smocker Women 35-54: \" + str(noSmockerWomen3454Mort))\n",
"print(\"Mortality rate of No Smocker Women 55-64: \" + str(noSmockerWomen5564Mort))\n",
"print(\"Mortality rate of No Smocker Women 65: \" + str(noSmockerWomen65Mort))"
]
},
{
"cell_type": "markdown",
"metadata": {
"hideCode": false,
"hidePrompt": false
},
"source": [
"In the graph below is shown the difference in terms of rate mortality of each range group. It is notable the high difference of mortality rate on the middle group 35-64. Indeed, in the Smocker group the mortality rate is higher. "