Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
M
mooc-rr
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
86d2379a8cd828206f6e8576c862739f
mooc-rr
Commits
878dcd14
Commit
878dcd14
authored
Aug 28, 2020
by
86d2379a8cd828206f6e8576c862739f
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
no commit message
parent
836ac157
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
1854 additions
and
6 deletions
+1854
-6
exercice.ipynb
module3/exo2/exercice.ipynb
+1118
-3
exercice.ipynb
module3/exo3/exercice.ipynb
+736
-3
No files found.
module3/exo2/exercice.ipynb
View file @
878dcd14
{
{
"cells": [],
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Incidence Varicelle"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"%matplotlib inline\n",
"import matplotlib.pyplot as plt\n",
"import pandas as pd\n",
"import isoweek"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"On importe les packages utiles pour l'analyse"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"data_url = \"https://www.sentiweb.fr/datasets/incidence-PAY-7.csv\""
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>week</th>\n",
" <th>indicator</th>\n",
" <th>inc</th>\n",
" <th>inc_low</th>\n",
" <th>inc_up</th>\n",
" <th>inc100</th>\n",
" <th>inc100_low</th>\n",
" <th>inc100_up</th>\n",
" <th>geo_insee</th>\n",
" <th>geo_name</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>202026</td>\n",
" <td>7</td>\n",
" <td>783</td>\n",
" <td>0</td>\n",
" <td>1740</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>202025</td>\n",
" <td>7</td>\n",
" <td>230</td>\n",
" <td>0</td>\n",
" <td>602</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>202024</td>\n",
" <td>7</td>\n",
" <td>388</td>\n",
" <td>0</td>\n",
" <td>959</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>202023</td>\n",
" <td>7</td>\n",
" <td>558</td>\n",
" <td>1</td>\n",
" <td>1115</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>202022</td>\n",
" <td>7</td>\n",
" <td>277</td>\n",
" <td>0</td>\n",
" <td>633</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>202021</td>\n",
" <td>7</td>\n",
" <td>602</td>\n",
" <td>36</td>\n",
" <td>1168</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>202020</td>\n",
" <td>7</td>\n",
" <td>824</td>\n",
" <td>20</td>\n",
" <td>1628</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>202019</td>\n",
" <td>7</td>\n",
" <td>310</td>\n",
" <td>0</td>\n",
" <td>753</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>202018</td>\n",
" <td>7</td>\n",
" <td>849</td>\n",
" <td>98</td>\n",
" <td>1600</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>202017</td>\n",
" <td>7</td>\n",
" <td>272</td>\n",
" <td>0</td>\n",
" <td>658</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>202016</td>\n",
" <td>7</td>\n",
" <td>758</td>\n",
" <td>78</td>\n",
" <td>1438</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>202015</td>\n",
" <td>7</td>\n",
" <td>1918</td>\n",
" <td>675</td>\n",
" <td>3161</td>\n",
" <td>3</td>\n",
" <td>1</td>\n",
" <td>5</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>202014</td>\n",
" <td>7</td>\n",
" <td>3879</td>\n",
" <td>2227</td>\n",
" <td>5531</td>\n",
" <td>6</td>\n",
" <td>3</td>\n",
" <td>9</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>202013</td>\n",
" <td>7</td>\n",
" <td>7326</td>\n",
" <td>5236</td>\n",
" <td>9416</td>\n",
" <td>11</td>\n",
" <td>8</td>\n",
" <td>14</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>202012</td>\n",
" <td>7</td>\n",
" <td>8123</td>\n",
" <td>5790</td>\n",
" <td>10456</td>\n",
" <td>12</td>\n",
" <td>8</td>\n",
" <td>16</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>202011</td>\n",
" <td>7</td>\n",
" <td>10198</td>\n",
" <td>7568</td>\n",
" <td>12828</td>\n",
" <td>15</td>\n",
" <td>11</td>\n",
" <td>19</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>202010</td>\n",
" <td>7</td>\n",
" <td>9011</td>\n",
" <td>6691</td>\n",
" <td>11331</td>\n",
" <td>14</td>\n",
" <td>10</td>\n",
" <td>18</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>202009</td>\n",
" <td>7</td>\n",
" <td>13631</td>\n",
" <td>10544</td>\n",
" <td>16718</td>\n",
" <td>21</td>\n",
" <td>16</td>\n",
" <td>26</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>202008</td>\n",
" <td>7</td>\n",
" <td>10424</td>\n",
" <td>7708</td>\n",
" <td>13140</td>\n",
" <td>16</td>\n",
" <td>12</td>\n",
" <td>20</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>202007</td>\n",
" <td>7</td>\n",
" <td>8959</td>\n",
" <td>6574</td>\n",
" <td>11344</td>\n",
" <td>14</td>\n",
" <td>10</td>\n",
" <td>18</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20</th>\n",
" <td>202006</td>\n",
" <td>7</td>\n",
" <td>9264</td>\n",
" <td>6925</td>\n",
" <td>11603</td>\n",
" <td>14</td>\n",
" <td>10</td>\n",
" <td>18</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21</th>\n",
" <td>202005</td>\n",
" <td>7</td>\n",
" <td>8505</td>\n",
" <td>6314</td>\n",
" <td>10696</td>\n",
" <td>13</td>\n",
" <td>10</td>\n",
" <td>16</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>22</th>\n",
" <td>202004</td>\n",
" <td>7</td>\n",
" <td>7991</td>\n",
" <td>5831</td>\n",
" <td>10151</td>\n",
" <td>12</td>\n",
" <td>9</td>\n",
" <td>15</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>23</th>\n",
" <td>202003</td>\n",
" <td>7</td>\n",
" <td>5968</td>\n",
" <td>4100</td>\n",
" <td>7836</td>\n",
" <td>9</td>\n",
" <td>6</td>\n",
" <td>12</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>24</th>\n",
" <td>202002</td>\n",
" <td>7</td>\n",
" <td>6534</td>\n",
" <td>4530</td>\n",
" <td>8538</td>\n",
" <td>10</td>\n",
" <td>7</td>\n",
" <td>13</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25</th>\n",
" <td>202001</td>\n",
" <td>7</td>\n",
" <td>9835</td>\n",
" <td>7019</td>\n",
" <td>12651</td>\n",
" <td>15</td>\n",
" <td>11</td>\n",
" <td>19</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>26</th>\n",
" <td>201952</td>\n",
" <td>7</td>\n",
" <td>7941</td>\n",
" <td>5246</td>\n",
" <td>10636</td>\n",
" <td>12</td>\n",
" <td>8</td>\n",
" <td>16</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>27</th>\n",
" <td>201951</td>\n",
" <td>7</td>\n",
" <td>5823</td>\n",
" <td>3675</td>\n",
" <td>7971</td>\n",
" <td>9</td>\n",
" <td>6</td>\n",
" <td>12</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>28</th>\n",
" <td>201950</td>\n",
" <td>7</td>\n",
" <td>6424</td>\n",
" <td>4276</td>\n",
" <td>8572</td>\n",
" <td>10</td>\n",
" <td>7</td>\n",
" <td>13</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>29</th>\n",
" <td>201949</td>\n",
" <td>7</td>\n",
" <td>6621</td>\n",
" <td>4540</td>\n",
" <td>8702</td>\n",
" <td>10</td>\n",
" <td>7</td>\n",
" <td>13</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1513</th>\n",
" <td>199126</td>\n",
" <td>7</td>\n",
" <td>17608</td>\n",
" <td>11304</td>\n",
" <td>23912</td>\n",
" <td>31</td>\n",
" <td>20</td>\n",
" <td>42</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1514</th>\n",
" <td>199125</td>\n",
" <td>7</td>\n",
" <td>16169</td>\n",
" <td>10700</td>\n",
" <td>21638</td>\n",
" <td>28</td>\n",
" <td>18</td>\n",
" <td>38</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1515</th>\n",
" <td>199124</td>\n",
" <td>7</td>\n",
" <td>16171</td>\n",
" <td>10071</td>\n",
" <td>22271</td>\n",
" <td>28</td>\n",
" <td>17</td>\n",
" <td>39</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1516</th>\n",
" <td>199123</td>\n",
" <td>7</td>\n",
" <td>11947</td>\n",
" <td>7671</td>\n",
" <td>16223</td>\n",
" <td>21</td>\n",
" <td>13</td>\n",
" <td>29</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1517</th>\n",
" <td>199122</td>\n",
" <td>7</td>\n",
" <td>15452</td>\n",
" <td>9953</td>\n",
" <td>20951</td>\n",
" <td>27</td>\n",
" <td>17</td>\n",
" <td>37</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1518</th>\n",
" <td>199121</td>\n",
" <td>7</td>\n",
" <td>14903</td>\n",
" <td>8975</td>\n",
" <td>20831</td>\n",
" <td>26</td>\n",
" <td>16</td>\n",
" <td>36</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1519</th>\n",
" <td>199120</td>\n",
" <td>7</td>\n",
" <td>19053</td>\n",
" <td>12742</td>\n",
" <td>25364</td>\n",
" <td>34</td>\n",
" <td>23</td>\n",
" <td>45</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1520</th>\n",
" <td>199119</td>\n",
" <td>7</td>\n",
" <td>16739</td>\n",
" <td>11246</td>\n",
" <td>22232</td>\n",
" <td>29</td>\n",
" <td>19</td>\n",
" <td>39</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1521</th>\n",
" <td>199118</td>\n",
" <td>7</td>\n",
" <td>21385</td>\n",
" <td>13882</td>\n",
" <td>28888</td>\n",
" <td>38</td>\n",
" <td>25</td>\n",
" <td>51</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1522</th>\n",
" <td>199117</td>\n",
" <td>7</td>\n",
" <td>13462</td>\n",
" <td>8877</td>\n",
" <td>18047</td>\n",
" <td>24</td>\n",
" <td>16</td>\n",
" <td>32</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1523</th>\n",
" <td>199116</td>\n",
" <td>7</td>\n",
" <td>14857</td>\n",
" <td>10068</td>\n",
" <td>19646</td>\n",
" <td>26</td>\n",
" <td>18</td>\n",
" <td>34</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1524</th>\n",
" <td>199115</td>\n",
" <td>7</td>\n",
" <td>13975</td>\n",
" <td>9781</td>\n",
" <td>18169</td>\n",
" <td>25</td>\n",
" <td>18</td>\n",
" <td>32</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1525</th>\n",
" <td>199114</td>\n",
" <td>7</td>\n",
" <td>12265</td>\n",
" <td>7684</td>\n",
" <td>16846</td>\n",
" <td>22</td>\n",
" <td>14</td>\n",
" <td>30</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1526</th>\n",
" <td>199113</td>\n",
" <td>7</td>\n",
" <td>9567</td>\n",
" <td>6041</td>\n",
" <td>13093</td>\n",
" <td>17</td>\n",
" <td>11</td>\n",
" <td>23</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1527</th>\n",
" <td>199112</td>\n",
" <td>7</td>\n",
" <td>10864</td>\n",
" <td>7331</td>\n",
" <td>14397</td>\n",
" <td>19</td>\n",
" <td>13</td>\n",
" <td>25</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1528</th>\n",
" <td>199111</td>\n",
" <td>7</td>\n",
" <td>15574</td>\n",
" <td>11184</td>\n",
" <td>19964</td>\n",
" <td>27</td>\n",
" <td>19</td>\n",
" <td>35</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1529</th>\n",
" <td>199110</td>\n",
" <td>7</td>\n",
" <td>16643</td>\n",
" <td>11372</td>\n",
" <td>21914</td>\n",
" <td>29</td>\n",
" <td>20</td>\n",
" <td>38</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1530</th>\n",
" <td>199109</td>\n",
" <td>7</td>\n",
" <td>13741</td>\n",
" <td>8780</td>\n",
" <td>18702</td>\n",
" <td>24</td>\n",
" <td>15</td>\n",
" <td>33</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1531</th>\n",
" <td>199108</td>\n",
" <td>7</td>\n",
" <td>13289</td>\n",
" <td>8813</td>\n",
" <td>17765</td>\n",
" <td>23</td>\n",
" <td>15</td>\n",
" <td>31</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1532</th>\n",
" <td>199107</td>\n",
" <td>7</td>\n",
" <td>12337</td>\n",
" <td>8077</td>\n",
" <td>16597</td>\n",
" <td>22</td>\n",
" <td>15</td>\n",
" <td>29</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1533</th>\n",
" <td>199106</td>\n",
" <td>7</td>\n",
" <td>10877</td>\n",
" <td>7013</td>\n",
" <td>14741</td>\n",
" <td>19</td>\n",
" <td>12</td>\n",
" <td>26</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1534</th>\n",
" <td>199105</td>\n",
" <td>7</td>\n",
" <td>10442</td>\n",
" <td>6544</td>\n",
" <td>14340</td>\n",
" <td>18</td>\n",
" <td>11</td>\n",
" <td>25</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1535</th>\n",
" <td>199104</td>\n",
" <td>7</td>\n",
" <td>7913</td>\n",
" <td>4563</td>\n",
" <td>11263</td>\n",
" <td>14</td>\n",
" <td>8</td>\n",
" <td>20</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1536</th>\n",
" <td>199103</td>\n",
" <td>7</td>\n",
" <td>15387</td>\n",
" <td>10484</td>\n",
" <td>20290</td>\n",
" <td>27</td>\n",
" <td>18</td>\n",
" <td>36</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1537</th>\n",
" <td>199102</td>\n",
" <td>7</td>\n",
" <td>16277</td>\n",
" <td>11046</td>\n",
" <td>21508</td>\n",
" <td>29</td>\n",
" <td>20</td>\n",
" <td>38</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1538</th>\n",
" <td>199101</td>\n",
" <td>7</td>\n",
" <td>15565</td>\n",
" <td>10271</td>\n",
" <td>20859</td>\n",
" <td>27</td>\n",
" <td>18</td>\n",
" <td>36</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1539</th>\n",
" <td>199052</td>\n",
" <td>7</td>\n",
" <td>19375</td>\n",
" <td>13295</td>\n",
" <td>25455</td>\n",
" <td>34</td>\n",
" <td>23</td>\n",
" <td>45</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1540</th>\n",
" <td>199051</td>\n",
" <td>7</td>\n",
" <td>19080</td>\n",
" <td>13807</td>\n",
" <td>24353</td>\n",
" <td>34</td>\n",
" <td>25</td>\n",
" <td>43</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1541</th>\n",
" <td>199050</td>\n",
" <td>7</td>\n",
" <td>11079</td>\n",
" <td>6660</td>\n",
" <td>15498</td>\n",
" <td>20</td>\n",
" <td>12</td>\n",
" <td>28</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1542</th>\n",
" <td>199049</td>\n",
" <td>7</td>\n",
" <td>1143</td>\n",
" <td>0</td>\n",
" <td>2610</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>5</td>\n",
" <td>FR</td>\n",
" <td>France</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>1543 rows × 10 columns</p>\n",
"</div>"
],
"text/plain": [
" week indicator inc inc_low inc_up inc100 inc100_low \\\n",
"0 202026 7 783 0 1740 1 0 \n",
"1 202025 7 230 0 602 0 0 \n",
"2 202024 7 388 0 959 1 0 \n",
"3 202023 7 558 1 1115 1 0 \n",
"4 202022 7 277 0 633 0 0 \n",
"5 202021 7 602 36 1168 1 0 \n",
"6 202020 7 824 20 1628 1 0 \n",
"7 202019 7 310 0 753 0 0 \n",
"8 202018 7 849 98 1600 1 0 \n",
"9 202017 7 272 0 658 0 0 \n",
"10 202016 7 758 78 1438 1 0 \n",
"11 202015 7 1918 675 3161 3 1 \n",
"12 202014 7 3879 2227 5531 6 3 \n",
"13 202013 7 7326 5236 9416 11 8 \n",
"14 202012 7 8123 5790 10456 12 8 \n",
"15 202011 7 10198 7568 12828 15 11 \n",
"16 202010 7 9011 6691 11331 14 10 \n",
"17 202009 7 13631 10544 16718 21 16 \n",
"18 202008 7 10424 7708 13140 16 12 \n",
"19 202007 7 8959 6574 11344 14 10 \n",
"20 202006 7 9264 6925 11603 14 10 \n",
"21 202005 7 8505 6314 10696 13 10 \n",
"22 202004 7 7991 5831 10151 12 9 \n",
"23 202003 7 5968 4100 7836 9 6 \n",
"24 202002 7 6534 4530 8538 10 7 \n",
"25 202001 7 9835 7019 12651 15 11 \n",
"26 201952 7 7941 5246 10636 12 8 \n",
"27 201951 7 5823 3675 7971 9 6 \n",
"28 201950 7 6424 4276 8572 10 7 \n",
"29 201949 7 6621 4540 8702 10 7 \n",
"... ... ... ... ... ... ... ... \n",
"1513 199126 7 17608 11304 23912 31 20 \n",
"1514 199125 7 16169 10700 21638 28 18 \n",
"1515 199124 7 16171 10071 22271 28 17 \n",
"1516 199123 7 11947 7671 16223 21 13 \n",
"1517 199122 7 15452 9953 20951 27 17 \n",
"1518 199121 7 14903 8975 20831 26 16 \n",
"1519 199120 7 19053 12742 25364 34 23 \n",
"1520 199119 7 16739 11246 22232 29 19 \n",
"1521 199118 7 21385 13882 28888 38 25 \n",
"1522 199117 7 13462 8877 18047 24 16 \n",
"1523 199116 7 14857 10068 19646 26 18 \n",
"1524 199115 7 13975 9781 18169 25 18 \n",
"1525 199114 7 12265 7684 16846 22 14 \n",
"1526 199113 7 9567 6041 13093 17 11 \n",
"1527 199112 7 10864 7331 14397 19 13 \n",
"1528 199111 7 15574 11184 19964 27 19 \n",
"1529 199110 7 16643 11372 21914 29 20 \n",
"1530 199109 7 13741 8780 18702 24 15 \n",
"1531 199108 7 13289 8813 17765 23 15 \n",
"1532 199107 7 12337 8077 16597 22 15 \n",
"1533 199106 7 10877 7013 14741 19 12 \n",
"1534 199105 7 10442 6544 14340 18 11 \n",
"1535 199104 7 7913 4563 11263 14 8 \n",
"1536 199103 7 15387 10484 20290 27 18 \n",
"1537 199102 7 16277 11046 21508 29 20 \n",
"1538 199101 7 15565 10271 20859 27 18 \n",
"1539 199052 7 19375 13295 25455 34 23 \n",
"1540 199051 7 19080 13807 24353 34 25 \n",
"1541 199050 7 11079 6660 15498 20 12 \n",
"1542 199049 7 1143 0 2610 2 0 \n",
"\n",
" inc100_up geo_insee geo_name \n",
"0 2 FR France \n",
"1 1 FR France \n",
"2 2 FR France \n",
"3 2 FR France \n",
"4 1 FR France \n",
"5 2 FR France \n",
"6 2 FR France \n",
"7 1 FR France \n",
"8 2 FR France \n",
"9 1 FR France \n",
"10 2 FR France \n",
"11 5 FR France \n",
"12 9 FR France \n",
"13 14 FR France \n",
"14 16 FR France \n",
"15 19 FR France \n",
"16 18 FR France \n",
"17 26 FR France \n",
"18 20 FR France \n",
"19 18 FR France \n",
"20 18 FR France \n",
"21 16 FR France \n",
"22 15 FR France \n",
"23 12 FR France \n",
"24 13 FR France \n",
"25 19 FR France \n",
"26 16 FR France \n",
"27 12 FR France \n",
"28 13 FR France \n",
"29 13 FR France \n",
"... ... ... ... \n",
"1513 42 FR France \n",
"1514 38 FR France \n",
"1515 39 FR France \n",
"1516 29 FR France \n",
"1517 37 FR France \n",
"1518 36 FR France \n",
"1519 45 FR France \n",
"1520 39 FR France \n",
"1521 51 FR France \n",
"1522 32 FR France \n",
"1523 34 FR France \n",
"1524 32 FR France \n",
"1525 30 FR France \n",
"1526 23 FR France \n",
"1527 25 FR France \n",
"1528 35 FR France \n",
"1529 38 FR France \n",
"1530 33 FR France \n",
"1531 31 FR France \n",
"1532 29 FR France \n",
"1533 26 FR France \n",
"1534 25 FR France \n",
"1535 20 FR France \n",
"1536 36 FR France \n",
"1537 38 FR France \n",
"1538 36 FR France \n",
"1539 45 FR France \n",
"1540 43 FR France \n",
"1541 28 FR France \n",
"1542 5 FR France \n",
"\n",
"[1543 rows x 10 columns]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"raw_data = pd.read_csv(data_url, skiprows=1)\n",
"raw_data"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>week</th>\n",
" <th>indicator</th>\n",
" <th>inc</th>\n",
" <th>inc_low</th>\n",
" <th>inc_up</th>\n",
" <th>inc100</th>\n",
" <th>inc100_low</th>\n",
" <th>inc100_up</th>\n",
" <th>geo_insee</th>\n",
" <th>geo_name</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
"Empty DataFrame\n",
"Columns: [week, indicator, inc, inc_low, inc_up, inc100, inc100_low, inc100_up, geo_insee, geo_name]\n",
"Index: []"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"raw_data[raw_data.isnull().any(axis=1)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Il n'y a pas de pointmanquant, les données sont gardées telles quelles."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Nous allons maintenant transformer le format des semaines à l'aide du package isoweek"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"def convert_week(number_week_int):\n",
" total_str = str(number_week_int)\n",
" year = int(total_str[:4])\n",
" week = int(total_str[4:])\n",
" w = isoweek.Week(year,week)\n",
" return pd.Period(w.day(0), 'W')\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"data = raw_data\n",
"data['Period'] = [convert_week(i) for i in data['week']]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"metadata": {
"kernelspec": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3",
...
@@ -16,10 +1132,9 @@
...
@@ -16,10 +1132,9 @@
"name": "python",
"name": "python",
"nbconvert_exporter": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"pygments_lexer": "ipython3",
"version": "3.6.
3
"
"version": "3.6.
4
"
}
}
},
},
"nbformat": 4,
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 2
}
}
module3/exo3/exercice.ipynb
View file @
878dcd14
{
{
"cells": [],
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Autour du paradoxe de Simpson"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"blabla intro"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Importing and checking the data"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"%matplotlib inline\n",
"import matplotlib.pyplot as plt\n",
"import pandas as pd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Nous récupérons les données sous format CSV depuis le Gitlab du MOOC."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"data_url =\"https://gitlab.inria.fr/learninglab/mooc-rr/mooc-rr-ressources/-/raw/master/module3/Practical_session/Subject6_smoking.csv?inline=false\""
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"raw_data = pd.read_csv(data_url)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Regardons visuellement le dataset."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Smoker</th>\n",
" <th>Status</th>\n",
" <th>Age</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>21.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>19.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>No</td>\n",
" <td>Dead</td>\n",
" <td>57.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>47.1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>81.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>36.8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>23.8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>Yes</td>\n",
" <td>Dead</td>\n",
" <td>57.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>24.8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>49.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>30.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>No</td>\n",
" <td>Dead</td>\n",
" <td>66.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>49.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>58.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>No</td>\n",
" <td>Dead</td>\n",
" <td>60.6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>25.1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>43.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>27.1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>58.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>65.7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20</th>\n",
" <td>No</td>\n",
" <td>Dead</td>\n",
" <td>73.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>38.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>22</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>33.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>23</th>\n",
" <td>Yes</td>\n",
" <td>Dead</td>\n",
" <td>62.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>24</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>18.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>56.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>26</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>59.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>27</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>25.8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>28</th>\n",
" <td>No</td>\n",
" <td>Dead</td>\n",
" <td>36.9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>29</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>20.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1284</th>\n",
" <td>Yes</td>\n",
" <td>Dead</td>\n",
" <td>36.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1285</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>48.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1286</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>63.1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1287</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>60.8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1288</th>\n",
" <td>Yes</td>\n",
" <td>Dead</td>\n",
" <td>39.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1289</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>36.7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1290</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>63.8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1291</th>\n",
" <td>No</td>\n",
" <td>Dead</td>\n",
" <td>71.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1292</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>57.7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1293</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>63.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1294</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>46.6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1295</th>\n",
" <td>Yes</td>\n",
" <td>Dead</td>\n",
" <td>82.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1296</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>38.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1297</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>32.7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1298</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>39.7</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1299</th>\n",
" <td>Yes</td>\n",
" <td>Dead</td>\n",
" <td>60.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1300</th>\n",
" <td>No</td>\n",
" <td>Dead</td>\n",
" <td>71.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1301</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>20.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1302</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>44.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1303</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>31.2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1304</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>47.8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1305</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>60.9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1306</th>\n",
" <td>No</td>\n",
" <td>Dead</td>\n",
" <td>61.4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1307</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>43.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1308</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>42.1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1309</th>\n",
" <td>Yes</td>\n",
" <td>Alive</td>\n",
" <td>35.9</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1310</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>22.3</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1311</th>\n",
" <td>Yes</td>\n",
" <td>Dead</td>\n",
" <td>62.1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1312</th>\n",
" <td>No</td>\n",
" <td>Dead</td>\n",
" <td>88.6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1313</th>\n",
" <td>No</td>\n",
" <td>Alive</td>\n",
" <td>39.1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>1314 rows × 3 columns</p>\n",
"</div>"
],
"text/plain": [
" Smoker Status Age\n",
"0 Yes Alive 21.0\n",
"1 Yes Alive 19.3\n",
"2 No Dead 57.5\n",
"3 No Alive 47.1\n",
"4 Yes Alive 81.4\n",
"5 No Alive 36.8\n",
"6 No Alive 23.8\n",
"7 Yes Dead 57.5\n",
"8 Yes Alive 24.8\n",
"9 Yes Alive 49.5\n",
"10 Yes Alive 30.0\n",
"11 No Dead 66.0\n",
"12 Yes Alive 49.2\n",
"13 No Alive 58.4\n",
"14 No Dead 60.6\n",
"15 No Alive 25.1\n",
"16 No Alive 43.5\n",
"17 No Alive 27.1\n",
"18 No Alive 58.3\n",
"19 Yes Alive 65.7\n",
"20 No Dead 73.2\n",
"21 Yes Alive 38.3\n",
"22 No Alive 33.4\n",
"23 Yes Dead 62.3\n",
"24 No Alive 18.0\n",
"25 No Alive 56.2\n",
"26 Yes Alive 59.2\n",
"27 No Alive 25.8\n",
"28 No Dead 36.9\n",
"29 No Alive 20.2\n",
"... ... ... ...\n",
"1284 Yes Dead 36.0\n",
"1285 Yes Alive 48.3\n",
"1286 No Alive 63.1\n",
"1287 No Alive 60.8\n",
"1288 Yes Dead 39.3\n",
"1289 No Alive 36.7\n",
"1290 No Alive 63.8\n",
"1291 No Dead 71.3\n",
"1292 No Alive 57.7\n",
"1293 No Alive 63.2\n",
"1294 No Alive 46.6\n",
"1295 Yes Dead 82.4\n",
"1296 Yes Alive 38.3\n",
"1297 Yes Alive 32.7\n",
"1298 No Alive 39.7\n",
"1299 Yes Dead 60.0\n",
"1300 No Dead 71.0\n",
"1301 No Alive 20.5\n",
"1302 No Alive 44.4\n",
"1303 Yes Alive 31.2\n",
"1304 Yes Alive 47.8\n",
"1305 Yes Alive 60.9\n",
"1306 No Dead 61.4\n",
"1307 Yes Alive 43.0\n",
"1308 No Alive 42.1\n",
"1309 Yes Alive 35.9\n",
"1310 No Alive 22.3\n",
"1311 Yes Dead 62.1\n",
"1312 No Dead 88.6\n",
"1313 No Alive 39.1\n",
"\n",
"[1314 rows x 3 columns]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"raw_data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Nous allons maintenant vérifier si aucune donnée n'est manquante, et si les différentes lignes concordent entre elles."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Smoker</th>\n",
" <th>Status</th>\n",
" <th>Age</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
"Empty DataFrame\n",
"Columns: [Smoker, Status, Age]\n",
"Index: []"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"raw_data[raw_data.isnull().any(axis=1)]"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Smoker</th>\n",
" <th>Status</th>\n",
" <th>Age</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
"Empty DataFrame\n",
"Columns: [Smoker, Status, Age]\n",
"Index: []"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"raw_data[(raw_data['Smoker'] != \"Yes\") & (raw_data['Smoker'] != \"No\")]"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Smoker</th>\n",
" <th>Status</th>\n",
" <th>Age</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
"Empty DataFrame\n",
"Columns: [Smoker, Status, Age]\n",
"Index: []"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"raw_data[(raw_data['Status'] != \"Alive\") & (raw_data['Status'] != \"Dead\")]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There seems to be no error in the dataset."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"data = raw_data.copy()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Taux de mortalité (question 1)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"metadata": {
"kernelspec": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3",
...
@@ -16,10 +750,9 @@
...
@@ -16,10 +750,9 @@
"name": "python",
"name": "python",
"nbconvert_exporter": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"pygments_lexer": "ipython3",
"version": "3.6.
3
"
"version": "3.6.
4
"
}
}
},
},
"nbformat": 4,
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 2
}
}
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment