Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
M
mooc-rr
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
26c634550904aba62520384fc8aa7dec
mooc-rr
Commits
c778e2af
Commit
c778e2af
authored
Apr 22, 2020
by
26c634550904aba62520384fc8aa7dec
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
no commit message
parent
29d5c5f0
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
201 additions
and
409 deletions
+201
-409
exercice.ipynb
module3/exo3/exercice.ipynb
+201
-409
No files found.
module3/exo3/exercice.ipynb
View file @
c778e2af
...
@@ -158,13 +158,17 @@
...
@@ -158,13 +158,17 @@
"cell_type": "markdown",
"cell_type": "markdown",
"metadata": {},
"metadata": {},
"source": [
"source": [
"Le chargement à partir des données GIT ne fonctionne pas\n",
"Le chargement à partir des données GIT ne fonctionne pas \n",
"A regarder plus tard, surement un problème dans le lien !!"
"A regarder plus tard, surement un problème dans le lien !!\n",
"\n",
"En regardant le début de la table, on voit que la première colonne se sert à rien, on va pouvoir la supprimer par la suite.\n",
"La colonne Year sera à passer en index (en vérifiant que Python reconnait bien le format date (ou cas ou ca serve)\n",
"Les 2 dernières colonnes sont bien en format numérique donc pas besoin de transformation"
]
]
},
},
{
{
"cell_type": "code",
"cell_type": "code",
"execution_count": 1
0
,
"execution_count": 1
4
,
"metadata": {},
"metadata": {},
"outputs": [
"outputs": [
{
{
...
@@ -196,355 +200,126 @@
...
@@ -196,355 +200,126 @@
" </thead>\n",
" </thead>\n",
" <tbody>\n",
" <tbody>\n",
" <tr>\n",
" <tr>\n",
" <th>0</th>\n",
" <th>count</th>\n",
" <td>1</td>\n",
" <td>53.000000</td>\n",
" <td>1565</td>\n",
" <td>53.000000</td>\n",
" <td>41.0</td>\n",
" <td>53.000000</td>\n",
" <td>5.00</td>\n",
" <td>50.000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2</td>\n",
" <td>1570</td>\n",
" <td>45.0</td>\n",
" <td>5.05</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3</td>\n",
" <td>1575</td>\n",
" <td>42.0</td>\n",
" <td>5.08</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>4</td>\n",
" <td>1580</td>\n",
" <td>49.0</td>\n",
" <td>5.12</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>5</td>\n",
" <td>1585</td>\n",
" <td>41.5</td>\n",
" <td>5.15</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>6</td>\n",
" <td>1590</td>\n",
" <td>47.0</td>\n",
" <td>5.25</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>7</td>\n",
" <td>1595</td>\n",
" <td>64.0</td>\n",
" <td>5.54</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>8</td>\n",
" <td>1600</td>\n",
" <td>27.0</td>\n",
" <td>5.61</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>9</td>\n",
" <td>1605</td>\n",
" <td>33.0</td>\n",
" <td>5.69</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>10</td>\n",
" <td>1610</td>\n",
" <td>32.0</td>\n",
" <td>5.78</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>11</td>\n",
" <td>1615</td>\n",
" <td>33.0</td>\n",
" <td>5.94</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>12</td>\n",
" <td>1620</td>\n",
" <td>35.0</td>\n",
" <td>6.01</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>13</td>\n",
" <td>1625</td>\n",
" <td>33.0</td>\n",
" <td>6.12</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>14</td>\n",
" <td>1630</td>\n",
" <td>45.0</td>\n",
" <td>6.22</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>15</td>\n",
" <td>1635</td>\n",
" <td>33.0</td>\n",
" <td>6.30</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>16</td>\n",
" <td>1640</td>\n",
" <td>39.0</td>\n",
" <td>6.37</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>17</td>\n",
" <td>1645</td>\n",
" <td>53.0</td>\n",
" <td>6.45</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>18</td>\n",
" <td>1650</td>\n",
" <td>42.0</td>\n",
" <td>6.50</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>19</td>\n",
" <td>1655</td>\n",
" <td>40.5</td>\n",
" <td>6.60</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>20</td>\n",
" <td>1660</td>\n",
" <td>46.5</td>\n",
" <td>6.75</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20</th>\n",
" <td>21</td>\n",
" <td>1665</td>\n",
" <td>32.0</td>\n",
" <td>6.80</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21</th>\n",
" <td>22</td>\n",
" <td>1670</td>\n",
" <td>37.0</td>\n",
" <td>6.90</td>\n",
" </tr>\n",
" <tr>\n",
" <th>22</th>\n",
" <td>23</td>\n",
" <td>1675</td>\n",
" <td>43.0</td>\n",
" <td>7.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>23</th>\n",
" <td>24</td>\n",
" <td>1680</td>\n",
" <td>35.0</td>\n",
" <td>7.30</td>\n",
" </tr>\n",
" <tr>\n",
" <th>24</th>\n",
" <td>25</td>\n",
" <td>1685</td>\n",
" <td>27.0</td>\n",
" <td>7.60</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25</th>\n",
" <td>26</td>\n",
" <td>1690</td>\n",
" <td>40.0</td>\n",
" <td>8.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>26</th>\n",
" <td>27</td>\n",
" <td>1695</td>\n",
" <td>50.0</td>\n",
" <td>8.50</td>\n",
" </tr>\n",
" <tr>\n",
" <th>27</th>\n",
" <td>28</td>\n",
" <td>1700</td>\n",
" <td>30.0</td>\n",
" <td>9.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>28</th>\n",
" <td>29</td>\n",
" <td>1705</td>\n",
" <td>32.0</td>\n",
" <td>10.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>29</th>\n",
" <td>30</td>\n",
" <td>1710</td>\n",
" <td>44.0</td>\n",
" <td>11.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>30</th>\n",
" <td>31</td>\n",
" <td>1715</td>\n",
" <td>33.0</td>\n",
" <td>11.75</td>\n",
" </tr>\n",
" <tr>\n",
" <th>31</th>\n",
" <td>32</td>\n",
" <td>1720</td>\n",
" <td>29.0</td>\n",
" <td>12.50</td>\n",
" </tr>\n",
" <tr>\n",
" <th>32</th>\n",
" <td>33</td>\n",
" <td>1725</td>\n",
" <td>39.0</td>\n",
" <td>13.00</td>\n",
" </tr>\n",
" </tr>\n",
" <tr>\n",
" <tr>\n",
" <th>
33
</th>\n",
" <th>
mean
</th>\n",
" <td>
34
</td>\n",
" <td>
27.000000
</td>\n",
" <td>1
730
</td>\n",
" <td>1
694.924528
</td>\n",
" <td>
26.0
</td>\n",
" <td>
43.264151
</td>\n",
" <td>1
3.3
0</td>\n",
" <td>1
1.58160
0</td>\n",
" </tr>\n",
" </tr>\n",
" <tr>\n",
" <tr>\n",
" <th>
34
</th>\n",
" <th>
std
</th>\n",
" <td>
3
5</td>\n",
" <td>
15.44344
5</td>\n",
" <td>
1735
</td>\n",
" <td>
77.089571
</td>\n",
" <td>
32.0
</td>\n",
" <td>
15.410287
</td>\n",
" <td>
13.60
</td>\n",
" <td>
7.336287
</td>\n",
" </tr>\n",
" </tr>\n",
" <tr>\n",
" <tr>\n",
" <th>
35
</th>\n",
" <th>
min
</th>\n",
" <td>
36
</td>\n",
" <td>
1.000000
</td>\n",
" <td>1
74
0</td>\n",
" <td>1
565.00000
0</td>\n",
" <td>2
7.
0</td>\n",
" <td>2
6.00000
0</td>\n",
" <td>
14.
00</td>\n",
" <td>
5.0000
00</td>\n",
" </tr>\n",
" </tr>\n",
" <tr>\n",
" <tr>\n",
" <th>
36
</th>\n",
" <th>
25%
</th>\n",
" <td>
37
</td>\n",
" <td>
14.000000
</td>\n",
" <td>1
745
</td>\n",
" <td>1
630.000000
</td>\n",
" <td>
27.5
</td>\n",
" <td>
33.000000
</td>\n",
" <td>
14.5
0</td>\n",
" <td>
6.14500
0</td>\n",
" </tr>\n",
" </tr>\n",
" <tr>\n",
" <tr>\n",
" <th>
37
</th>\n",
" <th>
50%
</th>\n",
" <td>
38
</td>\n",
" <td>
27.000000
</td>\n",
" <td>1
75
0</td>\n",
" <td>1
695.00000
0</td>\n",
" <td>
31.
0</td>\n",
" <td>
41.00000
0</td>\n",
" <td>
15.
00</td>\n",
" <td>
7.8000
00</td>\n",
" </tr>\n",
" </tr>\n",
" <tr>\n",
" <tr>\n",
" <th>
38
</th>\n",
" <th>
75%
</th>\n",
" <td>
39
</td>\n",
" <td>
40.000000
</td>\n",
" <td>17
55
</td>\n",
" <td>17
60.000000
</td>\n",
" <td>
35.5
</td>\n",
" <td>
47.000000
</td>\n",
" <td>1
5.7
0</td>\n",
" <td>1
4.87500
0</td>\n",
" </tr>\n",
" </tr>\n",
" <tr>\n",
" <tr>\n",
" <th>
39
</th>\n",
" <th>
max
</th>\n",
" <td>
4
0</td>\n",
" <td>
53.00000
0</td>\n",
" <td>1
76
0</td>\n",
" <td>1
821.00000
0</td>\n",
" <td>
31.
0</td>\n",
" <td>
99.00000
0</td>\n",
" <td>
16.5
0</td>\n",
" <td>
30.00000
0</td>\n",
" </tr>\n",
" </tr>\n",
" <tr>\n",
" </tbody>\n",
" <th>40</th>\n",
"</table>\n",
" <td>41</td>\n",
"</div>"
" <td>1765</td>\n",
],
" <td>43.0</td>\n",
"text/plain": [
" <td>17.60</td>\n",
" Unnamed: 0 Year Wheat Wages\n",
" </tr>\n",
"count 53.000000 53.000000 53.000000 50.000000\n",
" <tr>\n",
"mean 27.000000 1694.924528 43.264151 11.581600\n",
" <th>41</th>\n",
"std 15.443445 77.089571 15.410287 7.336287\n",
" <td>42</td>\n",
"min 1.000000 1565.000000 26.000000 5.000000\n",
" <td>1770</td>\n",
"25% 14.000000 1630.000000 33.000000 6.145000\n",
" <td>47.0</td>\n",
"50% 27.000000 1695.000000 41.000000 7.800000\n",
" <td>18.50</td>\n",
"75% 40.000000 1760.000000 47.000000 14.875000\n",
" </tr>\n",
"max 53.000000 1821.000000 99.000000 30.000000"
" <tr>\n",
]
" <th>42</th>\n",
},
" <td>43</td>\n",
"execution_count": 14,
" <td>1775</td>\n",
"metadata": {},
" <td>44.0</td>\n",
"output_type": "execute_result"
" <td>19.50</td>\n",
}
" </tr>\n",
],
" <tr>\n",
"source": [
" <th>43</th>\n",
"raw_data.describe()\n"
" <td>44</td>\n",
]
" <td>1780</td>\n",
},
" <td>46.0</td>\n",
{
" <td>21.00</td>\n",
"cell_type": "markdown",
" </tr>\n",
"metadata": {},
" <tr>\n",
"source": [
" <th>44</th>\n",
"On va maintenant regarder si il y a des données manquantes."
" <td>45</td>\n",
]
" <td>1785</td>\n",
},
" <td>42.0</td>\n",
{
" <td>23.00</td>\n",
"cell_type": "code",
" </tr>\n",
"execution_count": 15,
" <tr>\n",
"metadata": {},
" <th>45</th>\n",
"outputs": [
" <td>46</td>\n",
{
" <td>1790</td>\n",
"data": {
" <td>47.5</td>\n",
"text/html": [
" <td>25.50</td>\n",
"<div>\n",
" </tr>\n",
"<style scoped>\n",
" <tr>\n",
" .dataframe tbody tr th:only-of-type {\n",
" <th>46</th>\n",
" vertical-align: middle;\n",
" <td>47</td>\n",
" }\n",
" <td>1795</td>\n",
"\n",
" <td>76.0</td>\n",
" .dataframe tbody tr th {\n",
" <td>27.50</td>\n",
" vertical-align: top;\n",
" </tr>\n",
" }\n",
" <tr>\n",
"\n",
" <th>47</th>\n",
" .dataframe thead th {\n",
" <td>48</td>\n",
" text-align: right;\n",
" <td>1800</td>\n",
" }\n",
" <td>79.0</td>\n",
"</style>\n",
" <td>28.50</td>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" </tr>\n",
" <thead>\n",
" <tr>\n",
" <tr style=\"text-align: right;\">\n",
" <th>48</th>\n",
" <th></th>\n",
" <td>49</td>\n",
" <th>Unnamed: 0</th>\n",
" <td>1805</td>\n",
" <th>Year</th>\n",
" <td>81.0</td>\n",
" <th>Wheat</th>\n",
" <td>29.50</td>\n",
" <th>Wages</th>\n",
" </tr>\n",
" <tr>\n",
" <th>49</th>\n",
" <td>50</td>\n",
" <td>1810</td>\n",
" <td>99.0</td>\n",
" <td>30.00</td>\n",
" </tr>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <tr>\n",
" <th>50</th>\n",
" <th>50</th>\n",
" <td>51</td>\n",
" <td>51</td>\n",
...
@@ -572,97 +347,114 @@
...
@@ -572,97 +347,114 @@
],
],
"text/plain": [
"text/plain": [
" Unnamed: 0 Year Wheat Wages\n",
" Unnamed: 0 Year Wheat Wages\n",
"0 1 1565 41.0 5.00\n",
"1 2 1570 45.0 5.05\n",
"2 3 1575 42.0 5.08\n",
"3 4 1580 49.0 5.12\n",
"4 5 1585 41.5 5.15\n",
"5 6 1590 47.0 5.25\n",
"6 7 1595 64.0 5.54\n",
"7 8 1600 27.0 5.61\n",
"8 9 1605 33.0 5.69\n",
"9 10 1610 32.0 5.78\n",
"10 11 1615 33.0 5.94\n",
"11 12 1620 35.0 6.01\n",
"12 13 1625 33.0 6.12\n",
"13 14 1630 45.0 6.22\n",
"14 15 1635 33.0 6.30\n",
"15 16 1640 39.0 6.37\n",
"16 17 1645 53.0 6.45\n",
"17 18 1650 42.0 6.50\n",
"18 19 1655 40.5 6.60\n",
"19 20 1660 46.5 6.75\n",
"20 21 1665 32.0 6.80\n",
"21 22 1670 37.0 6.90\n",
"22 23 1675 43.0 7.00\n",
"23 24 1680 35.0 7.30\n",
"24 25 1685 27.0 7.60\n",
"25 26 1690 40.0 8.00\n",
"26 27 1695 50.0 8.50\n",
"27 28 1700 30.0 9.00\n",
"28 29 1705 32.0 10.00\n",
"29 30 1710 44.0 11.00\n",
"30 31 1715 33.0 11.75\n",
"31 32 1720 29.0 12.50\n",
"32 33 1725 39.0 13.00\n",
"33 34 1730 26.0 13.30\n",
"34 35 1735 32.0 13.60\n",
"35 36 1740 27.0 14.00\n",
"36 37 1745 27.5 14.50\n",
"37 38 1750 31.0 15.00\n",
"38 39 1755 35.5 15.70\n",
"39 40 1760 31.0 16.50\n",
"40 41 1765 43.0 17.60\n",
"41 42 1770 47.0 18.50\n",
"42 43 1775 44.0 19.50\n",
"43 44 1780 46.0 21.00\n",
"44 45 1785 42.0 23.00\n",
"45 46 1790 47.5 25.50\n",
"46 47 1795 76.0 27.50\n",
"47 48 1800 79.0 28.50\n",
"48 49 1805 81.0 29.50\n",
"49 50 1810 99.0 30.00\n",
"50 51 1815 78.0 NaN\n",
"50 51 1815 78.0 NaN\n",
"51 52 1820 54.0 NaN\n",
"51 52 1820 54.0 NaN\n",
"52 53 1821 54.0 NaN"
"52 53 1821 54.0 NaN"
]
]
},
},
"execution_count": 1
0
,
"execution_count": 1
5
,
"metadata": {},
"metadata": {},
"output_type": "execute_result"
"output_type": "execute_result"
}
}
],
],
"source": [
"source": [
"raw_data
\n
"
"raw_data
[raw_data.isnull().any(axis=1)]
"
]
]
},
},
{
{
"cell_type": "code",
"cell_type": "markdown",
"execution_count": null,
"metadata": {},
"metadata": {},
"outputs": [],
"source": [
"source": []
"Il y a 3 lignes avec des données manquantes...uniqument sur les salaires. On va donc les garder pour l'instant.\n",
"On va donc supprimer la première colonne et paaser la colonne Year en index."
]
},
},
{
{
"cell_type": "code",
"cell_type": "code",
"execution_count":
null
,
"execution_count":
37
,
"metadata": {},
"metadata": {},
"outputs": [],
"outputs": [
"source": []
},
{
{
"cell_type": "code",
"data": {
"execution_count": null,
"text/html": [
"metadata": {},
"<div>\n",
"outputs": [],
"<style scoped>\n",
"source": []
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Wheat</th>\n",
" <th>Wages</th>\n",
" </tr>\n",
" <tr>\n",
" <th>Year</th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1565</th>\n",
" <td>41.0</td>\n",
" <td>5.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1570</th>\n",
" <td>45.0</td>\n",
" <td>5.05</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1575</th>\n",
" <td>42.0</td>\n",
" <td>5.08</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1580</th>\n",
" <td>49.0</td>\n",
" <td>5.12</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1585</th>\n",
" <td>41.5</td>\n",
" <td>5.15</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Wheat Wages\n",
"Year \n",
"1565 41.0 5.00\n",
"1570 45.0 5.05\n",
"1575 42.0 5.08\n",
"1580 49.0 5.12\n",
"1585 41.5 5.15"
]
},
},
{
"execution_count": 37,
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {},
"outputs": [],
"output_type": "execute_result"
"source": []
}
],
"source": [
"colonne0=list(raw_data)[0]\n",
"sorted_data = raw_data.set_index('Year').sort_index().drop(colonne0,axis=1) \n",
"# ca ne marche pas si j'essaie de combiner les 2 lignes en une !!\n",
"sorted_data.head()"
]
}
}
],
],
"metadata": {
"metadata": {
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment