"# Le pouvoir d'achat des ouvriers anglais du XVIe au XIX siècle"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[William Playfair](https://fr.wikipedia.org/wiki/William_Playfair) était un des pionniers de la présentation graphique des données. Il est notamment considéré comme l'inventeur de l'histogramme. \n",
"Un de ses graphes célèbres, tiré de son livre [\"A Letter on Our Agricultural Distresses, Their Causes and Remedies\"](https://books.google.fr/books/about/A_Letter_on_Our_Agricultural_Distresses.html?id=aQZGAQAAMAAJ), montre [l'évolution du prix du blé et du salaire moyen entre 1565 et 1821](https://fr.wikipedia.org/wiki/William_Playfair#/media/File:Chart_Showing_at_One_View_the_Price_of_the_Quarter_of_Wheat,_and_Wages_of_Labour_by_the_Week,_from_1565_to_1821.png). Playfair n'a pas publié les données numériques brutes qu'il a utilisées, car à son époque la réplicabilité n'était pas encore considérée comme essentielle. Des [valeurs obtenues par numérisation du graphe](https://vincentarelbundock.github.io/Rdatasets/doc/HistData/Wheat.html) sont aujourd'hui téléchargeables, la [version en format CSV](https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/HistData/Wheat.csv) étant la plus pratique.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Quelques remarques pour la compréhension des données :\n",
"* Jusqu'en 1971, la livre sterling était divisée en 20 shillings, et un shilling en 12 pences.\n",
"* Le prix du blé est donné en shillings pour un quart de boisseau de blé. Un quart de boisseau équivaut 15 livres britanniques ou 6,8 kg.\n",
"* Les salaires sont donnés en shillings par semaine."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Reproduction du graphe de Playfair à partir des données numériques"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Il s'agit de:\n",
"* Représenter, comme Playfair, le prix du blé par des barres et les salaires par une surface bleue délimitée par une courbe rouge. \n",
"* Superposer les deux de la même façon dans un seul graphique. \n",
"Le style du graphique pourra rester différent par rapport à l'original, mais l'impression globale devrait être la même."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Pour créer un graphique représentant le prix du blé par des barres et les salaires par une surface bleue délimitée par une courbe rouge, vous pouvez utiliser la bibliothèque `pandas` pour manipuler les données et `matplotlib` pour créer le graphique."
"Y a-t-il des points manquants dans ce jeux de données ? Oui, les années 1815, 1820 et 1821 n'ont pas de valeurs associées pour les salaires."
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>rownames</th>\n",
" <th>Year</th>\n",
" <th>Wheat</th>\n",
" <th>Wages</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>50</th>\n",
" <td>51</td>\n",
" <td>1815</td>\n",
" <td>78.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>51</th>\n",
" <td>52</td>\n",
" <td>1820</td>\n",
" <td>54.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>52</th>\n",
" <td>53</td>\n",
" <td>1821</td>\n",
" <td>54.0</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" rownames Year Wheat Wages\n",
"50 51 1815 78.0 NaN\n",
"51 52 1820 54.0 NaN\n",
"52 53 1821 54.0 NaN"
]
},
"execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"raw_data[raw_data.isnull().any(axis=1)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Nous éliminons ces points, ce qui n'a pas d'impact fort sur notre analyse."
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>rownames</th>\n",
" <th>Year</th>\n",
" <th>Wheat</th>\n",
" <th>Wages</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>1565</td>\n",
" <td>41.0</td>\n",
" <td>5.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2</td>\n",
" <td>1570</td>\n",
" <td>45.0</td>\n",
" <td>5.05</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3</td>\n",
" <td>1575</td>\n",
" <td>42.0</td>\n",
" <td>5.08</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>4</td>\n",
" <td>1580</td>\n",
" <td>49.0</td>\n",
" <td>5.12</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>5</td>\n",
" <td>1585</td>\n",
" <td>41.5</td>\n",
" <td>5.15</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>6</td>\n",
" <td>1590</td>\n",
" <td>47.0</td>\n",
" <td>5.25</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>7</td>\n",
" <td>1595</td>\n",
" <td>64.0</td>\n",
" <td>5.54</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>8</td>\n",
" <td>1600</td>\n",
" <td>27.0</td>\n",
" <td>5.61</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>9</td>\n",
" <td>1605</td>\n",
" <td>33.0</td>\n",
" <td>5.69</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>10</td>\n",
" <td>1610</td>\n",
" <td>32.0</td>\n",
" <td>5.78</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>11</td>\n",
" <td>1615</td>\n",
" <td>33.0</td>\n",
" <td>5.94</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>12</td>\n",
" <td>1620</td>\n",
" <td>35.0</td>\n",
" <td>6.01</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>13</td>\n",
" <td>1625</td>\n",
" <td>33.0</td>\n",
" <td>6.12</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>14</td>\n",
" <td>1630</td>\n",
" <td>45.0</td>\n",
" <td>6.22</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>15</td>\n",
" <td>1635</td>\n",
" <td>33.0</td>\n",
" <td>6.30</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>16</td>\n",
" <td>1640</td>\n",
" <td>39.0</td>\n",
" <td>6.37</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>17</td>\n",
" <td>1645</td>\n",
" <td>53.0</td>\n",
" <td>6.45</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>18</td>\n",
" <td>1650</td>\n",
" <td>42.0</td>\n",
" <td>6.50</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>19</td>\n",
" <td>1655</td>\n",
" <td>40.5</td>\n",
" <td>6.60</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>20</td>\n",
" <td>1660</td>\n",
" <td>46.5</td>\n",
" <td>6.75</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20</th>\n",
" <td>21</td>\n",
" <td>1665</td>\n",
" <td>32.0</td>\n",
" <td>6.80</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21</th>\n",
" <td>22</td>\n",
" <td>1670</td>\n",
" <td>37.0</td>\n",
" <td>6.90</td>\n",
" </tr>\n",
" <tr>\n",
" <th>22</th>\n",
" <td>23</td>\n",
" <td>1675</td>\n",
" <td>43.0</td>\n",
" <td>7.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>23</th>\n",
" <td>24</td>\n",
" <td>1680</td>\n",
" <td>35.0</td>\n",
" <td>7.30</td>\n",
" </tr>\n",
" <tr>\n",
" <th>24</th>\n",
" <td>25</td>\n",
" <td>1685</td>\n",
" <td>27.0</td>\n",
" <td>7.60</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25</th>\n",
" <td>26</td>\n",
" <td>1690</td>\n",
" <td>40.0</td>\n",
" <td>8.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>26</th>\n",
" <td>27</td>\n",
" <td>1695</td>\n",
" <td>50.0</td>\n",
" <td>8.50</td>\n",
" </tr>\n",
" <tr>\n",
" <th>27</th>\n",
" <td>28</td>\n",
" <td>1700</td>\n",
" <td>30.0</td>\n",
" <td>9.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>28</th>\n",
" <td>29</td>\n",
" <td>1705</td>\n",
" <td>32.0</td>\n",
" <td>10.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>29</th>\n",
" <td>30</td>\n",
" <td>1710</td>\n",
" <td>44.0</td>\n",
" <td>11.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>30</th>\n",
" <td>31</td>\n",
" <td>1715</td>\n",
" <td>33.0</td>\n",
" <td>11.75</td>\n",
" </tr>\n",
" <tr>\n",
" <th>31</th>\n",
" <td>32</td>\n",
" <td>1720</td>\n",
" <td>29.0</td>\n",
" <td>12.50</td>\n",
" </tr>\n",
" <tr>\n",
" <th>32</th>\n",
" <td>33</td>\n",
" <td>1725</td>\n",
" <td>39.0</td>\n",
" <td>13.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>33</th>\n",
" <td>34</td>\n",
" <td>1730</td>\n",
" <td>26.0</td>\n",
" <td>13.30</td>\n",
" </tr>\n",
" <tr>\n",
" <th>34</th>\n",
" <td>35</td>\n",
" <td>1735</td>\n",
" <td>32.0</td>\n",
" <td>13.60</td>\n",
" </tr>\n",
" <tr>\n",
" <th>35</th>\n",
" <td>36</td>\n",
" <td>1740</td>\n",
" <td>27.0</td>\n",
" <td>14.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36</th>\n",
" <td>37</td>\n",
" <td>1745</td>\n",
" <td>27.5</td>\n",
" <td>14.50</td>\n",
" </tr>\n",
" <tr>\n",
" <th>37</th>\n",
" <td>38</td>\n",
" <td>1750</td>\n",
" <td>31.0</td>\n",
" <td>15.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>38</th>\n",
" <td>39</td>\n",
" <td>1755</td>\n",
" <td>35.5</td>\n",
" <td>15.70</td>\n",
" </tr>\n",
" <tr>\n",
" <th>39</th>\n",
" <td>40</td>\n",
" <td>1760</td>\n",
" <td>31.0</td>\n",
" <td>16.50</td>\n",
" </tr>\n",
" <tr>\n",
" <th>40</th>\n",
" <td>41</td>\n",
" <td>1765</td>\n",
" <td>43.0</td>\n",
" <td>17.60</td>\n",
" </tr>\n",
" <tr>\n",
" <th>41</th>\n",
" <td>42</td>\n",
" <td>1770</td>\n",
" <td>47.0</td>\n",
" <td>18.50</td>\n",
" </tr>\n",
" <tr>\n",
" <th>42</th>\n",
" <td>43</td>\n",
" <td>1775</td>\n",
" <td>44.0</td>\n",
" <td>19.50</td>\n",
" </tr>\n",
" <tr>\n",
" <th>43</th>\n",
" <td>44</td>\n",
" <td>1780</td>\n",
" <td>46.0</td>\n",
" <td>21.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>44</th>\n",
" <td>45</td>\n",
" <td>1785</td>\n",
" <td>42.0</td>\n",
" <td>23.00</td>\n",
" </tr>\n",
" <tr>\n",
" <th>45</th>\n",
" <td>46</td>\n",
" <td>1790</td>\n",
" <td>47.5</td>\n",
" <td>25.50</td>\n",
" </tr>\n",
" <tr>\n",
" <th>46</th>\n",
" <td>47</td>\n",
" <td>1795</td>\n",
" <td>76.0</td>\n",
" <td>27.50</td>\n",
" </tr>\n",
" <tr>\n",
" <th>47</th>\n",
" <td>48</td>\n",
" <td>1800</td>\n",
" <td>79.0</td>\n",
" <td>28.50</td>\n",
" </tr>\n",
" <tr>\n",
" <th>48</th>\n",
" <td>49</td>\n",
" <td>1805</td>\n",
" <td>81.0</td>\n",
" <td>29.50</td>\n",
" </tr>\n",
" <tr>\n",
" <th>49</th>\n",
" <td>50</td>\n",
" <td>1810</td>\n",
" <td>99.0</td>\n",
" <td>30.00</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" rownames Year Wheat Wages\n",
"0 1 1565 41.0 5.00\n",
"1 2 1570 45.0 5.05\n",
"2 3 1575 42.0 5.08\n",
"3 4 1580 49.0 5.12\n",
"4 5 1585 41.5 5.15\n",
"5 6 1590 47.0 5.25\n",
"6 7 1595 64.0 5.54\n",
"7 8 1600 27.0 5.61\n",
"8 9 1605 33.0 5.69\n",
"9 10 1610 32.0 5.78\n",
"10 11 1615 33.0 5.94\n",
"11 12 1620 35.0 6.01\n",
"12 13 1625 33.0 6.12\n",
"13 14 1630 45.0 6.22\n",
"14 15 1635 33.0 6.30\n",
"15 16 1640 39.0 6.37\n",
"16 17 1645 53.0 6.45\n",
"17 18 1650 42.0 6.50\n",
"18 19 1655 40.5 6.60\n",
"19 20 1660 46.5 6.75\n",
"20 21 1665 32.0 6.80\n",
"21 22 1670 37.0 6.90\n",
"22 23 1675 43.0 7.00\n",
"23 24 1680 35.0 7.30\n",
"24 25 1685 27.0 7.60\n",
"25 26 1690 40.0 8.00\n",
"26 27 1695 50.0 8.50\n",
"27 28 1700 30.0 9.00\n",
"28 29 1705 32.0 10.00\n",
"29 30 1710 44.0 11.00\n",
"30 31 1715 33.0 11.75\n",
"31 32 1720 29.0 12.50\n",
"32 33 1725 39.0 13.00\n",
"33 34 1730 26.0 13.30\n",
"34 35 1735 32.0 13.60\n",
"35 36 1740 27.0 14.00\n",
"36 37 1745 27.5 14.50\n",
"37 38 1750 31.0 15.00\n",
"38 39 1755 35.5 15.70\n",
"39 40 1760 31.0 16.50\n",
"40 41 1765 43.0 17.60\n",
"41 42 1770 47.0 18.50\n",
"42 43 1775 44.0 19.50\n",
"43 44 1780 46.0 21.00\n",
"44 45 1785 42.0 23.00\n",
"45 46 1790 47.5 25.50\n",
"46 47 1795 76.0 27.50\n",
"47 48 1800 79.0 28.50\n",
"48 49 1805 81.0 29.50\n",
"49 50 1810 99.0 30.00"
]
},
"execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data = raw_data.dropna().copy()\n",
"data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Nous définissons les années d'observation comme nouvel index de notre jeux de données. Ceci en fait une suite chronologique, ce qui sera pratique par la suite. Ensuite, nous trions les points par période, dans le sens chronologique."
"\u001b[0;32m/opt/conda/lib/python3.6/site-packages/matplotlib/axes/_axes.py\u001b[0m in \u001b[0;36mhist\u001b[0;34m(***failed resolving arguments***)\u001b[0m\n\u001b[1;32m 6637\u001b[0m \u001b[0;31m# this will automatically overwrite bins,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 6638\u001b[0m \u001b[0;31m# so that each histogram uses the same bins\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 6639\u001b[0;31m \u001b[0mm\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbins\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mhistogram\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbins\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mweights\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mw\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mhist_kwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 6640\u001b[0m \u001b[0mm\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mm\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mastype\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfloat\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# causes problems later if it's an int\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 6641\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mmlast\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/opt/conda/lib/python3.6/site-packages/numpy/lib/histograms.py\u001b[0m in \u001b[0;36mhistogram\u001b[0;34m(a, bins, range, normed, weights, density)\u001b[0m\n\u001b[1;32m 700\u001b[0m \u001b[0ma\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mweights\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0m_ravel_and_check_weights\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mweights\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 701\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 702\u001b[0;31m \u001b[0mbin_edges\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0muniform_bins\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0m_get_bin_edges\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbins\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mrange\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mweights\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 703\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 704\u001b[0m \u001b[0;31m# Histogram is an integer or a float array depending on the weights.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m/opt/conda/lib/python3.6/site-packages/numpy/lib/histograms.py\u001b[0m in \u001b[0;36m_get_bin_edges\u001b[0;34m(a, bins, range, weights)\u001b[0m\n\u001b[1;32m 318\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mbin_name\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32min\u001b[0m \u001b[0m_hist_bin_selectors\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 319\u001b[0m raise ValueError(\n\u001b[0;32m--> 320\u001b[0;31m \"{!r} is not a valid estimator for `bins`\".format(bin_name))\n\u001b[0m\u001b[1;32m 321\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mweights\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 322\u001b[0m raise TypeError(\"Automated estimation of the number of \"\n",
"\u001b[0;31mValueError\u001b[0m: '5' is not a valid estimator for `bins`"