--- title: "Subject 2: Purchasing power of English workers from the 16th to the 19th century" author: "Oussama Oulkaid" date: "November 28, 2021" output: pdf_document: default html_document: df_print: paged urlcolor: blue --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) options(warn = -1) ``` ## Preamble The aim of this activity is to perform convenient visualization for data describing the evolution of wages and wheat price for English workers from the 16th to the 19th century. The dataset in our disposal is a csv file available at: . The following is the chart as made by William Playfair, showing at one view the price of both the quarter of wheat and wages of labour by the Week, from 1565 to 1821. ![Evolution of the wheat price and average salaries from 1565 to 1821 *(source: [Wikimedia](https://commons.wikimedia.org/wiki/File:Chart_Showing_at_One_View_the_Price_of_the_Quarter_of_Wheat,_and_Wages_of_Labour_by_the_Week,_from_1565_to_1821.png))*](img/playfair-chart.png) In this document, we first try to reproduce the same chart using R. Then, we propose some enhancement on the visualization aspect. And at the end, we will try to make the message behind Playfair's chart stand out better. \newpage ## Preliminary steps 1. Importing the following R libraries: ```{r, results=FALSE, message=FALSE} library(tidyverse) library(ggplot2) ``` 2. Building the data frame: From the [link](https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/HistData/Wheat.csv) cited above, we have downloaded the csv file containing the data. It is located in the `data/` folder. We build the data frame as follows, and we output a couple of rows to have a look at its structure: ```{r, message=FALSE} df <- read.csv("data/Wheat.csv",header=T) df[c(1,2),] ``` We observe that the first column indicates a sort of an identifier for each data sample. This is not an interesting parameter, so we may simply omit it: ```{r, message=FALSE} # Only keep columns from 2 to 4 (column 1 is omitted): df <- df[c(2:4)] df[c(1,2),] ``` ## Reproducing Playfair's graph ```{r, message=FALSE} # Create a list of personalized colors: my_colors <- list( blue = "#3399e6", red = "#ff3333", dark = "#1f1f1f") ``` ```{r, message=FALSE} # Start with a usual ggplot2 call: ggplot(df, aes(x=Year)) + # Plot Wheat price with Histograms (scale=3/2): geom_col( aes(y=Wheat/1.5), width = 4.15, alpha=1, color = my_colors[["dark"]], fill = my_colors[["dark"]] ) + # Plot Wages as a blue filled area with a red delimiter: geom_area( aes(y=Wages), size = 1, alpha=0.7, color = my_colors[["red"]], fill=my_colors[["blue"]] ) + # Custom the Y scales: scale_y_continuous( name = "Wages (in Shillings per week)", sec.axis = sec_axis( trans=~.*1.5, name="Wheat Price (in Shillings per quarter)" ) ) + labs(title = "Evolution of wages and wheat price for English workers (16th to 19th century)") + theme(plot.title = element_text(hjust = 0.5), axis.title.y.left = element_text(colour = my_colors[["red"]]), axis.title.y.right = element_text(colour = my_colors[["dark"]])) ``` - \underline{Note:} The last three values of Wages are missing from dataset. - \underline{Comment:} The Wages were increasing for the whole period of time represented in this chart, with a noticeable increase in its pace that started around 1700. Wheat price on the other hand has no consistent progress, and was even rapidly changing in some times. ## Alternative representation First, we represent the data as simple dots for both wages and wheat price. As it is difficult to see the pattern for the wheat values evolution, we use the `stat_smooth()` function that shows a smoothed mean (with a confidence level of 70%). ```{r, message=FALSE} # Set color parameters: wages_color <- "#ff5733" wheat_color <- rgb(0.2, 0.6, 0.9, 1) wheat_color_trans <- rgb(0.2, 0.6, 0.9, 0.5) ``` ```{r, message=FALSE} # Start with a usual ggplot2 call: ggplot(df, aes(x=Year)) + geom_point( aes(y=Wages), size = 0.7, color = wages_color) + geom_point( aes(y=Wheat/1.5), size = 0.7, color = wheat_color) + geom_line( aes(y=Wages), size = 0.3, color = wages_color, linetype="dashed") + geom_line( aes(y=Wheat/1.5), size = 0.3, color = wheat_color, linetype="dashed") + stat_smooth(aes(y=Wheat/1.5), level = 0.7, size=0.6, color=wheat_color_trans) + # Custom the Y scales: scale_y_continuous( name = "Wages (in Shillings per week)", sec.axis = sec_axis( trans=~.*1.5, name="Wheat (in Shillings per quarter)") ) + labs(title = "Evolution of wages and wheat price for English workers (16th to 19th century)") + theme(plot.title = element_text(hjust = 0.5), axis.title.y.left = element_text(colour = wages_color), axis.title.y.right = element_text(colour = wheat_color)) ``` ## Another representation without an explicit time axis ```{r, message=FALSE} df_sub <- df[,] ggplot(df_sub, aes(x=Wages, y=Wheat)) + geom_area( size = 0.5, alpha=0.5, linetype="dashed", color = "#ff3333", fill = "#3399e6" ) + geom_point( size = .7, shape = 4 ) + xlim(min((na.omit(df_sub))$Wages), max((na.omit(df_sub))$Wages)) + labs(title = "Evolution of wheat price with respect to salary of English workers (16th to 19th century)") + theme(plot.title = element_text(hjust = 0.5)) ```