The aim of this activity is to perform convenient visualization for data describing the evolution of wages and wheat price for English workers from the 16th to the 19th century.
The aim of this activity is to perform convenient visualization for data describing the evolution of wages and wheat price for English workers from the 16th to the 19th century.
The dataset in our disposal is a csv file available at:
The following is the chart as made by William Playfair, showing at one view the price of both the quarter of wheat and wages of labour by the Week, from 1565 to 1821.
)*](img/playfair-chart.png)
In this document, we first try to reproduce the same chart using R. Then, we propose some enhancement on the visualization aspect. And at the end, we will try to make the message behind Playfair's chart stand out better.
<!--
<!--
library(reshape2)
library(reshape2)
library(Hmisc)
library(Hmisc)
-->
-->
We will need to use the following libraries:
<!-- describe(df) -->
\newpage
## Preliminary steps
1. Importing the following R libraries:
```{r, results=FALSE, message=FALSE}
```{r, results=FALSE, message=FALSE}
# The environment
library(tidyverse)
library(tidyverse)
library(ggplot2)
library(ggplot2)
```
```
## Build the data frame
2. Building the data frame:
From the following link we have downloaded the data we are going to work with in the form of a csv file, and make it into **data/** folder: <https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/HistData/Wheat.csv>
<!-- describe(df) -->
From the [link](https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/HistData/Wheat.csv) cited above, we have downloaded the csv file containing the data. It is located in the `data/` folder.
We build the data frame as follows, and we print a couple of rows to have a look at its structure:
We build the data frame as follows, and we output a couple of rows to have a look at its structure:
```{r, message=FALSE}
```{r, message=FALSE}
df <- read.csv("data/Wheat.csv",header=T)
df <- read.csv("data/Wheat.csv",header=T)
df[c(1,2),]
df[c(1,2),]
```
```
## Clean the data frame
We observe that the first column indicates a sort of an identifier for each data sample. This is not an interesting parameter, so we may simply omit it:
We observe that the first column indicates a sort of an identifier for each data sample. This is not an interesting parameter, so we can simply omit it:
```{r, message=FALSE}
```{r, message=FALSE}
# only keep columns from 2 to 4 (column 1 is omitted)
# Only keep columns from 2 to 4 (column 1 is omitted):
df <- df[c(2:4)]
df <- df[c(2:4)]
df[c(1,2),]
df[c(1,2),]
```
```
## 1. Reproducing Playfair's graph
## Reproducing Playfair's graph
<!-- **!!TODO : perform required transformations in terms of wheat-price & salary** -->
<!-- **!!TODO : perform required transformations in terms of wheat-price & salary** -->
- \underline{Note:} The last three values of Wages are missing from dataset.
- \underline{Comment:} The Wages were increasing for the whole period of time represented in this chart, with a noticeable increase in its pace that started around 1700. Wheat price on the other hand has no consistent progress, and was even rapidly changing in some times.
## Alternative representation
First, we represent the data as simple dots for both wages and wheat price.
First, we represent the data as simple dots for both wages and wheat price.
As it is difficult to see the pattern for the wheat values evolution, we use the `stat_smooth()` function that shows a smoothed mean (with a confidence level of 70%).
As it is difficult to see the pattern for the wheat values evolution, we use the `stat_smooth()` function that shows a smoothed mean (with a confidence level of 70%).
```{r, message=FALSE}
```{r, message=FALSE}
# Set color parameters
# Set color parameters:
wages_color <- "#ff5733"
wages_color <- "#ff5733"
wheat_color <- rgb(0.2, 0.6, 0.9, 1)
wheat_color <- rgb(0.2, 0.6, 0.9, 1)
wheat_color_trans <- rgb(0.2, 0.6, 0.9, 0.5)
wheat_color_trans <- rgb(0.2, 0.6, 0.9, 0.5)
```
```{r, message=FALSE}
# Start with a usual ggplot2 call:
# Start with a usual ggplot2 call:
ggplot(df, aes(x=Year)) +
ggplot(df, aes(x=Year)) +
geom_point( aes(y=Wages), size = 0.7, color = wages_color) +
geom_point( aes(y=Wages), size = 0.7, color = wages_color) +