---
title: "Subject 2: Purchasing power of English workers from the 16th to the 19th century"
author: "Oussama Oulkaid"
date: "November 28, 2021"
output:
  pdf_document: default
  html_document:
    df_print: paged
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
options(warn = -1) 
```

## Preamble
The aim of this activity is to perform convenient visualization for data describing the evolution of wages and wheat price for English workers from the 16th to the 19th century.

<!-- 
library(reshape2)
library(Hmisc)
-->
We will need to use the following libraries:
```{r, results=FALSE, message=FALSE}
# The environment
library(tidyverse)
library(ggplot2)
```

## Build the data frame
From the following link we have downloaded the data we are going to work with in the form of a csv file, and make it into **data/** folder: <https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/HistData/Wheat.csv>

<!-- describe(df) -->
We build the data frame as follows, and we print a couple of rows to have a look at its structure:
```{r, message=FALSE}
df <- read.csv("data/Wheat.csv",header=T)
df[c(1,2),]
```

## Clean the data frame
We observe that the first column indicates a sort of an identifier for each data sample. This is not an interesting parameter, so we can simply omit it:
```{r, message=FALSE}
# only keep columns from 2 to 4 (column 1 is omitted)
df <- df[c(2:4)]
df[c(1,2),]
```

## Reproducing Playfair's graph
**!!TODO : perform required transformations in terms of wheat-price & salary**
```{r, message=FALSE}
# create a list of colors
my_colors <- list( blue = "#3399e6",
                   red  = "#ff3333",
                   dark = "#1f1f1f")

# Start with a usual ggplot2 call:
ggplot(df, aes(x=Year)) + 
  geom_col( aes(y=Wheat/1.5), width = 4.15, alpha=1,
            color = my_colors[["dark"]], 
            fill = my_colors[["dark"]] ) +
  geom_area( aes(y=Wages), size = 1, alpha=0.7,
             color = my_colors[["red"]], 
             fill=my_colors[["blue"]] ) + 
  
# Custom the Y scales:
  scale_y_continuous(
    # Features of the first axis
    name = "Wages",
    # Add a second axis and specify its features
    sec.axis = sec_axis( trans=~.*1.5, name="Wheat Price") 
  ) +
  labs(title = "Evolution of wages and wheat price for English workers (16th to 19th century)") +
  theme(plot.title = element_text(hjust = 0.5), 
        axis.title.y.left = element_text(colour = my_colors[["red"]]),
        axis.title.y.right = element_text(colour = my_colors[["dark"]]))
```

## Alternative representation
First, we represent the data as simple dots for both wages and wheat price.

As it is difficult to see the pattern for the wheat values evolution, we use the `stat_smooth()` function that shows a smoothed mean.
```{r, message=FALSE}
# Set color parameters
wages_color <- "#ff5733"
wheat_color <- rgb(0.2, 0.6, 0.9, 1)
wheat_color_trans <- rgb(0.2, 0.6, 0.9, 0.5)

# Start with a usual ggplot2 call:
ggplot(df, aes(x=Year)) + 
  geom_point( aes(y=Wages), size = 0.7, color = wages_color) + 
  geom_point( aes(y=Wheat/1.5), size = 0.7, color = wheat_color) +
  geom_line( aes(y=Wages), size = 0.3, color = wages_color, linetype="dashed") +
  geom_line( aes(y=Wheat/1.5), size = 0.3, color = wheat_color, linetype="dashed") +
  stat_smooth(aes(y=Wheat/1.5), level = 0, size=0.6, color=wheat_color_trans) +
  
# Custom the Y scales:
  scale_y_continuous(
    # Features of the first axis
    name = "Wages",
    
    # Add a second axis and specify its features
    sec.axis = sec_axis( trans=~.*1.5, name="Wheat") 
  ) +
  labs(title = "Evolution of wages and wheat price for English workers (16th to 19th century)") +
  theme(plot.title = element_text(hjust = 0.5), 
        axis.title.y.left = element_text(colour = wages_color),
        axis.title.y.right = element_text(colour = wheat_color))
```