Replace exercice_en.Rmd

parent 90697597
......@@ -21,7 +21,7 @@ The dataset in our disposal is a csv file available at:
<https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/HistData/Wheat.csv>.
The following is the chart as made by William Playfair, showing at one view the price of both the quarter of wheat and wages of labour by the Week, from 1565 to 1821.
The following is the chart as made by William Playfair, showing at one view the price of both the quarter of wheat (=12.70058636 kg) and wages of labor by the Week, from 1565 to 1821.
![Evolution of the wheat price and average salaries from 1565 to 1821 *(source: [Wikimedia](https://commons.wikimedia.org/wiki/File:Chart_Showing_at_One_View_the_Price_of_the_Quarter_of_Wheat,_and_Wages_of_Labour_by_the_Week,_from_1565_to_1821.png))*](img/playfair-chart.png)
......@@ -38,6 +38,7 @@ library(Hmisc)
```{r, results=FALSE, message=FALSE}
library(tidyverse)
library(ggplot2)
library(dplyr)
```
2. Building the data frame:
......@@ -70,18 +71,17 @@ my_colors <- list( blue = "#3399e6",
ggplot(df, aes(x=Year)) +
# Plot Wheat price with Histograms (scale=3/2):
geom_col( aes(y=Wheat/1.5), width = 4.15, alpha=1,
color = my_colors[["dark"]],
fill = my_colors[["dark"]] ) +
color = my_colors[["dark"]], fill = my_colors[["dark"]] ) +
# Plot Wages as a blue filled area with a red delimiter:
geom_area( aes(y=Wages), size = 1, alpha=0.7,
color = my_colors[["red"]],
fill=my_colors[["blue"]] ) +
color = my_colors[["red"]], fill=my_colors[["blue"]] ) +
# Custom the Y scales:
scale_y_continuous( name = "Wages (in Shillings per week)",
sec.axis = sec_axis( trans=~.*1.5, name="Wheat Price (in Shillings per quarter)" )
) +
labs(title = "Evolution of wages and wheat price for English workers (16th to 19th century)") +
theme_light() +
theme(plot.title = element_text(hjust = 0.5),
axis.title.y.left = element_text(colour = my_colors[["red"]]),
axis.title.y.right = element_text(colour = my_colors[["dark"]]))
......@@ -93,7 +93,9 @@ ggplot(df, aes(x=Year)) +
## Alternative representation
First, we represent the data as simple dots for both wages and wheat price.
As it is difficult to see the pattern for the wheat values evolution, we use the `stat_smooth()` function that shows a smoothed mean (with a confidence level of 70%).
As it is difficult to see the overall pattern for the wheat values evolution, we use the `stat_smooth()` function that shows a smoothed mean (with a confidence level of 70%).
We try to make the two curves occupy the largest space as possible, all by avoiding them intersect, in order to facilitate the reading of the plot. For that purpose, my change reduce the scale of Wheat price by 2/3.
```{r, message=FALSE}
# Set color parameters:
wages_color <- "#ff5733"
......@@ -109,43 +111,58 @@ ggplot(df, aes(x=Year)) +
geom_line( aes(y=Wheat/1.5), size = 0.3, color = wheat_color, linetype="dashed") +
stat_smooth(aes(y=Wheat/1.5), level = 0.7, size=0.6, color=wheat_color_trans) +
# Custom the Y scales:
scale_y_continuous( name = "Wages (in Shillings per week)",
sec.axis = sec_axis( trans=~.*1.5, name="Wheat (in Shillings per quarter)")
) +
labs(title = "Evolution of wages and wheat price for English workers (16th to 19th century)") +
theme_light() +
theme(plot.title = element_text(hjust = 0.5),
axis.title.y.left = element_text(colour = wages_color),
axis.title.y.right = element_text(colour = wheat_color))
```
## Another representation without an explicit time axis
When simply plotting `Wages = f(Wheat)`, there is a high density of samples that are grouped around the lower values of Wages (compared to higher values), which makes it very hard to read the graph. As a solution, we propose the following representation, on which we defined two domains of wage value, each with a different x-axis scale.
Moreover, the evolution of time (Year) is now represented with a nuanced blue color.
```{r, message=FALSE}
df_sub <- df[,]
ggplot(df_sub, aes(x=Wages, y=Wheat)) +
geom_area( size = 0.5, alpha=0.5,
linetype="dashed",
color = "#ff3333",
fill = "#3399e6" ) +
geom_point( size = .7, shape = 4 ) +
#xlim(min((na.omit(df_sub))$Wages), max((na.omit(df_sub))$Wages)) +
labs(title = "Evolution of wheat price with respect to salary of English workers (16th to 19th century)") +
theme(plot.title = element_text(hjust = 0.5))
# Omit rows with NA values
data <- na.omit(df)
# Split data into two intervals, defined by a new variable
data$wages_interval[data$Wages < 8] <- "Wages in [5,8["
data$wages_interval[data$Wages >= 8] <- "Wages in [8,30]"
data$wages_interval = factor(data$wages_interval, levels=c('Wages in [5,8[','Wages in [8,30]'))
# Plot
ggplot(data, aes(x=Wages, y=Wheat)) +
geom_segment( aes(x=Wages, xend=Wages, y=0, yend=Wheat, colour=Year), size=.8, alpha=1) +
facet_grid(~wages_interval, scales='free') +
theme_light() +
theme(
legend.position = "right",
legend.key.size = unit(0.35, 'cm'),
panel.border = element_blank(),
) +
xlab("Wages (in Shillings per week)") +
ylab("Wheat Price (in Shilling per quarter)") +
labs(title = "Evolution of wheat price with respect to salary of English workers")
```
## From another angle
## Making the message behind Playfair's chart stand out better!
An interesting parameter to look at is the ratio representing the amount of wheat a worker can buy with his salary.
```{r, message=FALSE}
df_sub2 <- df[,]
ggplot(df_sub2, aes(x=Year, y=Wages/Wheat)) +
geom_line( size = 0.5, alpha=0.5,
linetype="dashed" ) +
ggplot(df, aes(x=Year, y=Wages/Wheat)) +
geom_line( size = 0.5, alpha=0.5, linetype="dashed" ) +
geom_point( size = 1 ) +
labs(title = "How many wheat quaters a worker can buy with his weekly salary ?") +
theme(plot.title = element_text(hjust = 0.5))
labs(title = "How many wheat quarters a worker can buy with his weekly salary ?") +
theme_light() + theme(plot.title = element_text(hjust = 0.5))
```
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment