Update exercice_en.Rmd

ad9b3784 · Jhouben · d4425395 · ad9b3784
Commit ad9b3784 authored Jun 07, 2021 by Jhouben
Hide whitespace changes
Inline Side-by-side

Showing with 18 additions and 11 deletions

exercice_en.Rmd module3/exo2/exercice_en.Rmd +18 -11

No files found.
--- a/module3/exo2/exercice_en.Rmd
+++ b/module3/exo2/exercice_en.Rmd
 ---
-title: "Incidence of influenza-like illness in France"
+title: "Incidence of Chickenpox in France"
 author: "Jhouben Cuesta Ramirez"
 date: "07/06/2021"
 output:
-  pdf_document:
-    toc: true
  html_document:
    toc: true
    theme: journal
+  pdf_document:
+    toc: true
 documentclass: article
 classoption: a4paper
 header-includes:
@@ -22,15 +22,16 @@ knitr::opts_chunk$set(echo = TRUE)
 ## Data preprocessing
-The data on the incidence of influenza-like illness are available from the Web site of the [Réseau Sentinelles](http://www.sentiweb.fr/). We download them as a file in CSV format, in which each line corresponds to a week in the observation period. Only the complete dataset, starting in 1984 and ending with a recent week, is available for download. The URL is:
+The data on the incidence incidence of chickenpox illness are available from the Web site of the [Réseau Sentinelles](http://www.sentiweb.fr/). We download them as a file in CSV format, in which each line corresponds to a week in the observation period. Only the complete dataset, starting in 1991 and ending with a recent week, is available for download. The URL is:
 ```{r}
-data_url = "http://www.sentiweb.fr/datasets/incidence-PAY-3.csv"
+data_url = "https://www.sentiweb.fr/datasets/incidence-PAY-7.csv"
 ```
+In order to preserve the re-producibility of this report, we made ourselves a local copy of the original data without adding or deleting any information at the date of : 07/06/2021. 
 ```{r}
 #The idea if to have a local backup of the file, in the case of the website being down or cease to exist.
-data_csv = "grippal.csv"	
+data_csv = "chickenpox.csv"	
 if (!file.exists(data_csv)) {	
    download.file(data_url, data_csv, method="auto")	
 }	
@@ -131,15 +132,15 @@ with(tail(data, 200), plot(date, inc, type="l", xlab="Date", ylab="Weekly incide
 ### Computation
-Since the peaks of the epidemic happen in winter, near the transition between calendar years, we define the reference period for the annual incidence from August 1st of year $N$ to August 1st of year $N+1$. We label this period as year $N+1$ because the peak is always located in year $N+1$. The very low incidence in summer ensures that the arbitrariness of the choice of reference period has no impact on our conclusions.
+According to the requested in the exercise, we define the reference period for the annual incidence from September 1st of year $N$ to September 1st of year $N+1$.
-The argument `na.rm=True` in the sum indicates that missing data points are removed. This is a reasonable choice since there is only one missing point, whose impact cannot be very strong.
+Given that we have no missing data points, the previous argument `na.rm=True` in the sum was removed.
 ```{r}
 yearly_peak = function(year) {
-      debut = paste0(year-1,"-08-01")
+      debut = paste0(year-1,"-09-01")
-      fin = paste0(year,"-08-01")
+      fin = paste0(year,"-09-01")
      semaines = data$date > debut & data$date <= fin
-      sum(data$inc[semaines], na.rm=TRUE)
+      sum(data$inc[semaines])
      }
 ```
@@ -169,6 +170,12 @@ A list sorted by decreasing annual incidence makes it easy to find the most impo
 head(annnual_inc[order(-annnual_inc$incidence),])
 ```
+### Identification of the weakest epidemics
+A list sorted by increasing annual incidence makes it easy to find the least important ones:
+```{r}
+head(annnual_inc[order(annnual_inc$incidence),])
+```
 Finally, a histogram clearly shows the few very strong epidemics, which affect about 10% of the French population, but are rare: there were three of them in the course of 35 years. The typical epidemic affects only half as many people.
 ```{r}
 hist(annnual_inc$incidence, breaks=10, xlab="Annual incidence", ylab="Number of observations", main="")