Commit ad9b3784 authored by Jhouben's avatar Jhouben

Update exercice_en.Rmd

parent d4425395
--- ---
title: "Incidence of influenza-like illness in France" title: "Incidence of Chickenpox in France"
author: "Jhouben Cuesta Ramirez" author: "Jhouben Cuesta Ramirez"
date: "07/06/2021" date: "07/06/2021"
output: output:
pdf_document:
toc: true
html_document: html_document:
toc: true toc: true
theme: journal theme: journal
pdf_document:
toc: true
documentclass: article documentclass: article
classoption: a4paper classoption: a4paper
header-includes: header-includes:
...@@ -22,15 +22,16 @@ knitr::opts_chunk$set(echo = TRUE) ...@@ -22,15 +22,16 @@ knitr::opts_chunk$set(echo = TRUE)
## Data preprocessing ## Data preprocessing
The data on the incidence of influenza-like illness are available from the Web site of the [Réseau Sentinelles](http://www.sentiweb.fr/). We download them as a file in CSV format, in which each line corresponds to a week in the observation period. Only the complete dataset, starting in 1984 and ending with a recent week, is available for download. The URL is: The data on the incidence incidence of chickenpox illness are available from the Web site of the [Réseau Sentinelles](http://www.sentiweb.fr/). We download them as a file in CSV format, in which each line corresponds to a week in the observation period. Only the complete dataset, starting in 1991 and ending with a recent week, is available for download. The URL is:
```{r} ```{r}
data_url = "http://www.sentiweb.fr/datasets/incidence-PAY-3.csv" data_url = "https://www.sentiweb.fr/datasets/incidence-PAY-7.csv"
``` ```
In order to preserve the re-producibility of this report, we made ourselves a local copy of the original data without adding or deleting any information at the date of : 07/06/2021.
```{r} ```{r}
#The idea if to have a local backup of the file, in the case of the website being down or cease to exist. #The idea if to have a local backup of the file, in the case of the website being down or cease to exist.
data_csv = "grippal.csv" data_csv = "chickenpox.csv"
if (!file.exists(data_csv)) { if (!file.exists(data_csv)) {
download.file(data_url, data_csv, method="auto") download.file(data_url, data_csv, method="auto")
} }
...@@ -131,15 +132,15 @@ with(tail(data, 200), plot(date, inc, type="l", xlab="Date", ylab="Weekly incide ...@@ -131,15 +132,15 @@ with(tail(data, 200), plot(date, inc, type="l", xlab="Date", ylab="Weekly incide
### Computation ### Computation
Since the peaks of the epidemic happen in winter, near the transition between calendar years, we define the reference period for the annual incidence from August 1st of year $N$ to August 1st of year $N+1$. We label this period as year $N+1$ because the peak is always located in year $N+1$. The very low incidence in summer ensures that the arbitrariness of the choice of reference period has no impact on our conclusions. According to the requested in the exercise, we define the reference period for the annual incidence from September 1st of year $N$ to September 1st of year $N+1$.
The argument `na.rm=True` in the sum indicates that missing data points are removed. This is a reasonable choice since there is only one missing point, whose impact cannot be very strong. Given that we have no missing data points, the previous argument `na.rm=True` in the sum was removed.
```{r} ```{r}
yearly_peak = function(year) { yearly_peak = function(year) {
debut = paste0(year-1,"-08-01") debut = paste0(year-1,"-09-01")
fin = paste0(year,"-08-01") fin = paste0(year,"-09-01")
semaines = data$date > debut & data$date <= fin semaines = data$date > debut & data$date <= fin
sum(data$inc[semaines], na.rm=TRUE) sum(data$inc[semaines])
} }
``` ```
...@@ -169,6 +170,12 @@ A list sorted by decreasing annual incidence makes it easy to find the most impo ...@@ -169,6 +170,12 @@ A list sorted by decreasing annual incidence makes it easy to find the most impo
head(annnual_inc[order(-annnual_inc$incidence),]) head(annnual_inc[order(-annnual_inc$incidence),])
``` ```
### Identification of the weakest epidemics
A list sorted by increasing annual incidence makes it easy to find the least important ones:
```{r}
head(annnual_inc[order(annnual_inc$incidence),])
```
Finally, a histogram clearly shows the few very strong epidemics, which affect about 10% of the French population, but are rare: there were three of them in the course of 35 years. The typical epidemic affects only half as many people. Finally, a histogram clearly shows the few very strong epidemics, which affect about 10% of the French population, but are rare: there were three of them in the course of 35 years. The typical epidemic affects only half as many people.
```{r} ```{r}
hist(annnual_inc$incidence, breaks=10, xlab="Annual incidence", ylab="Number of observations", main="") hist(annnual_inc$incidence, breaks=10, xlab="Annual incidence", ylab="Number of observations", main="")
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment