Commit 55d4ba2e authored by Julie Gullstrand's avatar Julie Gullstrand

exo 2 finit

parent 1c090dba
...@@ -23,7 +23,7 @@ knitr::opts_chunk$set(echo = TRUE) ...@@ -23,7 +23,7 @@ knitr::opts_chunk$set(echo = TRUE)
The data on the incidence of influenza-like illness are available from the Web site of the [Réseau Sentinelles](http://www.sentiweb.fr/). We download them as a file in CSV format, in which each line corresponds to a week in the observation period. Only the complete dataset, starting in 1984 and ending with a recent week, is available for download. The URL is: The data on the incidence of influenza-like illness are available from the Web site of the [Réseau Sentinelles](http://www.sentiweb.fr/). We download them as a file in CSV format, in which each line corresponds to a week in the observation period. Only the complete dataset, starting in 1984 and ending with a recent week, is available for download. The URL is:
```{r} ```{r}
data_url = "http://www.sentiweb.fr/datasets/incidence-PAY-3.csv" data_url = "http://www.sentiweb.fr/datasets/incidence-PAY-7.csv"
``` ```
This is the documentation of the data from [the download site](https://ns.sentiweb.fr/incidence/csv-schema-v1.json): This is the documentation of the data from [the download site](https://ns.sentiweb.fr/incidence/csv-schema-v1.json):
...@@ -121,21 +121,21 @@ with(tail(data, 200), plot(date, inc, type="l", xlab="Date", ylab="Weekly incide ...@@ -121,21 +121,21 @@ with(tail(data, 200), plot(date, inc, type="l", xlab="Date", ylab="Weekly incide
### Computation ### Computation
Since the peaks of the epidemic happen in winter, near the transition between calendar years, we define the reference period for the annual incidence from August 1st of year $N$ to August 1st of year $N+1$. We label this period as year $N+1$ because the peak is always located in year $N+1$. The very low incidence in summer ensures that the arbitrariness of the choice of reference period has no impact on our conclusions. Since the peaks of the epidemic happen in winter, near the transition between calendar years, we define the reference period for the annual incidence from september 1st of year $N$ to September 1st of year $N+1$. We label this period as year $N+1$ because the peak is always located in year $N+1$. The very low incidence in summer ensures that the arbitrariness of the choice of reference period has no impact on our conclusions.
The argument `na.rm=True` in the sum indicates that missing data points are removed. This is a reasonable choice since there is only one missing point, whose impact cannot be very strong. The argument `na.rm=True` in the sum indicates that missing data points are removed. This is a reasonable choice since there is only one missing point, whose impact cannot be very strong.
```{r} ```{r}
yearly_peak = function(year) { yearly_peak = function(year) {
debut = paste0(year-1,"-08-01") debut = paste0(year-1,"-09-01")
fin = paste0(year,"-08-01") fin = paste0(year,"-09-01")
semaines = data$date > debut & data$date <= fin semaines = data$date > debut & data$date <= fin
sum(data$inc[semaines], na.rm=TRUE) sum(data$inc[semaines], na.rm=TRUE)
} }
``` ```
We must also be careful with the first and last years of the dataset. The data start in October 1984, meaning that we don't have all the data for the peak attributed to the year 1985. We therefore exclude it from the analysis. For the same reason, we define 2018 as the final year. We can increase this value to 2019 only when all data up to July 2019 is available. We must also be careful with the first and last years of the dataset. The data start in October 1984, meaning that we don't have all the data for the peak attributed to the year 1985. We therefore exclude it from the analysis. For the same reason, we define 2019 as the final year. We can increase this value to 2019 only when all data up to July 2019 is available.
```{r} ```{r}
years = 1986:2018 years = 1991:2019
``` ```
We make a new data frame for the annual incidence, applying the function `yearly_peak` to each year: We make a new data frame for the annual incidence, applying the function `yearly_peak` to each year:
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment