diff --git a/module3/exo1/influenza-like-illness-analysis.Rmd b/module3/exo1/influenza-like-illness-analysis.Rmd index 8047fa03d3d3cc003416749dd44be6adbb072d04..cd788faea660bb389b1edaac07ec432e44eb15b1 100644 --- a/module3/exo1/influenza-like-illness-analysis.Rmd +++ b/module3/exo1/influenza-like-illness-analysis.Rmd @@ -26,6 +26,15 @@ The data on the incidence of influenza-like illness are available from the Web s data_url = "http://www.sentiweb.fr/datasets/incidence-PAY-3.csv" ``` +In order to protect us in case the Réseau Sentinelles Web server disappears or is modified, we make a local copy of this dataset that we store together with our analysis. It is unnecessary and even risky to download the data at each execution, because in case of a malfunction we might be replacing our file by a corrupted version. Therefore we download the data only if no local copy exists. + +```{r} +data_file = "syndrome-grippal.csv" +if (!file.exists(data_file)) { + download.file(data_url, data_file, method="auto") +} +``` + This is the documentation of the data from [the download site](https://ns.sentiweb.fr/incidence/csv-schema-v1.json): | Column name | Description | @@ -41,11 +50,11 @@ This is the documentation of the data from [the download site](https://ns.sentiw | `geo_insee` | Identifier of the geographic area, from INSEE https://www.insee.fr | | `geo_name` | Geographic label of the area, corresponding to INSEE code. This label is not an id and is only provided for human reading | -### Download +### Reading the daa The first line of the CSV file is a comment, which we ignore with `skip=1`. ```{r} -data = read.csv(data_url, skip=1) +data = read.csv(data_file, skip=1) ``` Let's have a look at what we got: