This is the documentation of the data from [[https://ns.sentiweb.fr/incidence/csv-schema-v1.json][the download site]]:
| Column name | Description |
...
...
@@ -65,10 +68,12 @@ The [[https://en.wikipedia.org/wiki/ISO_8601][ISO-8601]] format is popular in Eu
** Download
After downloading the raw data, we extract the part we are interested in. We first split the file into lines, of which we discard the first one that contains a comment. We then split the remaining lines into columns.
@@ -79,6 +84,8 @@ Let's have a look at what we have so far:
table[:5]
#+END_SRC
#+RESULTS:
** Checking for missing data
Unfortunately there are many ways to indicate the absence of a data value in a dataset. Here we check for a common one: empty fields. For completeness, we should also look for non-numerical data in numerical columns. We don't do this here, but checks in later processing steps would catch such anomalies.