Unfortunately there are many ways to indicate the absence of a data value in a dataset. Here we check for a common one: empty fields. For completeness, we should also look for non-numerical data in numerical columns. We don't do this here, but checks in later processing steps would catch such anomalies.
Unfortunately there are many ways to indicate the absence of a data value in a dataset. Here we check for a common one: empty fields. For completeness, we should also look for non-numerical data in numerical columns. We don't do this here, but checks in later processing steps would catch such anomalies.
We make a new dataset without the lines that contain empty fields. We print those lines to preserve a trace of their contents.
We make a new dataset without the lines that contain empty fields. We print those lines to preserve a trace of their contents.
#+BEGIN_SRC python :results output :exports both
#+BEGIN_SRC python :results output :exports both
...
@@ -135,6 +136,7 @@ for week, inc in data:
...
@@ -135,6 +136,7 @@ for week, inc in data:
#+END_SRC
#+END_SRC
No problem - fine!
No problem - fine!
** Date conversion
** Date conversion
In order to facilitate the subsequent treatment, we replace the ISO week numbers by the dates of each week's Monday. This is also a good occasion to sort the lines by increasing data, and to convert the incidences from strings to integers.
In order to facilitate the subsequent treatment, we replace the ISO week numbers by the dates of each week's Monday. This is also a good occasion to sort the lines by increasing data, and to convert the incidences from strings to integers.