From 78b7488edb025532788f18b713a409f3129e5a1b Mon Sep 17 00:00:00 2001
From: Dorinel Bastide <dorinel.bastide@gmail.com>
Date: Wed, 15 Jul 2020 22:21:01 +0200
Subject: [PATCH] Commit of exo1 module  3  to start correcting

---
 module3/exo1/influenza-like-illness-analysis.org | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/module3/exo1/influenza-like-illness-analysis.org b/module3/exo1/influenza-like-illness-analysis.org
index 6c8b47a..0102ba8 100644
--- a/module3/exo1/influenza-like-illness-analysis.org
+++ b/module3/exo1/influenza-like-illness-analysis.org
@@ -45,6 +45,9 @@ The data on the incidence of influenza-like illness are available from the Web s
 #+NAME: data-url
 http://www.sentiweb.fr/datasets/incidence-PAY-3.csv
 
+#+NAME: data-csv
+~/org/incidence-PAY-3.csv
+
 This is the documentation of the data from [[https://ns.sentiweb.fr/incidence/csv-schema-v1.json][the download site]]:
 
 | Column name  | Description                                                                                                               |
@@ -65,10 +68,12 @@ The [[https://en.wikipedia.org/wiki/ISO_8601][ISO-8601]] format is popular in Eu
 ** Download
 After downloading the raw data, we extract the part we are interested in. We first split the file into lines, of which we discard the first one that contains a comment. We then split the remaining lines into columns.
 
-#+BEGIN_SRC python :results silent :var data_url=data-url
+#+BEGIN_SRC python :results silent :var data_csv=data-csv
 from urllib.request import urlopen
-
-data = urlopen(data_url).read()
+import csv
+#data = urlopen(data_url).read()
+with open(data_csv) as csv_file:
+    data = csv.DictReader(csv_file) 
 lines = data.decode('latin-1').strip().split('\n')
 data_lines = lines[1:]
 table = [line.split(',') for line in data_lines]
@@ -79,6 +84,8 @@ Let's have a look at what we have so far:
 table[:5]
 #+END_SRC
 
+#+RESULTS:
+
 ** Checking for missing data
 Unfortunately there are many ways to indicate the absence of a data value in a dataset. Here we check for a common one: empty fields. For completeness, we should also look for non-numerical data in numerical columns. We don't do this here, but checks in later processing steps would catch such anomalies.
 
-- 
2.18.1