{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Incidence de la varicelle" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "import pandas as pd\n", "import isoweek" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Les données de l'incidence du syndrome de varicelle sont disponibles du site Web du [Réseau Sentinelles](http://www.sentiweb.fr/). Nous les récupérons sous forme d'un fichier en format CSV dont chaque ligne correspond à une semaine de la période demandée. Nous téléchargeons toujours le jeu de données complet, qui commence en 1984 et se termine avec une semaine récente." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "data_url = 'https://www.sentiweb.fr/datasets/incidence-PAY-7.csv'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Voici l'explication des colonnes données [sur le site d'origine](https://ns.sentiweb.fr/incidence/csv-schema-v1.json):\n", "\n", "```json\n", "{\n", "\t\"profile\": \"tabular-data-resource\",\n", "\t\"name\": \"sentiweb-incidence-{$id}\",\n", "\t\"path\": \"http://www.sentiweb.fr/datasets/{$file}\",\n", "\t\"title\": \"Sentiweb Incidence Data file\",\n", "\t\"description\": \"\",\n", "\t\"format\": \"csv\",\n", "\t\"mediatype\": \"text/csv\",\n", "\t\"encoding\": \"iso-8859-1\",\n", "\t\"schema\": {\n", "\t\t\"fields\": [\n", "\n", "\t\t\t{\n", "\t\t\t\t\"name\": \"week\",\n", "\t\t\t\t\"type\": \"integer\",\n", "\t\t\t\t\"description\": \"ISO8601 Yearweek number as numeric (year*100 + week nubmer)\"\n", "\t\t\t},\n", "\t\t\t{\n", "\t\t\t\t\"name\": \"geo_insee\",\n", "\t\t\t\t\"type\": \"string\",\n", "\t\t\t\t\"title\": \"Geographic area\",\n", "\t\t\t\t\"description\": \"Identifier of the geographic area, from INSEE https://www.insee.fr\"\n", "\t\t\t},\n", "\t\t\t{\n", "\t\t\t\t\"name\": \"geo_name\",\n", "\t\t\t\t\"type\": \"string\",\n", "\t\t\t\t\"title\": \"Geographic area label\",\n", "\t\t\t\t\"description\": \"Geographic label of the area, corresponding to INSEE code. This label is not an id and is only provided for human reading\"\n", "\t\t\t},\n", "\t\t\t{\n", "\t\t\t\t\"name\": \"indicator\",\n", "\t\t\t\t\"type\": \"integer\",\n", "\t\t\t\t\"title\": \"Indicator id\",\n", "\t\t\t\t\"description\": \"Unique identifier of the indicator, see metadata document https://www.sentiweb.fr/meta.json\"\n", "\t\t\t},\n", "\t\t\t{\n", "\t\t\t\t\"name\": \"inc\",\n", "\t\t\t\t\"type\": \"integer\",\n", "\t\t\t\t\"title\": \"Estimated incidence\",\n", "\t\t\t\t\"description\": \"Estimated incidence value for the time step, in the geographic level\"\n", "\t\t\t},\n", "\t\t\t{\n", "\t\t\t\t\"name\": \"inc_low\",\n", "\t\t\t\t\"type\": \"integer\",\n", "\t\t\t\t\"title\": \"Lower bound of Estimated incidence 95% CI\",\n", "\t\t\t\t\"description\": \"Lower bound of the estimated incidence 95% Confidence Interval\"\n", "\t\t\t},\n", "\t\t\t{\n", "\t\t\t\t\"name\": \"inc_up\",\n", "\t\t\t\t\"type\": \"integer\",\n", "\t\t\t\t\"title\": \"Upper bound of Estimated incidence 95% CI\",\n", "\t\t\t\t\"description\": \"Upper bound of the estimated incidence 95% Confidence Interval\"\n", "\t\t\t},\n", "\t\t\t{\n", "\t\t\t\t\"name\": \"inc100\",\n", "\t\t\t\t\"type\": \"integer\",\n", "\t\t\t\t\"title\": \"Estimated rate incidence\",\n", "\t\t\t\t\"description\": \"Estimated rate incidence per 100,000 inhabitants\"\n", "\t\t\t},\n", "\t\t\t{\n", "\t\t\t\t\"name\": \"inc100_low\",\n", "\t\t\t\t\"type\": \"integer\",\n", "\t\t\t\t\"title\": \"Lower bound of estimated rate incidence 95% CI\",\n", "\t\t\t\t\"description\": \"Lower bound of the estimated incidence 95% Confidence Interval\"\n", "\t\t\t},\n", "\t\t\t{\n", "\t\t\t\t\"name\": \"inc100_up\",\n", "\t\t\t\t\"type\": \"integer\",\n", "\t\t\t\t\"title\": \"Upper bound of rate incidence 95% CI\",\n", "\t\t\t\t\"description\": \"Upper bound of the estimated rate incidence 95% Confidence Interval\"\n", "\t\t\t}\n", "\n", "\t\t],\n", "\t\t\"primaryKey\": [\n", "\n", "\t\t\t\"week\",\n", "\t\t\t\"indicator\",\n", "\t\t\t\"geo_insee\"\n", "\n", "\t\t],\n", "\n", "\t\t\"missingValues\": [\"-\"]\n", "\t},\n", "\t\"dialect\": {\n", "\t\t\"csvddfVersion\": \"1.0\",\n", "\t\t\"delimiter\": \",\",\n", "\t\t\"doubleQuote\": false,\n", "\t\t\"lineTerminator\": \"\\r\\n\",\n", "\t\t\"quoteChar\": \"\\\"\",\n", "\t\t\"skipInitialSpace\": true,\n", "\t\t\"header\": true,\n", "\n", "\t\t\"commentChar\": \"#\"\n", "\t}\n", "}\n", "```\n", "\n", "La première ligne du fichier CSV est un commentaire, que nous ignorons en précisant `skiprows=1`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Save source data so that even if URL is not available, we still have a copy:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "File 'varicelle_data.csv' does not exist, downloading...\n", "File 'varicelle_data.csv' downloaded on 2020-08-22.\n" ] } ], "source": [ "import os\n", "import urllib.request\n", "from datetime import date\n", "\n", "filename = 'varicelle_data.csv'\n", "\n", "if os.path.isfile(filename):\n", " print(\"File '{}' exists\".format(filename))\n", "else:\n", " print(\"File '{}' does not exist, downloading...\".format(filename))\n", " urllib.request.urlretrieve(data_url, filename)\n", " download_time = date.today()\n", " print(\"File '{}' downloaded on {}.\".format(filename, download_time))\n" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | week | \n", "indicator | \n", "inc | \n", "inc_low | \n", "inc_up | \n", "inc100 | \n", "inc100_low | \n", "inc100_up | \n", "geo_insee | \n", "geo_name | \n", "
---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "202033 | \n", "7 | \n", "888 | \n", "0 | \n", "1841 | \n", "1 | \n", "0 | \n", "2 | \n", "FR | \n", "France | \n", "
1 | \n", "202032 | \n", "7 | \n", "2559 | \n", "624 | \n", "4494 | \n", "4 | \n", "1 | \n", "7 | \n", "FR | \n", "France | \n", "
2 | \n", "202031 | \n", "7 | \n", "1303 | \n", "100 | \n", "2506 | \n", "2 | \n", "0 | \n", "4 | \n", "FR | \n", "France | \n", "
3 | \n", "202030 | \n", "7 | \n", "1385 | \n", "75 | \n", "2695 | \n", "2 | \n", "0 | \n", "4 | \n", "FR | \n", "France | \n", "
4 | \n", "202029 | \n", "7 | \n", "841 | \n", "10 | \n", "1672 | \n", "1 | \n", "0 | \n", "2 | \n", "FR | \n", "France | \n", "
5 | \n", "202028 | \n", "7 | \n", "728 | \n", "0 | \n", "1515 | \n", "1 | \n", "0 | \n", "2 | \n", "FR | \n", "France | \n", "
6 | \n", "202027 | \n", "7 | \n", "986 | \n", "149 | \n", "1823 | \n", "1 | \n", "0 | \n", "2 | \n", "FR | \n", "France | \n", "
7 | \n", "202026 | \n", "7 | \n", "694 | \n", "0 | \n", "1454 | \n", "1 | \n", "0 | \n", "2 | \n", "FR | \n", "France | \n", "
8 | \n", "202025 | \n", "7 | \n", "228 | \n", "0 | \n", "597 | \n", "0 | \n", "0 | \n", "1 | \n", "FR | \n", "France | \n", "
9 | \n", "202024 | \n", "7 | \n", "388 | \n", "0 | \n", "959 | \n", "1 | \n", "0 | \n", "2 | \n", "FR | \n", "France | \n", "
10 | \n", "202023 | \n", "7 | \n", "558 | \n", "1 | \n", "1115 | \n", "1 | \n", "0 | \n", "2 | \n", "FR | \n", "France | \n", "
11 | \n", "202022 | \n", "7 | \n", "277 | \n", "0 | \n", "633 | \n", "0 | \n", "0 | \n", "1 | \n", "FR | \n", "France | \n", "
12 | \n", "202021 | \n", "7 | \n", "602 | \n", "36 | \n", "1168 | \n", "1 | \n", "0 | \n", "2 | \n", "FR | \n", "France | \n", "
13 | \n", "202020 | \n", "7 | \n", "824 | \n", "20 | \n", "1628 | \n", "1 | \n", "0 | \n", "2 | \n", "FR | \n", "France | \n", "
14 | \n", "202019 | \n", "7 | \n", "310 | \n", "0 | \n", "753 | \n", "0 | \n", "0 | \n", "1 | \n", "FR | \n", "France | \n", "
15 | \n", "202018 | \n", "7 | \n", "849 | \n", "98 | \n", "1600 | \n", "1 | \n", "0 | \n", "2 | \n", "FR | \n", "France | \n", "
16 | \n", "202017 | \n", "7 | \n", "272 | \n", "0 | \n", "658 | \n", "0 | \n", "0 | \n", "1 | \n", "FR | \n", "France | \n", "
17 | \n", "202016 | \n", "7 | \n", "758 | \n", "78 | \n", "1438 | \n", "1 | \n", "0 | \n", "2 | \n", "FR | \n", "France | \n", "
18 | \n", "202015 | \n", "7 | \n", "1918 | \n", "675 | \n", "3161 | \n", "3 | \n", "1 | \n", "5 | \n", "FR | \n", "France | \n", "
19 | \n", "202014 | \n", "7 | \n", "3879 | \n", "2227 | \n", "5531 | \n", "6 | \n", "3 | \n", "9 | \n", "FR | \n", "France | \n", "
20 | \n", "202013 | \n", "7 | \n", "7326 | \n", "5236 | \n", "9416 | \n", "11 | \n", "8 | \n", "14 | \n", "FR | \n", "France | \n", "
21 | \n", "202012 | \n", "7 | \n", "8123 | \n", "5790 | \n", "10456 | \n", "12 | \n", "8 | \n", "16 | \n", "FR | \n", "France | \n", "
22 | \n", "202011 | \n", "7 | \n", "10198 | \n", "7568 | \n", "12828 | \n", "15 | \n", "11 | \n", "19 | \n", "FR | \n", "France | \n", "
23 | \n", "202010 | \n", "7 | \n", "9011 | \n", "6691 | \n", "11331 | \n", "14 | \n", "10 | \n", "18 | \n", "FR | \n", "France | \n", "
24 | \n", "202009 | \n", "7 | \n", "13631 | \n", "10544 | \n", "16718 | \n", "21 | \n", "16 | \n", "26 | \n", "FR | \n", "France | \n", "
25 | \n", "202008 | \n", "7 | \n", "10424 | \n", "7708 | \n", "13140 | \n", "16 | \n", "12 | \n", "20 | \n", "FR | \n", "France | \n", "
26 | \n", "202007 | \n", "7 | \n", "8959 | \n", "6574 | \n", "11344 | \n", "14 | \n", "10 | \n", "18 | \n", "FR | \n", "France | \n", "
27 | \n", "202006 | \n", "7 | \n", "9264 | \n", "6925 | \n", "11603 | \n", "14 | \n", "10 | \n", "18 | \n", "FR | \n", "France | \n", "
28 | \n", "202005 | \n", "7 | \n", "8505 | \n", "6314 | \n", "10696 | \n", "13 | \n", "10 | \n", "16 | \n", "FR | \n", "France | \n", "
29 | \n", "202004 | \n", "7 | \n", "7991 | \n", "5831 | \n", "10151 | \n", "12 | \n", "9 | \n", "15 | \n", "FR | \n", "France | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
1520 | \n", "199126 | \n", "7 | \n", "17608 | \n", "11304 | \n", "23912 | \n", "31 | \n", "20 | \n", "42 | \n", "FR | \n", "France | \n", "
1521 | \n", "199125 | \n", "7 | \n", "16169 | \n", "10700 | \n", "21638 | \n", "28 | \n", "18 | \n", "38 | \n", "FR | \n", "France | \n", "
1522 | \n", "199124 | \n", "7 | \n", "16171 | \n", "10071 | \n", "22271 | \n", "28 | \n", "17 | \n", "39 | \n", "FR | \n", "France | \n", "
1523 | \n", "199123 | \n", "7 | \n", "11947 | \n", "7671 | \n", "16223 | \n", "21 | \n", "13 | \n", "29 | \n", "FR | \n", "France | \n", "
1524 | \n", "199122 | \n", "7 | \n", "15452 | \n", "9953 | \n", "20951 | \n", "27 | \n", "17 | \n", "37 | \n", "FR | \n", "France | \n", "
1525 | \n", "199121 | \n", "7 | \n", "14903 | \n", "8975 | \n", "20831 | \n", "26 | \n", "16 | \n", "36 | \n", "FR | \n", "France | \n", "
1526 | \n", "199120 | \n", "7 | \n", "19053 | \n", "12742 | \n", "25364 | \n", "34 | \n", "23 | \n", "45 | \n", "FR | \n", "France | \n", "
1527 | \n", "199119 | \n", "7 | \n", "16739 | \n", "11246 | \n", "22232 | \n", "29 | \n", "19 | \n", "39 | \n", "FR | \n", "France | \n", "
1528 | \n", "199118 | \n", "7 | \n", "21385 | \n", "13882 | \n", "28888 | \n", "38 | \n", "25 | \n", "51 | \n", "FR | \n", "France | \n", "
1529 | \n", "199117 | \n", "7 | \n", "13462 | \n", "8877 | \n", "18047 | \n", "24 | \n", "16 | \n", "32 | \n", "FR | \n", "France | \n", "
1530 | \n", "199116 | \n", "7 | \n", "14857 | \n", "10068 | \n", "19646 | \n", "26 | \n", "18 | \n", "34 | \n", "FR | \n", "France | \n", "
1531 | \n", "199115 | \n", "7 | \n", "13975 | \n", "9781 | \n", "18169 | \n", "25 | \n", "18 | \n", "32 | \n", "FR | \n", "France | \n", "
1532 | \n", "199114 | \n", "7 | \n", "12265 | \n", "7684 | \n", "16846 | \n", "22 | \n", "14 | \n", "30 | \n", "FR | \n", "France | \n", "
1533 | \n", "199113 | \n", "7 | \n", "9567 | \n", "6041 | \n", "13093 | \n", "17 | \n", "11 | \n", "23 | \n", "FR | \n", "France | \n", "
1534 | \n", "199112 | \n", "7 | \n", "10864 | \n", "7331 | \n", "14397 | \n", "19 | \n", "13 | \n", "25 | \n", "FR | \n", "France | \n", "
1535 | \n", "199111 | \n", "7 | \n", "15574 | \n", "11184 | \n", "19964 | \n", "27 | \n", "19 | \n", "35 | \n", "FR | \n", "France | \n", "
1536 | \n", "199110 | \n", "7 | \n", "16643 | \n", "11372 | \n", "21914 | \n", "29 | \n", "20 | \n", "38 | \n", "FR | \n", "France | \n", "
1537 | \n", "199109 | \n", "7 | \n", "13741 | \n", "8780 | \n", "18702 | \n", "24 | \n", "15 | \n", "33 | \n", "FR | \n", "France | \n", "
1538 | \n", "199108 | \n", "7 | \n", "13289 | \n", "8813 | \n", "17765 | \n", "23 | \n", "15 | \n", "31 | \n", "FR | \n", "France | \n", "
1539 | \n", "199107 | \n", "7 | \n", "12337 | \n", "8077 | \n", "16597 | \n", "22 | \n", "15 | \n", "29 | \n", "FR | \n", "France | \n", "
1540 | \n", "199106 | \n", "7 | \n", "10877 | \n", "7013 | \n", "14741 | \n", "19 | \n", "12 | \n", "26 | \n", "FR | \n", "France | \n", "
1541 | \n", "199105 | \n", "7 | \n", "10442 | \n", "6544 | \n", "14340 | \n", "18 | \n", "11 | \n", "25 | \n", "FR | \n", "France | \n", "
1542 | \n", "199104 | \n", "7 | \n", "7913 | \n", "4563 | \n", "11263 | \n", "14 | \n", "8 | \n", "20 | \n", "FR | \n", "France | \n", "
1543 | \n", "199103 | \n", "7 | \n", "15387 | \n", "10484 | \n", "20290 | \n", "27 | \n", "18 | \n", "36 | \n", "FR | \n", "France | \n", "
1544 | \n", "199102 | \n", "7 | \n", "16277 | \n", "11046 | \n", "21508 | \n", "29 | \n", "20 | \n", "38 | \n", "FR | \n", "France | \n", "
1545 | \n", "199101 | \n", "7 | \n", "15565 | \n", "10271 | \n", "20859 | \n", "27 | \n", "18 | \n", "36 | \n", "FR | \n", "France | \n", "
1546 | \n", "199052 | \n", "7 | \n", "19375 | \n", "13295 | \n", "25455 | \n", "34 | \n", "23 | \n", "45 | \n", "FR | \n", "France | \n", "
1547 | \n", "199051 | \n", "7 | \n", "19080 | \n", "13807 | \n", "24353 | \n", "34 | \n", "25 | \n", "43 | \n", "FR | \n", "France | \n", "
1548 | \n", "199050 | \n", "7 | \n", "11079 | \n", "6660 | \n", "15498 | \n", "20 | \n", "12 | \n", "28 | \n", "FR | \n", "France | \n", "
1549 | \n", "199049 | \n", "7 | \n", "1143 | \n", "0 | \n", "2610 | \n", "2 | \n", "0 | \n", "5 | \n", "FR | \n", "France | \n", "
1550 rows × 10 columns
\n", "