{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Incidence de la varicelle" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "import pandas as pd\n", "import isoweek\n", "import pathlib" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Les données de l'incidence de la varicelle sont disponibles du site Web du [Réseau Sentinelles](http://www.sentiweb.fr/). Nous les récupérons sous forme d'un fichier en format CSV dont chaque ligne correspond à une semaine de la période demandée. Nous téléchargeons toujours le jeu de données complet, qui commence en 1990 et se termine avec une semaine récente." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "data_url = \"http://www.sentiweb.fr/datasets/incidence-PAY-7.csv\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Voici l'explication des colonnes données [sur le site d'origine](https://ns.sentiweb.fr/incidence/csv-schema-v1.json):\n", "\n", "| Nom de colonne | Libellé de colonne |\n", "|----------------|-----------------------------------------------------------------------------------------------------------------------------------|\n", "| week | Semaine calendaire (ISO 8601) |\n", "| indicator | Code de l'indicateur de surveillance |\n", "| inc | Estimation de l'incidence de consultations en nombre de cas |\n", "| inc_low | Estimation de la borne inférieure de l'IC95% du nombre de cas de consultation |\n", "| inc_up | Estimation de la borne supérieure de l'IC95% du nombre de cas de consultation |\n", "| inc100 | Estimation du taux d'incidence du nombre de cas de consultation (en cas pour 100,000 habitants) |\n", "| inc100_low | Estimation de la borne inférieure de l'IC95% du taux d'incidence du nombre de cas de consultation (en cas pour 100,000 habitants) |\n", "| inc100_up | Estimation de la borne supérieure de l'IC95% du taux d'incidence du nombre de cas de consultation (en cas pour 100,000 habitants) |\n", "| geo_insee | Code de la zone géographique concernée (Code INSEE) http://www.insee.fr/fr/methodes/nomenclatures/cog/ |\n", "| geo_name | Libellé de la zone géographique (ce libellé peut être modifié sans préavis) |\n", "\n", "La première ligne du fichier CSV est un commentaire, que nous ignorons en précisant `skiprows=1`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pour plus de pérennité, les données brutes téléchargées sont gardées sur le dépôt afin de pouvoir les réutiliser sans avoir à les télécharger à chaque fois. Cela nécessite de ne les télécharger et les ajouter au dépôt si elles n'existent pas déjà dessus." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "cached_file = \"cached_raw_data.csv\"\n", "if pathlib.Path(cached_file).is_file():\n", " raw_data = pd.read_csv(cached_file)\n", "else:\n", " raw_data = pd.read_csv(data_url, skiprows=1)\n", " raw_data.to_csv(cached_file)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Le jeu de données est complet, il n'y a pas de points manquants." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0weekindicatorincinc_lowinc_upinc100inc100_lowinc100_upgeo_inseegeo_name
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [Unnamed: 0, week, indicator, inc, inc_low, inc_up, inc100, inc100_low, inc100_up, geo_insee, geo_name]\n", "Index: []" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "raw_data[raw_data.isnull().any(axis=1)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nous éliminons ce point, ce qui n'a pas d'impact fort sur notre analyse qui est assez simple." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0weekindicatorincinc_lowinc_upinc100inc100_lowinc100_upgeo_inseegeo_name
00202027714832212745204FRFrance
11202026770701481102FRFrance
2220202572280597001FRFrance
3320202473880959102FRFrance
44202023755811115102FRFrance
5520202272770633001FRFrance
662020217602361168102FRFrance
772020207824201628102FRFrance
8820201973100753001FRFrance
992020187849981600102FRFrance
101020201772720658001FRFrance
11112020167758781438102FRFrance
1212202015719186753161315FRFrance
13132020147387922275531639FRFrance
1414202013773265236941611814FRFrance
15152020127812357901045612816FRFrance
1616202011710198756812828151119FRFrance
171720201079011669111331141018FRFrance
18182020097136311054416718211626FRFrance
1919202008710424770813140161220FRFrance
202020200778959657411344141018FRFrance
212120200679264692511603141018FRFrance
222220200578505631410696131016FRFrance
23232020047799158311015112915FRFrance
242420200375968410078369612FRFrance
2525202002765344530853810713FRFrance
262620200179835701912651151119FRFrance
27272019527794152461063612816FRFrance
282820195175823367579719612FRFrance
2929201950764244276857210713FRFrance
....................................
151415141991267176081130423912312042FRFrance
151515151991257161691070021638281838FRFrance
151615161991247161711007122271281739FRFrance
15171517199123711947767116223211329FRFrance
15181518199122715452995320951271737FRFrance
15191519199121714903897520831261636FRFrance
152015201991207190531274225364342345FRFrance
152115211991197167391124622232291939FRFrance
152215221991187213851388228888382551FRFrance
15231523199117713462887718047241632FRFrance
152415241991167148571006819646261834FRFrance
15251525199115713975978118169251832FRFrance
15261526199114712265768416846221430FRFrance
1527152719911379567604113093171123FRFrance
15281528199112710864733114397191325FRFrance
152915291991117155741118419964271935FRFrance
153015301991107166431137221914292038FRFrance
15311531199109713741878018702241533FRFrance
15321532199108713289881317765231531FRFrance
15331533199107712337807716597221529FRFrance
15341534199106710877701314741191226FRFrance
15351535199105710442654414340181125FRFrance
153615361991047791345631126314820FRFrance
153715371991037153871048420290271836FRFrance
153815381991027162771104621508292038FRFrance
153915391991017155651027120859271836FRFrance
154015401990527193751329525455342345FRFrance
154115411990517190801380724353342543FRFrance
15421542199050711079666015498201228FRFrance
154315431990497114302610205FRFrance
\n", "

1544 rows × 11 columns

\n", "
" ], "text/plain": [ " Unnamed: 0 week indicator inc inc_low inc_up inc100 \\\n", "0 0 202027 7 1483 221 2745 2 \n", "1 1 202026 7 707 0 1481 1 \n", "2 2 202025 7 228 0 597 0 \n", "3 3 202024 7 388 0 959 1 \n", "4 4 202023 7 558 1 1115 1 \n", "5 5 202022 7 277 0 633 0 \n", "6 6 202021 7 602 36 1168 1 \n", "7 7 202020 7 824 20 1628 1 \n", "8 8 202019 7 310 0 753 0 \n", "9 9 202018 7 849 98 1600 1 \n", "10 10 202017 7 272 0 658 0 \n", "11 11 202016 7 758 78 1438 1 \n", "12 12 202015 7 1918 675 3161 3 \n", "13 13 202014 7 3879 2227 5531 6 \n", "14 14 202013 7 7326 5236 9416 11 \n", "15 15 202012 7 8123 5790 10456 12 \n", "16 16 202011 7 10198 7568 12828 15 \n", "17 17 202010 7 9011 6691 11331 14 \n", "18 18 202009 7 13631 10544 16718 21 \n", "19 19 202008 7 10424 7708 13140 16 \n", "20 20 202007 7 8959 6574 11344 14 \n", "21 21 202006 7 9264 6925 11603 14 \n", "22 22 202005 7 8505 6314 10696 13 \n", "23 23 202004 7 7991 5831 10151 12 \n", "24 24 202003 7 5968 4100 7836 9 \n", "25 25 202002 7 6534 4530 8538 10 \n", "26 26 202001 7 9835 7019 12651 15 \n", "27 27 201952 7 7941 5246 10636 12 \n", "28 28 201951 7 5823 3675 7971 9 \n", "29 29 201950 7 6424 4276 8572 10 \n", "... ... ... ... ... ... ... ... \n", "1514 1514 199126 7 17608 11304 23912 31 \n", "1515 1515 199125 7 16169 10700 21638 28 \n", "1516 1516 199124 7 16171 10071 22271 28 \n", "1517 1517 199123 7 11947 7671 16223 21 \n", "1518 1518 199122 7 15452 9953 20951 27 \n", "1519 1519 199121 7 14903 8975 20831 26 \n", "1520 1520 199120 7 19053 12742 25364 34 \n", "1521 1521 199119 7 16739 11246 22232 29 \n", "1522 1522 199118 7 21385 13882 28888 38 \n", "1523 1523 199117 7 13462 8877 18047 24 \n", "1524 1524 199116 7 14857 10068 19646 26 \n", "1525 1525 199115 7 13975 9781 18169 25 \n", "1526 1526 199114 7 12265 7684 16846 22 \n", "1527 1527 199113 7 9567 6041 13093 17 \n", "1528 1528 199112 7 10864 7331 14397 19 \n", "1529 1529 199111 7 15574 11184 19964 27 \n", "1530 1530 199110 7 16643 11372 21914 29 \n", "1531 1531 199109 7 13741 8780 18702 24 \n", "1532 1532 199108 7 13289 8813 17765 23 \n", "1533 1533 199107 7 12337 8077 16597 22 \n", "1534 1534 199106 7 10877 7013 14741 19 \n", "1535 1535 199105 7 10442 6544 14340 18 \n", "1536 1536 199104 7 7913 4563 11263 14 \n", "1537 1537 199103 7 15387 10484 20290 27 \n", "1538 1538 199102 7 16277 11046 21508 29 \n", "1539 1539 199101 7 15565 10271 20859 27 \n", "1540 1540 199052 7 19375 13295 25455 34 \n", "1541 1541 199051 7 19080 13807 24353 34 \n", "1542 1542 199050 7 11079 6660 15498 20 \n", "1543 1543 199049 7 1143 0 2610 2 \n", "\n", " inc100_low inc100_up geo_insee geo_name \n", "0 0 4 FR France \n", "1 0 2 FR France \n", "2 0 1 FR France \n", "3 0 2 FR France \n", "4 0 2 FR France \n", "5 0 1 FR France \n", "6 0 2 FR France \n", "7 0 2 FR France \n", "8 0 1 FR France \n", "9 0 2 FR France \n", "10 0 1 FR France \n", "11 0 2 FR France \n", "12 1 5 FR France \n", "13 3 9 FR France \n", "14 8 14 FR France \n", "15 8 16 FR France \n", "16 11 19 FR France \n", "17 10 18 FR France \n", "18 16 26 FR France \n", "19 12 20 FR France \n", "20 10 18 FR France \n", "21 10 18 FR France \n", "22 10 16 FR France \n", "23 9 15 FR France \n", "24 6 12 FR France \n", "25 7 13 FR France \n", "26 11 19 FR France \n", "27 8 16 FR France \n", "28 6 12 FR France \n", "29 7 13 FR France \n", "... ... ... ... ... \n", "1514 20 42 FR France \n", "1515 18 38 FR France \n", "1516 17 39 FR France \n", "1517 13 29 FR France \n", "1518 17 37 FR France \n", "1519 16 36 FR France \n", "1520 23 45 FR France \n", "1521 19 39 FR France \n", "1522 25 51 FR France \n", "1523 16 32 FR France \n", "1524 18 34 FR France \n", "1525 18 32 FR France \n", "1526 14 30 FR France \n", "1527 11 23 FR France \n", "1528 13 25 FR France \n", "1529 19 35 FR France \n", "1530 20 38 FR France \n", "1531 15 33 FR France \n", "1532 15 31 FR France \n", "1533 15 29 FR France \n", "1534 12 26 FR France \n", "1535 11 25 FR France \n", "1536 8 20 FR France \n", "1537 18 36 FR France \n", "1538 20 38 FR France \n", "1539 18 36 FR France \n", "1540 23 45 FR France \n", "1541 25 43 FR France \n", "1542 12 28 FR France \n", "1543 0 5 FR France \n", "\n", "[1544 rows x 11 columns]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data = raw_data.copy()\n", "data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nos données utilisent une convention inhabituelle: le numéro de\n", "semaine est collé à l'année, donnant l'impression qu'il s'agit\n", "de nombre entier. C'est comme ça que Pandas les interprète.\n", " \n", "Un deuxième problème est que Pandas ne comprend pas les numéros de\n", "semaine. Il faut lui fournir les dates de début et de fin de\n", "semaine. Nous utilisons pour cela la bibliothèque `isoweek`.\n", "\n", "Comme la conversion des semaines est devenu assez complexe, nous\n", "écrivons une petite fonction Python pour cela. Ensuite, nous\n", "l'appliquons à tous les points de nos donnés. Les résultats vont\n", "dans une nouvelle colonne 'period'." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "def convert_week(year_and_week):\n", " year = year_and_week // 100\n", " week = year_and_week % 100\n", " return pd.Period(isoweek.Week(year, week).day(0), 'W')\n", "\n", "data['period'] = list(map(convert_week, data['week']))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Il restent deux petites modifications à faire.\n", "\n", "Premièrement, nous définissons les périodes d'observation\n", "comme nouvel index de notre jeux de données. Ceci en fait\n", "une suite chronologique, ce qui sera pratique par la suite.\n", "\n", "Deuxièmement, nous trions les points par période, dans\n", "le sens chronologique." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "sorted_data = data.set_index('period').sort_index()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nous vérifions la cohérence des données. Entre la fin d'une période et\n", "le début de la période qui suit, la différence temporelle doit être\n", "zéro, ou au moins très faible. Nous laissons une \"marge d'erreur\"\n", "d'une seconde." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "periods = sorted_data.index\n", "for p1, p2 in zip(periods[:-1], periods[1:]):\n", " delta = p2.to_timestamp() - p1.end_time\n", " if delta > pd.Timedelta('1s'):\n", " print(p1, p2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Un premier regard sur les données !" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "sorted_data['inc'].plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Un zoom sur les dernières années montre mieux la situation des pics au printemps. Le creux des incidences se trouve en automne." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "sorted_data['inc'][-150:].plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Etude de l'incidence annuelle" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pour des réunions techniques liées à la validation automatiques des résultats du MOOC,nous définissons la période de référence entre deux minima de l'incidence, du $1^{er}$ septembre de l'année $N$ au $1^{er}$ septembre de l'année $N+1$.\n", "\n", "Notre tâche est un peu compliquée par le fait que l'année ne comporte pas un nombre entier de semaines. Nous modifions donc un peu nos périodes de référence: à la place du $1^{er}$ septembre de chaque année, nous utilisons le premier jour de la semaine qui contient le $1^{er}$ septembre.\n", "\n", "Comme l'incidence de syndrome grippal est très faible en été, cette modification ne risque pas de fausser nos conclusions.\n", "\n", "Encore un petit détail: les données commencent an decembre 1990, ce qui rend la première année incomplète. Nous commençons donc l'analyse en 1991." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "starting_year = 1991\n", "first_septembre_week = [pd.Period(pd.Timestamp(y, 9, 1), 'W')\n", " for y in range(starting_year,\n", " sorted_data.index[-1].year)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "En partant de cette liste des semaines qui contiennent un 1er août, nous obtenons nos intervalles d'environ un an comme les périodes entre deux semaines adjacentes dans cette liste. Nous calculons les sommes des incidences hebdomadaires pour toutes ces périodes.\n", "\n", "Nous vérifions également que ces périodes contiennent entre 51 et 52 semaines, pour nous protéger contre des éventuelles erreurs dans notre code." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "year = []\n", "yearly_incidence = []\n", "for week1, week2 in zip(first_septembre_week[:-1],\n", " first_septembre_week[1:]):\n", " one_year = sorted_data['inc'][week1:week2-1]\n", " assert abs(len(one_year)-52) < 2\n", " yearly_incidence.append(one_year.sum())\n", " year.append(week2.year)\n", "yearly_incidence = pd.Series(data=yearly_incidence, index=year)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Voici les incidences annuelles." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "yearly_incidence.plot(style='*')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Une liste triée permet de plus facilement répérer les valeurs les plus élevées (à la fin)." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2002 516689\n", "2018 542312\n", "2017 551041\n", "1996 564901\n", "2019 584066\n", "2015 604382\n", "2000 617597\n", "2001 619041\n", "2012 624573\n", "2005 628464\n", "2006 632833\n", "2011 642368\n", "1993 643387\n", "1995 652478\n", "1994 661409\n", "1998 677775\n", "1997 683434\n", "2014 685769\n", "2013 698332\n", "2007 717352\n", "2008 749478\n", "1999 756456\n", "2003 758363\n", "2004 777388\n", "2016 782114\n", "2010 829911\n", "1992 832939\n", "2009 842373\n", "dtype: int64" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "yearly_incidence.sort_values()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Enfin, un histogramme montre que les épidémies les plus fortes ne touchent qu'environ 1% de la population et sont deux fois moins fréquente que les épidémies les plus fréquente." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXwAAAEICAYAAABcVE8dAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAFGtJREFUeJzt3XuQJWV9h/Hnxy4oMLggDKgLukbIRGFVZL2gVTqDxhKXaGm8IZpotNYqFTFlyqCoaLytF7zESyobJZB4nShUkDUaDI6oUZQV4oI4amBVlighKjCI4Oovf3QvGadmdmZO95lzOu/zqdraPrfu73mn53v69Ok+E5mJJOn/v70GHUCStDIsfEkqhIUvSYWw8CWpEBa+JBXCwpekQlj4klQIC1+SCmHhS1IhVq/kwg488MA88sgjV3KRrbn11lvZf//9Bx1j2bqaG8w+KF3N3tXcsHj2bdu23ZiZo02Xs6KFf9hhh3HZZZet5CJbMzU1xfj4+KBjLFtXc4PZB6Wr2buaGxbPHhE/bGM57tKRpEJY+JJUCAtfkgph4UtSISx8SSpEo8KPiLGIuGLWv5sj4uVthZMktafRYZmZOQ08GCAiVgE7gfNbyCVJalmbu3QeC/xnZrZyvKgkqV3R1t+0jYizgW9l5vvnXL8J2AQwOjp63OTkZCvLW2kzMzOMjIwMOsay9Zp7+86b+pBmcevXrrlzuqtjDmYfhK7mhsWzT0xMbMvMDU2X00rhR8Q+wPXA0Zn504XuNzY2ltPT042XNwhdPYuv19zrTt/afpgl2LF5453TXR1zMPsgdDU3LOlM21YKv61dOidSbd0vWPaSpMFqq/BPBj7e0rwkSX3QuPAjYj/gD4HzmseRJPVL42/LzMxfAge3kEWS1EeeaStJhbDwJakQFr4kFcLCl6RCWPiSVAgLX5IKYeFLUiEsfEkqhIUvSYWw8CWpEBa+JBXCwpekQlj4klQIC1+SCmHhS1IhLHxJKoSFL0mFsPAlqRAWviQVwsKXpEJY+JJUiMaFHxEHRsSnIuK7EXF1RBzfRjBJUrtWtzCP9wKfy8ynRcQ+wH4tzFOS1LJGhR8RdwMeDTwPIDPvAO5oHkuS1LbIzN4fHPFgYAvwHeBBwDbgtMy8ddZ9NgGbAEZHR4+bnJxsFHhQZmZmGBkZGXSMZes19/adN/UhzeLWr11z53RXxxzMPghdzQ2LZ5+YmNiWmRuaLqdp4W8Avg48KjMvjYj3Ajdn5mvnu//Y2FhOT0/3vLxBmpqaYnx8fNAxlq3X3OtO39p+mCXYsXnjndNdHXMw+yB0NTcsnj0iWin8ph/aXgdcl5mX1pc/BTyk4TwlSX3QqPAz8yfAjyNirL7qsVS7dyRJQ6aNo3ROBT5aH6FzDfD8FuYpSWpZ48LPzCuAxvuWJEn95Zm2klQIC1+SCmHhS1IhLHxJKoSFL0mFsPAlqRAWviQVwsKXpEJY+JJUCAtfkgph4UtSISx8SSqEhS9JhbDwJakQFr4kFcLCl6RCWPiSVAgLX5IKYeFLUiEsfEkqROM/Yh4RO4BbgN8AuzLTP2guSUOoceHXJjLzxpbmJUnqA3fpSFIhIjObzSDiWuDnQAJ/m5lb5ty+CdgEMDo6etzk5GSj5Q3KzMwMIyMjg46xbL3m3r7zpj6kWdz6tWvunO7qmIPZB6GruWHx7BMTE9va2F3eRuHfKzOvj4hDgYuAUzPzkvnuOzY2ltPT042WNyhTU1OMj48POsay9Zp73elb2w+zBDs2b7xzuqtjDmYfhK7mhsWzR0Qrhd94l05mXl//fwNwPvCwpvOUJLWvUeFHxP4RccDuaeDxwJVtBJMktavpUTqHAedHxO55fSwzP9c4lSSpdY0KPzOvAR7UUhZJUh95WKYkFcLCl6RCWPiSVAgLX5IKYeFLUiEsfEkqhIUvSYWw8CWpEBa+JBXCwpekQlj4klQIC1+SCmHhS1IhLHxJKoSFL0mFsPAlqRAWviQVwsKXpEJY+JJUCAtfkgph4UtSIRoXfkSsiojLI+LCNgJJkvqjjS3804CrW5iPJKmPGhV+RBwObAQ+1E4cSVK/RGb2/uCITwFvBQ4A/iIzT5rnPpuATQCjo6PHTU5O9ry8QZqZmWFkZGTQMZat19zbd97UhzSLW792zZ3TXR1zMPsgdDU3LJ59YmJiW2ZuaLqc1b0+MCJOAm7IzG0RMb7Q/TJzC7AFYGxsLMfHF7zrUJuamqKL2XvN/bzTt7YfZgl2nDJ+53RXxxzMPghdzQ0rl73JLp1HAU+KiB3AJ4ATIuIjraSSJLWu58LPzFdl5uGZuQ54FnBxZj6ntWSSpFZ5HL4kFaLnffizZeYUMNXGvCRJ/eEWviQVwsKXpEJY+JJUCAtfkgph4UtSISx8SSqEhS9JhbDwJakQFr4kFcLCl6RCWPiSVAgLX5IKYeFLUiEsfEkqhIUvSYWw8CWpEBa+JBXCwpekQlj4klQIC1+SCtGo8CPirhHxjYj4j4i4KiLe0FYwSVK7Vjd8/O3ACZk5ExF7A1+JiH/JzK+3kE2S1KJGhZ+ZCczUF/eu/2XTUJKk9jXehx8RqyLiCuAG4KLMvLR5LElS26LaSG9hRhEHAucDp2bmlbOu3wRsAhgdHT1ucnKyleWttJmZGUZGRgYdY9l6zb195019SLM8h+0LP71t0Cl6s5zs69eu6W+YPZjv57wS496P57yUdX2Q6/WenvNi2ScmJrZl5oamGVorfICIOBO4NTPfOd/tY2NjOT093dryVtLU1BTj4+ODjrFsveZed/rW9sMs0yvW7+Ks7U0/ZhqM5WTfsXljn9MsbL6f80qMez+e81LW9UGu13t6zotlj4hWCr/pUTqj9ZY9EbEv8Djgu01DSZLa1/Rl/J7AuRGxiurFYzIzL2weS5LUtqZH6XwbOLalLJKkPvJMW0kqhIUvSYWw8CWpEBa+JBXCwpekQlj4klQIC1+SCmHhS1IhLHxJKoSFL0mFsPAlqRAWviQVwsKXpEJY+JJUCAtfkgph4UtSISx8SSqEhS9JhbDwJakQFr4kFcLCl6RCNCr8iDgiIr4YEVdHxFURcVpbwSRJ7Vrd8PG7gFdk5rci4gBgW0RclJnfaSGbJKlFjbbwM/O/MvNb9fQtwNXA2jaCSZLaFZnZzowi1gGXAMdk5s2zrt8EbAIYHR09bnJysqf5b995U/OQDRy2L/z0toFG6ElXc0M52devXdPfMHsw3+9VV8d92HPv6ec8MzPDyMjIgrdPTExsy8wNTTO0UvgRMQJ8CXhzZp630P3GxsZyenq6p2WsO31rj+na8Yr1uzhre9M9YCuvq7mhnOw7Nm/sc5qFzfd71dVxH/bce/o5T01NMT4+vuDtEdFK4Tc+Sici9gY+DXx0T2UvSRqspkfpBPBh4OrMfFc7kSRJ/dB0C/9RwHOBEyLiivrfE1vIJUlqWaMdXpn5FSBayiJJ6iPPtJWkQlj4klQIC1+SCmHhS1IhLHxJKoSFL0mFsPAlqRAWviQVwsKXpEJY+JJUCAtfkgph4UtSISx8SSqEhS9JhbDwJakQFr4kFcLCl6RCWPiSVAgLX5IKYeFLUiEsfEkqRKPCj4izI+KGiLiyrUCSpP5ouoV/DvCEFnJIkvqsUeFn5iXAz1rKIknqo8jMZjOIWAdcmJnHLHD7JmATwOjo6HGTk5M9LWf7zpt6TNiOw/aFn9420Ag96WpuMPugdDX7sOdev3bNgrfNzMwwMjKy4O0TExPbMnND0wx9L/zZxsbGcnp6uqflrDt9a0+Pa8sr1u/irO2rB5qhF13NDWYflK5mH/bcOzZvXPC2qakpxsfHF7w9IlopfI/SkaRCWPiSVIimh2V+HPgaMBYR10XEC9qJJUlqW6MdXpl5cltBJEn95S4dSSqEhS9JhbDwJakQFr4kFcLCl6RCWPiSVAgLX5IKYeFLUiEsfEkqhIUvSYWw8CWpEBa+JBXCwpekQlj4klQIC1+SCmHhS1IhLHxJKoSFL0mFsPAlqRAWviQVonHhR8QTImI6In4QEae3EUqS1L5GhR8Rq4APACcCDwBOjogHtBFMktSuplv4DwN+kJnXZOYdwCeAJzePJUlqW2Rm7w+OeBrwhMx8YX35ucDDM/Ols+6zCdhUXzwGuLL3uAN1CHDjoEP0oKu5weyD0tXsXc0Ni2e/T2aONl3I6oaPj3mu+51XkMzcAmwBiIjLMnNDw2UORFezdzU3mH1Qupq9q7lh5bI33aVzHXDErMuHA9c3nKckqQ+aFv43gaMi4r4RsQ/wLOCC5rEkSW1rtEsnM3dFxEuBzwOrgLMz86o9PGRLk+UNWFezdzU3mH1Qupq9q7lhhbI3+tBWktQdnmkrSYWw8CWpEBa+JBVi6As/Ik6IiPsOOsdydTU3mH1Qupq9q7mhvOxD+6Ft/Z08nwB+AfwWODMzvzTYVIvram4w+6B0NXtXc0O52YdmCz8iDo+Iu8266pnApzPz0VRP7uSIOH4w6RbW1dxg9kHpavau5gaz7zbwwo+I+0fEZ4GvAH8VEbu/fO1XwH719CTV90w8IiLm+zqHFdfV3GD2Qelq9q7mBrPPNZDCj4j9Z118MHBdZq4DLgbeWV//M+D2iDggM38GfA84DFi3glF/R1dzg9kHpavZu5obzL4nK1b4EXFQRJwTEd8ENkfEaP2K9EDgqxERmXkB8IuI2Ej1JA4A1tez+D7VN8rdsVKZu5zb7GYvJbfZl559JbfwHw3sAp5I9S2brwbuVme4R/7fp8fnAs8GvgHcQvXHVcjMrwEnADevYGbobm4wu9mXp6u5wexLy56Zrf6j+k6dFwFfovoe/EPq6yeBl9XT9wU217c/lGof1ar6thHgv+v5rAWuBl4K/D3wQWC/tjN3ObfZzV5KbrM3z96PLfyTgCcBbwCOB95eX38R8Mh6+sfAl4ETM/ObVK9qEwCZOQNcCjw0M3cCz6XaP/UT4DWZ+cs2w876oOOPupR7jk6NOTjug8jumLuu9/RtmfU+pYyIh1K9xfgysDUzbwd+H7gmMy+OiGuBd0TE44FtwFMi4pDMvDEivg/cGhH3Bt4HPCciDqX6Tv3/oXrbQmZeBlzWS8495N8AvJDqbdE7gBuA3xv23HX2To55nd1xd11fSuZO90v9HIZy3Je9hR8Rd6t/GOPA2VSHCD0OeGt9l98C34uIfTPzWqq3HQ+sn/j1VMeQAvyG6q3JXsCnqb4e9BTgOGBLZv52udkWyT0SEXeNiHPr5V0LvDczb4iIvaheSYcud5394Pr/RwLn0JExrzMfFNVxxFuA8+jWuB8aEQfXv7zn0pFxj4hDIuIhEfF+4Hy6Neb71P3yGKpdFZ0Y8zr73hGxb0Scw7B2zBL3Pe0H/Cnwb1QH/AP8OfCSevog4NvAsXXozcC6+raT6sCH1NPbgTVUHzJ8Fthn1nL2arKPbA+5LwY+WV/3NuBFs+6zuv7/pcCbhiT37jOgn061D28KGANe3oExn539q8AXqFbqoR/3ep77A8+jept9E7CxI+O+O/cXqLYCn9qhMd8beDHwGeBvgCOB04Z9zOdkv5Bqa/zoOt9QjvuiW/gRsTdwFfA04B2Z+cf1Tcfuvk9m/hz4Z+BlVL8ohwL3r2++BHgMcEdmXgh8GPgU8AGqraZfz5pPa6+6c3K/PTN3v3puBx4QEW+tX4n/LCLuDnwOuMegc9fzy4hYAzwDeE9mjmfmNNUr/O77DN2Yz5P93Zn5uMz8Nh0Y94g4kmr3wWOB1wA7gR9RfZC2ql7m0I37nNxnUO3T/WGdZ/0wj3ntJVQl9x6qv4n91Pryb+plDt2Yz5P93VTryFOo1pmxiNg8dOO+xFex84BT5lz3TODSWZfvBVxfT7+E6pTfg4B9qV657z3rvoe0/Uq7jNz3rrN9EjiZaiX73DDlrpf1YuCN9fTureanDvuYz81eX94XuGedb3JYx53qF/Yusy6fTbXB8ORhHvd5cn+Iamv/0I6s658B/qSefgFwat0v3xjWMV8g+/OBV9Ud88lhHPelPqmTqA72P4tq98LrgPtQnfF16Kz7XQQ8vJ5+E/CvVG8v/3KlfgB7yP1F4PVUH3ysnXWfvevndEJ9+c2Dzl3neEad6xTgW1Sv+E+m2s1wyKz7DdWYz5N9G/ARqiMUZq/YQznus/IdTLUP+XFU+1J/RnVM9NCO+5zcj68vd2FdfzHV7uJ/ojpK5UNUu0Z+DowO85jPyf4jqo2E+wNHDOO4L+eJfZ7qhIAjqF65TgO+RlX+Ady9/kHde9aTPAa466B+GPPk/ijV6clHzbr9HlQfhD5wyHIfVY/vX1NtDTwbeBfVcbivrEtoWMd8bvanU23Znzjs4z7neVwOPLWe/gjwtnp6KMd9Tu4n1dN7zbp+aMec6oiWs6n2Z78eOBOYrn93h3Zdnyf7GcDfAY8YxnFfzlE6T8nMt2Tmj4G3UB0e9T6qtyUXUL3KrcrMHwFk5q8z88rM/NUyltEPs3O/lWrwD4yIB0XEGVS7fX6Z1X7mYcr9I6pTpVdntQ/zwvq6C6k+jB7mMZ+bfSvwHeDYiPiDIR936iMqoPrA/Mh6+s3VTfEZhnTcF8i9V33EzqsZ4jEHHgBMZeaNVO9QVgEfZ/j7BX43+7nAbcD9IuLoYVvXl3wcfv7uQf2/oCr812bmxyLiOcBVmXl52wGbmpP7FqoTFa6meiXem+pIgGHMfXtEbKZ66wrVVs7xwObMvHzIx3xu9n2o3ua+keqoqX0Y0nGH6gOyiNj9bYQ/qK+7GnhlRJwCfGcYs8/JvaO+bldEPItq/RnKMY+IVcB1VF8x8A/ArVRnmb44M68d5nV9nuy/onqH+3aq3T13YYjGfcl/ACUi7gI8gersrqOpDp/6YGbu6l+85ubJvSUz3z3YVEsXEW+mOmHjWKr9fmfWW81Db072LwCvzP6ehdmqiJgGXpeZn9x9MtCgMy3F7NyDzrJUEXE/qkMU76BaZ84H3pTV2aVDbZ7sFwBnZOaKfxHbYpb1F68i4kVUJz7845C8lVqSrubeLSLGgB+afWXMOtPzwVSH9u7qQtl3Nfdu9ZmlRwH/npm3DTrPcnQl+9D+iUNJUrsG/hevJEkrw8KXpEJY+JJUCAtfkgph4UtSISx8SSqEhS9JhfhfHCg3qEW8+c0AAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "yearly_incidence.hist(xrot=20)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 1 }