{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Analyse de l'incidence de la varicelle" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Imports" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "import pandas as pd\n", "import isoweek\n", "from os import path\n", "from datetime import date" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Les données de l'incidence de la varicelle sont disponibles du site [Web du Réseau Sentinelles](https://www.sentiweb.fr/france/fr/). Nous les récupérons sous forme d'un fichier en format CSV dont chaque ligne correspond à une semaine de la période demandée. Nous téléchargeons toujours le jeu de données complet, qui commence en 1984 et se termine avec une semaine récente" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "data_varicelle_url = 'https://www.sentiweb.fr/datasets/incidence-PAY-7.csv'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Voici l'explication des colonnes données [sur le site d'origine](https://ns.sentiweb.fr/incidence/csv-schema-v1.json):\n", "\n", "| Nom de colonne | Libellé de colonne |\n", "|----------------|-----------------------------------------------------------------------------------------------------------------------------------|\n", "| week | Semaine calendaire (ISO 8601) |\n", "| indicator | Code de l'indicateur de surveillance |\n", "| inc | Estimation de l'incidence de consultations en nombre de cas |\n", "| inc_low | Estimation de la borne inférieure de l'IC95% du nombre de cas de consultation |\n", "| inc_up | Estimation de la borne supérieure de l'IC95% du nombre de cas de consultation |\n", "| inc100 | Estimation du taux d'incidence du nombre de cas de consultation (en cas pour 100,000 habitants) |\n", "| inc100_low | Estimation de la borne inférieure de l'IC95% du taux d'incidence du nombre de cas de consultation (en cas pour 100,000 habitants) |\n", "| inc100_up | Estimation de la borne supérieure de l'IC95% du taux d'incidence du nombre de cas de consultation (en cas pour 100,000 habitants) |\n", "| geo_insee | Code de la zone géographique concernée (Code INSEE) http://www.insee.fr/fr/methodes/nomenclatures/cog/ |\n", "| geo_name | Libellé de la zone géographique (ce libellé peut être modifié sans préavis) |\n", "\n", "La première ligne du fichier CSV est un commentaire, que nous ignorons en précisant `skiprows=1`." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "## On importe les données depuis Sentinelles\n", "\n", "raw_data_varicelle = pd.read_csv(data_varicelle_url, skiprows=1)\n", "raw_data_varicelle\n", "\n", "## On choisit le nom du fichier local dans lequel on stocke les données\n", "local_file_name = \"incidence-PAY-7_local_copy.csv\"\n", "\n", "## On teste si ce fichier existe déjà\n", "if path.exists(local_file_name):\n", " ## Si oui, on compare les données avec les données actualisées du jour\n", " local_data_varicelle = pd.read_csv(local_file_name, skiprows=1)\n", " diff = (local_data_varicelle != raw_data_varicelle)\n", " \n", " ## Si les dataframes sont différents, on réécrit le fichier local.\n", " if True in diff :\n", " raw_data_varicelle.to_csv(\"incidence-PAY-7_local_copy.csv\", sep = ';')\n", " \n", "else:\n", " raw_data_varicelle.to_csv(\"incidence-PAY-7_local_copy.csv\", sep = ';')" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
weekindicatorincinc_lowinc_upinc100inc100_lowinc100_upgeo_inseegeo_name
0202105712379910715651191424FRFrance
1202104712026882615226181323FRFrance
22021037891363751145113917FRFrance
32021027779554301016012816FRFrance
4202101710525775013300161220FRFrance
5202053711978840615550181323FRFrance
6202052712012828515739181224FRFrance
7202051710564757413554161121FRFrance
8202050770634744938211715FRFrance
920204975026314569078511FRFrance
10202048766834312905410614FRFrance
1120204774999296370358511FRFrance
122020467375219635541639FRFrance
132020457369620165376639FRFrance
1420204474391237564077410FRFrance
1520204374376250562477410FRFrance
162020427400019796021639FRFrance
172020417396120995823639FRFrance
18202040720786753481315FRFrance
19202039710492371861213FRFrance
20202038722537823724315FRFrance
21202037715844052763204FRFrance
2220203679191001738102FRFrance
23202035782801694102FRFrance
24202034722723714173306FRFrance
25202033712841772391204FRFrance
26202032726506894611417FRFrance
27202031713031002506204FRFrance
2820203071385752695204FRFrance
292020297841101672102FRFrance
.................................
15451991267176081130423912312042FRFrance
15461991257161691070021638281838FRFrance
15471991247161711007122271281739FRFrance
1548199123711947767116223211329FRFrance
1549199122715452995320951271737FRFrance
1550199121714903897520831261636FRFrance
15511991207190531274225364342345FRFrance
15521991197167391124622232291939FRFrance
15531991187213851388228888382551FRFrance
1554199117713462887718047241632FRFrance
15551991167148571006819646261834FRFrance
1556199115713975978118169251832FRFrance
1557199114712265768416846221430FRFrance
155819911379567604113093171123FRFrance
1559199112710864733114397191325FRFrance
15601991117155741118419964271935FRFrance
15611991107166431137221914292038FRFrance
1562199109713741878018702241533FRFrance
1563199108713289881317765231531FRFrance
1564199107712337807716597221529FRFrance
1565199106710877701314741191226FRFrance
1566199105710442654414340181125FRFrance
15671991047791345631126314820FRFrance
15681991037153871048420290271836FRFrance
15691991027162771104621508292038FRFrance
15701991017155651027120859271836FRFrance
15711990527193751329525455342345FRFrance
15721990517190801380724353342543FRFrance
1573199050711079666015498201228FRFrance
15741990497114302610205FRFrance
\n", "

1575 rows × 10 columns

\n", "
" ], "text/plain": [ " week indicator inc inc_low inc_up inc100 inc100_low \\\n", "0 202105 7 12379 9107 15651 19 14 \n", "1 202104 7 12026 8826 15226 18 13 \n", "2 202103 7 8913 6375 11451 13 9 \n", "3 202102 7 7795 5430 10160 12 8 \n", "4 202101 7 10525 7750 13300 16 12 \n", "5 202053 7 11978 8406 15550 18 13 \n", "6 202052 7 12012 8285 15739 18 12 \n", "7 202051 7 10564 7574 13554 16 11 \n", "8 202050 7 7063 4744 9382 11 7 \n", "9 202049 7 5026 3145 6907 8 5 \n", "10 202048 7 6683 4312 9054 10 6 \n", "11 202047 7 4999 2963 7035 8 5 \n", "12 202046 7 3752 1963 5541 6 3 \n", "13 202045 7 3696 2016 5376 6 3 \n", "14 202044 7 4391 2375 6407 7 4 \n", "15 202043 7 4376 2505 6247 7 4 \n", "16 202042 7 4000 1979 6021 6 3 \n", "17 202041 7 3961 2099 5823 6 3 \n", "18 202040 7 2078 675 3481 3 1 \n", "19 202039 7 1049 237 1861 2 1 \n", "20 202038 7 2253 782 3724 3 1 \n", "21 202037 7 1584 405 2763 2 0 \n", "22 202036 7 919 100 1738 1 0 \n", "23 202035 7 828 0 1694 1 0 \n", "24 202034 7 2272 371 4173 3 0 \n", "25 202033 7 1284 177 2391 2 0 \n", "26 202032 7 2650 689 4611 4 1 \n", "27 202031 7 1303 100 2506 2 0 \n", "28 202030 7 1385 75 2695 2 0 \n", "29 202029 7 841 10 1672 1 0 \n", "... ... ... ... ... ... ... ... \n", "1545 199126 7 17608 11304 23912 31 20 \n", "1546 199125 7 16169 10700 21638 28 18 \n", "1547 199124 7 16171 10071 22271 28 17 \n", "1548 199123 7 11947 7671 16223 21 13 \n", "1549 199122 7 15452 9953 20951 27 17 \n", "1550 199121 7 14903 8975 20831 26 16 \n", "1551 199120 7 19053 12742 25364 34 23 \n", "1552 199119 7 16739 11246 22232 29 19 \n", "1553 199118 7 21385 13882 28888 38 25 \n", "1554 199117 7 13462 8877 18047 24 16 \n", "1555 199116 7 14857 10068 19646 26 18 \n", "1556 199115 7 13975 9781 18169 25 18 \n", "1557 199114 7 12265 7684 16846 22 14 \n", "1558 199113 7 9567 6041 13093 17 11 \n", "1559 199112 7 10864 7331 14397 19 13 \n", "1560 199111 7 15574 11184 19964 27 19 \n", "1561 199110 7 16643 11372 21914 29 20 \n", "1562 199109 7 13741 8780 18702 24 15 \n", "1563 199108 7 13289 8813 17765 23 15 \n", "1564 199107 7 12337 8077 16597 22 15 \n", "1565 199106 7 10877 7013 14741 19 12 \n", "1566 199105 7 10442 6544 14340 18 11 \n", "1567 199104 7 7913 4563 11263 14 8 \n", "1568 199103 7 15387 10484 20290 27 18 \n", "1569 199102 7 16277 11046 21508 29 20 \n", "1570 199101 7 15565 10271 20859 27 18 \n", "1571 199052 7 19375 13295 25455 34 23 \n", "1572 199051 7 19080 13807 24353 34 25 \n", "1573 199050 7 11079 6660 15498 20 12 \n", "1574 199049 7 1143 0 2610 2 0 \n", "\n", " inc100_up geo_insee geo_name \n", "0 24 FR France \n", "1 23 FR France \n", "2 17 FR France \n", "3 16 FR France \n", "4 20 FR France \n", "5 23 FR France \n", "6 24 FR France \n", "7 21 FR France \n", "8 15 FR France \n", "9 11 FR France \n", "10 14 FR France \n", "11 11 FR France \n", "12 9 FR France \n", "13 9 FR France \n", "14 10 FR France \n", "15 10 FR France \n", "16 9 FR France \n", "17 9 FR France \n", "18 5 FR France \n", "19 3 FR France \n", "20 5 FR France \n", "21 4 FR France \n", "22 2 FR France \n", "23 2 FR France \n", "24 6 FR France \n", "25 4 FR France \n", "26 7 FR France \n", "27 4 FR France \n", "28 4 FR France \n", "29 2 FR France \n", "... ... ... ... \n", "1545 42 FR France \n", "1546 38 FR France \n", "1547 39 FR France \n", "1548 29 FR France \n", "1549 37 FR France \n", "1550 36 FR France \n", "1551 45 FR France \n", "1552 39 FR France \n", "1553 51 FR France \n", "1554 32 FR France \n", "1555 34 FR France \n", "1556 32 FR France \n", "1557 30 FR France \n", "1558 23 FR France \n", "1559 25 FR France \n", "1560 35 FR France \n", "1561 38 FR France \n", "1562 33 FR France \n", "1563 31 FR France \n", "1564 29 FR France \n", "1565 26 FR France \n", "1566 25 FR France \n", "1567 20 FR France \n", "1568 36 FR France \n", "1569 38 FR France \n", "1570 36 FR France \n", "1571 45 FR France \n", "1572 43 FR France \n", "1573 28 FR France \n", "1574 5 FR France \n", "\n", "[1575 rows x 10 columns]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "raw_data_varicelle" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Y a-t-il des points manquants dans ce jeux de données ? Non." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
weekindicatorincinc_lowinc_upinc100inc100_lowinc100_upgeo_inseegeo_name
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [week, indicator, inc, inc_low, inc_up, inc100, inc100_low, inc100_up, geo_insee, geo_name]\n", "Index: []" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "raw_data_varicelle[raw_data_varicelle.isnull().any(axis=1)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nos données utilisent une convention inhabituelle: le numéro de\n", "semaine est collé à l'année, donnant l'impression qu'il s'agit\n", "de nombre entier. C'est comme ça que Pandas les interprète.\n", " \n", "Un deuxième problème est que Pandas ne comprend pas les numéros de\n", "semaine. Il faut lui fournir les dates de début et de fin de\n", "semaine. Nous utilisons pour cela la bibliothèque `isoweek`.\n", "\n", "Comme la conversion des semaines est devenu assez complexe, nous\n", "écrivons une petite fonction Python pour cela. Ensuite, nous\n", "l'appliquons à tous les points de nos donnés. Les résultats vont\n", "dans une nouvelle colonne 'period'." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "def convert_week(year_and_week_int):\n", " year_and_week_str = str(year_and_week_int)\n", " year = int(year_and_week_str[:4])\n", " week = int(year_and_week_str[4:])\n", " w = isoweek.Week(year, week)\n", " return pd.Period(w.day(0), 'W')\n", "\n", "raw_data_varicelle['period'] = [convert_week(yw) for yw in raw_data_varicelle['week']]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Il reste deux petites modifications à faire.\n", "\n", "Premièrement, nous définissons les périodes d'observation\n", "comme nouvel index de notre jeux de données. Ceci en fait\n", "une suite chronologique, ce qui sera pratique par la suite.\n", "\n", "Deuxièmement, nous trions les points par période, dans\n", "le sens chronologique." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "sorted_v_data = raw_data_varicelle.set_index('period').sort_index()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nous vérifions la cohérence des données. Entre la fin d'une période et\n", "le début de la période qui suit, la différence temporelle doit être\n", "zéro, ou au moins très faible. Nous laissons une \"marge d'erreur\"\n", "d'une seconde." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "periods = sorted_v_data.index\n", "for p1, p2 in zip(periods[:-1], periods[1:]):\n", " delta = p2.to_timestamp() - p1.end_time\n", " if delta > pd.Timedelta('1s'):\n", " print(p1, p2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Il n'y a pas de manquements dans nos périodes : nous n'avons pas supprimé de semaines dans notre analyse car nous n'avons pas rencontré de données manquantes." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "sorted_v_data['inc'].plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Zoomons sur les dernières années pour apercevoir la périodicité." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "sorted_v_data['inc'][-200:].plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Etude de l'incidence annuelle" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Etant donné que le pic de l'épidémie se situe en hiver, à cheval\n", "entre deux années civiles, nous définissons la période de référence\n", "entre deux minima de l'incidence, du 1er septembre de l'année $N$ au\n", "1er septembre de l'année $N+1$.\n", "\n", "Notre tâche est un peu compliquée par le fait que l'année ne comporte\n", "pas un nombre entier de semaines. Nous modifions donc un peu nos périodes\n", "de référence: à la place du 1er septembre de chaque année, nous utilisons le\n", "premier jour de la semaine qui contient le 1er septembre.\n", "\n", "Comme l'incidence de la varicelle est relativement faible en été, cette\n", "modification ne risque pas de fausser nos conclusions.\n", "\n", "Encore un petit détail: les données commencent en décembre 1990, ce qui\n", "rend la première année incomplète. Nous commençons donc l'analyse en 1991." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "first_sept_week = [pd.Period(pd.Timestamp(y, 9, 1), 'W')\n", " for y in range(1991,\n", " sorted_v_data.index[-1].year)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "En partant de cette liste des semaines qui contiennent un 1er septembre, nous obtenons nos intervalles d'environ un an comme les périodes entre deux semaines adjacentes dans cette liste. Nous calculons les sommes des incidences hebdomadaires pour toutes ces périodes.\n", "\n", "Nous vérifions également que ces périodes contiennent entre 51 et 52 semaines, pour nous protéger contre des éventuelles erreurs dans notre code." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "year = []\n", "yearly_incidence = []\n", "for week1, week2 in zip(first_sept_week[:-1],\n", " first_sept_week[1:]):\n", " one_year = sorted_v_data['inc'][week1:week2-1]\n", " assert abs(len(one_year)-52) < 2\n", " yearly_incidence.append(one_year.sum())\n", " year.append(week2.year)\n", "yearly_incidence = pd.Series(data=yearly_incidence, index=year)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Voici les incidences annuelles" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "yearly_incidence.plot(style='*')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Une liste triée permet de repérer les valeurs les plus élevées plus aisément, elles seront situées à la fin." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2020 221186\n", "2002 516689\n", "2018 542312\n", "2017 551041\n", "1996 564901\n", "2019 584066\n", "2015 604382\n", "2000 617597\n", "2001 619041\n", "2012 624573\n", "2005 628464\n", "2006 632833\n", "2011 642368\n", "1993 643387\n", "1995 652478\n", "1994 661409\n", "1998 677775\n", "1997 683434\n", "2014 685769\n", "2013 698332\n", "2007 717352\n", "2008 749478\n", "1999 756456\n", "2003 758363\n", "2004 777388\n", "2016 782114\n", "2010 829911\n", "1992 832939\n", "2009 842373\n", "dtype: int64" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "yearly_incidence.sort_values()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Enfin, un histogramme montre bien que la varicelle ne déclenche pas d'épidémies fortes, qui touchent environ 10% de la population française. Les épidémies les plus fortes touchent un à deux % de la population française." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW8AAAEICAYAAACQzXX2AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAEaNJREFUeJzt3XuQZHV5h/HnZRd0YWBFF4YKoEPQGImrImPwUsFZsCwV1BJNvCCBlGZNeSNmUxZJqWgSlIRgaSxIakWUEuOoYEoB46Wio+IFmVXLFQlqBBGIRGJAl6Cw8uaPc4bMrjs7Zy49fd6t51M1Nae7T/d53z7d33P616e7IzORJNWy17ALkCQtnOEtSQUZ3pJUkOEtSQUZ3pJUkOEtSQUZ3pJUkOEtSQUZ3pJU0OpB3fC6detybGxsh/Puuusu9ttvv0EtckXYQz/YQz/Yw/LbsmXL7Zl50HzzDSy8x8bGmJ6e3uG8qakpJiYmBrXIFWEP/WAP/WAPyy8ifthlPodNJKkgw1uSCjK8Jakgw1uSCjK8Jakgw1uSCjK8Jakgw1uSChrYh3Qk/bqxM6+8f3rT+u2cPuv0IN14zokrshytHPe8Jakgw1uSCjK8Jakgw1uSCjK8Jakgw1uSCjK8Jakgw1uSCjK8Jakgw1uSCjK8Jakgw1uSCjK8Jakgw1uSCjK8Jakgw1uSCjK8Jakgw1uSCjK8JamgzuEdEa+LiGsj4tsR8cGIeOAgC5Mkza1TeEfEocBrgfHMfDSwCnjRIAuTJM1tIcMmq4E1EbEa2Be4dTAlSZLmE5nZbcaIM4CzgbuBT2fmKbuYZyOwEWB0dPSYycnJHS7ftm0bIyMjS615qOyhH6r2sPWWO++fHl0Dt929Mstdf+jagdxu1fUwW9962LBhw5bMHJ9vvk7hHREHApcBLwTuAD4CXJqZl8x1nfHx8Zyent7hvKmpKSYmJuZdXp/ZQz9U7WHszCvvn960fjvnbV29Isu98ZwTB3K7VdfDbH3rISI6hXfXYZOnATdk5k8y817go8CTl1KgJGnxuob3TcATI2LfiAjgBOC6wZUlSdqdTuGdmVcDlwJfB7a219s8wLokSbvRecAtM88CzhpgLZKkjvyEpSQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQV1Dm8I+JBEXFpRPx7RFwXEU8aZGGSpLmtXsC87wQ+mZkviIh9gH0HVJMkaR6dwjsiDgCOA04HyMx7gHsGV5YkaXciM+efKeJxwGbgO8BjgS3AGZl5107zbQQ2AoyOjh4zOTm5w+1s27aNkZGR5al8SOyhH6r2sPWWO++fHl0Dt909xGKWQZce1h+6dmWKWaS+PZY2bNiwJTPH55uva3iPA18FnpKZV0fEO4GfZeYb57rO+Ph4Tk9P73De1NQUExMT8y6vz+yhH6r2MHbmlfdPb1q/nfO2LmTksn+69HDjOSeuUDWL07fHUkR0Cu+ub1jeDNycmVe3py8FHr/Y4iRJS9MpvDPzx8CPIuKR7Vkn0AyhSJKGYCGv2V4DfKA90uQHwB8NpiRJ0nw6h3dmfhOYdxxGkjR4fsJSkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgpaUHhHxKqI+EZEXDGogiRJ81vonvcZwHWDKESS1F3n8I6Iw4ATgQsHV44kqYvIzG4zRlwKvA3YH/jzzDxpF/NsBDYCjI6OHjM5ObnD5du2bWNkZGSpNQ+VPfTDUnrYesudy1zN4oyugdvuHnYVS2MPu7b+0LWLvu6GDRu2ZOb4fPOt7nJjEXES8F+ZuSUiJuaaLzM3A5sBxsfHc2Jix1mnpqbY+bxq7KEfltLD6WdeubzFLNKm9ds5b2unp2Bv2cOu3XjKxLLe3q50HTZ5CvCciLgRmASOj4hLBlaVJGm3OoV3Zv5FZh6WmWPAi4DPZuZLB1qZJGlOHuctSQUteKAnM6eAqWWvRJLUmXveklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklRQp/COiMMj4nMRcV1EXBsRZwy6MEnS3FZ3nG87sCkzvx4R+wNbIuIzmfmdAdYmSZpDpz3vzPzPzPx6O/1z4Drg0EEWJkmaW2Tmwq4QMQZ8AXh0Zv5sp8s2AhsBRkdHj5mcnNzhutu2bWNkZGQJ5Q6fPfTDUnrYesudy1zN4oyugdvuHnYVS2MPu7b+0LWLvu6GDRu2ZOb4fPMtKLwjYgT4PHB2Zn50d/OOj4/n9PT0DudNTU0xMTHReXl9ZA/9sJQexs68cnmLWaRN67dz3tauI5f9ZA+7duM5Jy76uhHRKbw7H20SEXsDlwEfmC+4JUmD1fVokwDeA1yXmW8fbEmSpPl03fN+CnAqcHxEfLP9e9YA65Ik7UangZ7MvAqIAdciSerIT1hKUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkG9/NnnYf6691J+9VkLs5T1vGn9dk7vya/AS8PgnrckFWR4S1JBhrckFWR4S1JBhrckFWR4S1JBhrckFWR4S1JBhrckFWR4S1JBhrckFWR4S1JBhrckFWR4S1JBhrckFWR4S1JBhrckFWR4S1JBhrckFdQ5vCPiGRFxfUR8PyLOHGRRkqTd6xTeEbEKOB94JnAU8OKIOGqQhUmS5tZ1z/t3ge9n5g8y8x5gEnju4MqSJO1OZOb8M0W8AHhGZr68PX0qcGxmvnqn+TYCG9uTjwSu3+mm1gG3L7XoIbOHfrCHfrCH5fewzDxovplWd7yx2MV5v5b6mbkZ2DznjURMZ+Z4x2X2kj30gz30gz0MT9dhk5uBw2edPgy4dfnLkSR10TW8rwEeERFHRMQ+wIuAjw+uLEnS7nQaNsnM7RHxauBTwCrgosy8dhHLm3NIpRB76Ad76Ad7GJJOb1hKkvrFT1hKUkGGtyQVZHhLUkG9D++IOD4ijhh2HUtRvYfq9c/YE/qwh37oQw+9fcOy/e6USeAO4D7grMz8/HCrWpjqPVSvf8ae0Ic99EOfeujNnndEHBYRB8w664XAZZl5HM2d9eKIeNJwquumeg8LqT8idvWp216ovh7AHvqizz0MPbwj4lER8QngKuCvImLmC69+AezbTn+Y5rsHntjH0Kjew2Lqzx6+ZKu+HsAe+qJCD0MJ74jYb9bJxwE3Z+YY8Fng79vzfwr8MiL2z8yfAt8FRoGxFSx1TtV72E39n+PX6x9p6/8ePal/RvX1APaAPSzKioV3RBwYEe+LiGuAcyLioHZr9RjgS+3e3MeBOyLiRJo7ZX9gfXsT36P59q97VqrmnVXvYTf1r59V/8d2Uf9j2pv47jDrn1F9PYA9tDdhD0uwknvexwHbgWfRfEvhXwIHtDUcMutl+MXAS4CvAT+n+QEIMvMrwPHAz1aw5p1V72Gu+ldRo/4Z1dcD2IM9LFVmLusfTRC8Avg8zXd7r2vP/zDw2nb6COCc9vIn0IwrrWovGwF+0t7OocB1wKuB9wIXAPsud817Wg/V69+T+rAHexjU3yD2vE8CngO8BXgS8Hft+Z8BntxO/wj4IvDMzLyGZou3ASAztwFXA0/IzFuAU2nGlH4MvCEz/3cANe9pPVSvf0/qwx7sYSC6/hjDDmaONoiIJ9C8lPgicGVm/hL4LeAHmfnZiLgBODcing5sAZ4XEesy8/aI+B5wV0Q8FHgX8NKIOJjmu8L/m+blCZk5DUwvsc89rofq9e9pfbS92IPP6RWz4D3vWXfQccBFNIfOPA14WzvLfcB3I2JNZt5A8/LiMTTjRLfSHCcJ8CualyB7AZfRfC3jKcAxwObMvG/RXc3fw6q2h6fSvOwp1UNbV0bEBEXXQdvHAdX7iIiHtP+fDLyvaA8HR8RDImKcZmy3Yg/7VH5OL0qXsRWa4xr/BPhn4I+BvYE/BV7VXn4g8C3gaJo74RxgrL3sJJo7YF07vRVYSzPI/wlgn1nL2WtQ40PAfsDLaVbIJpo3Jar1sD9wJc33qQO8rlL9sx5LpwH/RvNhh1J98P+fSv59mjHRKZrfa632WNoPOJ1m2OBO4MSCPewNvBK4HPhH4OHAGZV6WMrfvHveEXEIcAUwAbyfZjD/ZJpxo+0Amfk/wMeA17YPhoOBR7U38QXgqcA9mXkF8B7gUuB8mq38vTPLysHtqe5HExbHA+8Gng48n+ZNifsq9NBaAzwAODIi1gFH0uwllKg/IvYGrgVeAJybmc9vLzp61vJ73UdmZkSsBf4AeEdmTmTm9TR7ZiV6iIiH0wwpnAC8AbgFuInmDbsyjyfgVTTP6XfQ/Kbuye3pXxXqYfE6bN3W0PxS/Mzp02kG+08Dvjbr/N8Abm2nX0Xz0dED2+tfDjx01rzrVnorBTxo1vTraVboKcV6OA04F3gj8DLg2cA1Vepvl/tR4JSdznshcHWVPmj29v66nZ7ZEz+5Sg80Af2AWacvotmgPrdKD+0yLwf+sJ1+GfCa9rFU5jm9lL8uY96/AL7WHrgOzcuQYzPzYpo9wEMAMvNW4NqIODYzzwe+D3yI5seLr8rMm2ZuMDNv77DcZZWZd0TEARHxPpphk3U0K+/IiBjtcw+z7vu9gP+g2Xs9PjMvB47oe/07uQg4KyLOi4ipiHgT8FWa30g9uK2t733cDvxeRJwCbImIi2n29n67fUXU6x4y81fZvIk3M2YfNF+0dDnNeijxnAb+FTgtIj4CvBl4LPBtmh4Oauvqew+Lt4Ct3MwexsXAGe30+4G/bacfDFxIuyWjGY96NPDAYW+hdurjlTQvDzfTjIF/GXgTzQO41z0AH6HZa1pLc7zqG2gerG8stg4+RfNhiMNpnkhnAF8ptB4e0db7DzR7cS8B3k5zHPDraTayve5hp36+AZzcTl9S6TndPocvotkZezNwFnB9+/gqtR4W+tf5aJPMzIg4DDiEZotHe2cREVfQjCmvynZLlpn3Zua3M/MXXZexEjLzgmzebb6AZpzyn2heQn2cHvcQESM0e3zvBj5J8075scCLgQMj4nJ6XP9OnpeZb83MHwFvpTmM610UWA+tm2g+Dr06m3HVK9rzrqB5Q7ZCD0TEzPP/Kpo3+wDObi4q83g6CpjKZq/5vTQ7Nx+kzmNp8Ra4lXs2zRe07E2zxXsGzZ31EuDxw94SLbCXw4FPAw9pT78UOHrYde2m3gfSvOq5kOaNpgng07Mu73X9u+nrYTRvJj24Uh80H4/e0k4/iOYVxNGVemhr3Zdmw3nyTuef0vce2uz5M+DC9vQ6mh3LI6qth8X8LejHGCLiS8BvAjfSHBv5lsz8VucbGLL2KIETaDY2R9EMnZyfmffu9oo91H6I4GRgMjN/POx6FiIiHkCz4T8V+B2aw7wuyMztQy1sgSLibJrnw9E0OwJnZbMnXkpEXA+8KTM/NPM5jmHX1FVEHEnzPL6HZl38C/A32Xwico/WObzbw7zOAm4ALsn2DY9KImI1zfcb/JKmh3IvnSJiFXBfpSfYrkTEK2gO03x/xfUwIyIeCfywYg+zPnD3OJo3wbdXfFy1OzKPAL6cmXcPu56V0tufQZMkzW3ov6QjSVo4w1uSCjK8Jakgw1uSCjK8Jakgw1uSCjK8Jamg/wNI6CKPSu0/QAAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "yearly_incidence.hist(xrot=20)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }