{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Incidence de la varicelle" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "import pandas as pd\n", "import isoweek\n", "from os import path as pth" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Les données de l'incidence du syndrome grippal sont disponibles du site Web du [Réseau Sentinelles](http://www.sentiweb.fr/). Par soucis d'efficacité et de pérennité, nous utilisons une copie locale de ces données. Dans le cas où ce fichier n'existerait pas ou plus, nous les récupérons sous forme d'un fichier en format CSV dont chaque ligne correspond à une semaine de la période demandée. Ces données sont ensuite sauvegardées à l'emplacement défini pour pouvoir être réutilisées par la suite." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "data_url = \"http://www.sentiweb.fr/datasets/incidence-PAY-7.csv\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Voici l'explication des colonnes données [sur le site d'origine](https://ns.sentiweb.fr/incidence/csv-schema-v1.json):\n", "\n", "| Nom de colonne | Libellé de colonne |\n", "|----------------|-----------------------------------------------------------------------------------------------------------------------------------|\n", "| week | Semaine calendaire (ISO 8601) |\n", "| indicator | Code de l'indicateur de surveillance |\n", "| inc | Estimation de l'incidence de consultations en nombre de cas |\n", "| inc_low | Estimation de la borne inférieure de l'IC95% du nombre de cas de consultation |\n", "| inc_up | Estimation de la borne supérieure de l'IC95% du nombre de cas de consultation |\n", "| inc100 | Estimation du taux d'incidence du nombre de cas de consultation (en cas pour 100,000 habitants) |\n", "| inc100_low | Estimation de la borne inférieure de l'IC95% du taux d'incidence du nombre de cas de consultation (en cas pour 100,000 habitants) |\n", "| inc100_up | Estimation de la borne supérieure de l'IC95% du taux d'incidence du nombre de cas de consultation (en cas pour 100,000 habitants) |\n", "| geo_insee | Code de la zone géographique concernée (Code INSEE) http://www.insee.fr/fr/methodes/nomenclatures/cog/ |\n", "| geo_name | Libellé de la zone géographique (ce libellé peut être modifié sans préavis) |\n", "\n", "La première ligne du fichier CSV est un commentaire, que nous ignorons en précisant `skiprows=1`." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
weekindicatorincinc_lowinc_upinc100inc100_lowinc100_upgeo_inseegeo_name
02021207800345901141612717FRFrance
1202119766544370893810713FRFrance
22021187391221105714639FRFrance
320211774686287864947410FRFrance
420211674780289166697410FRFrance
5202115711215762714803171222FRFrance
6202114711197799414400171222FRFrance
720211379714628913139151020FRFrance
8202112711520841514625171222FRFrance
920211179386667812094141018FRFrance
1020211079056645211660141018FRFrance
11202109710988793814038171222FRFrance
12202108711281836114201171321FRFrance
132021077135611031516807211626FRFrance
14202106713401981016992201525FRFrance
15202105712210898815432181323FRFrance
16202104712026882615226181323FRFrance
172021037891363751145113917FRFrance
182021027779554301016012816FRFrance
19202101710525775013300161220FRFrance
20202053711978840615550181323FRFrance
21202052712012828515739181224FRFrance
22202051710564757413554161121FRFrance
23202050770634744938211715FRFrance
2420204975026314569078511FRFrance
25202048766834312905410614FRFrance
2620204774999296370358511FRFrance
272020467375219635541639FRFrance
282020457369620165376639FRFrance
2920204474391237564077410FRFrance
.................................
15601991267176081130423912312042FRFrance
15611991257161691070021638281838FRFrance
15621991247161711007122271281739FRFrance
1563199123711947767116223211329FRFrance
1564199122715452995320951271737FRFrance
1565199121714903897520831261636FRFrance
15661991207190531274225364342345FRFrance
15671991197167391124622232291939FRFrance
15681991187213851388228888382551FRFrance
1569199117713462887718047241632FRFrance
15701991167148571006819646261834FRFrance
1571199115713975978118169251832FRFrance
1572199114712265768416846221430FRFrance
157319911379567604113093171123FRFrance
1574199112710864733114397191325FRFrance
15751991117155741118419964271935FRFrance
15761991107166431137221914292038FRFrance
1577199109713741878018702241533FRFrance
1578199108713289881317765231531FRFrance
1579199107712337807716597221529FRFrance
1580199106710877701314741191226FRFrance
1581199105710442654414340181125FRFrance
15821991047791345631126314820FRFrance
15831991037153871048420290271836FRFrance
15841991027162771104621508292038FRFrance
15851991017155651027120859271836FRFrance
15861990527193751329525455342345FRFrance
15871990517190801380724353342543FRFrance
1588199050711079666015498201228FRFrance
15891990497114302610205FRFrance
\n", "

1590 rows × 10 columns

\n", "
" ], "text/plain": [ " week indicator inc inc_low inc_up inc100 inc100_low \\\n", "0 202120 7 8003 4590 11416 12 7 \n", "1 202119 7 6654 4370 8938 10 7 \n", "2 202118 7 3912 2110 5714 6 3 \n", "3 202117 7 4686 2878 6494 7 4 \n", "4 202116 7 4780 2891 6669 7 4 \n", "5 202115 7 11215 7627 14803 17 12 \n", "6 202114 7 11197 7994 14400 17 12 \n", "7 202113 7 9714 6289 13139 15 10 \n", "8 202112 7 11520 8415 14625 17 12 \n", "9 202111 7 9386 6678 12094 14 10 \n", "10 202110 7 9056 6452 11660 14 10 \n", "11 202109 7 10988 7938 14038 17 12 \n", "12 202108 7 11281 8361 14201 17 13 \n", "13 202107 7 13561 10315 16807 21 16 \n", "14 202106 7 13401 9810 16992 20 15 \n", "15 202105 7 12210 8988 15432 18 13 \n", "16 202104 7 12026 8826 15226 18 13 \n", "17 202103 7 8913 6375 11451 13 9 \n", "18 202102 7 7795 5430 10160 12 8 \n", "19 202101 7 10525 7750 13300 16 12 \n", "20 202053 7 11978 8406 15550 18 13 \n", "21 202052 7 12012 8285 15739 18 12 \n", "22 202051 7 10564 7574 13554 16 11 \n", "23 202050 7 7063 4744 9382 11 7 \n", "24 202049 7 5026 3145 6907 8 5 \n", "25 202048 7 6683 4312 9054 10 6 \n", "26 202047 7 4999 2963 7035 8 5 \n", "27 202046 7 3752 1963 5541 6 3 \n", "28 202045 7 3696 2016 5376 6 3 \n", "29 202044 7 4391 2375 6407 7 4 \n", "... ... ... ... ... ... ... ... \n", "1560 199126 7 17608 11304 23912 31 20 \n", "1561 199125 7 16169 10700 21638 28 18 \n", "1562 199124 7 16171 10071 22271 28 17 \n", "1563 199123 7 11947 7671 16223 21 13 \n", "1564 199122 7 15452 9953 20951 27 17 \n", "1565 199121 7 14903 8975 20831 26 16 \n", "1566 199120 7 19053 12742 25364 34 23 \n", "1567 199119 7 16739 11246 22232 29 19 \n", "1568 199118 7 21385 13882 28888 38 25 \n", "1569 199117 7 13462 8877 18047 24 16 \n", "1570 199116 7 14857 10068 19646 26 18 \n", "1571 199115 7 13975 9781 18169 25 18 \n", "1572 199114 7 12265 7684 16846 22 14 \n", "1573 199113 7 9567 6041 13093 17 11 \n", "1574 199112 7 10864 7331 14397 19 13 \n", "1575 199111 7 15574 11184 19964 27 19 \n", "1576 199110 7 16643 11372 21914 29 20 \n", "1577 199109 7 13741 8780 18702 24 15 \n", "1578 199108 7 13289 8813 17765 23 15 \n", "1579 199107 7 12337 8077 16597 22 15 \n", "1580 199106 7 10877 7013 14741 19 12 \n", "1581 199105 7 10442 6544 14340 18 11 \n", "1582 199104 7 7913 4563 11263 14 8 \n", "1583 199103 7 15387 10484 20290 27 18 \n", "1584 199102 7 16277 11046 21508 29 20 \n", "1585 199101 7 15565 10271 20859 27 18 \n", "1586 199052 7 19375 13295 25455 34 23 \n", "1587 199051 7 19080 13807 24353 34 25 \n", "1588 199050 7 11079 6660 15498 20 12 \n", "1589 199049 7 1143 0 2610 2 0 \n", "\n", " inc100_up geo_insee geo_name \n", "0 17 FR France \n", "1 13 FR France \n", "2 9 FR France \n", "3 10 FR France \n", "4 10 FR France \n", "5 22 FR France \n", "6 22 FR France \n", "7 20 FR France \n", "8 22 FR France \n", "9 18 FR France \n", "10 18 FR France \n", "11 22 FR France \n", "12 21 FR France \n", "13 26 FR France \n", "14 25 FR France \n", "15 23 FR France \n", "16 23 FR France \n", "17 17 FR France \n", "18 16 FR France \n", "19 20 FR France \n", "20 23 FR France \n", "21 24 FR France \n", "22 21 FR France \n", "23 15 FR France \n", "24 11 FR France \n", "25 14 FR France \n", "26 11 FR France \n", "27 9 FR France \n", "28 9 FR France \n", "29 10 FR France \n", "... ... ... ... \n", "1560 42 FR France \n", "1561 38 FR France \n", "1562 39 FR France \n", "1563 29 FR France \n", "1564 37 FR France \n", "1565 36 FR France \n", "1566 45 FR France \n", "1567 39 FR France \n", "1568 51 FR France \n", "1569 32 FR France \n", "1570 34 FR France \n", "1571 32 FR France \n", "1572 30 FR France \n", "1573 23 FR France \n", "1574 25 FR France \n", "1575 35 FR France \n", "1576 38 FR France \n", "1577 33 FR France \n", "1578 31 FR France \n", "1579 29 FR France \n", "1580 26 FR France \n", "1581 25 FR France \n", "1582 20 FR France \n", "1583 36 FR France \n", "1584 38 FR France \n", "1585 36 FR France \n", "1586 45 FR France \n", "1587 43 FR France \n", "1588 28 FR France \n", "1589 5 FR France \n", "\n", "[1590 rows x 10 columns]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "file_path='DonneesGrippeMOOC.csv'\n", "\n", "if pth.isfile(file_path)==0:\n", " raw_data = pd.read_csv(data_url, skiprows=1)\n", " df=pd.DataFrame(raw_data,columns=['week','indicator','inc','inc_low','inc_up','inc100','inc100_low','inc100_up','geo_insee','geo_name'])\n", " df.to_csv(file_path,index=False)\n", "\n", "raw_data = pd.read_csv(file_path)\n", "\n", "raw_data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Y a-t-il des points manquants dans ce jeux de données ? Non" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
weekindicatorincinc_lowinc_upinc100inc100_lowinc100_upgeo_inseegeo_name
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [week, indicator, inc, inc_low, inc_up, inc100, inc100_low, inc100_up, geo_insee, geo_name]\n", "Index: []" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "raw_data[raw_data.isnull().any(axis=1)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nous utiliserons donc l'ensemble des données disponibles pour la suite de l'étude" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "data=raw_data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nos données utilisent une convention inhabituelle: le numéro de\n", "semaine est collé à l'année, donnant l'impression qu'il s'agit\n", "de nombre entier. C'est comme ça que Pandas les interprète.\n", " \n", "Un deuxième problème est que Pandas ne comprend pas les numéros de\n", "semaine. Il faut lui fournir les dates de début et de fin de\n", "semaine. Nous utilisons pour cela la bibliothèque `isoweek`.\n", "\n", "Comme la conversion des semaines est devenu assez complexe, nous\n", "écrivons une petite fonction Python pour cela. Ensuite, nous\n", "l'appliquons à tous les points de nos donnés. Les résultats vont\n", "dans une nouvelle colonne 'period'." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "def convert_week(year_and_week_int):\n", " year_and_week_str = str(year_and_week_int)\n", " year = int(year_and_week_str[:4])\n", " week = int(year_and_week_str[4:])\n", " w = isoweek.Week(year, week)\n", " return pd.Period(w.day(0), 'W')\n", "\n", "data['period'] = [convert_week(yw) for yw in data['week']]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Il restent deux petites modifications à faire.\n", "\n", "Premièrement, nous définissons les périodes d'observation\n", "comme nouvel index de notre jeux de données. Ceci en fait\n", "une suite chronologique, ce qui sera pratique par la suite.\n", "\n", "Deuxièmement, nous trions les points par période, dans\n", "le sens chronologique." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "sorted_data = data.set_index('period').sort_index()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nous vérifions la cohérence des données. Entre la fin d'une période et\n", "le début de la période qui suit, la différence temporelle doit être\n", "zéro, ou au moins très faible. Nous laissons une \"marge d'erreur\"\n", "d'une seconde.\n", "\n", "Ceci s'avère tout à fait juste pour l'ensemble des données." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "periods = sorted_data.index\n", "for p1, p2 in zip(periods[:-1], periods[1:]):\n", " delta = p2.to_timestamp() - p1.end_time\n", " if delta > pd.Timedelta('1s'):\n", " print(p1, p2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Un premier regard sur les données !" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "sorted_data['inc'].plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Un zoom sur les dernières années montre mieux la situation des creux en été. Une réduction de l'amplitude et de la largeur du pic de 2020 et 2021 pourraient être attribués aux effets des confinements liés à la COVID-19." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "sorted_data['inc'][-200:].plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Etude de l'incidence annuelle" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Etant donné que le pic de l'épidémie se situe en hiver, à cheval\n", "entre deux années civiles, nous définissons la période de référence\n", "entre deux minima de l'incidence, du 1er septembre de l'année $N$ au\n", "1er septembre de l'année $N+1$.\n", "\n", "Notre tâche est un peu compliquée par le fait que l'année ne comporte\n", "pas un nombre entier de semaines. Nous modifions donc un peu nos périodes\n", "de référence: à la place du 1er septembre de chaque année, nous utilisons le\n", "premier jour de la semaine qui contient le 1er septembre.\n", "\n", "Encore un petit détail: les données commencent en décembre 1990, ce qui\n", "rend la première année incomplète. Nous commençons donc l'analyse en 1991." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "first_september_week = [pd.Period(pd.Timestamp(y, 9, 1), 'W')\n", " for y in range(1991,\n", " sorted_data.index[-1].year)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "En partant de cette liste des semaines qui contiennent un 1er septembre, nous obtenons nos intervalles d'environ un an comme les périodes entre deux semaines adjacentes dans cette liste. Nous calculons les sommes des incidences hebdomadaires pour toutes ces périodes.\n", "\n", "Nous vérifions également que ces périodes contiennent entre 51 et 52 semaines, pour nous protéger contre des éventuelles erreurs dans notre code." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "year = []\n", "yearly_incidence = []\n", "for week1, week2 in zip(first_september_week[:-1],\n", " first_september_week[1:]):\n", " one_year = sorted_data['inc'][week1:week2-1]\n", " assert abs(len(one_year)-52) < 2\n", " yearly_incidence.append(one_year.sum())\n", " year.append(week2.year)\n", "yearly_incidence = pd.Series(data=yearly_incidence, index=year)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Voici les incidences annuelles." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "yearly_incidence.plot(style='*')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Une liste triée permet de plus facilement répérer les valeurs les plus élevées (à la fin)." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2020 221186\n", "2002 516689\n", "2018 542312\n", "2017 551041\n", "1996 564901\n", "2019 584066\n", "2015 604382\n", "2000 617597\n", "2001 619041\n", "2012 624573\n", "2005 628464\n", "2006 632833\n", "2011 642368\n", "1993 643387\n", "1995 652478\n", "1994 661409\n", "1998 677775\n", "1997 683434\n", "2014 685769\n", "2013 698332\n", "2007 717352\n", "2008 749478\n", "1999 756456\n", "2003 758363\n", "2004 777388\n", "2016 782114\n", "2010 829911\n", "1992 832939\n", "2009 842373\n", "dtype: int64" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "yearly_incidence.sort_values()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Enfin, un histogramme montre bien que la varicelle est une maladie touchant la population de manière chronique, sans vraiment présenter de forte épidémies se démarquant de la tendance moyenne. Les effets du confinement liés à la COVID-19 sont bien visibles puisque l'année 2020 voit l'incidence de la varicelle être réduite d'environ un facteur 3 vis à vis de l'incidence moyenne mesurée sur les 30 dernières années. " ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW8AAAEICAYAAACQzXX2AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAEaNJREFUeJzt3XuQZHV5h/HnZRd0YWBFF4YKoEPQGImrImPwUsFZsCwV1BJNvCCBlGZNeSNmUxZJqWgSlIRgaSxIakWUEuOoYEoB46Wio+IFmVXLFQlqBBGIRGJAl6Cw8uaPc4bMrjs7Zy49fd6t51M1Nae7T/d53z7d33P616e7IzORJNWy17ALkCQtnOEtSQUZ3pJUkOEtSQUZ3pJUkOEtSQUZ3pJUkOEtSQUZ3pJU0OpB3fC6detybGxsh/Puuusu9ttvv0EtckXYQz/YQz/Yw/LbsmXL7Zl50HzzDSy8x8bGmJ6e3uG8qakpJiYmBrXIFWEP/WAP/WAPyy8ifthlPodNJKkgw1uSCjK8Jakgw1uSCjK8Jakgw1uSCjK8Jakgw1uSChrYh3Qk/bqxM6+8f3rT+u2cPuv0IN14zokrshytHPe8Jakgw1uSCjK8Jakgw1uSCjK8Jakgw1uSCjK8Jakgw1uSCjK8Jakgw1uSCjK8Jakgw1uSCjK8Jakgw1uSCjK8Jakgw1uSCjK8Jakgw1uSCjK8JamgzuEdEa+LiGsj4tsR8cGIeOAgC5Mkza1TeEfEocBrgfHMfDSwCnjRIAuTJM1tIcMmq4E1EbEa2Be4dTAlSZLmE5nZbcaIM4CzgbuBT2fmKbuYZyOwEWB0dPSYycnJHS7ftm0bIyMjS615qOyhH6r2sPWWO++fHl0Dt929Mstdf+jagdxu1fUwW9962LBhw5bMHJ9vvk7hHREHApcBLwTuAD4CXJqZl8x1nfHx8Zyent7hvKmpKSYmJuZdXp/ZQz9U7WHszCvvn960fjvnbV29Isu98ZwTB3K7VdfDbH3rISI6hXfXYZOnATdk5k8y817go8CTl1KgJGnxuob3TcATI2LfiAjgBOC6wZUlSdqdTuGdmVcDlwJfB7a219s8wLokSbvRecAtM88CzhpgLZKkjvyEpSQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQV1Dm8I+JBEXFpRPx7RFwXEU8aZGGSpLmtXsC87wQ+mZkviIh9gH0HVJMkaR6dwjsiDgCOA04HyMx7gHsGV5YkaXciM+efKeJxwGbgO8BjgS3AGZl5107zbQQ2AoyOjh4zOTm5w+1s27aNkZGR5al8SOyhH6r2sPWWO++fHl0Dt909xGKWQZce1h+6dmWKWaS+PZY2bNiwJTPH55uva3iPA18FnpKZV0fEO4GfZeYb57rO+Ph4Tk9P73De1NQUExMT8y6vz+yhH6r2MHbmlfdPb1q/nfO2LmTksn+69HDjOSeuUDWL07fHUkR0Cu+ub1jeDNycmVe3py8FHr/Y4iRJS9MpvDPzx8CPIuKR7Vkn0AyhSJKGYCGv2V4DfKA90uQHwB8NpiRJ0nw6h3dmfhOYdxxGkjR4fsJSkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgoyvCWpIMNbkgpaUHhHxKqI+EZEXDGogiRJ81vonvcZwHWDKESS1F3n8I6Iw4ATgQsHV44kqYvIzG4zRlwKvA3YH/jzzDxpF/NsBDYCjI6OHjM5ObnD5du2bWNkZGSpNQ+VPfTDUnrYesudy1zN4oyugdvuHnYVS2MPu7b+0LWLvu6GDRu2ZOb4fPOt7nJjEXES8F+ZuSUiJuaaLzM3A5sBxsfHc2Jix1mnpqbY+bxq7KEfltLD6WdeubzFLNKm9ds5b2unp2Bv2cOu3XjKxLLe3q50HTZ5CvCciLgRmASOj4hLBlaVJGm3OoV3Zv5FZh6WmWPAi4DPZuZLB1qZJGlOHuctSQUteKAnM6eAqWWvRJLUmXveklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklSQ4S1JBRneklRQp/COiMMj4nMRcV1EXBsRZwy6MEnS3FZ3nG87sCkzvx4R+wNbIuIzmfmdAdYmSZpDpz3vzPzPzPx6O/1z4Drg0EEWJkmaW2Tmwq4QMQZ8AXh0Zv5sp8s2AhsBRkdHj5mcnNzhutu2bWNkZGQJ5Q6fPfTDUnrYesudy1zN4oyugdvuHnYVS2MPu7b+0LWLvu6GDRu2ZOb4fPMtKLwjYgT4PHB2Zn50d/OOj4/n9PT0DudNTU0xMTHReXl9ZA/9sJQexs68cnmLWaRN67dz3tauI5f9ZA+7duM5Jy76uhHRKbw7H20SEXsDlwEfmC+4JUmD1fVokwDeA1yXmW8fbEmSpPl03fN+CnAqcHxEfLP9e9YA65Ik7UangZ7MvAqIAdciSerIT1hKUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkGGtyQVZHhLUkG9/NnnYf6691J+9VkLs5T1vGn9dk7vya/AS8PgnrckFWR4S1JBhrckFWR4S1JBhrckFWR4S1JBhrckFWR4S1JBhrckFWR4S1JBhrckFWR4S1JBhrckFWR4S1JBhrckFWR4S1JBhrckFWR4S1JBhrckFdQ5vCPiGRFxfUR8PyLOHGRRkqTd6xTeEbEKOB94JnAU8OKIOGqQhUmS5tZ1z/t3ge9n5g8y8x5gEnju4MqSJO1OZOb8M0W8AHhGZr68PX0qcGxmvnqn+TYCG9uTjwSu3+mm1gG3L7XoIbOHfrCHfrCH5fewzDxovplWd7yx2MV5v5b6mbkZ2DznjURMZ+Z4x2X2kj30gz30gz0MT9dhk5uBw2edPgy4dfnLkSR10TW8rwEeERFHRMQ+wIuAjw+uLEnS7nQaNsnM7RHxauBTwCrgosy8dhHLm3NIpRB76Ad76Ad7GJJOb1hKkvrFT1hKUkGGtyQVZHhLUkG9D++IOD4ijhh2HUtRvYfq9c/YE/qwh37oQw+9fcOy/e6USeAO4D7grMz8/HCrWpjqPVSvf8ae0Ic99EOfeujNnndEHBYRB8w664XAZZl5HM2d9eKIeNJwquumeg8LqT8idvWp216ovh7AHvqizz0MPbwj4lER8QngKuCvImLmC69+AezbTn+Y5rsHntjH0Kjew2Lqzx6+ZKu+HsAe+qJCD0MJ74jYb9bJxwE3Z+YY8Fng79vzfwr8MiL2z8yfAt8FRoGxFSx1TtV72E39n+PX6x9p6/8ePal/RvX1APaAPSzKioV3RBwYEe+LiGuAcyLioHZr9RjgS+3e3MeBOyLiRJo7ZX9gfXsT36P59q97VqrmnVXvYTf1r59V/8d2Uf9j2pv47jDrn1F9PYA9tDdhD0uwknvexwHbgWfRfEvhXwIHtDUcMutl+MXAS4CvAT+n+QEIMvMrwPHAz1aw5p1V72Gu+ldRo/4Z1dcD2IM9LFVmLusfTRC8Avg8zXd7r2vP/zDw2nb6COCc9vIn0IwrrWovGwF+0t7OocB1wKuB9wIXAPsud817Wg/V69+T+rAHexjU3yD2vE8CngO8BXgS8Hft+Z8BntxO/wj4IvDMzLyGZou3ASAztwFXA0/IzFuAU2nGlH4MvCEz/3cANe9pPVSvf0/qwx7sYSC6/hjDDmaONoiIJ9C8lPgicGVm/hL4LeAHmfnZiLgBODcing5sAZ4XEesy8/aI+B5wV0Q8FHgX8NKIOJjmu8L/m+blCZk5DUwvsc89rofq9e9pfbS92IPP6RWz4D3vWXfQccBFNIfOPA14WzvLfcB3I2JNZt5A8/LiMTTjRLfSHCcJ8CualyB7AZfRfC3jKcAxwObMvG/RXc3fw6q2h6fSvOwp1UNbV0bEBEXXQdvHAdX7iIiHtP+fDLyvaA8HR8RDImKcZmy3Yg/7VH5OL0qXsRWa4xr/BPhn4I+BvYE/BV7VXn4g8C3gaJo74RxgrL3sJJo7YF07vRVYSzPI/wlgn1nL2WtQ40PAfsDLaVbIJpo3Jar1sD9wJc33qQO8rlL9sx5LpwH/RvNhh1J98P+fSv59mjHRKZrfa632WNoPOJ1m2OBO4MSCPewNvBK4HPhH4OHAGZV6WMrfvHveEXEIcAUwAbyfZjD/ZJpxo+0Amfk/wMeA17YPhoOBR7U38QXgqcA9mXkF8B7gUuB8mq38vTPLysHtqe5HExbHA+8Gng48n+ZNifsq9NBaAzwAODIi1gFH0uwllKg/IvYGrgVeAJybmc9vLzp61vJ73UdmZkSsBf4AeEdmTmTm9TR7ZiV6iIiH0wwpnAC8AbgFuInmDbsyjyfgVTTP6XfQ/Kbuye3pXxXqYfE6bN3W0PxS/Mzp02kG+08Dvjbr/N8Abm2nX0Xz0dED2+tfDjx01rzrVnorBTxo1vTraVboKcV6OA04F3gj8DLg2cA1Vepvl/tR4JSdznshcHWVPmj29v66nZ7ZEz+5Sg80Af2AWacvotmgPrdKD+0yLwf+sJ1+GfCa9rFU5jm9lL8uY96/AL7WHrgOzcuQYzPzYpo9wEMAMvNW4NqIODYzzwe+D3yI5seLr8rMm2ZuMDNv77DcZZWZd0TEARHxPpphk3U0K+/IiBjtcw+z7vu9gP+g2Xs9PjMvB47oe/07uQg4KyLOi4ipiHgT8FWa30g9uK2t733cDvxeRJwCbImIi2n29n67fUXU6x4y81fZvIk3M2YfNF+0dDnNeijxnAb+FTgtIj4CvBl4LPBtmh4Oauvqew+Lt4Ct3MwexsXAGe30+4G/bacfDFxIuyWjGY96NPDAYW+hdurjlTQvDzfTjIF/GXgTzQO41z0AH6HZa1pLc7zqG2gerG8stg4+RfNhiMNpnkhnAF8ptB4e0db7DzR7cS8B3k5zHPDraTayve5hp36+AZzcTl9S6TndPocvotkZezNwFnB9+/gqtR4W+tf5aJPMzIg4DDiEZotHe2cREVfQjCmvynZLlpn3Zua3M/MXXZexEjLzgmzebb6AZpzyn2heQn2cHvcQESM0e3zvBj5J8075scCLgQMj4nJ6XP9OnpeZb83MHwFvpTmM610UWA+tm2g+Dr06m3HVK9rzrqB5Q7ZCD0TEzPP/Kpo3+wDObi4q83g6CpjKZq/5vTQ7Nx+kzmNp8Ra4lXs2zRe07E2zxXsGzZ31EuDxw94SLbCXw4FPAw9pT78UOHrYde2m3gfSvOq5kOaNpgng07Mu73X9u+nrYTRvJj24Uh80H4/e0k4/iOYVxNGVemhr3Zdmw3nyTuef0vce2uz5M+DC9vQ6mh3LI6qth8X8LejHGCLiS8BvAjfSHBv5lsz8VucbGLL2KIETaDY2R9EMnZyfmffu9oo91H6I4GRgMjN/POx6FiIiHkCz4T8V+B2aw7wuyMztQy1sgSLibJrnw9E0OwJnZbMnXkpEXA+8KTM/NPM5jmHX1FVEHEnzPL6HZl38C/A32Xwico/WObzbw7zOAm4ALsn2DY9KImI1zfcb/JKmh3IvnSJiFXBfpSfYrkTEK2gO03x/xfUwIyIeCfywYg+zPnD3OJo3wbdXfFy1OzKPAL6cmXcPu56V0tufQZMkzW3ov6QjSVo4w1uSCjK8Jakgw1uSCjK8Jakgw1uSCjK8Jamg/wNI6CKPSu0/QAAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "yearly_incidence.hist(xrot=20)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }