{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Autour du SARS-CoV-2 (Covid-19)\n", "\n", "Le but est ici de reproduire des graphes semblables à ceux du [South China Morning Post](https://www.scmp.com/) (SCMP), sur la page [The Coronavirus Pandemic](https://www.scmp.com/coronavirus?src=homepage_covid_widget) et qui montrent pour différents pays le nombre cumulé (c’est-à-dire le nombre total de cas depuis le début de l’épidémie) de personnes atteintes de la [maladie à coronavirus 2019](https://fr.wikipedia.org/wiki/Maladie_%C3%A0_coronavirus_2019).\n", "\n", "Les données que nous utiliserons dans un premier temps sont compilées par le [Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE)](https://systems.jhu.edu/) et sont mises à disposition sur [GitHub](https://github.com/CSSEGISandData/COVID-19). C'est plus particulièrement sur les données `time_series_covid19_confirmed_global.csv` (des suites chronologiques au format csv) disponibles [ici](https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv) que nous allons nous concentrer." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Téléchargement et traitement des données\n", "\n", "Les données relevées sont stockées dans un fichier. Celles-ci sont à la date du 22 juin 2021." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'time_series_covid19_confirmed_global.csv'" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data_file = \"time_series_covid19_confirmed_global.csv\"\n", "data_url = \"https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv\"\n", "\n", "import os\n", "import urllib.request\n", "if not os.path.exists(data_file):\n", " urllib.request.urlretrieve(data_url, data_file)\n", "data_file" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "\n", "raw_data = pd.read_csv(data_file, sep=',')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On s'intéressera ici spécifiquement aux données de la **Belgique**, la **Chine** (en traitant **Hong-Kong** à part), la **France métroploitaine**, l'**Allemagne**, l'**Iran**, l'**Italie**, le **Japon**, la **Corée du Sud**, les **Pays-Bas** (*hors colonies*), le **Portugal**, l'**Espagne**, le **Royaume-Uni** (*hors colonies*) et les **États-Unis**.\n", "\n", "[//]: # \"Initialement, il est demandé de tenir compte également de la **Chine** (en traitant **Hong-Kong** à part). Cependant, et comme on peut le voir juste au dessus, le format utilisé pour le fichier `.csv` traite chacune des 34 provinces chinoises à part, avec aucune donnée générale sur la Chine. Plusieurs choix s'offrent à nous : reconstituer une ligne *globale* pour ce pays en mélangeant **toutes** ses provinces, faire la même chose en gardant de côté **Hong-Kong** pour coller à la consigne ou se simplifier la vie en mettant de côté les données chinoises.\"\n", "\n", "[//]: # \"Je choisis cette dernière options pour plusieurs raisons. La première, et plus évidente, est la facilité : je ne pense pas parvenir à mélanger toutes les provinces de la Chine efficacement/élégamment, et suis presque certain d'effectuer une erreur en m'y frottant. Par ailleurs, on remarque en lisant l'énoncé de cet exercice :\"\n", "\n", "[//]: # \"> Les données de la Chine apparaissent par province et nous avons séparé Hong-Kong, non pour prendre parti dans les différences entre cette province et l’état chinois, mais parce que c’est ainsi qu’apparaissent les données sur le site du SCMP.\"\n", "\n", "[//]: # \"Ce qui laisse penser que cette difficulté n'est pas initialement prévue, et que la consigne initiale est tournée de manière à ne pas devoir réaliser de fusion de lignes. Pour toutes ces raisons, laisser de côté les données pour la **Chine** me semble à la fois bien plus judicieux en terme de temps, mais aussi plus proche de l'intention initiale de la consigne.\"" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "selectedCountries = ['Belgium', 'France', 'China', 'Germany', 'Iran', 'Italy',\n", " 'Japan', 'Korea,South', 'Netherlands', 'Portugal', 'Spain',\n", " 'United Kingdom', 'US']\n", "\n", "selectedData = raw_data[raw_data['Country/Region'].isin(selectedCountries)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pour tous les pays - sauf la Chine - les données hors provinces/colonies présentent `NaN` dans leur colonne `Province/State`. On peut donc récupérer d'une part toutes les données chinoises, et d'autre part les données des autres pays." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "dataChina = selectedData[selectedData['Country/Region'] == 'China']" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "dataOther = selectedData[selectedData['Province/State'] != selectedData['Province/State']]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On peut finalement concaténer ces deux jeux de données." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Province/StateCountry/RegionLatLong1/22/201/23/201/24/201/25/201/26/201/27/20...6/12/216/13/216/14/216/15/216/16/216/17/216/18/216/19/216/20/216/21/21
23NaNBelgium50.8333004.469936000000...1075765107633810765791077087107775810782511078251107908410794151079640
58AnhuiChina31.825700117.2264001915396070...1004100410041004100410041004100410041004
59BeijingChina40.182400116.414200142236416880...1069107010711071107210721073107310751075
60ChongqingChina30.057200107.87400069275775110...598598598598598598598598598598
61FujianChina26.078900117.9874001510183559...637637638638641646650651652659
62GansuChina35.751800104.2861000224714...194194194194194194194194194194
63GuangdongChina23.341700113.42440026325378111151...2618262526352650265726662680269226992706
64GuangxiChina23.829800108.7881002523233646...275275275275275275275275275275
65GuizhouChina26.815400106.874800133457...147147147147147147147147147147
66HainanChina19.195900109.745300458192233...188188188188188188188188188188
67HebeiChina39.549000116.13060011281318...1317131713171317131713171317131713171317
68HeilongjiangChina47.862000127.76150002491521...1612161216121612161216121612161216121612
69HenanChina37.895700114.9042005593283128...1316131613161316131613161316131613171317
70Hong KongChina22.300000114.200000022588...11877118771187811880118811188111884118851188611889
71HubeiChina30.975600112.27070044444454976110581423...68159681596815968159681606816068160681606816068160
72HunanChina27.610400111.70880049244369100...1051105110511051105110511051105110511051
73Inner MongoliaChina44.093500113.9448000017711...390393393393393393393393393394
74JiangsuChina32.971100119.455000159183347...735736736738739739739739740740
75JiangxiChina27.614000115.7221002718183672...937937937937937937937937937937
76JilinChina43.666100126.192300013446...573573573573573573573573573573
77LiaoningChina41.295600122.608500234172127...426426426426426426426426426426
78MacauChina22.166700113.550000122256...52525252525253535353
79NingxiaChina37.269200106.165500112347...76767676767676767676
80QinghaiChina35.74520095.995600000116...18181818181818181818
81ShaanxiChina35.191700108.870100035152235...622622622622622622624624624624
82ShandongChina36.342700118.1498002615274675...883883883883883883883883883883
83ShanghaiChina31.202000121.44910091620334053...2155216021652168217021732179218221832184
84ShanxiChina37.577700112.2922001116913...253253253253253253253253253253
85SichuanChina30.617100102.7103005815284469...1050105410551056105710571057105810591064
86TianjinChina39.305400117.323000448101423...398398398398398399399399399399
87TibetChina31.69270088.092400000000...1111111111
88UnknownChinaNaNNaN000000...0000000000
89XinjiangChina41.11290085.240100022345...980980980980980980980980980980
90YunnanChina24.974000101.487000125111626...374376377377380382384388391391
91ZhejiangChina29.183200120.09340010274362104128...1372137213731373137313761377137913791383
130NaNFrance46.2276002.213700002333...5675604567820956788935681846568353656853875688557569118156929965692968
134NaNGermany51.16569110.451526000001...3722295372329537241683725328372676737276683728601372959737301263730619
149NaNIran32.42790853.688046000000...3020522302871730394323049648306013530704263080526308697430951353105620
153NaNItaly41.87194012.567380000000...4243482424487242457794247032424843242497554250902425209542529764253460
155NaNJapan36.204824138.252924222244...774240775624776565777979779696781241782877784384785702786566
197NaNNetherlands52.1326005.291300000000...1671703167274416735961674628167564416767081677596167828216789831679542
214NaNPortugal39.399900-8.224500000000...856740857447858072859045860395861628862926864109865050865806
237NaNSpain40.463667-3.749220000000...3733600373360037417673745199374903137532283757442375744237574423764651
253NaNUS40.000000-100.000000112255...33457228334620033347473433486038334984683350886733529475335379953354188733554275
268NaNUnited Kingdom55.378100-3.436000000000...4558494456581345734194581006458981446006234610893462096846300404640507
\n", "

45 rows × 521 columns

\n", "
" ], "text/plain": [ " Province/State Country/Region Lat Long 1/22/20 1/23/20 \\\n", "23 NaN Belgium 50.833300 4.469936 0 0 \n", "58 Anhui China 31.825700 117.226400 1 9 \n", "59 Beijing China 40.182400 116.414200 14 22 \n", "60 Chongqing China 30.057200 107.874000 6 9 \n", "61 Fujian China 26.078900 117.987400 1 5 \n", "62 Gansu China 35.751800 104.286100 0 2 \n", "63 Guangdong China 23.341700 113.424400 26 32 \n", "64 Guangxi China 23.829800 108.788100 2 5 \n", "65 Guizhou China 26.815400 106.874800 1 3 \n", "66 Hainan China 19.195900 109.745300 4 5 \n", "67 Hebei China 39.549000 116.130600 1 1 \n", "68 Heilongjiang China 47.862000 127.761500 0 2 \n", "69 Henan China 37.895700 114.904200 5 5 \n", "70 Hong Kong China 22.300000 114.200000 0 2 \n", "71 Hubei China 30.975600 112.270700 444 444 \n", "72 Hunan China 27.610400 111.708800 4 9 \n", "73 Inner Mongolia China 44.093500 113.944800 0 0 \n", "74 Jiangsu China 32.971100 119.455000 1 5 \n", "75 Jiangxi China 27.614000 115.722100 2 7 \n", "76 Jilin China 43.666100 126.192300 0 1 \n", "77 Liaoning China 41.295600 122.608500 2 3 \n", "78 Macau China 22.166700 113.550000 1 2 \n", "79 Ningxia China 37.269200 106.165500 1 1 \n", "80 Qinghai China 35.745200 95.995600 0 0 \n", "81 Shaanxi China 35.191700 108.870100 0 3 \n", "82 Shandong China 36.342700 118.149800 2 6 \n", "83 Shanghai China 31.202000 121.449100 9 16 \n", "84 Shanxi China 37.577700 112.292200 1 1 \n", "85 Sichuan China 30.617100 102.710300 5 8 \n", "86 Tianjin China 39.305400 117.323000 4 4 \n", "87 Tibet China 31.692700 88.092400 0 0 \n", "88 Unknown China NaN NaN 0 0 \n", "89 Xinjiang China 41.112900 85.240100 0 2 \n", "90 Yunnan China 24.974000 101.487000 1 2 \n", "91 Zhejiang China 29.183200 120.093400 10 27 \n", "130 NaN France 46.227600 2.213700 0 0 \n", "134 NaN Germany 51.165691 10.451526 0 0 \n", "149 NaN Iran 32.427908 53.688046 0 0 \n", "153 NaN Italy 41.871940 12.567380 0 0 \n", "155 NaN Japan 36.204824 138.252924 2 2 \n", "197 NaN Netherlands 52.132600 5.291300 0 0 \n", "214 NaN Portugal 39.399900 -8.224500 0 0 \n", "237 NaN Spain 40.463667 -3.749220 0 0 \n", "253 NaN US 40.000000 -100.000000 1 1 \n", "268 NaN United Kingdom 55.378100 -3.436000 0 0 \n", "\n", " 1/24/20 1/25/20 1/26/20 1/27/20 ... 6/12/21 6/13/21 \\\n", "23 0 0 0 0 ... 1075765 1076338 \n", "58 15 39 60 70 ... 1004 1004 \n", "59 36 41 68 80 ... 1069 1070 \n", "60 27 57 75 110 ... 598 598 \n", "61 10 18 35 59 ... 637 637 \n", "62 2 4 7 14 ... 194 194 \n", "63 53 78 111 151 ... 2618 2625 \n", "64 23 23 36 46 ... 275 275 \n", "65 3 4 5 7 ... 147 147 \n", "66 8 19 22 33 ... 188 188 \n", "67 2 8 13 18 ... 1317 1317 \n", "68 4 9 15 21 ... 1612 1612 \n", "69 9 32 83 128 ... 1316 1316 \n", "70 2 5 8 8 ... 11877 11877 \n", "71 549 761 1058 1423 ... 68159 68159 \n", "72 24 43 69 100 ... 1051 1051 \n", "73 1 7 7 11 ... 390 393 \n", "74 9 18 33 47 ... 735 736 \n", "75 18 18 36 72 ... 937 937 \n", "76 3 4 4 6 ... 573 573 \n", "77 4 17 21 27 ... 426 426 \n", "78 2 2 5 6 ... 52 52 \n", "79 2 3 4 7 ... 76 76 \n", "80 0 1 1 6 ... 18 18 \n", "81 5 15 22 35 ... 622 622 \n", "82 15 27 46 75 ... 883 883 \n", "83 20 33 40 53 ... 2155 2160 \n", "84 1 6 9 13 ... 253 253 \n", "85 15 28 44 69 ... 1050 1054 \n", "86 8 10 14 23 ... 398 398 \n", "87 0 0 0 0 ... 1 1 \n", "88 0 0 0 0 ... 0 0 \n", "89 2 3 4 5 ... 980 980 \n", "90 5 11 16 26 ... 374 376 \n", "91 43 62 104 128 ... 1372 1372 \n", "130 2 3 3 3 ... 5675604 5678209 \n", "134 0 0 0 1 ... 3722295 3723295 \n", "149 0 0 0 0 ... 3020522 3028717 \n", "153 0 0 0 0 ... 4243482 4244872 \n", "155 2 2 4 4 ... 774240 775624 \n", "197 0 0 0 0 ... 1671703 1672744 \n", "214 0 0 0 0 ... 856740 857447 \n", "237 0 0 0 0 ... 3733600 3733600 \n", "253 2 2 5 5 ... 33457228 33462003 \n", "268 0 0 0 0 ... 4558494 4565813 \n", "\n", " 6/14/21 6/15/21 6/16/21 6/17/21 6/18/21 6/19/21 6/20/21 \\\n", "23 1076579 1077087 1077758 1078251 1078251 1079084 1079415 \n", "58 1004 1004 1004 1004 1004 1004 1004 \n", "59 1071 1071 1072 1072 1073 1073 1075 \n", "60 598 598 598 598 598 598 598 \n", "61 638 638 641 646 650 651 652 \n", "62 194 194 194 194 194 194 194 \n", "63 2635 2650 2657 2666 2680 2692 2699 \n", "64 275 275 275 275 275 275 275 \n", "65 147 147 147 147 147 147 147 \n", "66 188 188 188 188 188 188 188 \n", "67 1317 1317 1317 1317 1317 1317 1317 \n", "68 1612 1612 1612 1612 1612 1612 1612 \n", "69 1316 1316 1316 1316 1316 1316 1317 \n", "70 11878 11880 11881 11881 11884 11885 11886 \n", "71 68159 68159 68160 68160 68160 68160 68160 \n", "72 1051 1051 1051 1051 1051 1051 1051 \n", "73 393 393 393 393 393 393 393 \n", "74 736 738 739 739 739 739 740 \n", "75 937 937 937 937 937 937 937 \n", "76 573 573 573 573 573 573 573 \n", "77 426 426 426 426 426 426 426 \n", "78 52 52 52 52 53 53 53 \n", "79 76 76 76 76 76 76 76 \n", "80 18 18 18 18 18 18 18 \n", "81 622 622 622 622 624 624 624 \n", "82 883 883 883 883 883 883 883 \n", "83 2165 2168 2170 2173 2179 2182 2183 \n", "84 253 253 253 253 253 253 253 \n", "85 1055 1056 1057 1057 1057 1058 1059 \n", "86 398 398 398 399 399 399 399 \n", "87 1 1 1 1 1 1 1 \n", "88 0 0 0 0 0 0 0 \n", "89 980 980 980 980 980 980 980 \n", "90 377 377 380 382 384 388 391 \n", "91 1373 1373 1373 1376 1377 1379 1379 \n", "130 5678893 5681846 5683536 5685387 5688557 5691181 5692996 \n", "134 3724168 3725328 3726767 3727668 3728601 3729597 3730126 \n", "149 3039432 3049648 3060135 3070426 3080526 3086974 3095135 \n", "153 4245779 4247032 4248432 4249755 4250902 4252095 4252976 \n", "155 776565 777979 779696 781241 782877 784384 785702 \n", "197 1673596 1674628 1675644 1676708 1677596 1678282 1678983 \n", "214 858072 859045 860395 861628 862926 864109 865050 \n", "237 3741767 3745199 3749031 3753228 3757442 3757442 3757442 \n", "253 33474734 33486038 33498468 33508867 33529475 33537995 33541887 \n", "268 4573419 4581006 4589814 4600623 4610893 4620968 4630040 \n", "\n", " 6/21/21 \n", "23 1079640 \n", "58 1004 \n", "59 1075 \n", "60 598 \n", "61 659 \n", "62 194 \n", "63 2706 \n", "64 275 \n", "65 147 \n", "66 188 \n", "67 1317 \n", "68 1612 \n", "69 1317 \n", "70 11889 \n", "71 68160 \n", "72 1051 \n", "73 394 \n", "74 740 \n", "75 937 \n", "76 573 \n", "77 426 \n", "78 53 \n", "79 76 \n", "80 18 \n", "81 624 \n", "82 883 \n", "83 2184 \n", "84 253 \n", "85 1064 \n", "86 399 \n", "87 1 \n", "88 0 \n", "89 980 \n", "90 391 \n", "91 1383 \n", "130 5692968 \n", "134 3730619 \n", "149 3105620 \n", "153 4253460 \n", "155 786566 \n", "197 1679542 \n", "214 865806 \n", "237 3764651 \n", "253 33554275 \n", "268 4640507 \n", "\n", "[45 rows x 521 columns]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data = pd.concat([dataOther, dataChina]).sort_index()\n", "data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On remarque d'ailleurs, pour la Chine (`index 88`), une ligne vide sans intérêt pour la suit de nos travaux car pleine de *zéros* ; on la supprime." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "data = data.drop([88])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualisation des données\n", "\n", "On cherche à visualiser le nombre cumulé de cas à chaque date disponible.\n", "Présentons dans un premier temps les données telles qu'elles." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "\n", "# Extraction des dates, lieux et nombre de cas cumulés\n", "dates = data.keys()[4:] # Dates\n", "place = data.get_values()[:,:2] # [Province , Country]\n", "cases = data.get_values()[:,4:] # Cas" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Reformattons les noms des différents pays/régions pour plus de lisibilité. Passons ainsi du format\n", " `[Province, Country]`\n", "au format\n", " `[Country (-Province?)]`\n", "où l'on ne précise la province que si c'est nécessaire." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "def convert_location_names(place) :\n", " places = []\n", " for i in range(len(place)) :\n", " newDate = place[i,1]\n", " # Cas où la province est spécifiée (non- 'NaN')\n", " if not place[i,0] != place[i,0] :\n", " newDate = newDate + \" - \" + place[i,0]\n", " places.append(newDate)\n", " return places\n", "place = convert_location_names(place) # Attention à ne pas réexécuter plusieurs fois cette cellule" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "for i,p in enumerate(place) :\n", " plt.plot(dates, cases[i,:], label=p)\n", "# On ne montre pas toutes les dates pour plus de lisibilité\n", "\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }