{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Sujet 7 : Autour du SARS-CoV-2 (Covid-19)\n", "\n", "\n", "## Consignes :\n", "\n", "### Prérequis\n", "\n", "Techniques de présentation graphique. Cet exercice peut être réalisé indifféremment en R ou en Python.\n", "\n", "### Sujet\n", "\n", "Le but est ici de reproduire des graphes semblables à ceux du South China Morning Post (SCMP), sur la page The Coronavirus Pandemic et qui montrent pour différents pays le nombre cumulé (c'est-à-dire le nombre total de cas depuis le début de l'épidémie) de personnes atteintes de la maladie à coronavirus 2019.\n", "\n", "Les données que nous utiliserons dans un premier temps sont compilées par le Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE) et sont mises à disposition sur GitHub. C'est plus particulièrement sur les données time_series_covid19_confirmed_global.csv (des suites chronologiques au format csv) disponibles à l'adresse : https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv, que nous allons nous concentrer.\n", "\n", "Vous commencerez par télécharger les données pour créer un graphe montrant l’évolution du nombre de cas cumulé au cours du temps pour les pays suivants : la Belgique (Belgium), la Chine - toutes les provinces sauf Hong-Kong (China), Hong Kong (China, Hong-Kong), la France métropolitaine (France), l’Allemagne (Germany), l’Iran (Iran), l’Italie (Italy), le Japon (Japan), la Corée du Sud (Korea, South), la Hollande sans les colonies (Netherlands), le Portugal (Portugal), l’Espagne (Spain), le Royaume-Unis sans les colonies (United Kingdom), les États-Unis (US).\n", "\n", "Le nom entre parenthèses est le nom du « pays » tel qu’il apparaît dans le fichier time_series_covid19_confirmed_global.csv. Les données de la Chine apparaissent par province et nous avons séparé Hong-Kong, non pour prendre parti dans les différences entre cette province et l'état chinois, mais parce que c'est ainsi qu'apparaissent les données sur le site du SCMP. Les données pour la France, la Hollande et le Royaume-Uni excluent les territoires d'outre-mer et autres « résidus coloniaux ».\n", "\n", "Ensuite vous ferez un graphe avec la date en abscisse et le nombre cumulé de cas à cette date en ordonnée. Nous vous proposons de faire deux versions de ce graphe, une avec une échelle linéaire et une avec une échelle logarithmique.\n", "\n", "### Question subsidiaire à faire quand on sera sorti du « merdier »\n", "\n", "Vous pourrez également utiliser les données de décès (timeseriescovid19deathsglobal.csv) et refaire les courbes, mais là encore, faites attention lors de l'interprétation. Ces courbes, même si elles paraissent effrayantes, doivent être comparées à la mortalité « normale ». Pour la France des données sont disponibles sur le site de l'INSEE : https://www.insee.fr/fr/information/4470857, ainsi que dans les « Points hebdomadaires » de surveillance de la mortalité diffusés par Santé publique France, comme celui de la semaine 12 (le site étant très mal conçu pour quiconque souhaite une information spécifique, le plus simple est de passer par un moteur de recherche généraliste…).\n", "\n", "Pour atténuer les effets dus aux méthodes de comptage, etc., vous pourrez, une fois l'épidémie terminée, prendre les données du nombre total de décès et les normaliser pour 1000 habitants du pays concerné. Vous irez ensuite chercher les données sur le nombre de lits d'hôpital pour 1000 habitants sur le site de l'OCDE et vous pourrez corréler les deux (c'est-à-dire, faire un graphe avec le nombre de lits en abscisse et le nombre de décès en ordonnée)…\n", "\n", "---\n", "\n", "# Pour commencer\n", "\n", "Nous allons tout d'abord importer les outils necéssaires pour réaliser ce travail.\n", "\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "# pour la vérification de la présence des données\n", "import os\n", "# pour télécharger les données\n", "import urllib.request\n", "\n", "# pour l'affichage des graphiques\n", "import matplotlib.pyplot as plt\n", "# pour le traitement des données\n", "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nous pouvons maintenant télécharger les [données](https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv) si elles ne sont pas déjà téléchargés." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Province/State | \n", "Country/Region | \n", "Lat | \n", "Long | \n", "1/22/20 | \n", "1/23/20 | \n", "1/24/20 | \n", "1/25/20 | \n", "1/26/20 | \n", "1/27/20 | \n", "... | \n", "2/23/21 | \n", "2/24/21 | \n", "2/25/21 | \n", "2/26/21 | \n", "2/27/21 | \n", "2/28/21 | \n", "3/1/21 | \n", "3/2/21 | \n", "3/3/21 | \n", "3/4/21 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "NaN | \n", "Afghanistan | \n", "33.93911 | \n", "67.709953 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "55646 | \n", "55664 | \n", "55680 | \n", "55696 | \n", "55707 | \n", "55714 | \n", "55733 | \n", "55759 | \n", "55770 | \n", "55775 | \n", "
1 | \n", "NaN | \n", "Albania | \n", "41.15330 | \n", "20.168300 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "102306 | \n", "103327 | \n", "104313 | \n", "105229 | \n", "106215 | \n", "107167 | \n", "107931 | \n", "108823 | \n", "109674 | \n", "110521 | \n", "
2 | \n", "NaN | \n", "Algeria | \n", "28.03390 | \n", "1.659600 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "112279 | \n", "112461 | \n", "112622 | \n", "112805 | \n", "112960 | \n", "113092 | \n", "113255 | \n", "113430 | \n", "113593 | \n", "113761 | \n", "
3 | \n", "NaN | \n", "Andorra | \n", "42.50630 | \n", "1.521800 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "10739 | \n", "10775 | \n", "10799 | \n", "10822 | \n", "10849 | \n", "10866 | \n", "10889 | \n", "10908 | \n", "10948 | \n", "10976 | \n", "
4 | \n", "NaN | \n", "Angola | \n", "-11.20270 | \n", "17.873900 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "20584 | \n", "20640 | \n", "20695 | \n", "20759 | \n", "20782 | \n", "20807 | \n", "20854 | \n", "20882 | \n", "20923 | \n", "20981 | \n", "
5 rows × 412 columns
\n", "\n", " | Province/State | \n", "Country/Region | \n", "Lat | \n", "Long | \n", "1/22/20 | \n", "1/23/20 | \n", "1/24/20 | \n", "1/25/20 | \n", "1/26/20 | \n", "1/27/20 | \n", "... | \n", "2/23/21 | \n", "2/24/21 | \n", "2/25/21 | \n", "2/26/21 | \n", "2/27/21 | \n", "2/28/21 | \n", "3/1/21 | \n", "3/2/21 | \n", "3/3/21 | \n", "3/4/21 | \n", "
---|
0 rows × 412 columns
\n", "\n", " | 1/22/20 | \n", "1/23/20 | \n", "1/24/20 | \n", "1/25/20 | \n", "1/26/20 | \n", "1/27/20 | \n", "1/28/20 | \n", "1/29/20 | \n", "1/30/20 | \n", "1/31/20 | \n", "... | \n", "2/24/21 | \n", "2/25/21 | \n", "2/26/21 | \n", "2/27/21 | \n", "2/28/21 | \n", "3/1/21 | \n", "3/2/21 | \n", "3/3/21 | \n", "3/4/21 | \n", "Sum | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Localisation | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
Hong-Kong | \n", "0 | \n", "2 | \n", "2 | \n", "5 | \n", "8 | \n", "8 | \n", "8 | \n", "10 | \n", "10 | \n", "12 | \n", "... | \n", "10913 | \n", "10926 | \n", "10950 | \n", "10983 | \n", "11005 | \n", "11019 | \n", "11032 | \n", "11046 | \n", "11055 | \n", "1659701 | \n", "
Korea, South | \n", "1 | \n", "1 | \n", "2 | \n", "2 | \n", "3 | \n", "4 | \n", "4 | \n", "4 | \n", "4 | \n", "11 | \n", "... | \n", "88516 | \n", "88922 | \n", "89321 | \n", "89676 | \n", "90031 | \n", "90372 | \n", "90816 | \n", "91240 | \n", "91638 | \n", "10964482 | \n", "
China | \n", "548 | \n", "641 | \n", "918 | \n", "1401 | \n", "2067 | \n", "2869 | \n", "5501 | \n", "6077 | \n", "8131 | \n", "9790 | \n", "... | \n", "89919 | \n", "89925 | \n", "89935 | \n", "89941 | \n", "89960 | \n", "89971 | \n", "89981 | \n", "89991 | \n", "90000 | \n", "33057311 | \n", "
Japan | \n", "2 | \n", "2 | \n", "2 | \n", "2 | \n", "4 | \n", "4 | \n", "7 | \n", "7 | \n", "11 | \n", "15 | \n", "... | \n", "427732 | \n", "428816 | \n", "429873 | \n", "431093 | \n", "432090 | \n", "432778 | \n", "433700 | \n", "434944 | \n", "436093 | \n", "41743265 | \n", "
Portugal | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "800586 | \n", "801746 | \n", "802773 | \n", "803844 | \n", "804562 | \n", "804956 | \n", "805647 | \n", "806626 | \n", "807456 | \n", "70635994 | \n", "
Belgium | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "760809 | \n", "763885 | \n", "766654 | \n", "769414 | \n", "771511 | \n", "772294 | \n", "774344 | \n", "777608 | \n", "780251 | \n", "98653796 | \n", "
Netherlands | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "1068960 | \n", "1073971 | \n", "1079084 | \n", "1084021 | \n", "1088690 | \n", "1092452 | \n", "1096433 | \n", "1101430 | \n", "1105544 | \n", "111703914 | \n", "
Iran | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "1598875 | \n", "1607081 | \n", "1615184 | \n", "1623159 | \n", "1631169 | \n", "1639679 | \n", "1648174 | \n", "1656699 | \n", "1665103 | \n", "208249840 | \n", "
Germany | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "4 | \n", "4 | \n", "4 | \n", "5 | \n", "... | \n", "2416037 | \n", "2427069 | \n", "2436506 | \n", "2444177 | \n", "2450295 | \n", "2455569 | \n", "2462061 | \n", "2472913 | \n", "2484306 | \n", "255393641 | \n", "
Italy | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "2 | \n", "... | \n", "2848564 | \n", "2868435 | \n", "2888923 | \n", "2907825 | \n", "2925265 | \n", "2938371 | \n", "2955434 | \n", "2976274 | \n", "2999119 | \n", "312517063 | \n", "
Spain | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "... | \n", "3170644 | \n", "3180212 | \n", "3188553 | \n", "3188553 | \n", "3188553 | \n", "3204531 | \n", "3130184 | \n", "3136321 | \n", "3142358 | \n", "355880364 | \n", "
France | \n", "0 | \n", "0 | \n", "2 | \n", "3 | \n", "3 | \n", "3 | \n", "4 | \n", "5 | \n", "5 | \n", "5 | \n", "... | \n", "3639501 | \n", "3664050 | \n", "3689034 | \n", "3712474 | \n", "3732426 | \n", "3736390 | \n", "3759247 | \n", "3785326 | \n", "3810605 | \n", "404492379 | \n", "
United Kingdom | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "2 | \n", "... | \n", "4144577 | \n", "4154562 | \n", "4163085 | \n", "4170519 | \n", "4176554 | \n", "4182009 | \n", "4188400 | \n", "4194785 | \n", "4201358 | \n", "404528634 | \n", "
US | \n", "1 | \n", "1 | \n", "2 | \n", "2 | \n", "5 | \n", "5 | \n", "5 | \n", "6 | \n", "6 | \n", "8 | \n", "... | \n", "28309085 | \n", "28386492 | \n", "28463190 | \n", "28527344 | \n", "28578548 | \n", "28637313 | \n", "28694071 | \n", "28759980 | \n", "28827144 | \n", "3343145027 | \n", "
14 rows × 409 columns
\n", "