jour1 chargement des données

parent 1ebfa6cf
This source diff could not be displayed because it is too large. You can view the blob instead.
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Sujet 7 : Autour du SARS-CoV-2 (Covid-19) *(Evaluation par pairs)*\n",
"\n",
"## Résumé de l'énoncé\n",
"\n",
"Le but est ici de reproduire des graphes semblables à ceux du South China Morning Post (SCMP), sur la page [The Coronavirus Pandemic](https://www.scmp.com/coronavirus?src=homepage_covid_widget) et qui montrent pour différents pays le nombre cumulé (c'est-à-dire le nombre total de cas depuis le début de l'épidémie) de personnes atteintes de la maladie à coronavirus 2019. Les données sont disponibles à https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv.\n",
"Nous créerons un graphe montrant l’évolution du nombre de cas cumulé au cours du temps pour les pays suivants (14 au total): \n",
"+ la Belgique (Belgium) \n",
"+ la Chine - toutes les provinces sauf Hong-Kong (China), \n",
"+ Hong Kong (China, Hong-Kong), \n",
"+ la France métropolitaine (France), \n",
"+ l’Allemagne (Germany), \n",
"+ l’Iran (Iran), \n",
"+ l’Italie (Italy), \n",
"+ le Japon (Japan), \n",
"+ la Corée du Sud (Korea, South), \n",
"+ la Hollande sans les colonies (Netherlands), \n",
"+ le Portugal (Portugal), \n",
"+ l’Espagne (Spain), \n",
"+ le Royaume-Unis sans les colonies (United Kingdom), \n",
"+ les États-Unis (US).\n",
"\n",
"Les graphes auront la date en abscisse et le nombre cumulé de cas à cette date en ordonnée. Nous aurons deux versions de ce graphe, une avec une échelle linéaire et une avec une échelle logarithmique."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Chargement des données"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Nous importons les librairies necessaires pour l'analyse des données dans un premier temps."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"%matplotlib inline\n",
"import matplotlib.pyplot as plt\n",
"import pandas as pd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Nous chargeons dans un second temps les données dans une variable Python (utilisant la structure de données *panda*)."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"data_url = \"https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv\"\n",
"raw_data = pd.read_csv(data_url)"
]
},
{
"cell_type": "code",
"execution_count": 98,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.axes._subplots.AxesSubplot at 0x7fb744d815f8>"
]
},
"execution_count": 98,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"raw_data.loc[(raw_data[\"Country/Region\"]==\"France\") & raw_data[\"Province/State\"].isna()].iloc[:,4:].isnull().values.any()\n",
"raw_data.loc[(raw_data[\"Country/Region\"]==\"France\") & raw_data[\"Province/State\"].isna()].iloc[:,4:].transpose().plot()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Nous notons la date de début et la date de fin. \n",
"Ensuite nous chargeons les données des 14 pays.\n",
"Nous vérifions qu'il n'existe pas d'entrées eronnées dans les données de ces pays."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"data_fr = raw_data.loc[(raw_data[\"Country/Region\"]==\"France\") & raw_data[\"Province/State\"].isna()]\n",
"data_be = raw_data.loc[(raw_data[\"Country/Region\"]==\"Belgium\") & raw_data[\"Province/State\"].isna()]\n",
"data_de = raw_data.loc[(raw_data[\"Country/Region\"]==\"Germany\") & raw_data[\"Province/State\"].isna()]\n",
"data_ir = raw_data.loc[(raw_data[\"Country/Region\"]==\"Iran\") & raw_data[\"Province/State\"].isna()]\n",
"data_it = raw_data.loc[(raw_data[\"Country/Region\"]==\"Italy\") & raw_data[\"Province/State\"].isna()]\n",
"data_jp = raw_data.loc[(raw_data[\"Country/Region\"]==\"Japan\") & raw_data[\"Province/State\"].isna()]\n",
"data_kr = raw_data.loc[(raw_data[\"Country/Region\"]==\"Korea, South\") & raw_data[\"Province/State\"].isna()]\n",
"data_nl = raw_data.loc[(raw_data[\"Country/Region\"]==\"Netherlands\") & raw_data[\"Province/State\"].isna()]\n",
"data_pt = raw_data.loc[(raw_data[\"Country/Region\"]==\"Portugal\") & raw_data[\"Province/State\"].isna()]\n",
"data_es = raw_data.loc[(raw_data[\"Country/Region\"]==\"Spain\") & raw_data[\"Province/State\"].isna()]\n",
"data_uk = raw_data.loc[(raw_data[\"Country/Region\"]==\"United Kingdom\") & raw_data[\"Province/State\"].isna()]\n",
"data_us = raw_data.loc[(raw_data[\"Country/Region\"]==\"US\") & raw_data[\"Province/State\"].isna()]\n",
"\n",
"data_hg = raw_data.loc[(raw_data[\"Country/Region\"]==\"China\") & (raw_data[\"Province/State\"]==\"Hong Kong\")]\n",
"data_ch = pd.DataFrame([raw_data.loc[(raw_data[\"Country/Region\"]==\"China\") & (raw_data[\"Province/State\"]!=\"Hong Kong\")].sum(axis=0)])\n",
"\n",
"\n",
"\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.4"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment