{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Analyse de l'épidémie de choléra à Londres en 1854"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Contexte historique"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"En 1854, le quartier de Soho à Londres a vécu [une des pires épidémies de choléra du Royaume-Uni](https://fr.wikipedia.org/wiki/%C3%89pid%C3%A9mie_de_chol%C3%A9ra_de_Broad_Street), avec 616 morts. Cette épidémie est devenue célèbre à cause de l'analyse détaillée de ses causes réalisée par le médecin [John Snow](https://johnsnowsociety.org/). Ce dernier a notamment montré que le choléra est transmis par l'eau plutôt que par l'air, ce qui était la théorie dominante de l'époque."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Analyse reproductible"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Récupération des données numériques\n",
"Tout d'abord récupérons les données numériques mis à notre disposition sur ce [blog](http://blog.rtwilson.com/john-snows-cholera-data-in-more-formats/). Nous pouvons retrouver sur ce site deux archives **SnowGIS_SHP.zip** et **SnowGIS_KML.zip** contenant des formats de fichiers différents. Le site mettait aussi à disposition des liens google où les données étaient directement accessibles mais malheureusement ces liens ne sont plus fonctionnels. Nous utiliserons donc les données d'une des archives téléchargées et importées afin de garantir la pérennité et l'accessibilité à ces données. \n",
"\n",
"Nous avons 4 informations à notre disposition dans ces archives : \n",
"- Emplacements des décès dus au choléra (vecteur) avec le nombre de décès à chaque point\n",
"- Emplacements des pompes (vecteur)\n",
"- La carte originale de John Snow géoréférencée sur la grille nationale de l'Ordnance Survey (raster)\n",
"- Cartes actuelles de l'Ordnance Survey de la zone (à partir de celles publiées sous OS OpenData; Contient les données de l'Ordnance Survey © Copyright et droit de la base de données 2013; Raster)\n",
"\n",
"Étant donné que nous utiliserons la bibliothèque [folium](https://python-visualization.github.io/folium/) pour l'affichage de la carte, seules les deux premières nous seront utiles. Si la bibliothèque n'est pas présente dans l'environnement, il faudra l'installer avec la commande `pip install folium`.\n",
"\n",
"La première archive **SnowGIS_SHP** contient des fichiers SHP (Shapefiles) qui sont interpretable par la bibliothèque [geopandas](https://geopandas.org/index.html). Nous ne nous sera pas nécessaire d'utiliser l'autre archive avec les fichiers KML.\n",
"\n",
"Nous utiliserons la bibliothèque [geopandas](https://geopandas.org/index.html) pour l'importation des données au format `.shp`. \n",
"Si la bibliothèque geopandas n'est pas présente dans l'environnement, il faudra l'installer avec la commande `pip install geopandas`."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: geopandas in /opt/conda/lib/python3.6/site-packages (0.8.1)\n",
"Requirement already satisfied: pandas>=0.23.0 in /opt/conda/lib/python3.6/site-packages (from geopandas) (1.1.1)\n",
"Requirement already satisfied: pyproj>=2.2.0 in /opt/conda/lib/python3.6/site-packages (from geopandas) (2.6.1.post1)\n",
"Requirement already satisfied: fiona in /opt/conda/lib/python3.6/site-packages (from geopandas) (1.8.13.post1)\n",
"Requirement already satisfied: shapely in /opt/conda/lib/python3.6/site-packages (from geopandas) (1.7.1)\n",
"Requirement already satisfied: python-dateutil>=2.7.3 in /opt/conda/lib/python3.6/site-packages (from pandas>=0.23.0->geopandas) (2.8.1)\n",
"Requirement already satisfied: numpy>=1.15.4 in /opt/conda/lib/python3.6/site-packages (from pandas>=0.23.0->geopandas) (1.19.1)\n",
"Requirement already satisfied: pytz>=2017.2 in /opt/conda/lib/python3.6/site-packages (from pandas>=0.23.0->geopandas) (2019.3)\n",
"Requirement already satisfied: cligj>=0.5 in /opt/conda/lib/python3.6/site-packages (from fiona->geopandas) (0.5.0)\n",
"Requirement already satisfied: click-plugins>=1.0 in /opt/conda/lib/python3.6/site-packages (from fiona->geopandas) (1.1.1)\n",
"Requirement already satisfied: click<8,>=4.0 in /opt/conda/lib/python3.6/site-packages (from fiona->geopandas) (7.1.2)\n",
"Requirement already satisfied: six>=1.7 in /opt/conda/lib/python3.6/site-packages (from fiona->geopandas) (1.14.0)\n",
"Requirement already satisfied: munch in /opt/conda/lib/python3.6/site-packages (from fiona->geopandas) (2.5.0)\n",
"Requirement already satisfied: attrs>=17 in /opt/conda/lib/python3.6/site-packages (from fiona->geopandas) (19.3.0)\n",
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"pip install geopandas"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import geopandas as gpd\n",
"death_cholera_location = gpd.read_file(\"Cholera_Deaths.shp\", crs={\"init\": \"epsg:4326\"})\n",
"pumps_location = gpd.read_file(\"Pumps.shp\", crs={\"init\": \"epsg:4326\"})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Visualisation et traitement des données\n",
"Visualisons les données que nous avons concernant les décès dus au choléra. Comme il était indiqué, nous avons le nombre (*count*) et la localisation (*geometry*) des décès."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Id | \n",
" Count | \n",
" geometry | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 0 | \n",
" 3 | \n",
" POINT (529308.741 181031.352) | \n",
"
\n",
" \n",
" 1 | \n",
" 0 | \n",
" 2 | \n",
" POINT (529312.164 181025.172) | \n",
"
\n",
" \n",
" 2 | \n",
" 0 | \n",
" 1 | \n",
" POINT (529314.382 181020.294) | \n",
"
\n",
" \n",
" 3 | \n",
" 0 | \n",
" 1 | \n",
" POINT (529317.380 181014.259) | \n",
"
\n",
" \n",
" 4 | \n",
" 0 | \n",
" 4 | \n",
" POINT (529320.675 181007.872) | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 245 | \n",
" 0 | \n",
" 3 | \n",
" POINT (529362.665 181156.058) | \n",
"
\n",
" \n",
" 246 | \n",
" 0 | \n",
" 2 | \n",
" POINT (529365.152 181176.129) | \n",
"
\n",
" \n",
" 247 | \n",
" 0 | \n",
" 1 | \n",
" POINT (529274.165 180907.313) | \n",
"
\n",
" \n",
" 248 | \n",
" 0 | \n",
" 1 | \n",
" POINT (529299.361 180873.185) | \n",
"
\n",
" \n",
" 249 | \n",
" 0 | \n",
" 1 | \n",
" POINT (529324.815 180857.949) | \n",
"
\n",
" \n",
"
\n",
"
250 rows × 3 columns
\n",
"
"
],
"text/plain": [
" Id Count geometry\n",
"0 0 3 POINT (529308.741 181031.352)\n",
"1 0 2 POINT (529312.164 181025.172)\n",
"2 0 1 POINT (529314.382 181020.294)\n",
"3 0 1 POINT (529317.380 181014.259)\n",
"4 0 4 POINT (529320.675 181007.872)\n",
".. .. ... ...\n",
"245 0 3 POINT (529362.665 181156.058)\n",
"246 0 2 POINT (529365.152 181176.129)\n",
"247 0 1 POINT (529274.165 180907.313)\n",
"248 0 1 POINT (529299.361 180873.185)\n",
"249 0 1 POINT (529324.815 180857.949)\n",
"\n",
"[250 rows x 3 columns]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"death_cholera_location"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Vérifions d'abord s'il n'y a pas de données manquantes."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"death_cholera_location.isna().any().any()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Bonne nouvelle, il n'y a pas de données manquantes.\n",
"\n",
"Nous pouvons d'ors et déjà remarquer que les points sont des coordonnées X/Y et non latitude/longitude car les valeurs de celles-ci doivent être comprises entre +90°/-90° et +180°/-180°. Or la bibliothèque **folium** prend en compte ces valeurs pour le positionnement de points. Il sera donc nécessaire d'effectuer une conversion. "
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"death_cholera_location = death_cholera_location.to_crs(epsg=4326) #Conversion au bon format"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Regardons si la conversion a bien été faites."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Id | \n",
" Count | \n",
" geometry | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 0 | \n",
" 3 | \n",
" POINT (-0.13793 51.51342) | \n",
"
\n",
" \n",
" 1 | \n",
" 0 | \n",
" 2 | \n",
" POINT (-0.13788 51.51336) | \n",
"
\n",
" \n",
" 2 | \n",
" 0 | \n",
" 1 | \n",
" POINT (-0.13785 51.51332) | \n",
"
\n",
" \n",
" 3 | \n",
" 0 | \n",
" 1 | \n",
" POINT (-0.13781 51.51326) | \n",
"
\n",
" \n",
" 4 | \n",
" 0 | \n",
" 4 | \n",
" POINT (-0.13777 51.51320) | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 245 | \n",
" 0 | \n",
" 3 | \n",
" POINT (-0.13711 51.51453) | \n",
"
\n",
" \n",
" 246 | \n",
" 0 | \n",
" 2 | \n",
" POINT (-0.13706 51.51471) | \n",
"
\n",
" \n",
" 247 | \n",
" 0 | \n",
" 1 | \n",
" POINT (-0.13847 51.51231) | \n",
"
\n",
" \n",
" 248 | \n",
" 0 | \n",
" 1 | \n",
" POINT (-0.13812 51.51200) | \n",
"
\n",
" \n",
" 249 | \n",
" 0 | \n",
" 1 | \n",
" POINT (-0.13776 51.51186) | \n",
"
\n",
" \n",
"
\n",
"
250 rows × 3 columns
\n",
"
"
],
"text/plain": [
" Id Count geometry\n",
"0 0 3 POINT (-0.13793 51.51342)\n",
"1 0 2 POINT (-0.13788 51.51336)\n",
"2 0 1 POINT (-0.13785 51.51332)\n",
"3 0 1 POINT (-0.13781 51.51326)\n",
"4 0 4 POINT (-0.13777 51.51320)\n",
".. .. ... ...\n",
"245 0 3 POINT (-0.13711 51.51453)\n",
"246 0 2 POINT (-0.13706 51.51471)\n",
"247 0 1 POINT (-0.13847 51.51231)\n",
"248 0 1 POINT (-0.13812 51.51200)\n",
"249 0 1 POINT (-0.13776 51.51186)\n",
"\n",
"[250 rows x 3 columns]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"death_cholera_location"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Les coordonnées ressemblent déjà plus à ce que l'on recherche. Faisons une vérification rapide sur la carte de la localisation avec les premières coordonnées."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: folium in /opt/conda/lib/python3.6/site-packages (0.11.0)\n",
"Requirement already satisfied: requests in /opt/conda/lib/python3.6/site-packages (from folium) (2.23.0)\n",
"Requirement already satisfied: branca>=0.3.0 in /opt/conda/lib/python3.6/site-packages (from folium) (0.4.1)\n",
"Requirement already satisfied: jinja2>=2.9 in /opt/conda/lib/python3.6/site-packages (from folium) (2.11.0)\n",
"Requirement already satisfied: numpy in /opt/conda/lib/python3.6/site-packages (from folium) (1.19.1)\n",
"Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/conda/lib/python3.6/site-packages (from requests->folium) (1.25.7)\n",
"Requirement already satisfied: idna<3,>=2.5 in /opt/conda/lib/python3.6/site-packages (from requests->folium) (2.9)\n",
"Requirement already satisfied: chardet<4,>=3.0.2 in /opt/conda/lib/python3.6/site-packages (from requests->folium) (3.0.4)\n",
"Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.6/site-packages (from requests->folium) (2020.4.5.1)\n",
"Requirement already satisfied: MarkupSafe>=0.23 in /opt/conda/lib/python3.6/site-packages (from jinja2>=2.9->folium) (1.1.1)\n",
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"pip install folium"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/html": [
"Make this Notebook Trusted to load map: File -> Trust Notebook
"
],
"text/plain": [
""
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import folium as fl\n",
"m = fl.Map(location=[-0.13667, 51.51334],zoom_start=5)\n",
"m"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Il semblerait que les coordonnées soient mauvaises. Après vérification sur la documentation de **geopandas**, POINT = (Longitude, Latitude). Et pour folium, nous rentrons `location=[Latitude, Longitude]`. \n",
"\n",
"Essayons d'inverser les valeurs et revérifions."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/html": [
"Make this Notebook Trusted to load map: File -> Trust Notebook
"
],
"text/plain": [
""
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"m = fl.Map(location=[51.51342, -0.13793],zoom_start=10)\n",
"m"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Nous sommes bien à Londre, les coordonnées semblent donc être correct.\n",
"\n",
"Maintenant que nous nous sommes assuré de la conformité des données, réitérons la même chose pour la liste des pompes :"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Id | \n",
" geometry | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 0 | \n",
" POINT (-0.13667 51.51334) | \n",
"
\n",
" \n",
" 1 | \n",
" 0 | \n",
" POINT (-0.13959 51.51388) | \n",
"
\n",
" \n",
" 2 | \n",
" 0 | \n",
" POINT (-0.13967 51.51491) | \n",
"
\n",
" \n",
" 3 | \n",
" 0 | \n",
" POINT (-0.13163 51.51235) | \n",
"
\n",
" \n",
" 4 | \n",
" 0 | \n",
" POINT (-0.13359 51.51214) | \n",
"
\n",
" \n",
" 5 | \n",
" 0 | \n",
" POINT (-0.13592 51.51154) | \n",
"
\n",
" \n",
" 6 | \n",
" 0 | \n",
" POINT (-0.13396 51.51002) | \n",
"
\n",
" \n",
" 7 | \n",
" 0 | \n",
" POINT (-0.13820 51.51130) | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Id geometry\n",
"0 0 POINT (-0.13667 51.51334)\n",
"1 0 POINT (-0.13959 51.51388)\n",
"2 0 POINT (-0.13967 51.51491)\n",
"3 0 POINT (-0.13163 51.51235)\n",
"4 0 POINT (-0.13359 51.51214)\n",
"5 0 POINT (-0.13592 51.51154)\n",
"6 0 POINT (-0.13396 51.51002)\n",
"7 0 POINT (-0.13820 51.51130)"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pumps_location = pumps_location.to_crs(epsg=4326) #Conversion au bon format\n",
"pumps_location"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Représentation des données sur la carte\n",
"Essayons maintenant d'afficher des cercles de circonférence proportionnelle au nombre du décès et des symboles pour la localisation des pompes.\n",
"\n",
"Reprenons notre carte précédente avec un zoom plus important."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Make this Notebook Trusted to load map: File -> Trust Notebook
"
],
"text/plain": [
""
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"m = fl.Map(location=[51.51334, -0.13667],zoom_start=17)\n",
"for index, death_row in death_cholera_location.iterrows():\n",
" death_count = death_row['Count']\n",
" point_geometry = death_cholera_location.geometry[index]\n",
" fl.Circle(\n",
" radius=death_count,\n",
" location=[point_geometry.centroid.y,point_geometry.centroid.x], #Ne pas oublier que longitude et latitude sont inversées.\n",
" tooltip=death_count,\n",
" color='crimson',\n",
" fill=True,\n",
" ).add_to(m)\n",
"m"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Voilà pour la localisation des décès, le radius du cercle correspondant au nombre de décès. \n",
"\n",
"Ajoutons maintenant les pompes."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Make this Notebook Trusted to load map: File -> Trust Notebook
"
],
"text/plain": [
""
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"for index, pump_row in pumps_location.iterrows():\n",
" point_geometry = pumps_location.geometry[index]\n",
" fl.Marker([point_geometry.centroid.y,point_geometry.centroid.x], popup='Pompes '+str(index)+'').add_to(m) #Ne pas oublier que longitude et latitude sont inversées.\n",
"m"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Nous pouvons en effet remarquer qu'une pompe se trouve au centre de ces décès et proche d'une zone ayant eu un grand nombre de décès. Il s'agit de la pompe se trouvant dans la rue **Broadwick Street**. Il s'agit de la pompe 0, soit la première de la liste fournit. \n",
"\n",
"### Une autre approche autour de la pompe de Broadwick Street\n",
"#### Nombre de décès suivant la distance des pompes\n",
"\n",
"Enfin de mettre en évidence la pompe de Broadwick Street comme ayant eu un impact dans l'épidémie de choléra, nous pouvons également comparer le nombre de décès autour de chacune des pompes dans un rayon définit. Pour cela, nous utiliserons la librairie `haversine` pour le calcul de distance entre deux coordonnées."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: haversine in /opt/conda/lib/python3.6/site-packages (2.2.0)\n",
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"pip install haversine"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Enfin de délimiter et de définir une taille de rayon commune autour de chacune des pompes, calculons la distance moyenne des pompes entre elles."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"347.1091983707074"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import haversine as hs\n",
"from haversine import Unit\n",
"import numpy as np\n",
"\n",
"list_range_mean = []\n",
"for index_pump, pump_row in pumps_location.iterrows():\n",
" pump_point_geometry = pumps_location.geometry[index_pump]\n",
" list_relative_distances = []\n",
" # Pour chaque pompe, on calcule la distance relative avec les autres pompes.\n",
" for index, pump_row in pumps_location.iterrows():\n",
" if index != index_pump: # On oublie pas d'exclure la pompe de référence pour le calcule.\n",
" death_point_geometry = pumps_location.geometry[index]\n",
" distance = hs.haversine((death_point_geometry.centroid.y,death_point_geometry.centroid.x),(pump_point_geometry.centroid.y,pump_point_geometry.centroid.x), unit=Unit.METERS)\n",
" list_relative_distances.append(distance)\n",
" # On fait la moyenne des distance pour chaque pompes.\n",
" list_range_mean.append(np.mean(list_relative_distances))\n",
"# Il suffit ensuite de calculer la moyenne de l'ensemble.\n",
"max_range = np.mean(list_range_mean)\n",
"# On affiche le résultat \n",
"max_range"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Les pompes sont donc, en moyenne, à 347 mètres l'une de l'autre.\n",
"\n",
"Calculons le nombre de décès se trouvant dans un rayon de 347 mètres autour de chaque pompe."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Pompe Index | \n",
" Latitude | \n",
" Longitude | \n",
" Nombre de décès dans un rayon de 300m | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 0.0 | \n",
" 51.513341 | \n",
" -0.136668 | \n",
" 489.0 | \n",
"
\n",
" \n",
" 1 | \n",
" 1.0 | \n",
" 51.513876 | \n",
" -0.139586 | \n",
" 404.0 | \n",
"
\n",
" \n",
" 2 | \n",
" 2.0 | \n",
" 51.514906 | \n",
" -0.139671 | \n",
" 339.0 | \n",
"
\n",
" \n",
" 3 | \n",
" 3.0 | \n",
" 51.512354 | \n",
" -0.131630 | \n",
" 227.0 | \n",
"
\n",
" \n",
" 4 | \n",
" 4.0 | \n",
" 51.512139 | \n",
" -0.133594 | \n",
" 404.0 | \n",
"
\n",
" \n",
" 5 | \n",
" 5.0 | \n",
" 51.511542 | \n",
" -0.135919 | \n",
" 453.0 | \n",
"
\n",
" \n",
" 6 | \n",
" 6.0 | \n",
" 51.510019 | \n",
" -0.133962 | \n",
" 81.0 | \n",
"
\n",
" \n",
" 7 | \n",
" 7.0 | \n",
" 51.511295 | \n",
" -0.138199 | \n",
" 381.0 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Pompe Index Latitude Longitude Nombre de décès dans un rayon de 300m\n",
"0 0.0 51.513341 -0.136668 489.0\n",
"1 1.0 51.513876 -0.139586 404.0\n",
"2 2.0 51.514906 -0.139671 339.0\n",
"3 3.0 51.512354 -0.131630 227.0\n",
"4 4.0 51.512139 -0.133594 404.0\n",
"5 5.0 51.511542 -0.135919 453.0\n",
"6 6.0 51.510019 -0.133962 81.0\n",
"7 7.0 51.511295 -0.138199 381.0"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas as pd\n",
"df_pump_death = pd.DataFrame(columns=['Pompe Index', 'Latitude', 'Longitude', 'Nombre de décès dans un rayon de 300m'])\n",
"for index_pump, pump_row in pumps_location.iterrows():\n",
" death_count = 0\n",
" pump_point_geometry = pumps_location.geometry[index_pump]\n",
" for index, death_row in death_cholera_location.iterrows():\n",
" death_point_geometry = death_cholera_location.geometry[index]\n",
" distance = hs.haversine((death_point_geometry.centroid.y,death_point_geometry.centroid.x),(pump_point_geometry.centroid.y,pump_point_geometry.centroid.x), unit=Unit.METERS)\n",
" if distance <= max_range:\n",
" death_count += death_row['Count']\n",
" \n",
" df_pump_death = df_pump_death.append({'Pompe Index': index_pump, 'Latitude': pump_point_geometry.centroid.y, 'Longitude': pump_point_geometry.centroid.x, 'Nombre de décès dans un rayon de 300m': death_count}, ignore_index=True)\n",
"df_pump_death"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"On remarque tout de suite que la pompe avec le plus de décès dans un rayon moyen de 347 mètres est la pompe à l'index 0, qui correspond à la pompe de Broadwick Street. \n",
"\n",
"Visualisons le résultat sur la carte en modifiant l'opacité suivant le nombre de décès."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Make this Notebook Trusted to load map: File -> Trust Notebook
"
],
"text/plain": [
""
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"m = fl.Map(location=[51.51334, -0.13667],zoom_start=15)\n",
"max_dead = np.max(df_pump_death['Nombre de décès dans un rayon de 300m'])\n",
"for index, pump_row in df_pump_death.iterrows():\n",
" fl.Circle(\n",
" radius=max_range,\n",
" stroke=True,\n",
" weight=5,\n",
" location=[pump_row['Latitude'],pump_row['Longitude']], #Ne pas oublier que longitude et latitude sont inversées.\n",
" tooltip=pump_row['Nombre de décès dans un rayon de 300m'],\n",
" color='crimson',\n",
" opacity=pump_row['Nombre de décès dans un rayon de 300m']/max_dead,\n",
" fillOpacity=pump_row['Nombre de décès dans un rayon de 300m']/max_dead,\n",
" fill=True,\n",
" ).add_to(m)\n",
" fl.Marker([pump_row['Latitude'],pump_row['Longitude']], popup='Pompes '+str(index)+'').add_to(m) #Ne pas oublier que longitude et latitude sont inversées.\n",
"m"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Et affichons uniquement le rayon de la pompe de Broadwick Street ainsi que l'emplacement des décès."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Make this Notebook Trusted to load map: File -> Trust Notebook
"
],
"text/plain": [
""
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"m = fl.Map(location=[51.51334, -0.13667],zoom_start=16)\n",
"# Emplacement des décès\n",
"for index, death_row in death_cholera_location.iterrows():\n",
" death_count = death_row['Count']\n",
" point_geometry = death_cholera_location.geometry[index]\n",
" fl.Circle(\n",
" radius=death_count,\n",
" location=[point_geometry.centroid.y,point_geometry.centroid.x], #Ne pas oublier que longitude et latitude sont inversées.\n",
" tooltip=death_count,\n",
" color='crimson',\n",
" fill=True,\n",
" ).add_to(m)\n",
"# Pompe de Broadwick Street\n",
"fl.Circle(\n",
" radius=max_range,\n",
" stroke=True,\n",
" weight=5,\n",
" location=[df_pump_death.values[0][1],df_pump_death.values[0][2]], \n",
" tooltip=df_pump_death.values[0][3],\n",
" color='orange',\n",
" opacity=df_pump_death.values[0][3]/max_dead,\n",
" fillOpacity=df_pump_death.values[0][3]/max_dead,\n",
" fill=True,\n",
" ).add_to(m)\n",
"fl.Marker([df_pump_death.values[0][1],df_pump_death.values[0][2]], popup='Pompes de Broadwick Street').add_to(m) \n",
"m"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Nous pouvons visualiser que la pompe de Broadwick est belle et bien au centre de cette épidémie.\n",
"\n",
"#### Bonus : Affichage des décès en Heatmap"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Make this Notebook Trusted to load map: File -> Trust Notebook
"
],
"text/plain": [
""
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from folium.plugins import HeatMap\n",
"\n",
"df_death = pd.DataFrame(columns=['Index', 'Latitude', 'Longitude', 'Nb décès'])\n",
"for index, death_row in death_cholera_location.iterrows():\n",
" point_geometry = death_cholera_location.geometry[index]\n",
" df_death = df_death.append({'Index': index, 'Latitude': point_geometry.centroid.y, 'Longitude': point_geometry.centroid.x, 'Nb décès': death_row['Count']}, ignore_index=True)\n",
"\n",
"\n",
"m = fl.Map(location=[51.51334, -0.13667],zoom_start=16)\n",
"HeatMap(data=df_death[['Latitude', 'Longitude', 'Nb décès']].groupby(['Latitude', 'Longitude']).sum().reset_index().values.tolist(), radius=20).add_to(m)\n",
"fl.Marker([51.51334,-0.13667], popup='Pompes de ').add_to(m) #Ne pas oublier que longitude et latitude sont inversées.\n",
"m"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}