{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Analyse sur l'incidence de la varicelle" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import isoweek" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Phase 1 : chargement des données\n", "On télécharge d'[ici](https://www.sentiweb.fr/france/fr/?page=table).\n", "J'ai choisi de ne prendre que sentinelle (sans IQVIA), au niveau national, et de renommer ce \"inc-7-PAY.csv\" en \"varicelle.csv\"." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "url = 'https://www.sentiweb.fr/datasets/all/inc-7-PAY.csv'" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
weekindicatorincinc_lowinc_upinc100inc100_lowinc100_upgeo_inseegeo_name
020245274952194079647212FRFrance
120245174705226571457311FRFrance
22024507736344381028811715FRFrance
320244976077363185239513FRFrance
420244874189145469246210FRFrance
\n", "
" ], "text/plain": [ " week indicator inc inc_low inc_up inc100 inc100_low inc100_up \\\n", "0 202452 7 4952 1940 7964 7 2 12 \n", "1 202451 7 4705 2265 7145 7 3 11 \n", "2 202450 7 7363 4438 10288 11 7 15 \n", "3 202449 7 6077 3631 8523 9 5 13 \n", "4 202448 7 4189 1454 6924 6 2 10 \n", "\n", " geo_insee geo_name \n", "0 FR France \n", "1 FR France \n", "2 FR France \n", "3 FR France \n", "4 FR France " ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "try:\n", " raw_data = pd.read_csv('./varicelle.csv',index=True)\n", "except:\n", " raw_data = pd.read_csv(url, skiprows=1)\n", " raw_data.to_csv('./varicelle.csv')\n", "raw_data.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pour la signification des colonnes, il faut vérifier le schéma csv, [ici](https://ns.sentiweb.fr/incidence/csv-schema-v1.json).\n", "À retenir : les incertitudes se font à 95%.\n", "La colonne \"inc100\" représente les incidences pour 100 000 habitants." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On vérifie s'il y a des données manquantes : visiblement non." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "raw_data.isnull().any(axis=1).sum()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Pré-traitement des données\n", "Il faut adapter le format des semaines, qui n'est pas lisible en l'état." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "def convert_week(ynw_int):\n", " '''Prend un entier représentant l'année et le numéro de semaine\n", " et renvoie un objet adapté à pandas.'''\n", " ynw_str = str(ynw_int)\n", " y = int(ynw_str[:4])\n", " w = int(ynw_str[4:])\n", " week = isoweek.Week(y,w)\n", " return pd.Period(week.day(0),'W')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }