" # On utilise le module requests pour récupérer les données en ligne\n",
" archive = requests.get(url)\n",
" # Le fichier est une archive .gz, on l'extrait avec le module gzip\n",
" content = gzip.decompress(archive.content)\n",
" \n",
" open(filename,'wb').write(content)\n",
" print(f\"Téléchargement de {url} et extraction vers {filename}.\")\n",
" else:\n",
...
...
@@ -108,150 +119,551 @@
" download_archive(filename, url)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Lecture des données\n",
"On extrait maintenant les données de l'outil `ping` sous forme d'un tableau `pandas`.\n",
"\n",
"Le format étant relativement simple, il est possible de le faire en utilisant uniquement les fonctions de base des chaînes de caractères de Python.\n",
"\n",
"Chaque ligne a la forme suivante:\n",
"```\n",
"[1421761682.052172] 665 bytes from lig-publig.imag.fr (129.88.11.7): icmp_seq=1 ttl=60 time=22.5 ms\n",
"```\n",
"On extrait uniquement les données qui nous intéressent :\n",
"\n",
" * la date de mesure (en secondes depuis le 1er janvier 1970) du 2e au 18e caractère\n",
" * la taille du message (en octets), qui est suivi de la sous-chaîne `\" bytes\"`\n",
" * la durée de réponse (en millisecondes), qui est précédé de `\"time=\"` et suivi de `\" ms\"`"
]
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 59,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Help on Response in module requests.models object:\n",
"\n",
"class Response(builtins.object)\n",
" | The :class:`Response <Response>` object, which contains a\n",
" | server's response to an HTTP request.\n",
" | \n",
" | Methods defined here:\n",
" | \n",
" | __bool__(self)\n",
" | Returns True if :attr:`status_code` is less than 400.\n",
" | \n",
" | This attribute checks if the status code of the response is between\n",
" | 400 and 600 to see if there was a client error or a server error. If\n",
" | the status code, is between 200 and 400, this will return True. This\n",
" | is **not** a check to see if the response code is ``200 OK``.\n",
" | \n",
" | __enter__(self)\n",
" | \n",
" | __exit__(self, *args)\n",
" | \n",
" | __getstate__(self)\n",
" | \n",
" | __init__(self)\n",
" | Initialize self. See help(type(self)) for accurate signature.\n",
" | \n",
" | __iter__(self)\n",
" | Allows you to use a response as an iterator.\n",
" | \n",
" | __nonzero__(self)\n",
" | Returns True if :attr:`status_code` is less than 400.\n",
" | \n",
" | This attribute checks if the status code of the response is between\n",
" | 400 and 600 to see if there was a client error or a server error. If\n",
" | the status code, is between 200 and 400, this will return True. This\n",
" | is **not** a check to see if the response code is ``200 OK``.\n",
" | \n",
" | __repr__(self)\n",
" | Return repr(self).\n",
" | \n",
" | __setstate__(self, state)\n",
" | \n",
" | close(self)\n",
" | Releases the connection back to the pool. Once this method has been\n",
" | called the underlying ``raw`` object must not be accessed again.\n",
" | \n",
" | *Note: Should not normally need to be called explicitly.*\n",