question 1 complete

c422a5fa · e436212dbb67ba8b493d344beb7f7acb · 05a1911e · c422a5fa
Commit c422a5fa authored May 31, 2023 by e436212dbb67ba8b493d344beb7f7acb
Hide whitespace changes
Inline Side-by-side

Showing with 302 additions and 6 deletions

exercice.ipynb module3/exo3/exercice.ipynb +302 -6

No files found.
--- a/module3/exo3/exercice.ipynb
+++ b/module3/exo3/exercice.ipynb
@@ -967,19 +967,315 @@
    "## Analyser les données"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Classons les personnages selon la quantité de parole grâce à une analyse syntaxique du texte (scènes / répliques / mots). En particulier, quel est celui qui parle le plus ? Quel est celui qui ne parle pas du tout ?"
+   ]
+  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 98,
   "metadata": {},
-   "outputs": [],
+   "outputs": [
-   "source": []
+    {
+     "data": {
+      "text/plain": [
+       "HARPAGON          22\n",
+       "FROSINE           14\n",
+       "CLEANTE           14\n",
+       "ELISE             13\n",
+       "MARIANE           11\n",
+       "MAITRE JACQUES     8\n",
+       "VALERE             8\n",
+       "LA FLECHE          5\n",
+       "LE COMMISSAIRE     5\n",
+       "SON CLERC          5\n",
+       "BRINDAVOINE        2\n",
+       "LA MERLUCHE        2\n",
+       "DAME CLAUDE        1\n",
+       "MAITRE SIMON       1\n",
+       "ANSELME            1\n",
+       "Name: personnage, dtype: int64"
+      ]
+     },
+     "execution_count": 98,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df.personnage.value_counts()"
+   ]
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 103,
   "metadata": {},
-   "outputs": [],
+   "outputs": [
-   "source": []
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>nombre_de_mots</th>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>personnage</th>\n",
+       "      <th></th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>DAME CLAUDE</th>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>MAITRE JACQUES</th>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>MAITRE SIMON</th>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>SON CLERC</th>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>BRINDAVOINE</th>\n",
+       "      <td>38</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>LA MERLUCHE</th>\n",
+       "      <td>49</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>LE COMMISSAIRE</th>\n",
+       "      <td>258</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>ANSELME</th>\n",
+       "      <td>383</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>MARIANE</th>\n",
+       "      <td>819</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>ELISE</th>\n",
+       "      <td>893</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>LA FLECHE</th>\n",
+       "      <td>1419</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>FROSINE</th>\n",
+       "      <td>2033</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>VALERE</th>\n",
+       "      <td>2532</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>CLEANTE</th>\n",
+       "      <td>3046</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>HARPAGON</th>\n",
+       "      <td>5092</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "                nombre_de_mots\n",
+       "personnage                    \n",
+       "DAME CLAUDE                  0\n",
+       "MAITRE JACQUES               0\n",
+       "MAITRE SIMON                 0\n",
+       "SON CLERC                    0\n",
+       "BRINDAVOINE                 38\n",
+       "LA MERLUCHE                 49\n",
+       "LE COMMISSAIRE             258\n",
+       "ANSELME                    383\n",
+       "MARIANE                    819\n",
+       "ELISE                      893\n",
+       "LA FLECHE                 1419\n",
+       "FROSINE                   2033\n",
+       "VALERE                    2532\n",
+       "CLEANTE                   3046\n",
+       "HARPAGON                  5092"
+      ]
+     },
+     "execution_count": 103,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df[['nombre_de_mots','personnage']].groupby('personnage').sum().sort_values(by=['nombre_de_mots'])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 104,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>nombre_de_repliques</th>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>personnage</th>\n",
+       "      <th></th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>DAME CLAUDE</th>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>MAITRE JACQUES</th>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>MAITRE SIMON</th>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>SON CLERC</th>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>BRINDAVOINE</th>\n",
+       "      <td>3</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>LA MERLUCHE</th>\n",
+       "      <td>5</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>ANSELME</th>\n",
+       "      <td>14</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>LE COMMISSAIRE</th>\n",
+       "      <td>15</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>MARIANE</th>\n",
+       "      <td>26</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>ELISE</th>\n",
+       "      <td>50</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>FROSINE</th>\n",
+       "      <td>59</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>LA FLECHE</th>\n",
+       "      <td>64</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>VALERE</th>\n",
+       "      <td>99</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>CLEANTE</th>\n",
+       "      <td>156</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>HARPAGON</th>\n",
+       "      <td>334</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "                nombre_de_repliques\n",
+       "personnage                         \n",
+       "DAME CLAUDE                       0\n",
+       "MAITRE JACQUES                    0\n",
+       "MAITRE SIMON                      0\n",
+       "SON CLERC                         0\n",
+       "BRINDAVOINE                       3\n",
+       "LA MERLUCHE                       5\n",
+       "ANSELME                          14\n",
+       "LE COMMISSAIRE                   15\n",
+       "MARIANE                          26\n",
+       "ELISE                            50\n",
+       "FROSINE                          59\n",
+       "LA FLECHE                        64\n",
+       "VALERE                           99\n",
+       "CLEANTE                         156\n",
+       "HARPAGON                        334"
+      ]
+     },
+     "execution_count": 104,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df[['nombre_de_repliques','personnage']].groupby('personnage').sum().sort_values(by=['nombre_de_repliques'])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "On voit dans ces analyses qu'Harpagon participe au plus grand nombre de scènes (22 sur 31). En terme de nombre de mots parlés, Harpagon est aussi celui qui parle le plus avec Dame Claude, Maitre Jacques, Maitre Simon et le clerc qui ne parlent pas du tout. C'est aussi Harpagon qui a le plus grand nombre de répliques."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Réalisez un graphique qui montrera le nombre de mots que chaque acteur prononce dans chaque scène. Pour cela, vous pouvez vous inspirer de l'étude de l'Avare de Molière réalisée par l'OBVIL (graphe de gauche). Dans ce graphique, les lignes sont de longueur égale et la hauteur représente le nombre de mots prononcés au total dans la scène. La largeur de chaque rectangle indique le pourcentage de la scène qu’un acteur occupe."
+   ]
  }
 ],
 "metadata": {