Commit 0b524ce9 authored by Emile Siboulet's avatar Emile Siboulet

fonctionnement

parent 370df9fa
This source diff could not be displayed because it is too large. You can view the blob instead.
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="fr" xml:lang="fr">
<head>
<!-- 2023-10-09 lun. 19:49 -->
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>Sujet 5 : Analyse des dialogues dans l'Avare de Molière</title>
<meta name="author" content="Émile Siboulet" />
<meta name="generator" content="Org Mode" />
<style>
#content { max-width: 60em; margin: auto; }
.title { text-align: center;
margin-bottom: .2em; }
.subtitle { text-align: center;
font-size: medium;
font-weight: bold;
margin-top:0; }
.todo { font-family: monospace; color: red; }
.done { font-family: monospace; color: green; }
.priority { font-family: monospace; color: orange; }
.tag { background-color: #eee; font-family: monospace;
padding: 2px; font-size: 80%; font-weight: normal; }
.timestamp { color: #bebebe; }
.timestamp-kwd { color: #5f9ea0; }
.org-right { margin-left: auto; margin-right: 0px; text-align: right; }
.org-left { margin-left: 0px; margin-right: auto; text-align: left; }
.org-center { margin-left: auto; margin-right: auto; text-align: center; }
.underline { text-decoration: underline; }
#postamble p, #preamble p { font-size: 90%; margin: .2em; }
p.verse { margin-left: 3%; }
pre {
border: 1px solid #e6e6e6;
border-radius: 3px;
background-color: #f2f2f2;
padding: 8pt;
font-family: monospace;
overflow: auto;
margin: 1.2em;
}
pre.src {
position: relative;
overflow: auto;
}
pre.src:before {
display: none;
position: absolute;
top: -8px;
right: 12px;
padding: 3px;
color: #555;
background-color: #f2f2f299;
}
pre.src:hover:before { display: inline; margin-top: 14px;}
/* Languages per Org manual */
pre.src-asymptote:before { content: 'Asymptote'; }
pre.src-awk:before { content: 'Awk'; }
pre.src-authinfo::before { content: 'Authinfo'; }
pre.src-C:before { content: 'C'; }
/* pre.src-C++ doesn't work in CSS */
pre.src-clojure:before { content: 'Clojure'; }
pre.src-css:before { content: 'CSS'; }
pre.src-D:before { content: 'D'; }
pre.src-ditaa:before { content: 'ditaa'; }
pre.src-dot:before { content: 'Graphviz'; }
pre.src-calc:before { content: 'Emacs Calc'; }
pre.src-emacs-lisp:before { content: 'Emacs Lisp'; }
pre.src-fortran:before { content: 'Fortran'; }
pre.src-gnuplot:before { content: 'gnuplot'; }
pre.src-haskell:before { content: 'Haskell'; }
pre.src-hledger:before { content: 'hledger'; }
pre.src-java:before { content: 'Java'; }
pre.src-js:before { content: 'Javascript'; }
pre.src-latex:before { content: 'LaTeX'; }
pre.src-ledger:before { content: 'Ledger'; }
pre.src-lisp:before { content: 'Lisp'; }
pre.src-lilypond:before { content: 'Lilypond'; }
pre.src-lua:before { content: 'Lua'; }
pre.src-matlab:before { content: 'MATLAB'; }
pre.src-mscgen:before { content: 'Mscgen'; }
pre.src-ocaml:before { content: 'Objective Caml'; }
pre.src-octave:before { content: 'Octave'; }
pre.src-org:before { content: 'Org mode'; }
pre.src-oz:before { content: 'OZ'; }
pre.src-plantuml:before { content: 'Plantuml'; }
pre.src-processing:before { content: 'Processing.js'; }
pre.src-python:before { content: 'Python'; }
pre.src-R:before { content: 'R'; }
pre.src-ruby:before { content: 'Ruby'; }
pre.src-sass:before { content: 'Sass'; }
pre.src-scheme:before { content: 'Scheme'; }
pre.src-screen:before { content: 'Gnu Screen'; }
pre.src-sed:before { content: 'Sed'; }
pre.src-sh:before { content: 'shell'; }
pre.src-sql:before { content: 'SQL'; }
pre.src-sqlite:before { content: 'SQLite'; }
/* additional languages in org.el's org-babel-load-languages alist */
pre.src-forth:before { content: 'Forth'; }
pre.src-io:before { content: 'IO'; }
pre.src-J:before { content: 'J'; }
pre.src-makefile:before { content: 'Makefile'; }
pre.src-maxima:before { content: 'Maxima'; }
pre.src-perl:before { content: 'Perl'; }
pre.src-picolisp:before { content: 'Pico Lisp'; }
pre.src-scala:before { content: 'Scala'; }
pre.src-shell:before { content: 'Shell Script'; }
pre.src-ebnf2ps:before { content: 'ebfn2ps'; }
/* additional language identifiers per "defun org-babel-execute"
in ob-*.el */
pre.src-cpp:before { content: 'C++'; }
pre.src-abc:before { content: 'ABC'; }
pre.src-coq:before { content: 'Coq'; }
pre.src-groovy:before { content: 'Groovy'; }
/* additional language identifiers from org-babel-shell-names in
ob-shell.el: ob-shell is the only babel language using a lambda to put
the execution function name together. */
pre.src-bash:before { content: 'bash'; }
pre.src-csh:before { content: 'csh'; }
pre.src-ash:before { content: 'ash'; }
pre.src-dash:before { content: 'dash'; }
pre.src-ksh:before { content: 'ksh'; }
pre.src-mksh:before { content: 'mksh'; }
pre.src-posh:before { content: 'posh'; }
/* Additional Emacs modes also supported by the LaTeX listings package */
pre.src-ada:before { content: 'Ada'; }
pre.src-asm:before { content: 'Assembler'; }
pre.src-caml:before { content: 'Caml'; }
pre.src-delphi:before { content: 'Delphi'; }
pre.src-html:before { content: 'HTML'; }
pre.src-idl:before { content: 'IDL'; }
pre.src-mercury:before { content: 'Mercury'; }
pre.src-metapost:before { content: 'MetaPost'; }
pre.src-modula-2:before { content: 'Modula-2'; }
pre.src-pascal:before { content: 'Pascal'; }
pre.src-ps:before { content: 'PostScript'; }
pre.src-prolog:before { content: 'Prolog'; }
pre.src-simula:before { content: 'Simula'; }
pre.src-tcl:before { content: 'tcl'; }
pre.src-tex:before { content: 'TeX'; }
pre.src-plain-tex:before { content: 'Plain TeX'; }
pre.src-verilog:before { content: 'Verilog'; }
pre.src-vhdl:before { content: 'VHDL'; }
pre.src-xml:before { content: 'XML'; }
pre.src-nxml:before { content: 'XML'; }
/* add a generic configuration mode; LaTeX export needs an additional
(add-to-list 'org-latex-listings-langs '(conf " ")) in .emacs */
pre.src-conf:before { content: 'Configuration File'; }
table { border-collapse:collapse; }
caption.t-above { caption-side: top; }
caption.t-bottom { caption-side: bottom; }
td, th { vertical-align:top; }
th.org-right { text-align: center; }
th.org-left { text-align: center; }
th.org-center { text-align: center; }
td.org-right { text-align: right; }
td.org-left { text-align: left; }
td.org-center { text-align: center; }
dt { font-weight: bold; }
.footpara { display: inline; }
.footdef { margin-bottom: 1em; }
.figure { padding: 1em; }
.figure p { text-align: center; }
.equation-container {
display: table;
text-align: center;
width: 100%;
}
.equation {
vertical-align: middle;
}
.equation-label {
display: table-cell;
text-align: right;
vertical-align: middle;
}
.inlinetask {
padding: 10px;
border: 2px solid gray;
margin: 10px;
background: #ffffcc;
}
#org-div-home-and-up
{ text-align: right; font-size: 70%; white-space: nowrap; }
textarea { overflow-x: auto; }
.linenr { font-size: smaller }
.code-highlighted { background-color: #ffff00; }
.org-info-js_info-navigation { border-style: none; }
#org-info-js_console-label
{ font-size: 10px; font-weight: bold; white-space: nowrap; }
.org-info-js_search-highlight
{ background-color: #ffff00; color: #000000; font-weight: bold; }
.org-svg { }
</style>
<link rel="stylesheet" type="text/css" href="http://www.pirilampo.org/styles/readtheorg/css/htmlize.css"/>
<link rel="stylesheet" type="text/css" href="http://www.pirilampo.org/styles/readtheorg/css/readtheorg.css"/>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.3/jquery.min.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.4/js/bootstrap.min.js"></script>
<script type="text/javascript" src="http://www.pirilampo.org/styles/lib/js/jquery.stickytableheaders.js"></script>
<script type="text/javascript" src="http://www.pirilampo.org/styles/readtheorg/js/readtheorg.js"></script>
</head>
<body>
<div id="content" class="content">
<h1 class="title">Sujet 5 : Analyse des dialogues dans l&rsquo;Avare de Molière</h1>
<div id="table-of-contents" role="doc-toc">
<h2>Table des matières</h2>
<div id="text-table-of-contents" role="doc-toc">
<ul>
<li><a href="#org3cfd910">1. Acquisition des données</a></li>
<li><a href="#org85b6b34">2. Découpage du text entre les différents personnages</a></li>
<li><a href="#orgf54370c">3. Étude statistique</a></li>
<li><a href="#org48101cb">4. Export graphique des résultats</a></li>
</ul>
</div>
</div>
<div id="outline-container-org3cfd910" class="outline-2">
<h2 id="org3cfd910"><span class="section-number-2">1.</span> Acquisition des données</h2>
<div class="outline-text-2" id="text-1">
<p>
Dans un premier temps, nous allons télécharger les paroles de Molière. Nous choisissons de prendre le fichier markdown qui suffit à notre étude et qui ne comporte pas de données inutiles à nos analyses.
</p>
<div class="org-src-container">
<pre class="src src-shell"> wget -O data.md http://dramacode.github.io/markdown/moliere_avare.txt
</pre>
</div>
</div>
</div>
<div id="outline-container-org85b6b34" class="outline-2">
<h2 id="org85b6b34"><span class="section-number-2">2.</span> Découpage du text entre les différents personnages</h2>
<div class="outline-text-2" id="text-2">
<p>
Dans un premier temps, on va ouvrir le fichier télécharger. Cela va nous permettre de vérifier l&rsquo;intégrité des informations.
</p>
<div class="org-src-container">
<pre class="src src-python"><span style="color: #ff98a4;">f</span> = <span style="color: #c099ff;">open</span>(<span style="color: #c3e88d;">"data.md"</span>, <span style="color: #c3e88d;">"r"</span>)
<span style="color: #ff98a4;">texte</span> = f.read()
f.close()
<span style="color: #c099ff;">print</span>(texte[<span style="color: #ff995e; font-weight: bold;">1324</span>:<span style="color: #ff995e; font-weight: bold;">2290</span>])
</pre>
</div>
<pre class="example" id="org3107061">
ÉLISE.
Non, Valère, je ne puis pas me repentir de tout ce que je fais pour vous. Je m'y sens entraîner par une trop douce puissance, et je n'ai pas même la force de souhaiter que les choses ne fussent pas. Mais, à vous dire vrai, le succès me donne de l'inquiétude ; et je crains fort de vous aimer un peu plus que je ne devrais.
VALÈRE.
Hé que pouvez-vous craindre, Élise, dans les bontés que vous avez pour moi ?
ÉLISE.
Hélas ! cent choses à la fois : L'emportement d'un Père ; les reproches d'une Famille ; les censures du monde ; mais plus que tout, Valère, le changement de votre cœur ; et cette froideur criminelle dont ceux de votre Sexe payent le plus souvent les témoignages trop ardents d'une innocente amour.
VALÈRE.
Ah ! ne me faites pas ce tort, de juger de moi par les autres.Soupçonnez-moi de tout, Élise, plutôt que de manquer à ce que je vous dois.Je vous aime trop pour cela ; et mon amour pour vous, durera autant que ma vie.
</pre>
<p>
Dans un second temps, on va stocker les données dans une liste qui contient les différentes scènes. Dans chacun des éléments de cette liste est contenu la liste des tuples (Personnage, Liste de mots) des différentes répliques.
</p>
<p>
Les différentes expressions régulières permettent le découpage du texte comme souhaité.
</p>
<div class="org-src-container">
<pre class="src src-python"><span style="color: #c099ff;">import</span> re <span style="color: #7a88cf;"># </span><span style="color: #7a88cf;">librarie standard d'analyse regex</span>
<span style="color: #ff98a4;">texte</span> = texte.replace(<span style="color: #c3e88d;">" "</span>, <span style="color: #c3e88d;">"@"</span>) <span style="color: #7a88cf;"># </span><span style="color: #7a88cf;">Pour remplacer les tabulation par des @ pour simplifier le par&#231;age</span>
<span style="color: #ff98a4;">f</span> = <span style="color: #c099ff;">open</span>(<span style="color: #c3e88d;">"data2.md"</span>, <span style="color: #c3e88d;">"w"</span>)
f.write(texte)
f.close()
<span style="color: #ff98a4;">reScene</span> = re.<span style="color: #c099ff;">compile</span>(r<span style="color: #c3e88d;">'\#\#\#[^\.]+([^\#]+)'</span>)
<span style="color: #ff98a4;">reReplique</span> = re.<span style="color: #c099ff;">compile</span>(r<span style="color: #c3e88d;">'@([^\.^\,^\*]+)(\.| \*[^\.]+)([^\@]+)'</span>)
<span style="color: #ff98a4;">reMot</span> = re.<span style="color: #c099ff;">compile</span>(r<span style="color: #c3e88d;">'\w+'</span>)
<span style="color: #ff98a4;">Scenes</span> = []
<span style="color: #c099ff;">for</span> scene <span style="color: #c099ff;">in</span> reScene.finditer(texte) :
<span style="color: #ff98a4;">Repliques</span> = []
<span style="color: #c099ff;">for</span> replique <span style="color: #c099ff;">in</span> reReplique.finditer(scene.group(<span style="color: #ff995e; font-weight: bold;">1</span>)) :
<span style="color: #ff98a4;">Mots</span> = []
<span style="color: #c099ff;">for</span> mot <span style="color: #c099ff;">in</span> reMot.finditer(replique.group(<span style="color: #ff995e; font-weight: bold;">3</span>)) :
Mots.append(mot.group().lower())
Repliques.append((replique.group(<span style="color: #ff995e; font-weight: bold;">1</span>), Mots))
Scenes.append(Repliques)
<span style="color: #c099ff;">print</span>(Scenes[<span style="color: #ff995e; font-weight: bold;">16</span>][<span style="color: #ff995e; font-weight: bold;">3</span>]) <span style="color: #7a88cf;"># </span><span style="color: #7a88cf;">v&#233;rification visuel des donn&#233;es</span>
</pre>
</div>
<pre class="example">
('HARPAGON', ['voilà', 'un', 'compliment', 'bien', 'impertinent', 'quelle', 'belle', 'confession', 'à', 'lui', 'faire'])
</pre>
</div>
</div>
<div id="outline-container-orgf54370c" class="outline-2">
<h2 id="orgf54370c"><span class="section-number-2">3.</span> Étude statistique</h2>
<div class="outline-text-2" id="text-3">
<p>
Il ne reste plus qu&rsquo;à compter les différents mots pour avoir le résultat attendu.
</p>
<div class="org-src-container">
<pre class="src src-python"><span style="color: #ff98a4;">CompteurSceneMots</span> = []
<span style="color: #ff98a4;">set_nom</span> = <span style="color: #c099ff;">set</span>()
<span style="color: #c099ff;">for</span> scene <span style="color: #c099ff;">in</span> <span style="color: #ff98a4;">Scenes</span> :
CompteurMots = {}
<span style="color: #c099ff;">for</span> replique <span style="color: #c099ff;">in</span> <span style="color: #ff98a4;">scene</span> :
<span style="color: #ff98a4;">nom</span>, <span style="color: #ff98a4;">n_mots</span> = replique[<span style="color: #ff995e; font-weight: bold;">0</span>], <span style="color: #c099ff;">len</span>(replique[<span style="color: #ff995e; font-weight: bold;">1</span>])
<span style="color: #c099ff;">if</span> nom == <span style="color: #c3e88d;">"HARPAGO"</span> :
<span style="color: #c099ff;">print</span>(replique[<span style="color: #ff995e; font-weight: bold;">1</span>])
<span style="color: #c099ff;">if</span> nom <span style="color: #c099ff;">not</span> <span style="color: #c099ff;">in</span> CompteurMots.keys() :
<span style="color: #ff98a4;">CompteurMots</span>[<span style="color: #ff98a4;">nom</span>] = n_mots
set_nom.add(nom)
<span style="color: #c099ff;">else</span> :
CompteurMots[<span style="color: #ff98a4;">nom</span>] += n_mots
<span style="color: #ff98a4;">CompteurMots</span> = [(key, value) <span style="color: #c099ff;">for</span> key, value <span style="color: #c099ff;">in</span> CompteurMots.items()]
CompteurMots.sort(key=<span style="color: #c099ff;">lambda</span> x: x[<span style="color: #ff995e; font-weight: bold;">1</span>], reverse=<span style="color: #ff995e;">True</span>)
CompteurSceneMots.append(CompteurMots)
<span style="color: #ff98a4;">ListePersonnages</span> = <span style="color: #c099ff;">list</span>(set_nom)
<span style="color: #ff98a4;">ListeScenes</span> = [f<span style="color: #c3e88d;">"Scene </span>{i}<span style="color: #c3e88d;">"</span> <span style="color: #c099ff;">for</span> i <span style="color: #c099ff;">in</span> <span style="color: #c099ff;">range</span>(<span style="color: #c099ff;">len</span>(CompteurSceneMots))]
<span style="color: #c099ff;">print</span>(<span style="color: #c3e88d;">"\nListe des personnages :"</span>, ListePersonnages)
<span style="color: #c099ff;">print</span>(<span style="color: #c3e88d;">'\nListe des scenes :'</span>, ListeScenes)
</pre>
</div>
<pre class="example">
Liste des personnages : ['ANSELME', 'MARIANE', 'LA FLÈCHE', 'VALÈRE', 'HARPAGON', 'BRINDAVOINE', 'MAÎTRE SIMON', 'CLÉANTE', 'LA MERLUCHE', 'LE COMMISSAIRE', 'ÉLISE', 'MAÎTRE JACQUES', 'FROSINE']
Liste des scenes : ['Scene 0', 'Scene 1', 'Scene 2', 'Scene 3', 'Scene 4', 'Scene 5', 'Scene 6', 'Scene 7', 'Scene 8', 'Scene 9', 'Scene 10', 'Scene 11', 'Scene 12', 'Scene 13', 'Scene 14', 'Scene 15', 'Scene 16', 'Scene 17', 'Scene 18', 'Scene 19', 'Scene 20', 'Scene 21', 'Scene 22', 'Scene 23', 'Scene 24', 'Scene 25', 'Scene 26', 'Scene 27', 'Scene 28', 'Scene 29', 'Scene 30', 'Scene 31']
</pre>
<p>
Nous allons maintenant placer ces différentes informations dans un tableau pandas
</p>
</div>
</div>
<div id="outline-container-org48101cb" class="outline-2">
<h2 id="org48101cb"><span class="section-number-2">4.</span> Export graphique des résultats</h2>
<div class="outline-text-2" id="text-4">
<div class="org-src-container">
<pre class="src src-python"><span style="color: #c099ff;">import</span> pandas <span style="color: #c099ff;">as</span> pd
<span style="color: #ff98a4;">df</span> = pd.DataFrame(columns=ListePersonnages, index=ListeScenes).fillna(<span style="color: #ff995e; font-weight: bold;">0</span>)
<span style="color: #c099ff;">for</span> i, scene <span style="color: #c099ff;">in</span> <span style="color: #c099ff;">enumerate</span>(CompteurSceneMots) :
<span style="color: #c099ff;">for</span> personnage, n_mots <span style="color: #c099ff;">in</span> <span style="color: #ff98a4;">scene</span> :
df[personnage][f<span style="color: #c3e88d;">"Scene </span>{i}<span style="color: #c3e88d;">"</span>] = <span style="color: #c099ff;">float</span>(n_mots)
<span style="color: #c099ff;">print</span>(df)
</pre>
</div>
<pre class="example" id="org88bea8e">
ANSELME MARIANE LA FLÈCHE VALÈRE HARPAGON BRINDAVOINE ... CLÉANTE LA MERLUCHE LE COMMISSAIRE ÉLISE MAÎTRE JACQUES FROSINE
Scene 0 0 0 0 630 0 0 ... 0 0 0 491 0 0
Scene 1 0 0 0 0 0 0 ... 762 0 0 154 0 0
Scene 2 0 0 258 0 492 0 ... 0 0 0 0 0 0
Scene 3 0 0 0 0 1160 0 ... 216 0 0 166 0 0
Scene 4 0 0 0 710 282 0 ... 0 0 0 36 0 0
Scene 5 0 0 903 0 0 0 ... 379 0 0 0 0 0
Scene 6 0 0 12 0 171 0 ... 127 0 0 0 0 0
Scene 7 0 0 0 0 21 0 ... 0 0 0 0 0 1
Scene 8 0 0 302 0 0 0 ... 0 0 0 0 0 130
Scene 9 0 0 0 0 555 0 ... 0 0 0 0 0 1510
Scene 10 0 0 0 272 777 23 ... 76 26 0 3 794 0
Scene 11 0 0 0 109 0 0 ... 0 0 0 0 198 0
Scene 12 0 0 0 0 0 0 ... 0 0 0 0 11 19
Scene 13 0 185 0 0 0 0 ... 0 0 0 0 0 191
Scene 14 0 0 0 0 105 0 ... 0 0 0 0 0 26
Scene 15 0 35 0 0 70 0 ... 0 0 0 17 0 9
Scene 16 0 235 0 5 178 0 ... 607 0 0 0 0 41
Scene 17 0 0 0 0 23 20 ... 0 0 0 0 0 0
Scene 18 0 0 0 7 77 0 ... 40 29 0 0 0 0
Scene 19 0 236 0 0 0 0 ... 245 0 0 58 0 436
Scene 20 0 0 0 0 54 0 ... 14 0 0 3 0 0
Scene 21 0 0 0 0 393 0 ... 418 0 0 0 0 0
Scene 22 0 0 0 0 162 0 ... 170 0 0 0 318 0
Scene 23 0 0 0 0 129 0 ... 163 0 0 0 0 0
Scene 24 0 0 47 0 0 0 ... 17 0 0 0 0 0
Scene 25 0 0 0 0 429 0 ... 0 0 0 0 0 0
Scene 26 0 0 0 0 89 0 ... 0 0 109 0 0 0
Scene 27 0 0 0 0 182 0 ... 0 0 159 0 306 0
Scene 28 0 0 0 641 441 0 ... 0 0 0 0 11 0
Scene 29 0 0 0 22 124 0 ... 0 0 0 11 7 4
Scene 30 403 192 0 354 258 0 ... 0 0 0 0 7 0
Scene 31 114 36 0 0 90 0 ... 130 0 26 0 23 0
[32 rows x 13 columns]
</pre>
<p>
Nous allons maintenant afficher les différents résultats avec MatPlotLib
</p>
<div class="org-src-container">
<pre class="src src-python"><span style="color: #c099ff;">import</span> matplotlib.pyplot <span style="color: #c099ff;">as</span> plt
<span style="color: #c099ff;">import</span> numpy <span style="color: #c099ff;">as</span> np
<span style="color: #ff98a4;">fig</span>, <span style="color: #ff98a4;">ax</span> = plt.subplots()
<span style="color: #ff98a4;">bottom</span> = np.zeros(<span style="color: #c099ff;">len</span>(ListeScenes))
<span style="color: #c099ff;">for</span> personnage <span style="color: #c099ff;">in</span> <span style="color: #ff98a4;">ListePersonnages</span> :
p = ax.bar(ListeScenes, df[personnage].values, <span style="color: #ff995e; font-weight: bold;">0.7</span>, label=personnage, bottom=bottom)
<span style="color: #ff98a4;">bottom</span> += np.array(df[personnage].values)
<span style="color: #ff98a4;">figure</span> = plt.gcf()
ax.set_xticklabels(ax.get_xticks(), rotation = <span style="color: #ff995e; font-weight: bold;">45</span>)
figure.set_size_inches(<span style="color: #ff995e; font-weight: bold;">20</span>, <span style="color: #ff995e; font-weight: bold;">10</span>)
plt.legend()
plt.savefig(<span style="color: #c3e88d;">"fig.svg"</span>)
</pre>
</div>
<pre class="example">
/tmp/babel-lfQl8f/python-71oYSL:11: UserWarning: FixedFormatter should only be used together with FixedLocator
ax.set_xticklabels(ax.get_xticks(), rotation = 45)
</pre>
<p>
Ce qui nous donne le graphique suivant. Il représente le nombre de mot en fonction de la scène des différents personnages
</p>
<div id="orgab2805c" class="figure">
<p><img src="./fig.svg" alt="fig.svg" class="org-svg" />
</p>
</div>
</div>
</div>
</div>
<div id="postamble" class="status">
<p class="date">Date: 9 octobre</p>
<p class="author">Auteur: Émile Siboulet</p>
<p class="date">Created: 2023-10-09 lun. 19:49</p>
</div>
</body>
</html>
\ No newline at end of file
...@@ -52,9 +52,6 @@ Les différentes expressions régulières permettent le découpage du texte comm ...@@ -52,9 +52,6 @@ Les différentes expressions régulières permettent le découpage du texte comm
import re # librarie standard d'analyse regex import re # librarie standard d'analyse regex
texte = texte.replace(" ", "@") # Pour remplacer les tabulation par des @ pour simplifier le parçage texte = texte.replace(" ", "@") # Pour remplacer les tabulation par des @ pour simplifier le parçage
f = open("data2.md", "w")
f.write(texte)
f.close()
reScene = re.compile(r'\#\#\#[^\.]+([^\#]+)') reScene = re.compile(r'\#\#\#[^\.]+([^\#]+)')
reReplique = re.compile(r'@([^\.^\,^\*]+)(\.| \*[^\.]+)([^\@]+)') reReplique = re.compile(r'@([^\.^\,^\*]+)(\.| \*[^\.]+)([^\@]+)')
reMot = re.compile(r'\w+') reMot = re.compile(r'\w+')
...@@ -75,6 +72,8 @@ print(Scenes[16][3]) # vérification visuel des données ...@@ -75,6 +72,8 @@ print(Scenes[16][3]) # vérification visuel des données
#+RESULTS: #+RESULTS:
: ('HARPAGON', ['voilà', 'un', 'compliment', 'bien', 'impertinent', 'quelle', 'belle', 'confession', 'à', 'lui', 'faire']) : ('HARPAGON', ['voilà', 'un', 'compliment', 'bien', 'impertinent', 'quelle', 'belle', 'confession', 'à', 'lui', 'faire'])
* Étude statistique
Il ne reste plus qu'à compter les différents mots pour avoir le résultat attendu. Il ne reste plus qu'à compter les différents mots pour avoir le résultat attendu.
#+begin_src python :results output :session :exports both #+begin_src python :results output :session :exports both
...@@ -108,6 +107,8 @@ print('\nListe des scenes :', ListeScenes) ...@@ -108,6 +107,8 @@ print('\nListe des scenes :', ListeScenes)
: Liste des scenes : ['Scene 0', 'Scene 1', 'Scene 2', 'Scene 3', 'Scene 4', 'Scene 5', 'Scene 6', 'Scene 7', 'Scene 8', 'Scene 9', 'Scene 10', 'Scene 11', 'Scene 12', 'Scene 13', 'Scene 14', 'Scene 15', 'Scene 16', 'Scene 17', 'Scene 18', 'Scene 19', 'Scene 20', 'Scene 21', 'Scene 22', 'Scene 23', 'Scene 24', 'Scene 25', 'Scene 26', 'Scene 27', 'Scene 28', 'Scene 29', 'Scene 30', 'Scene 31'] : Liste des scenes : ['Scene 0', 'Scene 1', 'Scene 2', 'Scene 3', 'Scene 4', 'Scene 5', 'Scene 6', 'Scene 7', 'Scene 8', 'Scene 9', 'Scene 10', 'Scene 11', 'Scene 12', 'Scene 13', 'Scene 14', 'Scene 15', 'Scene 16', 'Scene 17', 'Scene 18', 'Scene 19', 'Scene 20', 'Scene 21', 'Scene 22', 'Scene 23', 'Scene 24', 'Scene 25', 'Scene 26', 'Scene 27', 'Scene 28', 'Scene 29', 'Scene 30', 'Scene 31']
Nous allons maintenant placer ces différentes informations dans un tableau pandas Nous allons maintenant placer ces différentes informations dans un tableau pandas
* Export graphique des résultats
#+begin_src python :results output :session :exports both #+begin_src python :results output :session :exports both
import pandas as pd import pandas as pd
df = pd.DataFrame(columns=ListePersonnages, index=ListeScenes).fillna(0) df = pd.DataFrame(columns=ListePersonnages, index=ListeScenes).fillna(0)
...@@ -158,7 +159,7 @@ Scene 31 0 114 0 90 0 23 ...@@ -158,7 +159,7 @@ Scene 31 0 114 0 90 0 23
Nous allons maintenant afficher les différents résultats avec MatPlotLib Nous allons maintenant afficher les différents résultats avec MatPlotLib
#+begin_src python :session #+begin_src python :results output :session :exports both
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
import numpy as np import numpy as np
...@@ -176,6 +177,7 @@ plt.savefig("fig.svg") ...@@ -176,6 +177,7 @@ plt.savefig("fig.svg")
#+end_src #+end_src
#+RESULTS: #+RESULTS:
: None
Ce qui nous donne le graphique suivant. Il représente le nombre de mot en fonction de la scène des différents personnages
[[file:./fig.svg]] [[file:./fig.svg]]
This source diff could not be displayed because it is too large. You can view the blob instead.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment