From fd308c88c3614278ddbf0b114ecd8f307a1c2d86 Mon Sep 17 00:00:00 2001 From: Arnaud Legrand Date: Wed, 7 Nov 2018 15:02:14 +0100 Subject: [PATCH] Org-mode examples --- module2/ressources/video_examples/README.org | 49 + module2/ressources/video_examples/journal.org | 2136 ++ .../video_examples/labbook_several.org | 29230 ++++++++++++++++ .../video_examples/labbook_single.org | 6906 ++++ module2/ressources/video_examples/paper.org | 2634 ++ .../video_examples/technical_report.org | 477 + 6 files changed, 41432 insertions(+) create mode 100644 module2/ressources/video_examples/README.org create mode 100644 module2/ressources/video_examples/journal.org create mode 100644 module2/ressources/video_examples/labbook_several.org create mode 100644 module2/ressources/video_examples/labbook_single.org create mode 100644 module2/ressources/video_examples/paper.org create mode 100644 module2/ressources/video_examples/technical_report.org diff --git a/module2/ressources/video_examples/README.org b/module2/ressources/video_examples/README.org new file mode 100644 index 0000000..961538d --- /dev/null +++ b/module2/ressources/video_examples/README.org @@ -0,0 +1,49 @@ +# -*- coding: utf-8 -*- +# -*- mode: org -*- + +#+TITLE: Org document examples +#+AUTHOR: Arnaud Legrand +#+STARTUP: overview indent inlineimages logdrawer +#+LANGUAGE: en + +In the MOOC video, I quickly demo how org-mode can be used in various +contexts. Here are the (sometimes trimmed) corresponding +org-files. These documents depend on many other external data files +and are not meant to lead to reproducible documents but it will give +you an idea of how it can be organized: + +1. [[file:journal.org][journal.org]]: an excerpt (I've only left a few code samples and links + to some resources on R, Stats, ...) from my own journal. This is a + personal document where everything (meeting notes, hacking, random + thoughts, ...) goes by default. Entries are created with the =C-c c= + shortcut. +2. [[file:labbook_single.org][labbook_single.org]]: this is an excerpt from the laboratory notebook + [[https://cornebize.net/][Tom Cornebize]] wrote during his Master thesis internship under my + supervision. This a personal labbook. I consider this notebook to be + excellent and was the ideal level of details for us to communicate + without any ambiguity and for him to move forward with confidence. +3. [[file:paper.org][paper.org]]: this is an ongoing paper based on the previous labbook of + Tom Cornebize. As such it is not reproducible as there are hardcoded + paths and uncleaned dependencies but writing it from the labbook was + super easy as we just had to cut and paste the parts we + needed. What may be interesting is the organization and the org + tricks to export to the right LaTeX style. +4. [[file:labbook_several.org][labbook_several.org]]: this is a labbook for a specific project shared + by several persons. As a consequence it starts with information + about installation, common scripts, has section with notes about all + our meetings, a section with information about experiments and an + other one about analysis. Entries could have been labeled by who + wrote them but there were only a few of us and this information was + available in git so we did not bother. In such labbook, it is common + to find annotations indicating that such experiment was :FLAWED: as + it had some issues. +5. [[file:technical_report.org][technical_report.org]]: this is a short technical document I wrote + after a colleague sent me a PDF describing an experiment he was + conducting and asked me about how reproducible I felt it was. It + turned out I had to cut and paste the C code from the PDF, then + remove all the line numbers and fix syntax, etc. Obviously I got + quite different performance results but writing everything in + org-mode made it very easy to generate both HTML and PDF and to + explicitly explain how the measurements were done. + + diff --git a/module2/ressources/video_examples/journal.org b/module2/ressources/video_examples/journal.org new file mode 100644 index 0000000..7022387 --- /dev/null +++ b/module2/ressources/video_examples/journal.org @@ -0,0 +1,2136 @@ +# -*- coding: utf-8 -*- +#+TITLE: Blog +#+AUTHOR: Arnaud Legrand +#+HTML_HEAD: +#+STARTUP: overview indent inlineimages logdrawer +#+LANGUAGE: en +#+TAGS: Seminar(s) +#+TAGS: SG(s) WP1(1) WP2(2) WP3(3) WP4(4) WP5(5) WP6(6) WP7(7) WP8(8) WP0(0) Argonne(A) +#+TAGS: POLARIS(P) LIG(L) INRIA (I) HOME(H) Europe(E) +#+TAGS: twitter(t) +#+TAGS: Workload(w) BOINC(b) Blog noexport(n) Stats(S) +#+TAGS: BULL(B) +#+TAGS: autotuning(a) +#+TAGS: Epistemology(E) Vulgarization(V) Teaching(T) +#+TAGS: R(R) Python(p) OrgMode(O) HACSPECIS(h) +#+PROPERTY: header-args :eval never-export +#+EXPORT_SELECT_TAGS: Blog +#+OPTIONS: H:3 num:t toc:t \n:nil @:t ::t |:t ^:t -:t f:t *:t <:t +#+OPTIONS: TeX:t LaTeX:nil skip:nil d:nil todo:t pri:nil tags:not-in-toc +#+LATEX_HEADER: %\usepackage{palatino,a4wide,eurosym,graphicx}\usepackage[francais]{babel} +#+INFOJS_OPT: view:nil toc:nil ltoc:t mouse:underline buttons:0 path:http://orgmode.org/org-info.js +#+EXPORT_SELECT_TAGS: export +#+EXPORT_EXCLUDE_TAGS: noexport +#+EPRESENT_FRAME_LEVEL: 2 +#+COLUMNS: %25ITEM %TODO %3PRIORITY %TAGS +#+SEQ_TODO: TODO(t!) STARTED(s!) WAITING(w!) APPT(a!) | DONE(d!) CANCELLED(c!) DEFERRED(f!) DELEGATED(D!) + + +* 2011 +** 2011-02 février +*** 2011-02-08 mardi :R: +**** Pour apprendre: + - Pour les débutants: + http://wiki.stdout.org/rcookbook/ + http://www.r-bloggers.com/ + http://rstudio.org/ but emacs is just great too once ess is installed + - Indispensables: + + http://had.co.nz/ggplot2/ + + http://plyr.had.co.nz/ et la démonstration par l'exemple + http://plyr.had.co.nz/09-user/ + - Une intro pas trop mal faite: + - http://bioconnector.github.io/workshops/lessons/intro-r-lifesci/01-intro-r/ + - Encore plus interactif: http://swirlstats.com/ + - Plus avancé: + http://cran.r-project.org/doc/contrib/Paradis-rdebuts_fr.pdf + - Pour ceux qui veulent aller plus loin et coder + http://zoonek2.free.fr/UNIX/48_R/all.html + - Bien plus avancé pour les fans de sémantique et de ruses de + fou par Hadlay Wickam: + http://adv-r.had.co.nz/Computing-on-the-language.html + - Un [[http://ww2.coastal.edu/kingw/statistics/R-tutorials/dataframes.html][excellent tutorial on data frames]] (=attach=, =with=, =rownames=, + =dimnames=, notions of scope...) +**** R 101 :Blog: +[[file:public_html/blog/2012/09/12/R101.org][Moved to the blog]] +[[file:~/Work/SimGrid/infra-songs/slides/140422-compas-R101/R101.org][Compas tutorial]] +**** R tricks +***** Reshaping +http://www.statmethods.net/management/reshape.html +#+begin_src R :results output :session :exports both +# example of melt function +library(reshape) +mdata <- melt(mydata, id=c("id","time")) +#+end_src +***** somme d'éléments avec fenêtre glissante +#+begin_src R +filter(x, rep(1,4)) +#+end_src +***** sorting a data frame +#+BEGIN_SRC R + dd[with(dd, order(-z, b)), ] +#+END_SRC +***** Capture output +#+begin_src R +sink("myfile.txt", append=TRUE, split=TRUE) +#+end_src +When redirecting output, use the cat() function to annotate the +output. +***** Batch processing +#+begin_src sh +R CMD BATCH [options] my_script.R [outfile] +#+end_src +***** Convenient commands + - describe + - structure + - ddply + - cbind/rbind +***** Labels and Factors + http://stackoverflow.com/questions/12075037/ggplot-legends-change-labels-order-and-title +#+begin_src R +dtt$model <- factor(dtt$model, levels=c("mb", "ma", "mc"), labels=c("MBB", "MAA", "MCC")) +#+end_src + here is another way of reordering factors: +#+begin_src R +dtt$model <- relevel(dtt$model, ref="MBB"). +#+end_src +This puts the factor given by ref at the beginning. +***** "parallel" Prefix +#+BEGIN_SRC +cumsum +#+END_SRC +***** knitr preembule +check out "Tools for making a paper" in R-bloggers: +#+BEGIN_SRC +<>= +opts_knit$set(stop_on_error=2L) +@ +<>= +suppressMessages(require(memisc)) +@ +#+END_SRC +***** Annotate in facet_wrap/facet_grid +http://www.ansci.wisc.edu/morota/R/ggplot2/ggplot2.html +***** Interactive plotting + http://rstudio.org/docs/advanced/manipulate + [[file:~/Work/SimGrid/infra-songs/WP4/R/Sweep3D_analysis/analyze.Rnw]] +#+begin_src R +GC <- function(df,start,end) { + ggplot( + df[(df$Start>=start & df$Start<=end)|(df$End>=start & + df$End<=end)| + (df$Start<=start & df$End>=end) ,], + aes(xmin=Start,xmax=End, ymin=ResourceId, ymax=ResourceId+1, + fill=Value))+ + theme_bw()+geom_rect()+coord_cartesian(xlim = c(start, end)) +} +GC(df_tau,1.1,1.2) +animate(GC(df_tau, start, end),start=slider...) +#+end_src +***** scoping issue with ggplot: mixing external variables with column names + There is a magical function designed for this: here() +#+BEGIN_EXAMPLE +ddply(df_native, c("ResourceId"), here(transform), + Chunk = compute_chunk(Start,End,Duration,min_time_pure)) +#+END_EXAMPLE + Here, min_time_pure is an external variable, not a column name. +***** speeding things up with parallel plyR +#+BEGIN_SRC R + library(doMP) + library(plyr) + perf_win <- ddply(df_win,c("host_id"), summarize, + astro_avg=sum(et_avg*astro_win), + astro_var=sum(et_var*astro_win), + seti_avg=sum(et_avg*seti_win), + seti_var=sum(et_var*seti_win), + .parallel=TRUE, .progress = "text") +#+END_SRC + +***** dessin de graphe avec courbes de bezier + https://gist.github.com/dsparks/4331058 +***** Side effect in local functions +http://my.safaribooksonline.com/book/programming/r/9781449377502/9dot-functions/id3440389 +***** Non-standard evaluation +http://adv-r.had.co.nz/Computing-on-the-language.html +***** Arrays of functions in for loops +http://stackoverflow.com/questions/26064649/enclosing-variables-within-for-loop +**** R weblinks/statistiques r-cran :WP8: + http://en.wikibooks.org/wiki/R_Programming/Graphics + + http://freecode.com/articles/creating-charts-and-graphs-with-gnu-r + + http://www.statmethods.net/graphs/density.html + + http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=78 + (scatterplot + histogram) + + http://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf + (comparaison et fitting de distributions) + + http://www.sr.bham.ac.uk/~ajrs/R/r-gallery.html + http://www.som.yale.edu/faculty/pks4/files/teaching/handouts/r2_tstat_explained.pdf + about t-values + + http://www.statmethods.net/stats/anova.html + + http://www.stat.wisc.edu/courses/st850-lindstro/handouts/blocking.pdf + (blocking dans un anova en R) + + http://www-rocq.inria.fr/axis/modulad/archives/numero-34/Goupy-34/goupy-34.pdf + (tutorial on DOE in French) + + file:/home/alegrand/Work/Documents/Enseignements/M2R_Mesures_Analyse_Eval_Perf_06/Intro_Statistics/doesimp2excerpt--chap3.pdf + file:/home/alegrand/Work/Documents/Enseignements/M2R_Mesures_Analyse_Eval_Perf_06/Intro_Statistics/doeprimer.pdf + DOE + + http://cran.r-project.org/doc/contrib/Faraway-PRA.pdf + (gros bouquin de R sur l'ANOVA) + http://pages.cs.wisc.edu/~cyffka/R_regression-and-anova.pdf + + https://marvelig.liglab.fr/doku.php/thematiques/methodologie/accueil + Documents de Nadine Mandran + pointeurs vers cours de stat + + http://pbil.univ-lyon1.fr/R/pdf/bsa.pdf + http://grasland.script.univ-paris-diderot.fr/go303/ch5/doc_ch5.htm + Document sur l'analyse de données spatialisées + + http://nsaunders.wordpress.com/2010/08/20/a-brief-introduction-to-apply-in-r/ + Utilisation de Apply + + http://zoonek2.free.fr/UNIX/48_R/all.html + Un utilisateur de R qui a noté tout un tas de choses utiles et + d'exemples, nottamment de programmation. + + http://sharpstatistics.co.uk/r/ggplot/ + http://rug.mnhn.fr/semin-r/PDF/INED-SFdS-MNHN_Sueur_280411.pdf + Tutorial ggplot2 + + https://catalyst.uw.edu/workspace/tbranch/24589/155528 + Cours sur la visu en R à la Tufte + +***** Linear regression and heteroscedasticity :ATTACH: +:PROPERTIES: +:Attachments: ModeleLineaireRegrDegerine.pdf Regression101R.pdf GLSHeteroskedasticity.pdf week2_ht.pdf +:ID: b3ced951-cda8-40ce-b281-cc71b55f1da9 +:END: +- http://ljk.imag.fr/membres/Anatoli.Iouditski/cours/MLDESS.pdf (see + attachment) cours en français sur la régression linéaire, vision + proba. +- http://smat.epfl.ch/courses/Regression/Slides/week2_ht.pdf slides + sur la régression linéaire et le lien avec maximum likelihood +- http://www.r-tutor.com/elementary-statistics/simple-linear-regression/confidence-interval-linear-regression + #+begin_src R :results output :session :exports both + predict(eruption.lm, newdata, interval="confidence") + #+end_src +- http://www.princeton.edu/~otorres/Regression101R.pdf (std error et + confidence interval on parameters estimates + heteroscedasticity) +- http://www.econ.uiuc.edu/~wsosa/econ471/GLSHeteroskedasticity.pdf + Comment gérer l'hétéroscedastisité. +***** Time series :ATTACH: +:PROPERTIES: +:Attachments: SCBio.pdf +:ID: 40d5498d-e8b3-4c73-8722-7d0056667c15 +:END: +http://ljk.imag.fr/membres/Serge.Degerine/Enseignement/SCBio.pdf + +**** Quantile Regression and Bootstrap :Stats:ATTACH: +:PROPERTIES: +:Attachments: mcgill-r.pdf st-m-app-bootstrap.pdf stnews70.pdf +:ID: 8e3038dc-fa3e-4a7d-a4b1-216513e4359f +:END: +http://freakonometrics.hypotheses.org/date/2012/04 (open data and +ecological falacies (Simpson's paradox)). + +http://freakonometrics.hypotheses.org/2396 +(Talk-on-quantiles-at-the-R-Montreal-group) + +http://www.cscu.cornell.edu/news/statnews/stnews70.pdf +**** Reproducible research :WP8: +Andrew Davison tutorial, which is full of interesting references: +http://rrcns.readthedocs.org/en/latest/index.html +***** org-mode +Une autre approche, uniquement en org. +http://orgmode.org/worg/org-contrib/babel/how-to-use-Org-Babel-for-R.html +***** R/Sweave/knitr +http://users.stat.umn.edu/~geyer//Sweave/ +Sweave, exemple minimaux, emacs. + +http://www.bepress.com/cgi/viewcontent.cgi?article=1001&context=bioconductor +Un article sur reproducible research et sweave + +http://cran.r-project.org/web/packages/pgfSweave/vignettes/pgfSweave.pdf +Pgfsweave, un paquet latex qui améliore le look et la vitesse de +sweave. Le paquet est mort ceci dit et mes premiers essais n'étaient +pas concluants car tout convertir en pgf, c'est un peu bourrin. + +http://yihui.name/knitr/ +knitr, le dernier, bien à la mode, stable et très prometteur + + +http://www.stat.uiowa.edu/~rlenth/StatWeave/OLD/SRC-talk.pdf +StatWeave. Permet aussi de mettre du Maple. +***** Ipython notebook +https://osf.io/h9gsd/ +sympa, facile à mettre en place +***** ActivePapers +- http://www.activepapers.org/ +- https://bitbucket.org/khinsen/active_papers_py/wiki/Tutorial +***** Elsevier approach +http://www.elsevier.com/physical-sciences/computer-science/executable-papers +https://collage.elsevier.com/manual/ +http://is.ieis.tue.nl/staff/pvgorp/research/?page=SCP11 +***** Research Gate +https://www.researchgate.net/publicliterature.OpenReviewInfo.html +***** Conference or general discussions +http://reproducibleresearch.net/index.php/Main_Page +http://wiki.stodden.net/Main_Page + + +- [[http://www.eecg.toronto.edu/~enright/wddd/][Workshop on Duplicating, Deconstructing and Debunking (WDDD)]] ([[http://cag.engr.uconn.edu/isca2014/workshop_tutorial.html][2014 + edition]]) +- http://evaluate2010.inf.usi.ch +- [[http://www.stodden.net/AMP2011/][Reproducible Research: Tools and Strategies for Scientific Computing]] +- [[http://wssspe.researchcomputing.org.uk/][Working towards Sustainable Software for Science: Practice and + Experiences]] +- [[http://hunoldscience.net/conf/reppar14/pc.html][REPPAR'14: 1st International Workshop on Reproducibility in Parallel + Computing]] +- [[https://www.xsede.org/web/reproducibility][Reproducibility@XSEDE: An XSEDE14 Workshop]] +- [[http://www.occamportal.org/reproduce][Reproduce/HPCA 2014]] +- [[http://www.ctuning.org/cm/wiki/index.php?title%3DEvents:TRUST2014][TRUST 2014]] +- http://vee2014.cs.technion.ac.il/docs/VEE14-present602.pdf + + http://www-958.ibm.com/software/data/cognos/manyeyes/visualizations + http://www.myexperiment.org/ + http://wiki.galaxyproject.org/ + http://www.runmycode.org/CompanionSite/ + + http://evaluate.inf.usi.ch/ + + github ? + workflow ? + vistrails ? + sumatra + vcr +***** Politics +http://michaelnielsen.org/blog/how-you-can-help-the-federal-research-public-access-act-frpaa-become-law/7 + +http://en.wikipedia.org/wiki/Federal_Research_Public_Access_Act + +http://michaelnielsen.org/blog/on-elsevier/ + +**** General discussions about scientific practice :WP8: +http://michaelnielsen.org/blog/three-myths-about-scientific-peer-review*/ +http://michaelnielsen.org/blog/some-garbage-in-gold-out/ +http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285 +**** Coursera +- https://www.coursera.org/course/compdata +- https://class.coursera.org/exdata-002/lecture +- https://class.coursera.org/repdata-002 +**** ggplot2 cool examples +http://felixfan.github.io/rstudy/2014/02/28/ggplot2-cheatsheet/ +http://blog.revolutionanalytics.com/graphics/ +http://grrrraphics.blogspot.com.br/2012/05/ever-wanted-to-see-at-glance.html +http://www.ancienteco.com/2012/03/basic-introduction-to-ggplot2.html +http://sape.inf.usi.ch/quick-reference/ggplot2 +http://www.r-bloggers.com/overplotting-solution-for-black-and-white-graphics/ +http://stats.stackexchange.com/questions/12029/is-it-possible-to-create-parallel-sets-plot-using-r +http://novyden.blogspot.fr/2013/09/how-to-expand-color-palette-with-ggplot.html +https://gastonsanchez.wordpress.com/2012/08/27/scatterplot-matrices-with-ggplot/ +**** Visualisations +http://www.visual-literacy.org/periodic_table/periodic_table.html +**** Design of Experiments (DoE) +- Montgommery book +- http://www.cs.wayne.edu/~hzhang/courses/7290/Lectures/4%20-%20Introduction%20to%20Experimental%20Design.pdf +- http://www.obgyn.cam.ac.uk/cam-only/statsbook/stexdes.html#3g +- http://mescal.imag.fr/membres/arnaud.legrand/teaching/2011/EP_czitrom.pdf +- http://www.basic.northwestern.edu/statguidefiles/oneway_anova_ass_viol.html +- http://techdigest.jhuapl.edu/TD/td2703/telford.pdf +*** 2011-02-15 mardi +**** Réunion CIGRI +***** Présents + - Olivier Richard, Mcf UJF/MESCAL, gestions de ressources, + initiateur de OAR et Cigri, G5K + - Bruno Bzeznik, Ingénieur CIMENT (admin, gestion clusters) et + MESCAL (dev OAR, outils pour CIMENT). + - Chislain Charrier, Ingénieur INRIA G5K à Rennes depuis quelques + mois. Mission: s'occuper des campagnes d'expérimentations. + - Philippe Leprouster, Ingénieur CDD UJF MESCAL pour bosser sur + l'optimisation d'OAR + - Bernard Boutherin, responsable info au LPSC, noeud Tier3 de la + grille EGI (600 coeurs de calcul, 700 To de stoquage, + précurseur autour du free-cooling, installation à moins de 60 + kW depuis 2008). + - Catherine Biscarat, IR CNRS qui va s'occuper de la liaison + CIGRI/LPSC. + - Pierre Neyron, IR CNRS MESCAL/MOAIS, responsable de digitalis. +***** Point de Bruno sur l'état actuel de Cigri + Site web: https://ciment-grid.ujf-grenoble.fr + Logiciel principalement déployé dans CIMENT. Exploite + actuellement 3000 cores sur une vingtaine de machines. + R2D2 et fostino sont les plus grosses et gérées par un seul + serveur OAR. + + Resources très faiblement utilisées (en général un ou deux + utilisateurs à un instant donnée). Besoin d'accompagner les + utilisateurs qui n'ont pas forcément conscience que CIGRI est + adapté à leurs besoins. Les utilisateur qui utilisent CIGRI + actuellement sont de gros consommateurs de ressources. + + Collaboration CIGRI/LPSC initiée par un projet autour du + stoquage. Bruno a du coup équipé CIGRI de noeuds de stoquage et + a déployé Irods. +***** Plus d'infos sur : http://wiki-oar.imag.fr/index.php/CiGri-ng + +Entered on [2011-02-15 mar. 09:41] + + [[file:~/Liste.org]] +* 2012 +** 2013-02 février +**** 2013-02-11 lundi +***** Reproducible research links :WP8:R: + http://wiki.stodden.net/ICERM_Reproducibility_in_Computational_and_Experimental_Mathematics:_Readings_and_References + http://www.rpubs.com/ + An interesting article with a dissenting opinion on reproducible research: + http://cogprints.org/8675/ + + Entered on [2013-02-11 lun. 09:52] +***** Audio StarPU :WP4: + Lionel, Samuel, Paul, Luka, Brice. +****** Séquentialisation des comms + - Idées: faire des mesures automatiques + - Deux implems' (Sam & Paul), pas équivalentes, l'une modélisant + plus les communications synchrone et l'autre les + asynchrones. À creuser. +****** Petites macros pour mesurer/injecter le temps + - Temps injecté dans la version initiale de Sam = temps moyen + observé par StarPU . + - Une fois les problème de communication réglés (virer le + slow-start, séquentialiser ce qui doit l'être), les dernières + différences viennent de la variabilité vraie vie / simu + (surtout sur CPU). + - Objectif: insérer variabilité. C'est le même problème que pour + SMPI. Dans la version actuelle, on regarde le temps pris lors + de la simulation et on le réinjecte, d'où une très mauvaise portabili + - Idée: identifier les blocs, capturer les temps et utiliser en + simu un tirage à partir du profil capturé. C'est assez + "nouveau" car FSuter capturait une trace de niveau MPI donc + sans info sur quel bloc de code => pas d'information sur la + source de la variabilité. + - On commence par une approche basique: à la compilation, on + identifie un bloc par FILE,LINE, avec éventuellement une + extension via une annotation manuelle (c'est le cas pour + StarPU qui lance les calculs toujours au même endroit). + - Niveau workflow, première exécution pour avoir les timings, + puis R, puis réinsertion ds SG. + - La capture est pas compliquée et comme il y a le même besoin + pour SMPI, on factorise pour éviter les divergences. Ce code + est donc dans SG. Paul et Luka ont fait ça la semaine dernière + et Paul l'a utilisé dans *PU, reste à tester pour confirmer + - Luka essaie maintenant de mettre ça ds SMPI, c'est plus + difficile de savoir où mettre les benchmarks. L'ideal serait + de regarder dans la pile, c'est un peu compliqué donc on reste + sur notre approche simple pour l'instant et on raffinera plus + tard si c'est vraiment nécessaire. L'avantage escompté, c'est + sur les plates-formes Mt Blanc par exemples, on peut exécuter + une fois et utiliser ensuite les timings pour faire des tests + de scalabilité sur une vraie machine de brute qui va vite. +****** Objectifs des uns et des autres + - Lionel & Paul à Bdx: objectif = proposer des modèles, support + - Sam: objectif = bricoler DES ordonnanceurs, lancer vite sur + différentes et évaluer l'impact de tailles de blocs ou de la + taille d'une fenêtre glissante. C'est donc clairement un outil + de développement pour tester des choses et il faut donc que + l'outil soit un minimum stable. Rien de grave mais il faut + bien en être conscient en terme de développement. Il sera + important de propager les informations du genre "attention on + a corrigé un truc, ça risque d'invalider les expériences + précédentes". + - Arnaud rappelle que d'un point de vue développement, c'est + comme pour SMPI, il faut être conscient qu'il y a trois types + de tâches toutes aussi importantes les unes que les autres + (i.e., quand on en néglige une on s'en mord toujours les + doigts à un moment ou à un autre): + + Exploration: le plus fun, de petites expériences pour voir + si ça marche. Pour moi, la démarche de Sam et ses + expériences faites rentrent dans cette catégorie. + + Ingénierie: écriture de code, petites fonctions + techniques. Dans le contexte de starPU, typiquement, il + s'agit du travail initial de Sam mais aussi du codage de la + séquentialisation des communications ou bien des macros de + capture de traces. + + Consolidation: moins drôle, mais il faut le faire pour + vérifier que tout le monde peut fairer ses mesures, + réutiliser, et qu'on puisse avancer en toute + confiance. Souvent, sur une nouvelle machine, de nouveaux + phénomènes apparaissent et ce n'est qu'avec des outils + d'exploration systématique et automatiques qu'on s'en sort. + + Il faut donc mettre en place dès le début des outils de capture + d'information et d'analyse. +****** Roadmap + - Séquentialisation / Parallélisation: Paul à Bordeaux s'en + occupe. Il met en place le code qui crache une matrice + d'interférences et met en place dans StarPU/SG le code qui + l'exploite. + - Infrastructure de mesure / collecte de traces: Luka et Arnaud à + Grenoble s'en occupent. Réflexion sur un workflow qui va bien + pour garder de bonnes traces et pouvoir facilement tester de + nouvelles machines. + - Paul vient le 14 Mars à Grenoble et on en profitera pour faire + le point. +****** Divers +******* Campagne d'expériences pour valider le modèle / invalider les précédents. Comment faire ? + Il est difficile (impossible ?) de dire qu'un modèle est + valide. Il est plus raisonnable de montrer à quel point on a + essayer de l'invalider, ce qui permet à chacun d'évaluer à quel + point il fait confiance aux capacités d'extrapolation et + d'explication du modèle. + + On peut donc montrer l'impact des améliorations successives du + modèle, soit sur le temps final soit sur des choses plus fines + de la trace. Il faut tester sur des cas de plus en plus + complexes, d'où la nécessité d'avoir une méthode un peu + automatique pour comparer des résultats. À titre d'illustration + voici le genre de choses que Martin a raconté à l'éval d'Héméra. + + http://mescal.imag.fr/membres/arnaud.legrand/uss_simgrid/130211-HEMERA-eval.pdf + + + Mesure 0: makespan. C'est ce qui nous intéresse mais c'est + généralement très pauvre et on peut arriver par hasard à de + bons résultats ou avoir de mauvais résultats juste parce + qu'un paramètre a été mal mesuré. Même si c'est uniquement, + cette mesure là qui nous intéresse au final, il est + indispensable de comparer pour des mesures plus fines car + c'est ce qui permet de mettre une certaine confiance dans les + capacités d'extrapolation de l'outil. + + Mesure 1: comparaison visuelle de gantt chartt, peut être + joli, facile et instructif avec R, mais difficilement + quantifiable. + + Mesure 2: Regarder les distributions de temps passés dans + différents états. On peut faire ça en partie pour StarPU mais + uniquement pour les temps de calcul, pas pour les temps de + communication. En effet, on a peu de maîtrises sur les temps + de comms, on ne sait pas vraiment à quel moment la + communication s'est terminée ni quand elle a commencé. + + Mesure 3: Comparer les schedules... C'est difficile car déjà + quand c'est stable, une métrique n'est pas évidente à définir + mais quand en plus c'est variable d'une fois sur + l'autre... Idéalement, il faudrait comparer plutôt la + distribution des schedules plutôt que les schedules + individuels... C'est difficile mais passionnant. Ce qui est + super, c'est qu'on a l'outil qui permet de générer les + traces. +******* Problème de main() entre *PU et SG + Passer systématiquement par le XML a résolu le problème de + l'initialisation et du lancement de simgrid. Ça reste gênant + parce qu'il faut recompiler l'appli Sam. aimerait pouvoir + changer l'appli en changeant le LD_LIBRARY_PATH, avec 2 + versions de libstarpu.so => Mais alors comment passer les + arguments à SG ? + + C'est l'appli qui donne argc,argv dans l'appel à *pu_init + L'appli risque de pas aimer les options de SG. + + *pu passe par des variables d'environnements pour éviter ces soucis + + Vision d'arnaud : --platform=toto.xml mangé par *pu, reste des + args mangés par SG Comment faire pour mettre les stats de temps + d'exécution adaptés au fichier XML Ptetre qu'on peut identifier + les GPUs par modèle plutôt que par numéro de GPU Possibilité de + vérifier les hostname (STARPU_HOSTNAME) +******* Comment faire pour que ça marche pour exécuter en local. + On peut pondre le .xml, tout classer par hostname. Tout est dans + .starpu/sampling, pour les différents hostnames et codelets, les + traces de perfs. + + Du coup ce serait *pu qui génèrerait automatiquement le .xml ? + ça coûte pas cher. Mais les valeurs qu'on met dedans, comment + on les obtient ? + + BP/latence du bus sont mesuréesde toutes façons au départ + + Pour la matrice d'interférence de Paul, il faudrait aussi la + mettre dans le XML. + Du coup, il serait peut-être plus naturel que ce soit le script + de Paul qui ponde le .XML qui s'occupe de l'étalonnage + + Entered on [2013-02-11 lun. 09:53] +** 2013-05 mai +*** 2013-05-21 mardi +**** BIS Workshop + +#+BEGIN_SRC sh :results output raw :exports both +for i in gnome gnome-desktop-environment ifupdown iproute iproute-dev isc-dhcp-client libatm1 network-manager network-manager-gnome ; do + dir=`apt-cache showsrc $i | grep Directory | sed 's/.*: //'` + version=`apt-cache showsrc $i | grep ^Version | sed 's/.*: *//g'` + echo "wget http://http.us.debian.org/debian/$dir/$i""_$version"_amd64.deb +done +#+END_SRC + +Entered on [2013-05-21 mar. 08:46] +**** Discussions avec Anne-Cécile à propos des timeouts TCP MPI :WP4: + I have some related news. I had the chance to chat with + Anne-Cecile and talked her about our timeout problem. After + digging a little, she was able to point me to related work: + - Understanding TCP Incast Throughput Collapse in Datacenter + Networks + - Safe and Effective Fine-grained TCP Retransmissions for + Datacenter Communication + - On the properties of an adaptive TCP minimum rto + - http://www.hjp.at/doc/rfc/rfc2988.txt + + So the problem has a name (incast) and is linked to the following + TCP parameter: + | OS | TCP RTOmin | + |---------+-------------| + | Linux | 200ms | + | BSD | 200ms | + | Solaris | 400ms | + + I haven't read the articles so I don't know the details. All I + can say so far is I don't know how to change this parameter + without recompiling the kernel... + + #+BEGIN_SRC sh :results output raw :exports both + cd /usr/src/linux-headers-3.2.0-4-common + cg 'define *TCP_RTO_MIN' '*' + cg 'define *HZ' '*' + #+END_SRC + + Basically, this value is good enough for wide area where RTT is + large but in our SAN setting, it's rather bad. Looking further, + http://comments.gmane.org/gmane.linux.network/162986, I learnt + that although this parameter cannot be modified through sysctl, + it could be overriden per route with iproute. + + + #+BEGIN_SRC sh :results output text :exports both + for i in ip ip-address ipcontroller iplogger ip-netns iproxy iptables-save ipython ip6tables ip-addrlabel ipcrm ip-maddress ip-ntable ip-rule iptables-xml ipython2.6 ip6tables-apply ipc ipcs ip-monitor ipptool iptables ip-tunnel ipython2.7 ip6tables-restore ipcluster ipengine ip-mroute ipptoolfile iptables-apply ipv6 ip6tables-save ipcmk ip-link ip-neighbour ip-route iptables-restore ip-xfrm ; do + man -T --troff-device=ascii $i | grep -i rto + done + #+END_SRC + + #+RESULTS: + +Entered on [2013-05-21 mar. 12:37] +**** Play with R xkcd :R: + #+begin_src R :results output :session :exports both + # install.packages('xkcd') # did not work so I made it manually + library(extrafont) + download.file("http://simonsoftware.se/other/xkcd.ttf", dest="xkcd.ttf") + system("mkdir ~/.fonts") + system("cp xkcd.tff -t ~/.fonts") + font_import() + loadfonts() + #+END_SRC + + #+BEGIN_SRC R :results graphics :file /tmp/plot.png :exports results :width 600 :height 200 :session + library(xkcd) + theme_xkcd <- theme( + panel.background = element_rect(fill="white"), + axis.ticks = element_line(colour=NA), + panel.grid = element_line(colour="white"), + axis.text.y = element_text(colour=NA), + axis.text.x = element_text(colour="black"), + text = element_text(size=16, family="xkcd") + ) + ggplot(data.frame(x=c(0, 10)), aes(x)) + theme_xkcd + + stat_function(fun=sin,position="jitter", color="red", size=2) + + stat_function(fun=cos,position="jitter", color="white", size=3) + + stat_function(fun=cos,position="jitter", color="blue", size=2) + + geom_text(family="xkcd", x=4, y=0.7, label="A SIN AND COS CURVE")+ + xkcdaxis(c(0, 10),c(-1,1)) + #+END_SRC + + #+RESULTS: + [[file:/tmp/plot.png]] + +Entered on [2013-05-21 mar. 15:49] +* 2015 +** 2015-01 janvier +*** 2015-01-07 mercredi +**** Helping Martin with R :Teaching:R: + +#+tblname: daily +| Date | exos_java | traces_java | exos_python | traces_python | exos_scala | traces_scala | +|------------+-----------+-------------+-------------+---------------+------------+--------------| +| 2014.9.2 | 6 | 1 | 0 | 0 | 0 | 0 | +| 2014.9.3 | 5 | 1 | 0 | 0 | 0 | 0 | +| 2014.9.4 | 8 | 2 | 0 | 0 | 0 | 0 | +| 2014.9.8 | 7 | 4 | 0 | 0 | 1290 | 86 | +| 2014.9.9 | 0 | 0 | 3 | 1 | 1615 | 86 | +| 2014.9.10 | 0 | 0 | 1 | 1 | 163 | 16 | +| 2014.9.11 | 3 | 2 | 0 | 0 | 999 | 63 | +| 2014.9.12 | 67 | 4 | 2 | 2 | 1149 | 67 | +| 2014.9.13 | 20 | 3 | 1 | 1 | 132 | 14 | +| 2014.9.14 | 7 | 1 | 0 | 0 | 170 | 12 | +| 2014.9.15 | 9 | 2 | 0 | 0 | 1112 | 73 | +| 2014.9.16 | 16 | 2 | 0 | 0 | 768 | 60 | +| 2014.9.17 | 36 | 3 | 0 | 0 | 274 | 40 | +| 2014.9.18 | 1 | 1 | 22 | 2 | 20 | 2 | +| 2014.9.19 | 1 | 1 | 18 | 2 | 10 | 2 | +| 2014.9.20 | 0 | 0 | 12 | 1 | 61 | 6 | +| 2014.9.21 | 0 | 0 | 6 | 2 | 36 | 6 | +| 2014.9.22 | 3 | 2 | 11 | 2 | 420 | 50 | +| 2014.9.23 | 1 | 1 | 0 | 0 | 218 | 31 | +| 2014.9.24 | 0 | 0 | 12 | 2 | 39 | 4 | +| 2014.9.25 | 0 | 0 | 1 | 1 | 220 | 30 | +| 2014.9.26 | 0 | 0 | 19 | 2 | 28 | 5 | +| 2014.9.27 | 0 | 0 | 10 | 1 | 17 | 4 | +| 2014.9.28 | 0 | 0 | 12 | 2 | 37 | 6 | +| 2014.9.29 | 26 | 3 | 8 | 1 | 509 | 81 | +| 2014.9.30 | 9 | 2 | 16 | 2 | 243 | 36 | +| 2014.10.1 | 1 | 1 | 26 | 14 | 99 | 16 | +| 2014.10.2 | 1 | 1 | 1 | 1 | 325 | 38 | +| 2014.10.3 | 26 | 15 | 52 | 16 | 22 | 4 | +| 2014.10.4 | 25 | 1 | 4 | 3 | 36 | 9 | +| 2014.10.5 | 10 | 1 | 2 | 1 | 49 | 9 | +| 2014.10.6 | 5 | 2 | 39 | 22 | 192 | 37 | +| 2014.10.7 | 24 | 4 | 17 | 7 | 143 | 25 | +| 2014.10.8 | 50 | 3 | 0 | 0 | 77 | 14 | +| 2014.10.9 | 24 | 2 | 11 | 3 | 48 | 9 | +| 2014.10.10 | 35 | 4 | 7 | 2 | 0 | 0 | +| 2014.10.11 | 0 | 0 | 9 | 3 | 3 | 1 | +| 2014.10.12 | 20 | 6 | 7 | 3 | 10 | 1 | +| 2014.10.13 | 32 | 4 | 18 | 4 | 0 | 0 | +| 2014.10.14 | 44 | 1 | 41 | 3 | 8 | 1 | +| 2014.10.15 | 5 | 3 | 64 | 10 | 6 | 2 | +| 2014.10.16 | 27 | 2 | 24 | 5 | 1 | 1 | +| 2014.10.17 | 43 | 3 | 14 | 4 | 0 | 0 | +| 2014.10.18 | 84 | 2 | 57 | 8 | 0 | 0 | +| 2014.10.19 | 10 | 2 | 86 | 11 | 0 | 0 | +| 2014.10.20 | 0 | 0 | 94 | 11 | 0 | 0 | +| 2014.10.21 | 15 | 1 | 67 | 8 | 10 | 2 | +| 2014.10.22 | 20 | 5 | 76 | 15 | 1 | 1 | +| 2014.10.23 | 33 | 3 | 12 | 5 | 0 | 0 | +| 2014.10.24 | 29 | 2 | 58 | 11 | 1 | 1 | +| 2014.10.25 | 33 | 8 | 38 | 8 | 1 | 1 | +| 2014.10.26 | 13 | 6 | 39 | 8 | 34 | 3 | +| 2014.10.27 | 13 | 4 | 49 | 12 | 15 | 1 | +| 2014.10.28 | 4 | 2 | 44 | 8 | 3 | 1 | +| 2014.10.29 | 0 | 0 | 28 | 9 | 13 | 2 | +| 2014.10.30 | 4 | 3 | 49 | 8 | 0 | 0 | +| 2014.10.31 | 3 | 2 | 58 | 14 | 7 | 1 | +| 2014.11.1 | 0 | 0 | 71 | 9 | 7 | 2 | +| 2014.11.2 | 23 | 2 | 57 | 6 | 0 | 0 | +| 2014.11.3 | 10 | 1 | 18 | 5 | 0 | 0 | +| 2014.11.4 | 19 | 1 | 49 | 10 | 3 | 1 | +| 2014.11.5 | 29 | 2 | 28 | 9 | 0 | 0 | +| 2014.11.6 | 86 | 3 | 142 | 19 | 0 | 0 | +| 2014.11.7 | 38 | 2 | 4 | 2 | 0 | 0 | +| 2014.11.8 | 0 | 0 | 18 | 4 | 6 | 1 | +| 2014.11.9 | 25 | 2 | 39 | 10 | 0 | 0 | +| 2014.11.10 | 16 | 1 | 17 | 3 | 0 | 0 | +| 2014.11.11 | 0 | 0 | 70 | 16 | 1 | 1 | +| 2014.11.12 | 0 | 0 | 4 | 3 | 0 | 0 | +| 2014.11.13 | 0 | 0 | 168 | 20 | 1 | 1 | +| 2014.11.14 | 0 | 0 | 18 | 2 | 0 | 0 | +| 2014.11.15 | 0 | 0 | 5 | 2 | 8 | 1 | +| 2014.11.16 | 16 | 2 | 16 | 4 | 4 | 1 | +| 2014.11.17 | 0 | 0 | 8 | 3 | 0 | 0 | +| 2014.11.18 | 4 | 1 | 7 | 3 | 0 | 0 | +| 2014.11.19 | 17 | 2 | 4 | 1 | 0 | 0 | +| 2014.11.20 | 0 | 0 | 102 | 13 | 0 | 0 | +| 2014.11.21 | 7 | 1 | 31 | 3 | 1 | 1 | +| 2014.11.22 | 1 | 1 | 17 | 4 | 0 | 0 | +| 2014.11.23 | 4 | 1 | 25 | 6 | 0 | 0 | +| 2014.11.24 | 0 | 0 | 2 | 1 | 3 | 1 | +| 2014.11.25 | 4 | 1 | 0 | 0 | 7 | 2 | +| 2014.11.26 | 0 | 0 | 4 | 1 | 0 | 0 | +| 2014.11.27 | 0 | 0 | 1 | 1 | 6 | 1 | +| 2014.11.28 | 0 | 0 | 6 | 3 | 1 | 1 | +| 2014.11.29 | 1 | 1 | 29 | 4 | 13 | 3 | +| 2014.11.30 | 3 | 1 | 57 | 10 | 15 | 2 | +| 2014.12.1 | 8 | 1 | 15 | 4 | 7 | 3 | +| 2014.12.2 | 8 | 3 | 17 | 5 | 0 | 0 | +| 2014.12.3 | 3 | 1 | 6 | 2 | 0 | 0 | +| 2014.12.4 | 4 | 3 | 1 | 1 | 1 | 1 | +| 2014.12.5 | 0 | 0 | 17 | 2 | 5 | 2 | +| 2014.12.6 | 0 | 0 | 6 | 2 | 3 | 1 | +| 2014.12.7 | 0 | 0 | 7 | 3 | 0 | 0 | +| 2014.12.8 | 11 | 3 | 0 | 0 | 0 | 0 | +| 2014.12.9 | 7 | 1 | 0 | 0 | 0 | 0 | +| 2014.12.10 | 27 | 2 | 0 | 0 | 0 | 0 | +| 2014.12.11 | 0 | 0 | 0 | 0 | 1 | 1 | +| 2014.12.13 | 17 | 3 | 0 | 0 | 0 | 0 | +| 2014.12.14 | 3 | 1 | 10 | 1 | 0 | 0 | +| 2014.12.15 | 25 | 3 | 1 | 1 | 9 | 2 | +| 2014.12.16 | 34 | 3 | 10 | 4 | 0 | 0 | +| 2014.12.17 | 11 | 2 | 3 | 2 | 1 | 1 | +| 2014.12.18 | 3 | 1 | 8 | 1 | 0 | 0 | +| 2014.12.19 | 7 | 1 | 1 | 1 | 9 | 1 | +| 2014.12.20 | 96 | 3 | 11 | 4 | 0 | 0 | +| 2014.12.21 | 1 | 1 | 17 | 4 | 12 | 3 | +| 2014.12.23 | 0 | 0 | 21 | 5 | 1 | 1 | +| 2014.12.24 | 5 | 1 | 11 | 4 | 0 | 0 | +| 2014.12.25 | 14 | 2 | 8 | 2 | 0 | 0 | +| 2014.12.26 | 0 | 0 | 13 | 4 | 0 | 0 | +| 2014.12.27 | 0 | 0 | 9 | 3 | 0 | 0 | +| 2014.12.28 | 0 | 0 | 24 | 4 | 0 | 0 | +| 2014.12.29 | 0 | 0 | 21 | 7 | 0 | 0 | +| 2014.12.30 | 0 | 0 | 34 | 6 | 0 | 0 | +| 2014.12.31 | 1 | 1 | 47 | 5 | 0 | 0 | +| 2015.1.1 | 0 | 0 | 33 | 4 | 0 | 0 | +| 2015.1.2 | 0 | 0 | 29 | 7 | 0 | 0 | +| 2015.1.3 | 0 | 0 | 25 | 4 | 12 | 1 | +| 2015.1.4 | 0 | 0 | 14 | 5 | 0 | 0 | +| 2015.1.5 | 12 | 2 | 0 | 0 | 0 | 0 | + + +#+tblname: idle_periods_mt +| Start | End | +|------------+------------| +| 2014.9.2 | 2014.9.20 | +| 2014.12.18 | 2014.12.31 | + + +#+begin_src R :exports both :results output graphics :var daily=daily :var idle=idle_periods_mt :file /tmp/daily.png :width 600 :height 600 + library(reshape2) + library(ggplot2) + require(gridExtra) + + daily$Date <- as.Date(daily$Date, "%Y.%m.%d") + + data_long <- melt(daily, id.vars=c("Date")) + + idle$Start <- as.Date(idle$Start, "%Y.%m.%d") + idle$End <- as.Date(idle$End, "%Y.%m.%d") + + ymax1=200 + + p1 <- ggplot() + + geom_area(data=data_long[data_long$variable %in% c("exos_scala","exos_python","exos_java"),], + aes(x=Date, y=value, color=variable,fill=variable)) + + ggtitle("Daily activity (exercises)") + + geom_rect(data=idle,aes(xmin=Start, xmax=End, ymin=0, ymax=ymax1),alpha=.1,fill="red",color="blue") + + theme(legend.justification=c(1,0), legend.position=c(1,.6)) + + coord_cartesian(ylim = c(0,ymax1)) + + ylab("Exercises (#)") + + ymax2 = 40 + + p2 <- ggplot() + + geom_area(data=data_long[data_long$variable %in% c("traces_scala","traces_python","traces_java"),], + aes(x=Date, y=value, color=variable,fill=variable)) + + ggtitle("Daily activity (users)") + + geom_rect(data=idle,aes(xmin=Start, xmax=End, ymin=0, ymax=ymax2),alpha=.1,fill="red",color="blue") + + theme(legend.justification=c(1,0), legend.position=c(1,.6)) + + coord_cartesian(ylim = c(0,ymax2)) + ### zoom with ggplot + ylab("Active Traces (#)") + grid.arrange(p1, p2) +#+end_src + +#+RESULTS: +[[file:/tmp/daily.png]] + +Entered on [2015-01-07 mer. 16:27] + + [[file:/tmp/plm-iticse.org::*Data%20Analysis][Data Analysis]] +** 2015-07 juillet +*** 2015-07-31 vendredi +**** LOESS :WP8:Stats:R: +An involved lecture: + http://web.as.uky.edu/statistics/users/pbreheny/621/F10/notes/11-4.pdf + +A few R examples illustrating the influence of bandwidth: +- http://research.stowers-institute.org/efg/R/Statistics/loess.htm +- http://www.duclert.org/Aide-memoire-R/Statistiques/Local-polynomial-fitting.php + +Entered on [2015-07-31 ven. 09:35] +**** Harald Servat's Phd defense +Comments: +- I really enjoyed the very *clear presentation* of the document, of the + related work, etc. +- I particularly enjoyed the fact that *hypothesis are clearly stated*, + which I think is the sign of a true *scientific approach* in term of + methodology. I also think that moving to *continous approximations* as + you tried by using Kriging or segmented linear regressions is an + excellent idea. +- Last, *thanks for giving me the opportunity* to think more carefully + about the mathematical foundations of such tools and how they could + make sense or not. It actually raised a lots of questions. + +Questions: +- You did not hesitate to use of *elaborate statistical tools*. Such + tools rely on probabilistic model and on of the interesting feature + is that they allow two things: + - Hypothesis testing + - Confidence interval calculation + Do you think it would be worth building on such features +- You demonstrated that your clustering methodology could be applied + to many use cases. Can tell me if you can think about situations + where it would not apply. +- 11: segmented linear regression seems more meaningful than krigging + here. Is it really the case ? Kriging does an interpolation and + maybe the nuggeting is unable to smooth things enough. But maybe the + different phases detected by segmented regression are not that + meaningful either ? +- Reuse of previous analysis to capture better traces with a lower + overhead ? + + +- 15: multi-dimensionnal segmented linear regression ? +- 28: you are annoyed because time is a random variable too. There is + uncertainty on it, which is why the classical technique (krigging or + segmented regression) do not apply +- Machine learning for pointing out to situations where correlation + make sense or not. +- 39: shouldn't the compiler have been able to do such kind of + optimization ? +- 49: L1-cache based sampling allowed to detect when MPI was receiving + a message + + +Entered on [2015-07-31 ven. 10:58] +** 2015-12 décembre +*** 2015-12-22 mardi +**** Programmation avec Clément: hexagone magique :Teaching:Python: +***** Génération de permutations toute simple +#+begin_src python :results output :exports both +N = 5 +A = range(1,N) + +def generate(tab,i): + if i>=len(tab): + print(tab) + else: + for j in range(i,len(tab)): + tab[i],tab[j] = tab[j],tab[i] + generate(tab,i+1) + tab[i],tab[j] = tab[j],tab[i] + +generate(A,0) +#+end_src + +#+RESULTS: +#+begin_example +[1, 2, 3, 4] +[1, 2, 4, 3] +[1, 3, 2, 4] +[1, 3, 4, 2] +[1, 4, 3, 2] +[1, 4, 2, 3] +[2, 1, 3, 4] +[2, 1, 4, 3] +[2, 3, 1, 4] +[2, 3, 4, 1] +[2, 4, 3, 1] +[2, 4, 1, 3] +[3, 2, 1, 4] +[3, 2, 4, 1] +[3, 1, 2, 4] +[3, 1, 4, 2] +[3, 4, 1, 2] +[3, 4, 2, 1] +[4, 2, 3, 1] +[4, 2, 1, 3] +[4, 3, 2, 1] +[4, 3, 1, 2] +[4, 1, 3, 2] +[4, 1, 2, 3] +#+end_example + +***** Exploration comme un bourrin +On représente l'hexagone par un tableau numéroté comme ceci: +#+BEGIN_EXAMPLE + 0 1 2 + 3 4 5 6 +7 8 9 10 11 + 12 13 14 15 + 17 18 19 +#+END_EXAMPLE + +#+begin_src python :results output :exports both :tangle /tmp/test_bourrin.py +def check(tab): + start = 0 + for r in [3,4,5,4]: + if sum(tab[start:(start+r)])!=38: + return False + start = start + r + for t in [[2,6,11],[1,5,10,15],[0,4,9,14,18],[3,8,13,17]]: + if sum([tab[i] for i in t])!=38: + return False + for t in [[7,3,0],[1,4,8,12],[2,5,9,13,16],[6,10,14,17]]: + if sum([tab[i] for i in t])!=38: + return False + return True + +def generate(tab,i): + if i>=len(tab): + if check(tab): + print(tab) + else: + for j in range(i,len(tab)): + tab[i],tab[j] = tab[j],tab[i] + generate(tab,i+1) + tab[i],tab[j] = tab[j],tab[i] + +generate([3, 17, 18, 19, 7, 1, 11, 16, 2, 5, 6, 9, 12, 4, 8, 14, 10, 13, 15],0) +#+end_src + +Bon, sauf que ça va être monstrueusement long en fait. Sur ma machine: +#+begin_src sh :results output :exports both +cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq +#+end_src + +#+RESULTS: +: 3300000 + +Donc, dans l'hypothèse, ultra optimiste où je serais capable de +vérifier une permutation par cycle d'horloge, il me faudrait: +#+begin_src R :results output :session :exports both +factorial(19)/3300000/24/3600/365 +#+end_src + +#+RESULTS: +: [1] 1168.891 + +Plus de 1000 ans. Bon, la loi de Moore finira par nous aider mais pas +des masses. :) + +***** Génération de permutation avec coupe au plus tôt +De façon à éliminer les branches au plus tôt, on représente l'hexagone +par un tableau numéroté comme ceci: +#+BEGIN_EXAMPLE + 0 1 2 + 11 12 13 3 +10 17 19 14 4 + 9 16 15 5 + 8 7 6 +#+END_EXAMPLE + +Et on remarque qu'on n'a de choix de branchement que pour 0, 1, 3, 5, +7, 9, et 12. Tous les autres sont induits par les précédents. + +#+begin_src python :results output :exports both :tangle /tmp/test_rapide.py +def assign(tab, i, x): + if x in tab[i:len(tab)]: + for j in range(i,len(tab)): + if(tab[j]==x): + tab[i],tab[j] = tab[j],tab[i] + generate(tab,i+1) + tab[i],tab[j] = tab[j],tab[i] + return + +def generate(tab,i): + # print(i) + if i>=len(tab): + print(tab) + else: + if i in [0,1,3,5,7,9,12]: + for j in range(i,len(tab)): + tab[i],tab[j] = tab[j],tab[i] + generate(tab,i+1) + tab[i],tab[j] = tab[j],tab[i] + elif i in [2,4,6,8,10]: + x = 38 - (tab[i-1]+tab[i-2]) + assign(tab,i,x) + elif i==11: + x = 38 - (tab[i-1]+tab[0]) + assign(tab,i,x) + elif i==13: + x = 38 - (tab[11]+tab[12]+tab[3]) + assign(tab,i,x) + elif i==14: + x = 38 - (tab[1]+tab[13]+tab[5]) + assign(tab,i,x) + elif i==15: + x = 38 - (tab[3]+tab[14]+tab[7]) + assign(tab,i,x) + elif i==16: + x = 38 - (tab[5]+tab[15]+tab[9]) + assign(tab,i,x) + elif i==17: + x = 38 - (tab[7]+tab[16]+tab[11]) + if x+tab[9]+tab[12]+tab[1]!=38: + return + assign(tab,i,x) + elif i==18: + if tab[10]+tab[17]+tab[18]+tab[14]+tab[4]==38 and \ + tab[0]+tab[12]+tab[18]+tab[15]+tab[6]==38 and \ + tab[2]+tab[13]+tab[18]+tab[16]+tab[8]==38: + generate(tab,i+1) + +generate(range(1,20),0) +#+end_src + + +Et maintenant, combien de temps pour avoir la solution ? +#+begin_src sh :results output :exports both +cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor +cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq +time python /tmp/test_rapide.py 2>&1 +#+end_src + +#+RESULTS: +#+begin_example +performance +3300000 +[3, 17, 18, 11, 9, 14, 15, 13, 10, 12, 16, 19, 7, 1, 6, 8, 4, 2, 5] +[3, 19, 16, 12, 10, 13, 15, 14, 9, 11, 18, 17, 7, 2, 4, 8, 6, 1, 5] +[9, 11, 18, 17, 3, 19, 16, 12, 10, 13, 15, 14, 6, 1, 7, 2, 4, 8, 5] +[9, 14, 15, 13, 10, 12, 16, 19, 3, 17, 18, 11, 6, 8, 4, 2, 7, 1, 5] +[10, 12, 16, 19, 3, 17, 18, 11, 9, 14, 15, 13, 4, 2, 7, 1, 6, 8, 5] +[10, 13, 15, 14, 9, 11, 18, 17, 3, 19, 16, 12, 4, 8, 6, 1, 7, 2, 5] +[15, 13, 10, 12, 16, 19, 3, 17, 18, 11, 9, 14, 8, 4, 2, 7, 1, 6, 5] +[15, 14, 9, 11, 18, 17, 3, 19, 16, 12, 10, 13, 8, 6, 1, 7, 2, 4, 5] +[16, 12, 10, 13, 15, 14, 9, 11, 18, 17, 3, 19, 2, 4, 8, 6, 1, 7, 5] +[16, 19, 3, 17, 18, 11, 9, 14, 15, 13, 10, 12, 2, 7, 1, 6, 8, 4, 5] +[18, 11, 9, 14, 15, 13, 10, 12, 16, 19, 3, 17, 1, 6, 8, 4, 2, 7, 5] +[18, 17, 3, 19, 16, 12, 10, 13, 15, 14, 9, 11, 1, 7, 2, 4, 8, 6, 5] +1.25user 0.00system 0:01.26elapsed 99%CPU (0avgtext+0avgdata 6668maxresident)k +0inputs+0outputs (0major+873minor)pagefaults 0swaps +#+end_example + +***** Solution par d'autres personnes +Finalement, un peu de google nous donne ça: + +http://codegolf.stackexchange.com/questions/6304/code-solution-for-the-magic-hexagon + +La solution en C++ est fondamentalement la même mais sans utilisation +de récursion, i.e. en inlinant les 9 boucles et avec une macro pour +aléger l'écriture. Elle trouve les mêmes solutions que nous mais 60 +fois plus vite: + +#+begin_src cpp :results output :exports both :tangle /tmp/test_cpp.cpp +#include +#define LOOP(V) for(int V=1;V<20;V++){if(m&1<&1 +#+end_src + +#+RESULTS: +#+begin_example +3 17 18 19 7 1 11 16 2 5 6 9 12 4 8 14 10 13 15 +3 19 16 17 7 2 12 18 1 5 4 10 11 6 8 13 9 14 15 +9 11 18 14 6 1 17 15 8 5 7 3 13 4 2 19 10 12 16 +9 14 15 11 6 8 13 18 1 5 4 10 17 7 2 12 3 19 16 +10 12 16 13 4 2 19 15 8 5 7 3 14 6 1 17 9 11 18 +10 13 15 12 4 8 14 16 2 5 6 9 19 7 1 11 3 17 18 +15 13 10 14 8 4 12 9 6 5 2 16 11 1 7 19 18 17 3 +15 14 9 13 8 6 11 10 4 5 1 18 12 2 7 17 16 19 3 +16 12 10 19 2 4 13 3 7 5 8 15 17 1 6 14 18 11 9 +16 19 3 12 2 7 17 10 4 5 1 18 13 8 6 11 15 14 9 +18 11 9 17 1 6 14 3 7 5 8 15 19 2 4 13 16 12 10 +18 17 3 11 1 7 19 9 6 5 2 16 14 8 4 12 15 13 10 +0.02user 0.00system 0:00.02elapsed 100%CPU (0avgtext+0avgdata 1284maxresident)k +0inputs+0outputs (0major+64minor)pagefaults 0swaps +#+end_example + +Entered on [2015-12-22 mar. 17:59] +*** 2015-12-23 mercredi +**** Parrot :HOME: +Appel le 23/12/15 vers 17:35. No de dossier: 516 440 +* 2016 +** 2016-02 février +*** 2016-02-17 mercredi +**** Programmation pendu avec Rémi :Teaching:Python:ATTACH: +:PROPERTIES: +:Attachments: lst.txt +:ID: 0032c718-137f-464c-ab82-5cd3b378a222 +:END: +***** Code +#+begin_src python :results output :exports both :tangle /tmp/pendu.py +from random import * +from sys import stdin + +def valide(mot): + for l in mot: + ok = 0 + for lp in ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z","-"]: + if(l==lp): ok=1 + if(ok==0): return 0 + return 1 + +def lit_dictionnaire(nom_fichier): + f = open(nom_fichier, 'r') + L = [] + for line in f: + mot = line.rstrip() + if(valide(mot)): + L.append(mot) + + # print "Il y a " + str(len(L)) + " mots dans mon dictionnaire." + # print "Le premier est '" + L[0] + "'." + # print "Le dixieme est '" + L[9] + "'." + return L + +def choisit_mot(dico): + return dico[int(len(dico)*random())] + +def trouve(lettre,lettres_autorisees): + for l in lettres_autorisees: + if(lettre==l): + return 1 + return 0 + +def enleve(l,L): + for i in range(0,len(L)): + if(l==L[i]): + return(L[:i]+L[(i+1):]) + +def lit_lettre(lettres_autorisees): + print(lettres_autorisees) + input = stdin.readline().rstrip() + while (len(input)!=1 or (trouve(input,lettres_autorisees)!=1) ): + print trouve(input,lettres_autorisees) + print "Imbecile! Donne moi UNE lettre et qui soit autorisee!" + input = stdin.readline().rstrip() + return input + +def remplace(mot_joueur, l, mot): + # print ">>> ("+mot_joueur+","+l+","+mot+")" + a_trouve = 0 + for i in range(0,len(mot)): + if(mot[i] == l): + # print "Youpi!!! j'ai trouve: "+mot[i] + # print " "+mot_joueur + mot_joueur = mot_joueur[:i] + l + mot_joueur[(i+1):] + a_trouve = 1 + # print " "+mot_joueur + return (mot_joueur,a_trouve) + + +def motif_ok(mot,motif): + if(len(mot)!=len(motif)): + return 0 + for i in range(0,len(motif)): + if(motif[i]!="#"): + if(mot[i]!=motif[i]): + return 0; + return 1 + +def lettres_exclues_ok(mot,lettres_exclues): + for l in mot: + for le in lettres_exclues: + if l==le: + return 0 + return 1 + +def filtre(dictionnaire,motif,lettres_exclues): + nouveau_dico = [] + for mot in dictionnaire: + if(motif_ok(mot,motif) and lettres_exclues_ok(mot, lettres_exclues)): + nouveau_dico.append(mot) + return nouveau_dico + +def conseille_stupide(mots_possibles, lettres_possibles): + return lettres_possibles[0] + +def lettre_dans_mot(l,mot): + for lm in mot: + if(lm==l): + return 1 + return 0 + +def conseille(mots_possibles, lettres_possibles): + nombre_mots = len(mots_possibles) + + if nombre_mots==1: + for l in lettres_possibles: + if(lettre_dans_mot(l,mots_possibles[0])): + return l + def compte(l,mots_possibles): + num = 0 + for mot in mots_possibles: + if lettre_dans_mot(l,mot)==1: + num=num+1 + return num + + nombre_mots_avec_la_bonne_lettre = [] + score = [] + for l in lettres_possibles: + num = compte(l,mots_possibles) + nombre_mots_avec_la_bonne_lettre.append(num) + score.append(abs(num-nombre_mots/2.0)) + + score_min = score[0]+.1 + i_min = 0 + for i in range(0,len(lettres_possibles)): + if(score[i]freq_max): + freq_max = frequence_lettre[i] + i_max = i + # print lettres_possibles + # print nombre_mots + # print nombre_mots_avec_la_bonne_lettre + # print score + # print i_min + return lettres_possibles[i_max] + +def jeu(dictionnaire,mot,mode): + mot_joueur = "#" * len(mot) + for i in range(0,len(mot)): + if mot[i]=="-": mot_joueur = mot_joueur[:i] + "-" + mot_joueur[(i+1):] + + lettres_autorisees = ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"] + lettres_exclues = [] + max_erreur=18 + erreur=0 + mots_possibles=dictionnaire + + # print mot + + while(mot_joueur != mot): + if mode=="interactif": print mot_joueur + " | Nombre d'erreurs autorisees restant : " + str(max_erreur-erreur) + mots_possibles = filtre(mots_possibles,mot_joueur,lettres_exclues) + if mode=="interactif": print "Il reste " + str(len(mots_possibles)) + " mot(s)" + if mode=="interactif": lettre_conseillee = conseille(mots_possibles,lettres_autorisees) + if mode=="frequence": lettre_conseillee = conseille_freq(mots_possibles,lettres_autorisees) + if mode=="dichotomie": lettre_conseillee = conseille(mots_possibles,lettres_autorisees) + + if mode=="interactif": print " Conseil: " + lettre_conseillee + if mode=="interactif": lettre = lit_lettre(lettres_autorisees) + else: lettre=lettre_conseillee + (mot_joueur,a_trouve) = remplace(mot_joueur, lettre, mot) + lettres_autorisees = enleve(lettre,lettres_autorisees) + if(a_trouve==0): + erreur += 1 + lettres_exclues.append(lettre) + if(erreur==max_erreur): + if mode=="interactif": print "Tu as perdu!!!!" + if mode=="interactif": print "C'etait : " + mot + return erreur + if mode=="interactif": print mot_joueur + " | Nombre d'erreurs autorisees restant : " + str(max_erreur-erreur) + if mode=="interactif": print "Bravo!!!!" + return erreur + + +def main(): + mon_dico = lit_dictionnaire("/home/alegrand/Hacking/boggle/Words.txt"); + while(1): + mot = choisit_mot(mon_dico); + jeu(mon_dico,mot,"interactif") + +def main2(): + mon_dico = lit_dictionnaire("/home/alegrand/Hacking/boggle/Words.txt"); + while(1): + mot = choisit_mot(mon_dico); + freq = jeu(mon_dico,mot,"frequence") + dicho = jeu(mon_dico,mot,"dichotomie") + print mot + " , " + str(freq) + " , " + str(dicho) + +main2() +### Quelques constructions equivalentes + +# i=0 +# while(i<10): +# print i +# i=i+1 +# +# for i in range(0,10): +# print i + +# for i in range(0,len(liste)): +# print liste[i] +# +# for mot in liste: +# print mot + +#+end_src + +J'exécute et j'arrête au boût de 3 minutes +#+begin_src sh :results output :exports both +python pendu.py > lst.txt +#+end_src + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 400 :height 400 :session +library(ggplot2) +df=read.csv("data/00/32c718-137f-464c-ab82-5cd3b378a222/lst.txt",strip.white=T,header=F) +names(df)=c("mot","freq","dicho") +ggplot(data=df,aes(x=freq,y=dicho)) + geom_point(alpha=.3) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-9398SP8/figure9398caH.png]] + + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 400 :height 400 :session +ggplot(data=df,aes(x=freq-dicho)) + geom_histogram() +#+end_src + +#+RESULTS: +[[file:/tmp/babel-9398SP8/figure9398QDg.png]] + + +#+begin_src R :results output :session :exports both +summary(df) +#+end_src + +#+RESULTS: +: mot freq dicho +: bionique : 2 Min. :0.000 Min. : 0.000 +: crevassait: 2 1st Qu.:0.000 1st Qu.: 1.000 +: pin : 2 Median :1.000 Median : 3.000 +: primevere : 2 Mean :1.296 Mean : 3.087 +: terrien : 2 3rd Qu.:2.000 3rd Qu.: 4.000 +: abattement: 1 Max. :8.000 Max. :11.000 +: (Other) :836 + +#+begin_src R :results output :session :exports both +X=df$freq-df$dicho +summary(X) +mean(X) +err = sd(X)/sqrt(length(X)) +mean(X) - 2*err +mean(X) + 2*err +#+end_src + +#+RESULTS: +: Min. 1st Qu. Median Mean 3rd Qu. Max. +: -10.000 -3.000 -1.000 -1.791 0.000 7.000 +: [1] 2.304793 +: [1] 0.07919362 +***** Links +http://web.stanford.edu/class/cs106l/handouts/assignment-2-evil-hangman.pdf +http://www.sharkfeeder.com/hangman/ +http://blog.wolfram.com/2010/08/13/25-best-hangman-words/ +Entered on [2016-02-17 mer. 10:14] +*** 2016-02-25 jeudi +**** Hacking screenkey :Python: +#+begin_src sh :session foo :results output :exports both +diff -u /usr/share/pyshared/Screenkey/listenkdb.py_old /usr/share/pyshared/Screenkey/listenkdb.py +#+end_src + +#+RESULTS: +#+begin_example +--- /usr/share/pyshared/Screenkey/listenkdb.py_old 2016-03-07 09:40:13.271193249 +0100 ++++ /usr/share/pyshared/Screenkey/listenkdb.py 2016-03-07 09:42:41.216862924 +0100 +@@ -230,7 +230,14 @@ + mod = mod + "Alt+" + if self.cmd_keys['super']: + mod = mod + "Super+" +- ++ ++ if self.cmd_keys['shift']: ++ if (len(key_shift)==1) and not(ord(key_normal) in range(97,123)) and not(ord(key_shift) in range(33,126)): ++ mod = mod + "Shift+" +65000): ++ mod = mod + "Shift+" +print "---------" +print key, key_shift, keysym + if self.cmd_keys['shift']: + key = key_shift + if self.cmd_keys['capslock'] \ +#+end_example + +Entered on [2016-02-25 jeu. 11:19] + +** 2016-07 juillet +*** 2016-07-19 mardi +**** [[http://rmarkdown.rstudio.com/flexdashboard/][flexdashboard: Easy interactive dashboards for R]] :WP7:WP8:R:twitter: + +Entered on [2016-07-19 mar. 09:04] +**** Steps toward reproducible research (Karl Broman) :WP8:twitter:R: +- https://github.com/kbroman/Talk_ReproRes (slides) + - Why + - I'm sorry but I think you haven't used the right data. + - The results in Table 1 don’t seem to correspond to those in + Figure 2. + - In what order do I run these scripts? + - Where did we get this data file? + - Why did I omit those samples? + - How did I make that figure? + - Important points + - Organize your data & code + #+BEGIN_QUOTE + Your closest collaborator is you six months ago, + but you don’t reply to emails. + (paraphrasing Mark Holder) + #+END_QUOTE + - Everything with a script + If you do something once, you’ll do it 1000 times. + - Automate the process as much as you can + In addition to automating a complex process, it also documents + the process, including the dependencies among data files and + scripts. + - Turn scripts into reproducible reports + - Use version control (git/GitHub) + #+BEGIN_QUOTE + The most important tool is the mindset, + when starting, that the end product + will be reproducible. + – Keith Baggerly + #+END_QUOTE + +- [[http://kbroman.org/steps2rr/][initial steps toward reproducible research]] (lecture) +- https://github.com/kbroman/Tools4RR (Materials for a one-credit + course on reproducible research) + +Entered on [2016-07-19 mar. 09:06] +**** [[http://michaellevy.name/blog/teaching-r-to-200-students-in-a-week/][Teaching R to 200 students in a week • Michael Levy]] :Teaching:R: +Link from Michael Blum +- Motivation precedes detail: “Here’s what you’re going to learn to do this week” +- Live coding: shows that I make mistakes, builds in flexibility, forces to slow down +- Live code piped to their browsers (dropbox? pad? floobits?) +- In-class exercises instead of lectures +- Stickies and good assistants +- Daily feedback: Each day, I asked the students to fill out a quick + survey: How well do you understand what was taught today, what’s + working for you, and what could use a change? +- Advanced exercises for experts. + +Entered on [2016-07-19 mar. 09:36] + CEDEX 9 +* 2017 +** 2017-01 janvier +*** 2017-01-09 lundi +**** Ridge regression :Stats:R: +http://web.as.uky.edu/statistics/users/pbreheny/764-F11/notes/9-1.pdf + +Ridge regression penalizes the size of the regression coefficients, +which is convenient in the presence of multicollinearity + +- http://www.few.vu.nl/~wvanwie/Courses/HighdimensionalDataAnalysis/WNvanWieringen_HDDA_Lecture4_RidgeRegression_20162017.pdf +- https://arxiv.org/pdf/1509.09169.pdf + +Entered on[2017-01-09 lun. 22:10] + +*** 2017-01-19 jeudi +**** Groupe de lecture: [[file:public_html/readings/Jackson_Networks.pdf][Social and Economic Networks]] (session 1) :POLARIS: +***** Chapter 1: Introduction +- Example from the Medici graph +- degree can be seen as a measure of the influence of the node but a + probably more interesting notion is the betweenness which indicates + the probability the path a pair of peole have to go through you to + communicate: + + b(k) = 1/((n-1)(n-2)/2) \sum_{i\neq j} (number of shortest paths from i to j going through + k)/(number ofr shortest paths from i to j) +***** Chapter 2: Basic notions +- notion of adjacency matrix with convenient properties. +- deg(i) = \sum_j g_{i,j} +- #triangles = tr(g^3)/6 +- Clustering: "how close are you from a clique" + - local notion of connectivity + - Cl_i(G) = (#of pairs j,k connected to i s.t. j and k are also + connected )/(d(i).(d(i)-1)/2) + = \sum_{j,k s.t. i\ne j\ne k} g_{i,j} g_{j,k} g_{k,i} / \sum_{j,k s.t. i\ne j\ne k} g_{i,j} g_{i,k} + \approx (sum ith diag element of g^3) / (sum ith diag element of g^2) + - Cl(G) = 1/N \sum_i Cl_i(G) is then the /average clustering/ of the graph + if you average over nodes. If one consider a "star" of small + "cliques", all the nodes but the one in the center will have a + local clique of 1 (hence the average clustering tends to 1) + whereas it is far of being a clique itself. + - You could average over triples directly and get the /overall + clustering/ by considering directly: + \sum_{i\ne j\ne k} g_{i,j} g_{j,k} g_{k,i} / \sum_{i\ne j\ne k} g_{i,j} g_{i,k} + This may be a better notion as the overall clustering would go to + 0... But we could totally have the reverse... +- Centrality: + - degree centrality is the average degree + - betweeness centrality is what we saw earlier + - Eigen centrality. Relates to a notion of prestige P: + + P_i (g) = \sum_{j\ne i} g_{i,j} P_j(g)/\delta_g(j) + + Hence P = \bar{g}.P with \bar{g} being the normalized graph (that + could be different if the graph is undirected or if one does not + consider \delta to be the degree d). Note that the Degree satisfies + such equation, hence you recove the degree centrality notion. + - In general, one may consider the larger eigenvalue of \bar{g} and + the Perron Frobenius tells that all components of the + corresponding eigenvector are positive. One may thus normalize by + the sum of the elements of the eigenvector P. +- Another notion is distance or decay centrality: + - \sum_j 1/d(i,j +- We don't really know whether there is a link between + Eigen-centrality and Betweenness centrality. +**** MOOC, Jean-Marc Hasenfratz :Vulgarization:Teaching:WP8: +Discussions sur l'opportunité de faire un MOOC Recherche reproductible +sur FUN. On fait le tour des choses existantes: +- Il y a des choses en stats, sur R + https://www.fun-mooc.fr/cours/#filter/subject/mathematiques-et-statistiques?page=1&rpp=50 + mais rien sur les aspects recherche reproductible ou la + programmation littérale en tant que tel. +- Sur coursera, il y a + https://www.coursera.org/specializations/jhu-data-science. Il me + semble que les 5 premiers (intro/techno, R basis, data cleaning, + EDA, litterate programming) valent le coup mais c'est trop long. +- Lorena Barba's tutorial on RR ? + https://barbagroup.github.io/essential_skills_RRC/ +- Si on reste sur l'idée que le "cahier de laboratoire" c'est la base, + il faut décider de quelle techno on pousse: knitr, jupyter ou + org-mode. D'où une discussion prévue avec Konrad et Christophe, ce + qui permettrait en plus de donnée une dimension pluri-disciplinaire + intéressante. + +Entered on [2017-01-19 jeu. 14:29] + +** 2017-12 décembre +*** 2017-12-07 jeudi +**** Refs Recherche reproductible intéressantes :WP8: +[[https://www.jove.com/blog/2017/10/27/reproducibility-librarian-yes-that-should-be-your-next-job/][Reproducible research librarian]] + +- [[http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0038234][The Effects of FreeSurfer Version, Workstation Type, and Macintosh + Operating System Version on Anatomical Volume and Cortical Thickness + Measurements]]: + - No differences were detected between repeated single runs nor + between single runs and parallel runs on the same workstation and + for the same FreeSurfer and OS version. For the same OS version, + all Mac workstations produced identical results. However, + differences were revealed between: + - Mac and HP workstations + - FreeSurfer versions v4.3.1, v4.5.0, and v5.0.0 + - OSX 10.5.8 and OSX 10.6.4/5 + - Focus sur l'importance des erreurs mais pas sur leur origine car + tout est très closed-source. +- [[https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-1044-7][Gene name errors are widespread in the scientific literature]] +- [[https://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0850-7][Five selfish reasons to work reproducibly]]: style sympa +- Jupyter extension with Reprozip: https://www.youtube.com/watch?v=Y8YmGVYHhS8 +- [[https://github.com/pantsbuild/pex][Pex]]: un environnement permettant de transformer un script python en + un exécutable autonome (packing interne des libs). +- [[http://o2r.info/][O2R]]: hyper préliminaire, aucun intérêt. + +Entered on [2017-12-07 jeu. 11:09] + + [[file:~/org/journal.org::*Autotuning%20context:][Autotuning context:]] +**** [[file:~/Archives/Cours/maths/R/Verzani-SimpleR.pdf][simpleR – Using R for Introductory StatisticsR]] :Stats:R: + +Un document que je n'avais pas pris le temps de lire mais qui est pas +mal fichu du tout. Points intéressants sur R: + +- Trouver des outliers en interagissant avec les plots: + =identify(BUSH,BUCHANAN,n=2)=. +- Using rlm or lqs for resistant regression. rlm minimise la somme + d'une fraction des résidus et pas de l'intégralité. +- pas mal d'exemples/exos sur le test (paired, avec même variance ou + pas, Cox-Wilkinson, chi2, etc.) + +Entered on [2017-12-07 jeu. 12:28] + + [[file:~/org/journal.org::*Refs%20Recherche%20reproductible%20int%C3%A9ressantes][Refs Recherche reproductible intéressantes]] +* 2018 +** 2018-10 octobre +*** 2018-10-02 mardi +**** Learning emacs lisp +https://www.gnu.org/software/emacs/manual/html_mono/elisp.html + +file:///home/alegrand/tmp/Programming%20in%20Emacs%20Lisp%20-%20https:_www.gnu.org_software_emacs_manual_html_mono_eintr.html + +Also eval some emacs lisp using M-: +***** Evaluation +#+begin_src emacs-lisp +(+ 2 2) +#+end_src + +#+RESULTS: +: 4 + +#+begin_src emacs-lisp +'(this is a quoted list) ;; a list +#+end_src + +#+begin_src emacs-lisp : +(this is a quoted list) ;; won't work as it will call "this" with args "is" "a" "quoted" "list" +#+end_src +***** Setting variables +#+begin_src emacs-lisp +(setq toto 2) +toto +#+end_src + +#+RESULTS: +: 2 + + +#+begin_src emacs-lisp +(setq toto 2) +(set 'toto' 2) +toto +#+end_src + +#+RESULTS: +: 2 + +#+begin_src emacs-lisp +(setq counter 0) ; Let's call this the initializer. +(setq counter (+ counter 1)) ; This is the incrementer. +counter ; This is the counter. +#+end_src + +#+RESULTS: +: 1 + +#+begin_src emacs-lisp +(let ((var1 2) + (var2 3)) + (+ var1 var2)) +#+end_src + +#+RESULTS: +: 5 + +#+begin_src emacs-lisp + (let ((zebra "stripes") + (tiger "fierce")) + (message "One kind of animal has %s and another is %s." + zebra tiger)) +#+end_src + +#+RESULTS: +: One kind of animal has stripes and another is fierce. + +***** Defining and calling function +#+begin_src emacs-lisp +(defun tutu() 2) +(defun tutu() '(2)) +(tutu) +#+end_src + +#+RESULTS: +| 2 | + +#+begin_src emacs-lisp +(functionp 'tutu) +#+end_src + +#+RESULTS: +: t + +***** Testing +#+begin_src emacs-lisp +(if (functionp 'tutu) (message "this is a function")) +#+end_src + +#+RESULTS: +: this is a function + +#+begin_src emacs-lisp +(defun type-of-animal (characteristic) + "Print message in echo area depending on CHARACTERISTIC. + If the CHARACTERISTIC is the string \"fierce\", + then warn of a tiger." + (if (equal characteristic "fierce") + (message "It is a tiger!"))) +(type-of-animal "fierce") +;; (type-of-animal "striped") +#+end_src + +#+RESULTS: +: It is a tiger! + +#+begin_src emacs-lisp +(if (> 4 5) ; if-part + (message "4 falsely greater than 5!") ; then-part + (message "4 is not greater than 5!")) ; else-part +#+end_src + +#+RESULTS: +: 4 is not greater than 5! + +***** Useful functions +#+begin_src emacs-lisp +;; describe-function +;; describe-key +;; list-matching-lines +;; delete-window +;; point-to-register +;; eval-expression +;; car, cdr, cons +#+end_src +***** Playing with babel templates (1) +#+begin_src emacs-lisp +;; (add-to-list 'org-structure-template-alist +;; '("Y" "#+begin_src R\n?\n#+end_src")) +;; (add-to-list 'org-structure-template-alist +;; '("Y" '(tutu))) +(setq tata "2") +(add-to-list 'org-structure-template-alist + '("Y" tata)) +#+end_src + +#+RESULTS: +| Y | tata | +| Y | toto | +| Y | (quote (tutu)) | +| Y | (tutu) | +| Y | tutu | +| Y | #+begin_src R | + +#+begin_src emacs-lisp +(setq a (assoc "Y" org-structure-template-alist)) +a +#+end_src + +#+RESULTS: +| Y | tata | + +Unfortunately, when expending, the code checks whether the right value +is a string (through th stringp function). + +[[file:~/Work/org-mode/lisp/org.el::(defun%20org-try-structure-completion%20()][This is where org-structure-template-alist is used in org-mode 9.0.5's code]]. + +#+begin_src emacs-lisp +(defun org-try-structure-completion () + "Try to complete a structure template before point. +This looks for strings like \" @file Alya.f90 + !! @author Guillaume Houzeaux + !! @brief Ayla main +@@ -20,7 +21,13 @@ + use def_master, only : kfl_goblk + use def_master, only : kfl_gocou + use def_coupli, only : kfl_gozon ++ use mod_parall, only : PAR_MY_WORLD_RANK + implicit none ++ INTEGER :: iter,ierror ++ character*100 striter ++ character*100 iterfile ++ real :: tnow ++ real, dimension(2) :: tarray + ! + ! DLB should be disabled as we only wabnt to activate it for particular loops + ! Master does not disble to lend its resources automatically +@@ -39,6 +46,10 @@ + + call Parall(22270_ip) + ++ write(iterfile,'(a,i4.4,a)') 'iterations-', PAR_MY_WORLD_RANK, '.csv' ++ open(unit=2, file=iterfile) ++ ++ iter = 1 + optimization: do while ( kfl_goopt == 1 ) + + call Iniunk() +@@ -49,6 +60,11 @@ + time: do while ( kfl_gotim == 1 ) + + call Timste() ++ call ETIME(tarray, tnow) ++ write(2,*) tnow, tarray(1), tarray(2), PAR_MY_WORLD_RANK, iter ++ ++ write(striter, '(a,i3.3)') 'iter',iter ++ SCOREP_USER_REGION_BY_NAME_BEGIN(striter, SCOREP_USER_REGION_TYPE_PHASE) + + reset: do + call Begste() +@@ -77,6 +93,13 @@ + + call Endste() + ++ SCOREP_USER_REGION_BY_NAME_END(striter) ++ call ETIME(tarray, tnow) ++ write(2,*) tnow, tarray(1), tarray(2), PAR_MY_WORLD_RANK, iter ++ iter = iter + 1 ++ ++ ++ + call Filter(ITASK_ENDTIM) + call Output(ITASK_ENDTIM) + +@@ -91,6 +114,9 @@ + + end do optimization + ++ close(2) ++ + call Turnof() + ++ + end program Alya +Index: Sources/services/parall/par_prepro.f90 +=================================================================== +--- Sources/services/parall/par_prepro.f90 (revision 9154) ++++ Sources/services/parall/par_prepro.f90 (working copy) +@@ -23,6 +23,7 @@ + use mod_redistribute, only : redistribute + use mod_redistribute, only : gather_to_master + use mod_parall, only : PAR_WORLD_SIZE ++ use mod_parall, only : PAR_MY_WORLD_RANK + use mod_parall, only : PAR_METIS4 + use mod_parall, only : PAR_SFC + use mod_parall, only : PAR_ORIENTED_BIN +@@ -39,6 +40,13 @@ + integer(ip) :: npoin_tmp, nelem_tmp, nboun_tmp,ndims + real(rp) :: time1,time2,time3,time4,time5 + character(100) :: messa_integ ++ ! LUCAS ++ character*100 dfile ++ integer(ip) ii ++ ! END LUCAS ++ ++ ++ + ! + ! Intermediate variables for partitioning + ! +@@ -384,4 +392,19 @@ + end if + + ++ ! LUCAS ++ write(dfile,'(a,i4.4,a)') 'domain-', PAR_MY_WORLD_RANK, '.csv' ++ open(unit=12345, file=dfile) ++ ++ write(12345,*) "T1", PAR_MY_WORLD_RANK, nelem, npoin, nboun, npoi3-npoi2 ++ do ii = 1, nelem ++ write(12345,*) "T2", PAR_MY_WORLD_RANK, ii, ltype(ii) ++ end do ++ do ii = 1, nboun ++ write(12345,*) "T3", PAR_MY_WORLD_RANK, ii, ltypb(ii) ++ end do ++ ++ close(12345) ++ ! END LUCAS ++ + end subroutine par_prepro +#+end_example + +**** Compilation + +In the =./ThirdParties/metis-4.0/=, compile Metis: + +#+begin_src shell :results output +cd ./ThirdParties/metis-4.0/ +make +#+end_src + +In the =./Executables/unix/= Alya directory, do. + +Copy =./configure.in/config_gfortran.in= to config.in. + +Make it look like this (in the begginning): + +#+BEGIN_EXAMPLE +SCOREP = ~/install/scorep-4.0/bin/scorep +MPIF90 = ~//spack-ALYA/opt/spack/linux-debian9-x86_64/gcc-6.3.0/openmpi-3.0.0-a7g33v4ulwtb4g2verliyelvtifybrq3/bin/mpif90 +MPICC = ~//spack-ALYA/opt/spack/linux-debian9-x86_64/gcc-6.3.0/openmpi-3.0.0-a7g33v4ulwtb4g2verliyelvtifybrq3/bin/mpicc +F77 = $(SCOREP) --user --nocompiler --nopomp --noopenmp $(MPIF90) -cpp +F90 = $(SCOREP) --user --nocompiler --nopomp --noopenmp $(MPIF90) -cpp +FCOCC = $(SCOREP) --user --nocompiler --nopomp --noopenmp $(MPICC) -c +#+END_EXAMPLE + +Then, configure the compilation + +#+begin_src shell :results output +./configure nastin parall +make -j 48 +#+end_src + +The compilation fails with + +#+BEGIN_EXAMPLE +/gpfs/home/bsc21/bsc21835/alya-bsc-sfc/Sources/kernel/coupli/mod_commdom_alya.f90:489:2: + + CPLNG%sendrecv(1,6) = (current_task==ITASK_BEGSTE).and.( current_when==ITASK_BEFORE).and.& ! \ + 1 +Error: Unclassifiable statement at (1) +#+END_EXAMPLE + +The solution is to comment those lines (and =make= again). + +The compilation fails with =par_prepro.f90=, just comment (or remove) +the line that has =use mod_par_partit_sfc,=. + +*** R + +Installed in the image. + +** Jequi +* Scripts +** Get info from the platform +*** Get machine information before doing the experiment +#+begin_src sh :results output :tangle scripts/get_info.sh :tangle-mode (identity #o755) +#!/bin/bash +# Script for to get machine information before doing the experiment + +set +e # Don't fail fast since some information is maybe not available + +title="Experiment results" +starpu_build="" +inputfile="" +host="$(hostname | sed 's/[0-9]*//g' | cut -d'.' -f1)" +help_script() +{ + cat << EOF +Usage: $0 [options] outputfile.org + +Script for to get machine information before doing the experiment + +OPTIONS: + -h Show this message + -t Title of the output file + -s Path to the StarPU installation + -i Input file name if doing SimGrid simulation based on input +EOF +} +# Parsing options +while getopts "t:s:i:h" opt; do + case $opt in + t) + title="$OPTARG" + ;; + s) + starpu_build="$OPTARG" + ;; + i) + inputfile="$OPTARG" + ;; + h) + help_script + exit 4 + ;; + \?) + echo "Invalid option: -$OPTARG" + help_script + exit 3 + ;; + esac +done + +shift $((OPTIND - 1)) +filedat=$1 +if [[ $# != 1 ]]; then + echo 'ERROR!' + help_script + exit 2 +fi + +################################################## +# Preambule of the output file +echo "#+TITLE: $title" >> $filedat +echo "#+DATE: $(eval date)" >> $filedat +echo "#+AUTHOR: $(eval whoami)" >> $filedat +echo "#+MACHINE: $(eval hostname)" >> $filedat +echo "#+FILE: $(eval basename $filedat)" >> $filedat +if [[ -n "$inputfile" ]]; +then + echo "#+INPUTFILE: $inputfile" >> $filedat +fi +echo " " >> $filedat + +################################################## +# Collecting metadata +echo "* MACHINE INFO:" >> $filedat + +echo "** PEOPLE LOGGED WHEN EXPERIMENT STARTED:" >> $filedat +who >> $filedat +echo "############################################" >> $filedat + +echo "** ENVIRONMENT VARIABLES:" >> $filedat +env >> $filedat +echo "############################################" >> $filedat + +echo "** HOSTNAME:" >> $filedat +hostname >> $filedat +echo "############################################" >> $filedat + +if [[ -n $(command -v lstopo) ]]; +then + echo "** MEMORY HIERARCHY:" >> $filedat + lstopo --of console >> $filedat + echo "############################################" >> $filedat +fi + +if [[ -n "$starpu_build" ]]; +then + echo "** STARPU MACHINE DISPLAY:" >> $filedat + $starpu_build/bin/starpu_machine_display 1> tmp 2> /dev/null + cat tmp >> $filedat + rm -f tmp + echo "############################################" >> $filedat +fi + +if [ -f /proc/cpuinfo ]; +then + echo "** CPU INFO:" >> $filedat + cat /proc/cpuinfo >> $filedat + echo "############################################" >> $filedat +fi + +if [ -f /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor ]; +then + echo "** CPU GOVERNOR:" >> $filedat + ONLINECPUS=$(for CPU in $(find /sys/devices/system/cpu/ | grep cpu[0-9]*$); do [[ $(cat $CPU/online) -eq 1 ]] && echo $CPU; done | grep cpu[0-9]*$ | sed 's/.*cpu//') + for PU in ${ONLINECPUS}; do + echo -n "CPU frequency for cpu${PU}: " >> $filedat + cat /sys/devices/system/cpu/cpu${PU}/cpufreq/scaling_governor >> $filedat + done + echo "############################################" >> $filedat +fi + +if [ -f /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq ]; +then + echo "** CPU FREQUENCY:" >> $filedat + ONLINECPUS=$(for CPU in $(find /sys/devices/system/cpu/ | grep cpu[0-9]*$); do [[ $(cat $CPU/online) -eq 1 ]] && echo $CPU; done | grep cpu[0-9]*$ | sed 's/.*cpu//') + for PU in ${ONLINECPUS}; do + echo -n "CPU frequency for cpu${PU}: " >> $filedat + cat /sys/devices/system/cpu/cpu${PU}/cpufreq/scaling_cur_freq >> $filedat + done + echo "############################################" >> $filedat +fi + +if [ -f /usr/bin/cpufreq-info ]; +then + echo "** CPUFREQ_INFO" >> $filedat + cpufreq-info >> $filedat + echo "############################################" >> $filedat +fi + +if [ -f /usr/bin/lspci ]; +then + echo "** LSPCI" >> $filedat + lspci >> $filedat + echo "############################################" >> $filedat +fi + +if [ -f /usr/bin/ompi_info ]; +then + echo "** OMPI_INFO" >> $filedat + ompi_info --all >> $filedat + echo "############################################" >> $filedat +fi + +if [ -f /sbin/ifconfig ]; +then + echo "** IFCONFIG" >> $filedat + /sbin/ifconfig >> $filedat + echo "############################################" >> $filedat +fi + +if [[ -n $(command -v nvidia-smi) ]]; +then + echo "** GPU INFO FROM NVIDIA-SMI:" >> $filedat + nvidia-smi -q >> $filedat + echo "############################################" >> $filedat +fi + +if [ -f /proc/version ]; +then + echo "** LINUX AND GCC VERSIONS:" >> $filedat + cat /proc/version >> $filedat + echo "############################################" >> $filedat +fi + +if [[ -n $(command -v module) ]]; +then + echo "** MODULES:" >> $filedat + module list 2>> $filedat + echo "############################################" >> $filedat +fi +#+end_src + +** Common Scripts +*** Colors + error report +#+begin_src sh :results output :tangle scripts/colors_error_report.sh :tangle-mode (identity #o755) +#!/bin/bash +set -euo pipefail + +# Error function +function error_script { + ERRORMSG=${1:-} + echo "Error: $ERRORMSG" + exit 1 +} + +# Check functions +function check_var { + VAR=${1:-} + MSG=${2:-} + if [[ -z "$VAR" ]]; then + error_script $MSG + fi +} + +function check_bin { + BIN=${1:-} + MSG=${2:-} + if [[ ! -x "$BIN" ]]; then + error_script $MSG + fi +} + +function check_env { + ENV=${1:-} + MSG=${2:-} + if [ -z ${ENV+x} ]; then + error_script $MSG + fi +} + +function info { + INFOMSG=${1:-} + echo "Info: $INFOMSG" +} +#+end_src + +*** Create experimental directory (only functions) +#+begin_src sh :results output :tangle scripts/create_experiment_dir.sh :tangle-mode (identity #o755) +#!/bin/bash + +# fail often, so we are robust +set -euo pipefail + +# source functions +DIR=$(dirname $0) +source $DIR/colors_error_report.sh + +# Function to cat a file within an example block +function org_example_file { + FILE=${1:-} + if [ -e "$FILE" ]; then + echo "#+BEGIN_EXAMPLE" + cat $FILE + echo "#+END_EXAMPLE" + fi +} + +# Function that returns the unique org filename given an experiment unique identifier +function org_err_file { + UNIQUE=${1:-} + if [ -z "$UNIQUE" ]; then + error_script "Error: Unique key not passed as parameter to org_err_file." + fi + echo $(basename $UNIQUE .org)_err.org +} + +# Function that returns the unique org filename given an experiment unique identifier +function org_file { + UNIQUE=${1:-} + if [ -z "$UNIQUE" ]; then + error_script "Error: Unique key not passed as parameter to org_file." + fi + echo $(basename $UNIQUE .org).org +} + +function org_info_file { + UNIQUE=${1:-} + if [ -z "$UNIQUE" ]; then + error_script "Error: Unique key not passed as parameter to org_file." + fi + echo $(basename $UNIQUE .org)_get_info.org +} + +# Function that returns the unique directory name given an experiment unique identifier +function dir_file { + UNIQUE=${1:-} + if [ -z "$UNIQUE" ]; then + error_script "Error: Unique key not passed as parameter to dir_file." + fi + echo $(basename $UNIQUE .org).dir +} + +function create_experiment_dir { + UNIQUE=${1:-} + + if [ -z "$UNIQUE" ]; then + error_script "Error: Unique key not passed as parameter to create_experiment_directory." + fi + + # Preparation of the output (org and dir) + OUTPUTORG=$(org_file $UNIQUE) + OUTPUTORGDIR=$(dir_file $UNIQUE) + if [ -e $OUTPUTORG ]; then + error_script "$OUTPUTORG already exists. Please, remove it or use another unique name." + usage; + exit; + else + info "Experiment Org file: $OUTPUTORG" + fi + if [ -d $OUTPUTORGDIR ]; then + error_script "$OUTPUTORGDIR directory already exists. Please, remove it or use another unique name." + usage; + exit; + else + info "Experiment Directory: $OUTPUTORGDIR" + fi + # Everything seems okay, let's prepare the terrain + touch $OUTPUTORG + if [ $? -ne 0 ]; then + error_script "Creation of $OUTPUTORG did not work." + exit; + fi + mkdir $OUTPUTORGDIR + if [ $? -ne 0 ]; then + error_script "Creation of $OUTPUTORGDIR did not work." + exit; + fi +} +#+end_src +*** Detect Hyperthreading +Inspired by: +http://unix.stackexchange.com/questions/33450/checking-if-hyperthreading-is-enabled-or-not/33509#33509 +#+begin_src R :results output :session :exports both :tangle scripts/hyperthreading.sh :tangle-mode (identity #o755) +#!/bin/bash +CPUFILE=/proc/cpuinfo +test -f $CPUFILE || exit 1 +NUMPHYCPU=`grep "physical id" $CPUFILE | sort -u | wc -l` +NUMLOGCORE=`grep "processor" $CPUFILE | wc -l` +NUMPHYCORE=`grep "core id" $CPUFILE | sort -u | wc -l` +TOTALNUMPHYCORE=$(echo "$NUMPHYCPU * $NUMPHYCORE" | bc) +MODEL=`grep "model name" $CPUFILE | sort -u | cut -d : -f 2- | sed "s/^[[:space:]]*//"` +echo "This system has $NUMPHYCPU CPUs, of model \"$MODEL\"." +echo "Each physical CPU is equipped with $NUMPHYCORE physical cores (total is $TOTALNUMPHYCORE)." +if [ $TOTALNUMPHYCORE -ne $NUMLOGCORE ]; then + echo "Hyperthreading is ON. So, there are $NUMLOGCORE logical cores." +else + echo "Hyperthreading is OFF." +fi +exit +#+end_src +*** Disable Hyperthreading +#+begin_src sh :results output :tangle scripts/disable_hyperthreading.sh :tangle-mode (identity #o755) +#!/bin/bash +DIR=$(dirname $0) + +#First, enable all cores +for PU in `find /sys/devices/system/cpu/ |grep cpu[0-9]*$`; do + echo "Enabling $PU now." + sudo zsh -c "echo 1 > ${PU}/online" +done + +HYPERTHREADING=`$DIR/hyperthreading.sh | grep -e "Hyperthreading is ON" | wc -l` +if [ $HYPERTHREADING -eq 0 ]; then + echo "Hyperthreading is OFF, so disabling is not necessary." + exit +else + echo "Hyperthreading is ON." +fi +echo "The number of PUs now is $(hwloc-ls --only PU | wc -l)." +echo "I will disable hyperthreading now." +# Disable hyperthreading +# Only run this if you are sure +# - Hyperthreading is enabled +# - Each physical core has two processing units (PU) +# - hwloc-ls is installed and reports two PU per physical core +for PU in `hwloc-ls --only PU | cat -n | grep -e "[[:space:]]*[0-9]*[02468][[:space:]]*PU" | sed -e "s/^[^(]*(P#\\([0-9]*\))/\1/"`; do + echo "Disabling PU $PU now." + sudo zsh -c "echo 0 > /sys/devices/system/cpu/cpu${PU}/online" +done +echo "The number of PUs now is $(hwloc-ls --only PU | wc -l)." +#+end_src +*** Disable TurboBoost +#+begin_src sh :results output :tangle scripts/disable_turboboost.sh :tangle-mode (identity #o755) +#!/bin/bash +DIR=$(dirname $0) + +if [ `lsmod | grep msr | wc -l` -ne 1 ]; then + echo "The =msr= module is not loaded. It should be." + exit 1; +fi + +# Get the list of online cores +ONLINECPUS=$(for CPU in $(find /sys/devices/system/cpu/ | grep -v cpu0 | grep cpu[0-9]*$); do [[ $(cat $CPU/online) -eq 1 ]] && echo $CPU; done | grep cpu[0-9]*$ | sed 's/.*cpu//') + +# Enable +for PU in ${ONLINECPUS}; do + sudo zsh -c "/usr/sbin/wrmsr -p${PU} 0x1a0 0x850089" +done + +# Disable & Check +for PU in ${ONLINECPUS}; do + echo "Disabling turbo boost mode for PU $PU." + sudo zsh -c "/usr/sbin/wrmsr -p${PU} 0x1a0 0x4000850089" + TURBOBOOST=$(sudo zsh -c "/usr/sbin/rdmsr -p${PU} 0x1a0 -f 38:38") + if [[ "0" = $TURBOBOOST ]]; then + echo "Failed to disable turbo boost for PU number $cpu. Aborting." + exit 1 + fi +done +#+end_src +*** CPU Selection +#+begin_src sh :results output :tangle scripts/cpu_selection.sh :tangle-mode (identity #o755) +#!/bin/bash +DIR=$(dirname $0) + +function usage() +{ + echo "Input: number of CPUs to be used" + echo "Output: core identifiers (NUMA-aware)" + echo "$0 "; +} + +NCPUS=${1:-} +if [ -z "$NCPUS" ]; then + echo "Error: is empty" + usage; + exit; +fi + +# Check total number of CPUs +NUMCPU=$($DIR/hyperthreading.sh | head -n1 | sed 's/^This system has \([0-9]*\) CPUs.*$/\1/') +if [ $NCPUS -gt $NUMCPU ]; then + echo "You request $NCPUS CPUs, but this system has only $NUMCPU. Sorry." + exit +fi + +RES=$(lscpu | grep "NUMA\ node[0-9]* " | sed "s/^.*:[[:space:]]*//" | head -n${NCPUS} | tr '\n' ' ' | sed "s/ $//") +FINAL=$(for i in $RES; do + if [ $(echo $i | grep "-" | wc -l) -eq 0 ]; then + #just replace commas by spaces + echo $i | sed "s/,/ /g" + else + #use seq to replace core ranges by actual core ids + S=$(echo $i | sed "s/-.*$//"); + E=$(echo $i | sed "s/^.*-//"); + seq $S $E + fi +done | tr '\n' ' ') +echo "Use these: $FINAL" +#+end_src +*** Detect DVFS driver (acpi-cpufreq) +#+begin_src sh :results output :tangle scripts/detect_acpidriver.sh :tangle-mode (identity #o755) +#!/bin/bash +DIR=$(dirname $0) + +function usage() +{ + echo "Input: number of CPUs to be used" + echo "Output: core identifiers (NUMA-aware)" + echo "$0 "; +} + +PRESENT=$(cpufreq-info | grep driver | uniq | grep acpi-cpufreq | wc -l) +if [ $PRESENT -ne 1 ]; then + exit 1; +fi + +exit 0 +#+end_src +*** Set frequency to maximum +#+begin_src sh :results output :tangle scripts/set_maximum_frequency.sh :tangle-mode (identity #o755) +#!/bin/bash +DIR=$(dirname $0) + +MAXFREQ=$(cpufreq-info | grep limits | sed -e "s/.*- //" -e "s/ //g" | uniq) + +# Get all online cores +ONLINECPUS=$(for CPU in $(find /sys/devices/system/cpu/ | grep -v cpu0 | grep cpu[0-9]*$); do [[ $(cat $CPU/online) -eq 1 ]] && echo $CPU; done | grep cpu[0-9]*$ | sed 's/.*cpu//') + +# Core 0 is always online +ONLINECPUS="0 ${ONLINECPUS}" + +for PU in ${ONLINECPUS}; do + echo "Setting the frequency of PU ${PU} to ${MAXFREQ}" + sudo cpufreq-set -c ${PU} -f ${MAXFREQ} + echo "After setting to max, the frequency is now $(cat /sys/devices/system/cpu/cpu${PU}/cpufreq/scaling_cur_freq)" +done +#+end_src +** Configure multi-node experimental settings +A script to control the experiment. +- [X] Input is a file with node names + +For each machine: +- [X] Disable turboboost +- [X] Disable hyperthreading +- [X] The =acpi_cpufreq= + - [X] Check for its use + - [X] Fix processor frequency to maximum +- [X] Call the getinfo script after everything + +#+begin_src sh :results output :tangle scripts/control_experiment.sh :tangle-mode (identity #o755) +#!/bin/bash +DIR=$(dirname $0) + +function usage() +{ + echo "$0 [dest]"; + echo " where is a file with a list of hostnames, one per line"; + echo " where is a directory where captured files should be moved to"; +} + +############################## +# Parameter handling section # +############################## +MACHINEFILE=${1:-} +if [ -z "$MACHINEFILE" ]; then + echo "Error: is empty" + usage; + exit; +fi + +DEST=${2:-} +if [ -z "$DEST" ]; then + DEST="." +fi + +############################## +# For every host # +############################## +while IFS='' read -u10 -r HOST; do + # Ignore comments and empty lines + [[ "$(echo $HOST | sed 's/ //g')" =~ ^$ ]] && continue + [[ "$HOST" =~ ^#.*$ ]] && continue + + # Clear anything after the colon + HOST=$(echo $HOST | sed "s/:.*$//g") + + echo "** $HOST" + echo "*** passwordless ssh" + + # Check is passwordless ssh is possible + X=$(eval ssh -o 'PreferredAuthentications=publickey' ${HOST} "echo -n"; echo $?) + if [ $X -ne 0 ]; then + echo "passwordless ssh is not possible to ${HOST}." + exit ; + else + echo "${HOST} is acessible through passwordless ssh." + fi + + echo "*** Check presence of acpi-cpufreq driver" + + # Check for acpi-cpufreq + scp $DIR/detect_acpidriver.sh ${HOST}:/tmp/ + PRESENT=$(eval ssh ${HOST} "/tmp/detect_acpidriver.sh ; echo $?") + if [ $PRESENT -ne 0 ]; then + echo "The driver acpi-cpufreq is not available in ${HOST}" + exit + else + echo "I believe the ${HOST} is using the acpi-cpufreq DVFS driver." + fi + + echo "*** Disable hyperthreading" + + # Disable hyperthreading + scp $DIR/hyperthreading.sh ${HOST}:/tmp/ + scp $DIR/disable_hyperthreading.sh ${HOST}:/tmp/ + ssh ${HOST} /tmp/disable_hyperthreading.sh + + echo "*** Disable turboboost" + + # Disable turboboost + scp $DIR/disable_turboboost.sh ${HOST}:/tmp/ + ssh ${HOST} /tmp/disable_turboboost.sh + + echo "*** Set the frequency to maximum" + + # Manually set the frequency to maximum + scp $DIR/set_maximum_frequency.sh ${HOST}:/tmp/ + ssh ${HOST} /tmp/set_maximum_frequency.sh + + echo "*** Call the get_info.sh" + + # After everything, call the get_info.sh script + scp $DIR/get_info.sh ${HOST}:/tmp/ + GETINFO=$(echo ${HOST} | sed "s/\./_/g" | sed "s/$/_get_info.org/") + ssh ${HOST} "rm -f /tmp/${GETINFO}" + ssh ${HOST} "/tmp/get_info.sh -t 'Get information from the ${HOST}' /tmp/${GETINFO}" + scp ${HOST}:/tmp/$GETINFO . + echo $GETINFO + if [ "$DEST" != "." ]; then + mv -f $GETINFO $DEST + fi + +done 10< ${MACHINEFILE} +#+end_src +** Run an Alya experiment +Input: +- [X] The machine file with the list of reserved machines +- [X] The number of MPI processes to launch +- [X] Path to Alya binary +- [X] Path to the DAT file + +Everything should be ran in =/tmp/= directory, outside of NFS. The +script will pushd to the directory of the DAT file (that should be +outside of NFS) and ran from there. + +#+begin_src sh :results output :tangle scripts/run_alya_experiment.sh :tangle-mode (identity #o755) +#!/bin/bash +DIR=$(dirname $0) + +source $DIR/create_experiment_dir.sh + +function usage() +{ + echo "$0 "; + echo " where is a file with a list of hostnames, one per line"; + echo " where is the number of MPI process"; + echo " where is the path to the Alya binary"; + echo " where is the path to the DAT file"; + echo " where is a comment without white spaces"; + echo " where (for infiniband) is true or false"; +} + +function now() { date +%s; } + +############################## +# Parameter handling section # +############################## +MACHINEFILE=${1:-} +if [ -z "$MACHINEFILE" ]; then + echo "Error: is empty" + usage; + exit; +fi +MACHINEFILE=$(readlink -f $MACHINEFILE) + +if [ ! -e "$MACHINEFILE" ]; then + echo "Error: '$MACHINEFILE' does not exists." + usage; + exit +fi + +NP=${2:-} +if [ -z "$NP" ]; then + echo "Error: is empty" + usage; + exit; +fi + +ALYA=${3:-} +if [ -z "$ALYA" ]; then + echo "Error: is empty" + usage; + exit; +fi +ALYA=$(readlink -f $ALYA) + +DAT=${4:-} +if [ -z "$DAT" ]; then + echo "Error: is empty" + usage; + exit; +fi +DAT=$(readlink -f $DAT) + +UNIQUE=${5:-} +if [ -z "$UNIQUE" ]; then + echo "Error: is empty" + usage; + exit; +else + # Make sure unique is without spaces + UNIQUE=$(echo $UNIQUE | sed "s/ /_/g") +fi + +IB=${6:-} +if [ -z "$IB" ]; then + echo "Error: is empty"; + exit; +fi + +############################## +# Create the directory # +############################## +# The org file of this script +OUTPUTORG=$(org_file $UNIQUE) +OUTPUTORGERR=$(org_err_file $UNIQUE) +# The getinfo of where this script has been executed +OUTPUTORGINFO=$(org_info_file $UNIQUE) +# The place where results will be stored +OUTPUTORGDIR=$(dir_file $UNIQUE) +# Create the experimental directory +create_experiment_dir $UNIQUE +# Create the two org files +touch $OUTPUTORG $OUTPUTORGINFO +# From now on, all standard output goes to $OUTPUTORG +exec > $OUTPUTORG +exec 2> $OUTPUTORGERR + +DRYRUN=false + +############################## +# Prepare the experiment # +############################## +echo "* Prepare the environment" + +# Call control the experiment +if [ "$DRYRUN" = "false" ]; then + $DIR/control_experiment.sh ${MACHINEFILE} ${OUTPUTORGDIR} +else + echo "This is a dry run, not calling the control experiment" +fi + +# Save the machinefile forever +cp $MACHINEFILE ${OUTPUTORGDIR} + +echo "* Alya" +echo "** Binary" +echo "The alya binary: $ALYA" +ls -lh $ALYA +# ldd is rather important +ldd $ALYA +# Heinrich told me he saves the binary, let's do the same +cp $ALYA $OUTPUTORGDIR +echo "** Input" + +echo "*** DAT" +echo "The contents of the dat file:" +ls -lh $DAT +org_example_file $DAT +DATDIR=$(dirname $DAT) +DATBASE=$(basename $DAT .dat) +echo "The DAT dir is ${DATDIR}, the DAT base is ${DATBASE}" + +echo "*** DOM.DAT" +DOMDAT=${DATDIR}/${DATBASE}.dom.dat +echo "The contents of the $DOMDAT" +ls -lh $DOMDAT +org_example_file $DOMDAT + +echo "*** KER.DAT" +KERDAT=${DATDIR}/${DATBASE}.ker.dat +echo "The contents of the $KERDAT" +ls -lh $KERDAT +org_example_file $KERDAT + +echo "*** NSI.DAT" +NSIDAT=${DATDIR}/${DATBASE}.ker.dat +echo "The contents of the $NSIDAT" +ls -lh $NSIDAT +org_example_file $NSIDAT + +if [ "$DRYRUN" = "false" ]; then + $DIR/get_info.sh -t "Get information from launcher host $(hostname)" $OUTPUTORGINFO +else + echo "This is a dry run, not calling the get info script" +fi + +############################## +# Run the experiment # +############################## +echo "* ScoreP directory" +SCOREPDIR="$HOME/scorep_${UNIQUE}" +echo "$SCOREPDIR (finally moved to this experiment directory)" +rm -rf $SCOREPDIR + +echo "* The Experiment" +if [ "$IB" = "true" ]; then + OPENIB="" +else + OPENIB="OMPI_MCA_btl=\"tcp,self,sm\"" +fi + +# btl_base_verbose is set to 40 "info" +COMMAND="${OPENIB} mpirun --mca btl_base_verbose 40 -x SCOREP_MPI_ENABLE_GROUPS=no -x SCOREP_VERBOSE=true -x SCOREP_TOTAL_MEMORY=4G -x SCOREP_EXPERIMENT_DIRECTORY=${SCOREPDIR} -x SCOREP_OVERWRITE_EXPERIMENT_DIRECTORY=true -x SCOREP_ENABLE_TRACING=true --bind-to core --report-bindings -np $NP -machinefile $MACHINEFILE $ALYA ${DATDIR}/${DATBASE}" +echo "The command to be executed: '$COMMAND'" +echo "The epoch time now, before the execution, is: $(now)" +echo "#+BEGIN_EXAMPLE" +if [ "$DRYRUN" = "false" ]; then + eval $COMMAND +fi +echo "#+END_EXAMPLE" +echo "The epoch time now, after the execution, is: $(now)" + +############################## +# Copy LOGs # +############################## +echo "* The Logs" +ls -lh ${DATDIR}/*.log +cp ${DATDIR}/*.log $OUTPUTORGDIR + +echo "* Copy other small produced data" +ls -lh ${DATDIR}/*.cvg ${DATDIR}/*.msh ${DATDIR}/*.res ${DATDIR}/*.sol ${DATDIR}/*.set +cp ${DATDIR}/*.cvg ${DATDIR}/*.msh ${DATDIR}/*.res ${DATDIR}/*.sol ${DATDIR}/*.set $OUTPUTORGDIR + +echo "* Copy results of our manual instrumentation of alya code" +ls -lh results_NPOIN_NELEM_NELEW_NBOUN.log +mv -f results_NPOIN_NELEM_NELEW_NBOUN.log $OUTPUTORGDIR + +############################## +# Save in the directory # +############################## +# Move scorep directory to +mv -f $SCOREPDIR $OUTPUTORGDIR + +# Move org file to directory +mv -f $OUTPUTORG $OUTPUTORGDIR + +# Move org err file to directory +mv -f $OUTPUTORGERR $OUTPUTORGDIR + +# Move get_info org to directory +mv -f $OUTPUTORGINFO $OUTPUTORGDIR + +# Copy myself to directory +cp -f $0 $OUTPUTORGDIR + +# Copy all scripts to directory +cp -f $DIR/*.sh $OUTPUTORGDIR + +# Exit graciously +exit 0 +#+end_src + +#+RESULTS: +** Convert traces.otf2 to CSV (version with =pj_dump=) +#+begin_src sh :results output :tangle scripts/otf22csv.sh :tangle-mode (identity #o755) +#!/bin/bash +DIR=$(dirname $0) + +otf22csv() { + pushd $(dirname $otf2) + otf22paje traces.otf2 | pj_dump | grep ^State | cut -d, -f2,4,5,8 | sed -e "s/ //g" -e "s/MPIRank//" | gzip > traces.csv.gz + popd +} + +function usage() +{ + echo "$0 " + echo " where is a directory that contains .otf2 files"; +} + +############################## +# Parameter handling section # +############################## +OTF2DIR=${1:-} +if [ -z "$OTF2DIR" ]; then + echo "Error: is empty" + usage; + exit; +fi +OTF2DIR=$(readlink -f $OTF2DIR) + +if [ ! -d "$OTF2DIR" ]; then + echo "Error: $OTF2DIR is not a directory" + usage + exit +fi + +############################## +# Check for necessary tools # +############################## +if [ -z "$(which pj_dump)" ]; then + echo "Error: pj_dump is not in the PATH" + exit +fi +if [ -z "$(which otf22paje)" ]; then + echo "Error: otf22paje is not in the PATH" + exit +fi + +# Files already converted (whose CSV size is not zero) +EXISTINGFILE=$(tempfile) +OTF2FILE=$(tempfile) +find $OTF2DIR -not -empty | grep csv.gz$ | sed -e "s/.csv.gz$//" | sort > $EXISTINGFILE +find $OTF2DIR | grep \.otf2$ | sed -e "s/.otf2$//" | sort > $OTF2FILE + +for otf2 in $(comm -3 $OTF2FILE $EXISTINGFILE | sed "s/$/.otf2/"); do + echo $otf2 + otf22csv $otf2 +done +#+end_src +** Convert traces.otf2 to CSV (version *without* =pj_dump=, only *otf22csv*) +#+begin_src sh :results output :tangle scripts/otf22csv_faster.sh :tangle-mode (identity #o755) +#!/bin/bash +DIR=$(dirname $0) + +otf22csv_faster() { + pushd $(dirname $otf2) + otf22csv traces.otf2 | gzip > traces.csv.gz + popd +} + +function usage() +{ + echo "$0 " + echo " where is a directory that contains .otf2 files"; +} + +############################## +# Parameter handling section # +############################## +OTF2DIR=${1:-} +if [ -z "$OTF2DIR" ]; then + echo "Error: is empty" + usage; + exit; +fi +OTF2DIR=$(readlink -f $OTF2DIR) + +if [ ! -d "$OTF2DIR" ]; then + echo "Error: $OTF2DIR is not a directory" + usage + exit +fi + +############################## +# Check for necessary tools # +############################## +if [ -z "$(which otf22csv)" ]; then + echo "Error: otf22csv is not in the PATH" + exit +fi + +# Files already converted (whose CSV size is not zero) +EXISTINGFILE=$(tempfile) +OTF2FILE=$(tempfile) +find $OTF2DIR -not -empty | grep csv.gz$ | sed -e "s/.csv.gz$//" | sort > $EXISTINGFILE +find $OTF2DIR | grep \.otf2$ | sed -e "s/.otf2$//" | sort > $OTF2FILE + +for otf2 in $(comm -3 $OTF2FILE $EXISTINGFILE | sed "s/$/.otf2/"); do + echo $otf2 + otf22csv_faster $otf2 +done +#+end_src +** Post-processing in R +#+begin_src R :results output :session :exports both :tangle scripts/trace2summary.R :tangle-mode (identity #o755) +#!/usr/bin/Rscript +library(readr); +library(dplyr); +alya_scorep_trace_read <- function(filename) +{ + meta <- unlist(strsplit(unlist(strsplit(filename, "/"))[3], "_")); + read_csv(filename, + col_names=c("Rank", "Start", "End", "Value"), + progress=TRUE) %>% + # Transform Value to factor + mutate(Value = as.factor(Value)) %>% + # Detect begin and end of iterations + mutate(Iteration = case_when( + grepl("timste", .$Value) ~ 1, + grepl("endste", .$Value) ~ -1, + TRUE ~ 0)) %>% + group_by(Rank) %>% + # Create a logical to detect observations within iterations + mutate(Iteration = as.logical(cumsum(Iteration))) %>% + # Get only observations that belongs to some iteration + filter(Iteration == TRUE) %>% + ungroup() %>% + # Create the iteration by cumsum + mutate(Iteration = case_when( + grepl("timste", .$Value) ~ 1, + TRUE ~ 0)) %>% + group_by(Rank) %>% + mutate(Iteration = as.integer(cumsum(Iteration))) %>% + ungroup() %>% + # Define metadata + mutate(EID = meta[2], + Platform = meta[3], + Nodes = meta[4], + NP = meta[5], + Partitioning = meta[6], + Infiniband = as.logical(meta[7])); +} + +alya_scorep_trace_iterations <- function(filename) +{ + alya_scorep_trace_read(filename) %>% + group_by(Rank, Iteration, Platform, Nodes, NP, Partitioning, EID, Infiniband) %>% + filter(grepl("MPI_", Value)) %>% + summarize(N=n(), S=min(Start), E=max(End), Comm=sum(End-Start), Comp=(E-S)-Comm); +} + +args = commandArgs(trailingOnly=TRUE) +if (length(args) < 1) { + stop("Usage: trace2summary.R ", call.=FALSE) +} + +# +# parse arguments +# +outputFilename = args[[1]]; +inputs = unlist(args[2:length(args)]); + +df <- do.call("rbind", lapply(inputs, function(x) { alya_scorep_trace_iterations(x) })); +write.csv(df, outputFilename); +#+end_src +* Guilherme entries +* Meeting +** 2018-04-18 Arnaud and Lucas + +At http://rendez-vous.renater.fr/alyalbsfc + +Rick's goal: 10K cores in MN4 + +Eeach iteration has: Assemblage and Solver + +Rick has made a perfect LB in the Assembly phase only. + +The idea now is to make him repeat our observation methodology. + +** 2018-04-18 Paper Outline (Discussion @ Barcelona w. Arnaud and Ricard) +*** General information +- Target: Physics journal/conf, multi-physics simulation + developers. Note that I think this kind of work would totally make + sense in CCPE or JPDC. +- Thoughts about our work: + - Nothing specific to Alya + - Load balance in-depth study and mitigation is interesting as + everyone relies on METIS or SFC. + - Profiling technique not new for HPC performance experts but + interesting for physics simulation developers. + - 1D load balancing strategy not new from an algorithmic perspective + - Refinement process expected to work but could have failed as + well... + - Doing a good load estimation and a good load balance in + practice on a real code is not that easy common. +- Goal: explain how this can be done as it could apply to many other + codes +*** Outline +**** Intro +**** Context :Guillaume:Ricard: + - Alya structure + - 3 typical test cases) + - respiratory (up to ~1 K cpus - ~25M elem. mesh) + - combustor (up to ~3K cpus - ~50M elem. mesh) + - plane (up to ~6K cpus - 160M elem. mesh) + - Scalability not satifying regardless of the partitioner in use + (METIS, SFC). +**** In-depth Load Analysis of one test case (mostly :Lucas:Arnaud: + - Tracing is intrusive and cumbersome but an informed profiling with + e.g., Score-P is cheap and requires almost no modification of the + application. + - Illustrate the load imbalance and regularity of the application + across iterations. Illustrate the fact that the load imbalance is + different for assembly and for the sparse solver. + - Explain the reason behind the bad load balance (e.g., with SFC but + this does not depend on the partitioner). There is no magic + formula relating geometric characteristics with the actual + execution time. +**** Proposal: iterative improvement of the load distribution on an SFC :Lucas:Arnaud:Ricard: + parition based on a few iterations of the application. + - Illustration of the quick improvements of the load balance +**** Evaluation of the technique on the three test cases at different scales :Lucas:Arnaud:Ricard: + - Report both: + - the parallel execution time + - an evaluation of the load imbalance on the assembly and on the + solver to know how much gain can still be expected + - the amount of communications +**** Conclusion + - The main issue is the load estimation which seems very hard to do + without running the code for real. + - Our iterative load refinement allows significant gains but + requires a few "dry runs". Ideally, it would be possible to + rebalance the load at runtime, which is something on which we are + working. +*** Issues to solve (Ordered by priority) +- Built-in standard profiling :: Ricard will run the experiments as he + has more experience in this but we need systematic and precise + load profiles for both writing the paper and deciding how to + efficiently balance (e.g., balancing the assembly vs. balancing + the sparse solver vs. both at the same time). To this end, Lucas + should install score-p on MN4 and explain Ricard how to obtain + the traces/profiles. Arnaud and Lucas will then do the analysis + and graph generation. +- Granularity problem :: Too coarse granularity is the best + explanation (simpler than unstable memory ordering) for the lack + of good performance at scale and the "jumps" we observed with + Lucas. It seems there was some memory issue preventing Ricard to + run with a finer granularity (bin size) +- Pre-processing :: Solve pre-processing problems to run SFC on large + cases. +*** Possible future work +- Could DLB be applied with our static iterative improvemet (e.g., by + grouping heavily loaded nodes with lightly loaded ones if we cannot + solve our granularity problem). Guillaume is not convinced as there + is some overhead. +- Would it make sense to have different bin sizes, i.e., an SFC more + or less fine grained ? If so would it be possible to adjust the bin + size when reading the data structure ? What about a simple quad-tree + that would automatically split when bins are too large ? First, we + would record where elements lie, and estimate their density locally + (Note that this could be a pre-processing step independant of the + execution of Alya...) then, we would merge the density from the + different ranks. The density could then be used to determine at + which granularity the SFC should be managed. + +* Temporal entries +** Old entries (from my journal) +*** 2017-02-06 Alya Meeting :MEET: + Two alternatives with just one physics model: + - METIS partitioning + - Space filling curves + + Tasks: + 1. Check if there is really a difference + 2. Compare the two .par files + - With GID or ours (ggplot2) + + Sources of problems: + - Load balancing problem + - Too many communication + - Badly distributed + + Entered on [2017-02-06 Mon 17:46] +*** 2017-02-07 Installing and Testing GID :BOOT: + +http://www.gidhome.com/download/official-versions/linux64/ + +Download the =tar.gz= version: +#+begin_src shell :results output +cd /tmp/ +wget -q ftp://www.gidhome.com/pub/GiD_Official_Versions/Linux/amd64/gid13.0.1-linux-x64.tar.gz +#+end_src + +#+RESULTS: + +Extract it +#+begin_src shell :results output +cd /tmp/ +tar xfzv gid13.0.1-linux-x64.tar.gz | head +#+end_src + +#+RESULTS: +#+begin_example +gid13.0.1-x64/ +gid13.0.1-x64/gid.exe +gid13.0.1-x64/desktop_icons.tcl +gid13.0.1-x64/scripts/ +gid13.0.1-x64/scripts/StatusLabel.tcl +gid13.0.1-x64/scripts/ContourColors.tcl +gid13.0.1-x64/scripts/units.tcl +gid13.0.1-x64/scripts/gdi/ +gid13.0.1-x64/scripts/gdi/pkgIndex.tcl +gid13.0.1-x64/scripts/DataWindows.tcl +#+end_example + +Check if there is a binary: +#+begin_src shell :results output +cd /tmp/gid13.0.1-x64/ +file gid +file gid.exe +#+end_src + +#+RESULTS: +: gid: POSIX shell script, ASCII text executable +: gid.exe: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.15, BuildID[sha1]=b5e042b34364226055f5446ec033407d51bac59c, stripped + +Let's go for the script, using the terminal. + +Tool works in my machine, I can read the examples, =render= to make them +beautiful. I tried the motor example. + +Entered on [2017-02-07 Tue 09:57] +*** 2017-02-07 Discussions avec Guillaume Houzeaux (Arnaud's Notes) :MEET: +***** Original notes +- Ouverture de compte sur marenostrum pour Lucas. Guillaume s'en + occupe et nous dit dès que c'est fait. En début de semaine prochaine + normalement. +- Question générale: Pourquoi les space filling curves sont tellement + plus efficaces sur leur code que METHIS alors que tous les articles + disent le contraire ? +- Use case: système respiratoire. + - visualisation (avec GiD cimne x64, téléchargeable en version démo + sans problème) des mesh/sous domaine MPI avec (fichier .par) en 3D + pour visualiser qui est qui et où ils sont + - Là, le résultat de Methis montrait qu'un des process était + responsable de zones très éloignées (au début du tube + respiratoire et dans les bronches). Est-ce que ça explique le + problème ? Pas clair... Il faudra qu'on regarde ça nous même. + - Typical size: 256 cores, 5 secondes pour quelques itérations, ce + qui devrait suffire si le partitionnement est déjà fait. + - Faire le partitionnement une bonne fois pour toute et faire un + restart d'Alya à partir de là. + - Par rapport à + http://bsccase02.bsc.es/alya/index.html#Solution_procedure, la + description de Guillaume correspond à ce qui se passe dans "solve + module 1". Il y a juste un module, pas de couplage. (M= moment, + S=Schurr). + - A priori, pas de lien entre les informations applicatives et ce + qui apparaît dans une trace paraver + - Outputs: + - un .log + - -partition.par.post.msh et -partition.par.post.res + - Compilation METHIS + #+BEGIN_SRC + cd Alya + cd thirdparties/metis-4.0; make ... + cd Extecutables/unix; cat config.h ; # Guillaume will provide a typical one + ./configure nastin parall ; make + #+END_SRC + - Run: + - Guillaume will explain us how to do the partitionning and how to + reuse them. + - See the doc: + http://bsccase02.bsc.es/alya/tutorial/partitions.html. It is + actually up-to-date + - Warning: master slave organization. Hence partition on 255 and + run on 256 procs. + - Dans gensap-nastin.dat, on peut définir le NUMBER_OF_STEPS + - Input: +***** Follow-up +I've checked the site: +http://bsccase02.bsc.es/alya/index.html#Solution_procedure + +#+BEGIN_EXAMPLE + - Read files, define mesh dependent arrays + - Initial solution or read restart + - Output and postprocess + + +--- do time steps + ! + | - Compute time step + | - Begin a time step: update bc, etc. + | + | do blocks ---------------------------------+ + | | + | do coupling --------+ | Modules are grouped into blocks + | - Solve module1 | | + ! - Solve module2 | | + ! ... | | Useful if some equations are coupled + | - Check coupling cvg | | and others decoupled. Example + | end do coupling --------+ | Block 1: Nastin - Turbul + | | Block 2: Chemic: species are transported + | Goto new block | + | end do blocks ---------------------------------+ + | + | - End time step + | - Output and postprocess + | + +--- end do time steps + + - Output and postprocess + - End the run +#+END_EXAMPLE + +Then, I've checked the step-by-step tutorial to partition: +http://bsccase02.bsc.es/alya/tutorial/partitions.html + +The key line in my simple view now is this one: +#+BEGIN_EXAMPLE +TASK: ONLY_PREPROCESS, BINARY, SUBDOMAIN=num_domains +#+END_EXAMPLE +present in the =case.dat= configuration file for ALYA. +Run the MPI program without =mpirun=, using a single process. + +Then, the resulting pre-processing stage can be registered in files +that are only read, and the actual simulation can take place. +*** 2017-02-24 Marenostrum account creation :MN3: +***** E-mails from Guillaume Houzeaux :ATTACH: + :PROPERTIES: + :Attachments: Screen%20Shot%202017-02-01%20at%2014.49.05.png Screen%20Shot%202017-02-01%20at%2014.49.31.png + :ID: 789c7373-448e-44a3-9c5a-a0afa244280d + :END: +Two screen shots joined here. +***** Paper about load balancing in Alya :ATTACH: + :PROPERTIES: + :Attachments: IJHPCA.pdf + :ID: 5439e62d-768a-4a98-a77f-4890b829c6ff + :END: + +Load balancing of an MPI parallel unstructured CFD code using DLB and +OpenMP. Marta Garcia-Gasulla 1 , Guillaume Houzeaux 1 , Antoni +Artigues 1 , Jesús Labarta 12 and Mariano Vázquez 1. + +- unstructured meshes +- the assembly step + - a loop over the elements of the mesh to compute element matrix and right-hand side + - an then to assemble them into global matrix and right-hand side +- load balancing is an issue + - mesh partitioning is never perfect + - hardware can also be a random source of imbalance +- approach + - dynamic load balance library (OpenMP) \to idle MPI tasks by others + +***** E-mail of account creation :ATTACH: + :PROPERTIES: + :Attachments: user_responsibilities.pdf 0769_001.pdf + :ID: b16bcfa4-f3db-40da-b443-8db078c817e9 + :END: +We would like to inform you that the accounts of your activity in the +supercomputer Marenostrum III have been created. Here you have the basic +information to access and use the supercomputer. + +When accessing to the machine for the first time, an error message +regarding ssh keys may appear. This is normal and you have to erase (mn1,mn2,mn3) +entries from: +=$HOME/.ssh/known_hosts= + +We have 3 login nodes for access to MareNostrum III machine +{mn1,mn2,mn3}.bsc.es,all connections must be done through ssh +(Secure SHell). + +#+BEGIN_EXAMPLE +ssh username@mn1.bsc.es +ssh username@mn2.bsc.es +#+END_EXAMPLE + +The usernames associated to your project leaded by Jose M. Cela are: + + =Lucas Schnorr:bsc21835= + +The password will be sent in a separate mail with no subject unless you already +had one for dl01.bsc.es. In order to change your password, you have to login at: + +=dl01.bsc.es= + +with the same username and password than in the cluster. Then, you have to use +the 'passwd' command. The new password will become effective 10 minutes after +the change. For security reasons the password must be changed the +first time every user accesses the machine. + +MareNostrum III provides 5 interactive nodes. In order to gain access to +each of them you need a ssh connection + +#+BEGIN_EXAMPLE +ssh login5 +#+END_EXAMPLE + +The login nodes serve as front ends and are used typically for +editing, compiling, preparation and submition of batch executions. +It is not permitted the execution of cpu-bound programs on these nodes, +if some execution needs more cputime than the permitted, this needs +to be done through the batch queue system. + +The basic limits in queues by job are 48 "wall clock" hours and 2048 cpus. If +you need to submit a job exceeding 1024 cpus, then you must contact the +support team to study the scalability and schedule its execution. + +No connections are allowed from inside Marenostrum III to the outside world, +so all the file transfers have to be executed from your local machines +and not inside Marenostrum III. + +Example to copy files or directories from MN to external machine: + +#+BEGIN_EXAMPLE +scp -r {username}@dt01.bsc.es:"SOURCE_MN_directory" "DEST_directory" +#+END_EXAMPLE + +Example to copy files or directories from external machine to MN: + +#+BEGIN_EXAMPLE +scp -r "SOURCE_directory" {username}@dt01.bsc.es:"DEST_MN_directory" +#+END_EXAMPLE + +Each project needs to inform us regularly about their evolution to let us +know the status of each project running on the machine. + +All accounts will be for single user access and they must not be shared +among several users. If your project needs more resources or accounts, +only the project manager must do this kind of request. You can find +the user's guide of MareNostrum III at the following link: + + http://www.bsc.es/support/MareNostrum3-ug.pdf + +Remember to send us the User Responsibilities document. If we do not receive it +15 days after the account is accepted, the access will be disabled. + +All BSC's staff is eager to help you in the development and success of +your project, please contact us for any question / suggestion: + + e-mail: support@bsc.es + Barcelona Supercomputing Center + C/ Jordi Girona, 31 + 08034 Barcelona + +Entered on [2017-02-24 Fri 10:46] +*** 2017-02-24 Trying to compile Alya, preprocess, postprocess :ALYA: +**** Preliminaries (with Alya compilation) +***** Local SSH configuration for BSC supercomputer +Put this in your =.ssh/config= +#+BEGIN_EXAMPLE +Host bscdata + Hostname dt01.bsc.es + User bsc21835 +Host bsc + Hostname mn3.bsc.es + User bsc21835 +#+END_EXAMPLE +***** Source code identification +Currently in my Dropbox. It corresponds to a tarball Guillaume +Houseaux sent me a couple of months ago (the same used by M. Camelo +in his PFE). +#+begin_src shell :results output +cd ~/Downloads/ +md5sum AlyaR1Source.tar.gz +#+end_src + +#+RESULTS: +: e8d20de91b56c64030d04ed1a4b1c559 AlyaR1Source.tar.gz +***** Copy Alya Sources to MN3 +#+begin_src shell :results output +rsync -avuzP ~/Downloads/AlyaR1Source.tar.gz bscdata:. +ssh bscdata md5sum AlyaR1Source.tar.gz +#+end_src + +#+RESULTS: +: sending incremental file list +: AlyaR1Source.tar.gz +: 32,768 0% 0.00kB/s 0:00:00 21,856,256 21% 20.81MB/s 0:00:03 56,590,336 55% 26.98MB/s 0:00:01 93,487,104 92% 29.72MB/s 0:00:00 101,349,570 100% 29.91MB/s 0:00:03 (xfr#1, to-chk=0/1) +: +: sent 91,884,154 bytes received 34 bytes 20,418,708.44 bytes/sec +: total size is 101,349,570 speedup is 1.10 +: e8d20de91b56c64030d04ed1a4b1c559 AlyaR1Source.tar.gz +***** Alya Compilation +Execute the following commands there. + +#+begin_src shell :results output +tar xfz AlyaR1Source.tar.gz +cd ~/AlyaR1Source/Thirdparties/metis-4.0; make +cd ~/AlyaR1Source/Executables/unix/; +cp configure.in/config_ifort.in config.in +./configure nastin parall ; make +#+end_src + +I've got two binaries (g for Debug): + +#+BEGIN_EXAMPLE +-rwxr-xr-x 1 bsc21835 bsc21 8.8M Feb 24 14:09 Alya.x +-rwxr-xr-x 1 bsc21835 bsc21 27M Feb 24 14:10 Alya.g +#+END_EXAMPLE +**** Preprocess the =4_tufan= +***** Copy the test case to there +#+begin_src shell :results output +md5sum ~/Downloads/4_tufan_run.tar.gz +rsync -avuzP ~/Downloads/4_tufan_run.tar.gz bscdata:. +ssh bscdata md5sum 4_tufan_run.tar.gz +#+end_src + +#+RESULTS: +: b4127cb22bed9309d1234827f41135c9 /home/schnorr/Downloads/4_tufan_run.tar.gz +: sending incremental file list +: 4_tufan_run.tar.gz +: 32,768 0% 0.00kB/s 0:00:00 20,709,376 5% 19.72MB/s 0:00:19 52,789,248 12% 25.16MB/s 0:00:13 82,837,504 20% 26.33MB/s 0:00:12 115,376,128 28% 27.51MB/s 0:00:10 149,848,064 36% 30.80MB/s 0:00:08 183,861,248 45% 31.27MB/s 0:00:07 216,989,696 53% 31.99MB/s 0:00:05 249,724,928 61% 32.04MB/s 0:00:04 283,770,880 69% 31.93MB/s 0:00:03 317,849,600 77% 31.95MB/s 0:00:02 347,111,424 84% 31.02MB/s 0:00:01 377,749,504 92% 30.53MB/s 0:00:00 408,525,195 100% 30.11MB/s 0:00:12 (xfr#1, to-chk=0/1) +: +: sent 396,055,909 bytes received 34 bytes 27,314,202.97 bytes/sec +: total size is 408,525,195 speedup is 1.03 +: b4127cb22bed9309d1234827f41135c9 4_tufan_run.tar.gz +***** Unpack +#+begin_src shell :results output +tar xfz 4_tufan_run.tar.gz +cd 4_tufan_run/c/ +#+end_src +***** Preprocess using METIS 4.0 +Everything is done locally, in my notebook, because I don't know yet +how to submit jobs in MN and Alya.x fails to run on =login= nodes. + +See: http://bsccase02.bsc.es/alya/tutorial/partitions.html +- Partion preprocess step + +I am using Metis 4, but the tutorial tells me to use version 5. + +Let's proceed anyway. + +Create the directories to hold the partitioned meshes (for 16 domains) + +#+begin_src shell :results output +cd ~/svn/guilherme/alya-imag/4_tufan_run/7/ +../../Utils/user/alya-hierarchy 116 +#+end_src + +#+RESULTS: +: +: --| alya-hierarchy |-- +: --| +: --| 2 directories have been created +: --| +: --| Bye. +: --| +: --| + +This is the DAT file for preprocessing. + +#+begin_src shell :results output +cd ~/svn/guilherme/alya-imag/4_tufan_run/7/ +cat lucas_preprocess_case.dat +#+end_src + +#+RESULTS: +#+begin_example +$------------------------------------------------------------------- +RUN_DATA + ALYA: sq_cyl + RUN_TYPE: noCONTI , PRELIMINARY, FREQUENCY=100 + LATEX_INFO_FILE: Yes + LIVE_INFORMATION: Screen +END_RUN_DATA +$------------------------------------------------------------------- +PROBLEM_DATA + TIME_COUPLING: GLOBAL, PRESCR + TIME_INTERVAL= 0.0,100000.0 + TIME_STEP_SIZE= 0.00025 + NUMBER_OF_STEPS= 100000 + MAXIMUM_NUMBER_GLOBAL= 1 + NASTIN_MODULE: On + END_NASTIN_MODULE + PARALL_SERVICE: On + PARTITION_TYPE: FACES + FILE_HIERARCHY: ON + FILE_OPEN_CLOSE: Yes + VIRTUAL_FILE: On, MAXIMUM_MEMORY=0.5 + TASK: ONLY_PREPROCESS, BINARY, SUBDOMAIN=16 + END_PARALL_SERVICE +END_PROBLEM_DATA +$------------------------------------------------------------------- +#+end_example + +Note the customized =SUBDOMAIN=16= setting. + +Local execution gives this as output: + +#+begin_src shell :results output +cd ~/svn/guilherme/alya-imag/4_tufan_run/7/ +mpirun -np 1 ../../Executables/unix/Alya.x lucas_preprocess_case +#+end_src + +#+RESULTS: +#+begin_example +--| +--| ALYA START ALYA FOR PROBLEM: sq_cyl +--| +--| ALYA READ PROBLEM DATA +--| ALYA START CONSTRUCT DOMAIN +--| ALYA READ MESH DATA +--| ALYA READ MESH ARRAYS +--| ALYA READ ELEMENT TYPES +--| ALYA CHECK ELEMENT TYPES +--| ALYA READ ELEMENT CONNECTIVITY +--| ALYA CHECK ELEMENT CONNECTIVITY +--| ALYA READ COORDINATES +--| ALYA READ BOUNDARY CONNECTIVITY, TYPES AND ELEMENT CONNECTIVITY +--| ALYA CHECK BOUNDARY TYPES +--| ALYA CHECK BOUNDARY CONNECTIVITY +--| ALYA CHECK CONNECTIVITY BOUNDARY/ELEMENT +--| ALYA READ BOUNDARY CONDITIONS +--| ALYA WARNING: KERNEL WILL EXTRAPOLATE FROM BOUNDARY CODES TO NODE CODES +--| ALYA WARNING: KERNEL READS BOUNDARY CODES ON BOUNDARIES +--| ALYA END READ MESH ARRAYS +--| ALYA COMPUTE GRAPH +--| ALYA KERNEL: COMPUTE GROUPS OF DEFLATED CG +--| ALYA START MESH PARTITION (# OF SUBDOMAINS= 16) +--| ALYA PARALL PREPROCESS --> ALLOCATE MEMORY... +--| ALYA PARALL PREPROCESS --> SET WEIGHTS... +--| ALYA MASTER COMPUTES ELEMENT GRAPH (# EDGES= 6214430, MAX # EDGES/ELEMENT= 6) +--| ALYA MASTER PARTITIONS ELEMENT GRAPH USING FACE CONNECTIVITY +--| ALYA MASTER COMPUTES COMMUNICATION STRATEGY +--| ALYA MASTER ORDERS INTERIOR AND BOUNDARY NODES +--| ALYA MASTER COMPUTES PERMUTATION ARRAYS +--| ALYA MASTER WRITES MESH AND PARTITION DATA IN RESTART FILES WITH HIERARCHY +--| ALYA PARALL PREPROCESS --> S/R 1 - TOT 16 +--| ALYA PARALL PREPROCESS --> S/R 2 - TOT 16 +--| ALYA PARALL PREPROCESS --> S/R 3 - TOT 16 +--| ALYA PARALL PREPROCESS --> S/R 4 - TOT 16 +--| ALYA PARALL PREPROCESS --> S/R 5 - TOT 16 +--| ALYA PARALL PREPROCESS --> S/R 6 - TOT 16 +--| ALYA PARALL PREPROCESS --> S/R 7 - TOT 16 +--| ALYA PARALL PREPROCESS --> S/R 8 - TOT 16 +--| ALYA PARALL PREPROCESS --> S/R 9 - TOT 16 +--| ALYA PARALL PREPROCESS --> S/R 10 - TOT 16 +--| ALYA PARALL PREPROCESS --> S/R 11 - TOT 16 +--| ALYA PARALL PREPROCESS --> S/R 12 - TOT 16 +--| ALYA PARALL PREPROCESS --> S/R 13 - TOT 16 +--| ALYA PARALL PREPROCESS --> S/R 14 - TOT 16 +--| ALYA PARALL PREPROCESS --> S/R 15 - TOT 16 +--| ALYA PARALL PREPROCESS --> S/R 16 - TOT 16 +--| ALYA PARALL PREPROCESS --> SEND/REC 16 - TOTAL 16 +--| ALYA PARALL PREPROCESS --> COORD, LTYPE, LNODS, LTYPB, LNODB... +--| ALYA PARALL PREPROCESS --> LBOEL... +--| ALYA PARALL PREPROCESS --> KFL_FIELD, XFIEL, TIME_FIELD,... +--| ALYA PARALL PREPROCESS --> SET DATA... +--| ALYA PARALL PREPROCESS --> BC DATA... +--| ALYA PARALL PREPROCESS --> COMMUNICATION ARRAYS... +--| ALYA PARALL PREPROCESS --> SEND DATA TO SLAVES... DONE. +--| ALYA END MESH PARTITION +--| +--| ALYA CALCULATIONS CORRECT +--| +#+end_example + +Here are the contents of the =PAR*= dirs: + +#+begin_src shell :results output +cd ~/svn/guilherme/alya-imag/4_tufan_run/7/ +ls -1hl PAR000* +#+end_src + +#+RESULTS: +#+begin_example +PAR000000: +total 4.0K +-rw-r--r-- 1 schnorr schnorr 2.9K Feb 24 16:02 lucas_preprocess_case.par.rst0 + +PAR000100: +total 102M +-rw-r--r-- 1 schnorr schnorr 6.4M Feb 24 16:02 lucas_preprocess_case.par.rst1 +-rw-r--r-- 1 schnorr schnorr 6.4M Feb 24 16:02 lucas_preprocess_case.par.rst10 +-rw-r--r-- 1 schnorr schnorr 6.5M Feb 24 16:02 lucas_preprocess_case.par.rst11 +-rw-r--r-- 1 schnorr schnorr 6.4M Feb 24 16:02 lucas_preprocess_case.par.rst12 +-rw-r--r-- 1 schnorr schnorr 6.3M Feb 24 16:02 lucas_preprocess_case.par.rst13 +-rw-r--r-- 1 schnorr schnorr 6.7M Feb 24 16:02 lucas_preprocess_case.par.rst14 +-rw-r--r-- 1 schnorr schnorr 6.6M Feb 24 16:02 lucas_preprocess_case.par.rst15 +-rw-r--r-- 1 schnorr schnorr 6.3M Feb 24 16:02 lucas_preprocess_case.par.rst16 +-rw-r--r-- 1 schnorr schnorr 6.3M Feb 24 16:02 lucas_preprocess_case.par.rst2 +-rw-r--r-- 1 schnorr schnorr 6.5M Feb 24 16:02 lucas_preprocess_case.par.rst3 +-rw-r--r-- 1 schnorr schnorr 6.4M Feb 24 16:02 lucas_preprocess_case.par.rst4 +-rw-r--r-- 1 schnorr schnorr 6.3M Feb 24 16:02 lucas_preprocess_case.par.rst5 +-rw-r--r-- 1 schnorr schnorr 6.5M Feb 24 16:02 lucas_preprocess_case.par.rst6 +-rw-r--r-- 1 schnorr schnorr 6.3M Feb 24 16:02 lucas_preprocess_case.par.rst7 +-rw-r--r-- 1 schnorr schnorr 6.3M Feb 24 16:02 lucas_preprocess_case.par.rst8 +-rw-r--r-- 1 schnorr schnorr 6.6M Feb 24 16:02 lucas_preprocess_case.par.rst9 +#+end_example + +Okay, now let's try to use them avoiding the preprocess. +**** Postprocess the =4_tufan= +***** Postprocess +Still local (notebook), for the same reasons. + +See: http://bsccase02.bsc.es/alya/tutorial/partitions.html +- Partion postprocess step + +#+begin_src shell :results output +cd ~/svn/guilherme/alya-imag/4_tufan_run/7/ +cat lucas_postprocess_case.dat +#+end_src + +#+RESULTS: +#+begin_example +$------------------------------------------------------------------- +RUN_DATA + ALYA: sq_cyl + RUN_TYPE: noCONTI , PRELIMINARY, FREQUENCY=100 + LATEX_INFO_FILE: Yes + LIVE_INFORMATION: Screen +END_RUN_DATA +$------------------------------------------------------------------- +PROBLEM_DATA + TIME_COUPLING: GLOBAL, PRESCR + TIME_INTERVAL= 0.0,100000.0 + TIME_STEP_SIZE= 0.00025 + NUMBER_OF_STEPS=2 + MAXIMUM_NUMBER_GLOBAL= 1 + NASTIN_MODULE: On + END_NASTIN_MODULE + +$ this is the section that does the postprocess + PARALL_SERVICE: On + PARTITION_TYPE: FACES + FILE_HIERARCHY: ON + FILE_OPEN_CLOSE: Yes + VIRTUAL_FILE: On, MAXIMUM_MEMORY=0.5 + TASK: READ_PREPROCESS, BINARY + END_PARALL_SERVICE + +$ This part has been removed to only preprocess +$ PARALL_SERVICE: On +$ OUTPUT_FILE: OFF +$ POSTPROCESS: MASTER +$ PARTITION_TYPE: FACES +$ $ COMMUNICATION: ASYNCRONOUS +$ END_PARALL_SERVICE +END_PROBLEM_DATA +$------------------------------------------------------------------- +#+end_example + +The machine file I have: + +#+begin_src shell :results output +cd ~/svn/guilherme/alya-imag/4_tufan_run/7/ +cat machinefile +#+end_src + +#+RESULTS: +: localhost slots=32 max_slots=32 + +Now, let's run again with 17 procs: + +#+begin_src shell :results output +cd ~/svn/guilherme/alya-imag/4_tufan_run/7/ +mpirun -machinefile machinefile -np 17 ../../Executables/unix/Alya.x lucas_postprocess_case +#+end_src + +#+RESULTS: + +This has failed. The log tells me why: + +#+begin_src shell :results output +cd ~/svn/guilherme/alya-imag/4_tufan_run/7/ +cat lucas_postprocess_case.log +#+end_src + +#+RESULTS: +#+begin_example + + + + -------- + |- A L Y A: sq_cyl + -------- + + + + + + -------------------------------------------------------- + + |- AN ERROR HAS BEEN DETECTED: + + ERROR WHEN OPENING THE PARALL RESTART FILE: PAR000000/lucas_postprocess_case.par.rst0 + +#+end_example + +Looks like I can't change the DAT filename. + +Let's rename it. + +Next code block is non-reproducible. + +#+begin_src shell :results output +cd ~/svn/guilherme/alya-imag/4_tufan_run/7/ +#cp lucas_preprocess_case.dat lucas_preprocess_case.dat.bak +#cp lucas_postprocess_case.dat lucas_preprocess_case.dat +cat lucas_preprocess_case.dat +#+end_src + +#+RESULTS: +#+begin_example +$------------------------------------------------------------------- +RUN_DATA + ALYA: sq_cyl + RUN_TYPE: noCONTI , PRELIMINARY, FREQUENCY=100 + LATEX_INFO_FILE: Yes + LIVE_INFORMATION: Screen +END_RUN_DATA +$------------------------------------------------------------------- +PROBLEM_DATA + TIME_COUPLING: GLOBAL, PRESCR + TIME_INTERVAL= 0.0,100000.0 + TIME_STEP_SIZE= 0.00025 + NUMBER_OF_STEPS=2 + MAXIMUM_NUMBER_GLOBAL= 1 + NASTIN_MODULE: On + END_NASTIN_MODULE + +$ this is the section that does the postprocess + PARALL_SERVICE: On + OUTPUT_FILE: OFF + PARTITION_TYPE: FACES + FILE_HIERARCHY: ON + FILE_OPEN_CLOSE: Yes + VIRTUAL_FILE: On, MAXIMUM_MEMORY=0.5 + TASK: READ_PREPROCESS, BINARY + END_PARALL_SERVICE + +END_PROBLEM_DATA +$------------------------------------------------------------------- +#+end_example + +Okay. + +Let's run it: + +#+begin_src shell :results output +cd ~/svn/guilherme/alya-imag/4_tufan_run/7/ +mpirun -machinefile machinefile -np 17 ../../Executables/unix/Alya.x lucas_preprocess_case 2>&1 > x.log +#+end_src + +#+RESULTS: + +#+begin_src shell :results output +cd ~/svn/guilherme/alya-imag/4_tufan_run/7/ +cat x.log +#+end_src + +#+RESULTS: +#+begin_example +--| +--| ALYA START ALYA FOR PROBLEM: sq_cyl +--| +--| ALYA CHECK MPI. 1 MASTER + 16 SUBDOMAINS +--| ALYA MPI IS WORKING WELL +--| ALYA READ PROBLEM DATA +--| ALYA START CONSTRUCT DOMAIN +--| ALYA MASTER/SLAVES READ MESH AND PARTITION DATA FROM RESTART FILE WITH HIERARCHY +--| ALYA MASTER/SLAVES: ALL SLAVES HAVE READ THEIR RESTART FILES +--| ALYA KERMOD: READ DATA +--| ALYA KERMOD FILE DOES NOT EXITS: USE DEFAULT OPTIONS +--| ALYA PARALL: COMPUTE INTER-COLOR COMMUNICATORS +--| ALYA PARALL: COMPUTE BIN STRUCTURE FOR PARTITIONS +--| ALYA RENUMBER ELEMENTS +--| ALYA RECOMPUTE GRAPH FOR PERIODICITY +--| ALYA CHECK ELEMENT ORDERING +--| ALYA CHECK BOUNDARY ORDERING +--| ALYA PARALL: GET HALO GEOMETRY +--| ALYA PARALL: CREATE INTRA-ZONE COMMUNICATION ARRAY +--| ALYA PARALL: CREATE INTRA-SUBDOMAIN COMMUNICATION ARRAY +--| ALYA OUTPUT MESH +--| ALYA COMPUTE LUMPED MASS MATRIX +--| ALYA COMPUTE CLOSED MASS MATRIX +--| ALYA COMPUTE EXTERIOR NORMALS +--| ALYA EXTRAPOLATE BOUNDARY CODES TO NODE CODES +--| ALYA END CONSTRUCT DOMAIN +--| ALYA MODULE DATA +--| ALYA NASTIN: READ DATA +--| ALYA NASTIN: MASTER SENDS PHYSICAL AND NUMERICAL DATA TO SLAVES +--| ALYA NASTIN: MASTER SENDS PHYSICAL AND NUMERICAL ARRAYS TO SLAVES +--| ALYA WARNINGS HAVE BEEN FOUND IN MODULE NASTIN +--| +--| ALYA ABORTED IN |----> 1 ERROR HAS BEEN FOUND IN NASTIN DATA FILE +--| +#+end_example + +**** Next steps +- Get an updated version of Alya +- Fix the partition postprocess step +- Test case + - The mesh for the respiratory system + - Adapt preprocess + postprocess to this case +- Find out how to activate the second method + for mesh partitioning (space filling curves) +- Reconfirm the fact that SFC are faster than METIS +** 2017-03-27 Trying to compile SVN version of Alya locally +*** Configure with gfortran +#+begin_src shell :results output +cd ~/misc/alya/alya-bsc/Executables/unix/ +rm config.in +ln -s ./configure.in/config_gfortran.in config.in +./configure nastin parall +#+end_src + +#+RESULTS: +#+begin_example + +--| Alya configure |-- + +--| Build Alya makefile + + Linux OS identified. + Using Intel Fortran Compiler ifort. + + Creating makefile for the modules demanded, when the source folder exists: + + Name: adapti ( service # 59 ) --> compile = no iffold = yes iffake = no + Name: alefor ( module # 7 ) --> compile = no iffold = yes iffake = no + Name: apelme ( module # 9 ) --> compile = no iffold = no iffake = no + Name: casimi ( module # 23 ) --> compile = no iffold = no iffake = no + Name: cgns24 ( service # 56 ) --> compile = no iffold = yes iffake = yes + Name: chemic ( module # 19 ) --> compile = no iffold = yes iffake = no + Name: codire ( module # 3 ) --> compile = no iffold = yes iffake = no + Name: commdo ( service # 65 ) --> compile = no iffold = no iffake = no + Name: coupli ( service # 64 ) --> compile = no iffold = no iffake = no + Name: dodeme ( service # 57 ) --> compile = no iffold = yes iffake = yes + Name: exmedi ( module # 5 ) --> compile = no iffold = yes iffake = no + Name: gidpos ( service # 53 ) --> compile = no iffold = yes iffake = yes + Name: gotita ( module # 11 ) --> compile = no iffold = yes iffake = no + Name: handfp ( service # 54 ) --> compile = no iffold = yes iffake = yes + Name: hdfpos ( service # 62 ) --> compile = no iffold = yes iffake = no + Name: helmoz ( module # 20 ) --> compile = no iffold = yes iffake = no + Name: immbou ( module # 21 ) --> compile = no iffold = yes iffake = no + Name: latbol ( module # 8 ) --> compile = no iffold = no iffake = no + Name: levels ( module # 14 ) --> compile = no iffold = yes iffake = no + Name: magnet ( module # 16 ) --> compile = no iffold = yes iffake = no + Name: nasedg ( module # 18 ) --> compile = no iffold = yes iffake = no + Name: nastal ( module # 6 ) --> compile = no iffold = yes iffake = no + Name: nastin ( module # 1 ) --> compile = yes iffold = yes iffake = no --> OK to compile! + Name: neutro ( module # 26 ) --> compile = no iffold = yes iffake = no + Name: optsol ( service # 63 ) --> compile = no iffold = yes iffake = yes + Name: parall ( service # 55 ) --> compile = yes iffold = yes iffake = yes --> OK to compile! + Name: partis ( module # 17 ) --> compile = no iffold = yes iffake = no + Name: porous ( module # 24 ) --> compile = no iffold = yes iffake = no + Name: quanty ( module # 15 ) --> compile = no iffold = yes iffake = no + Name: radiat ( module # 22 ) --> compile = no iffold = yes iffake = no + Name: shapar ( service # 51 ) --> compile = no iffold = no iffake = no + Name: solidz ( module # 10 ) --> compile = no iffold = yes iffake = no + Name: solmum ( service # 52 ) --> compile = no iffold = yes iffake = yes + Name: solpls ( service # 58 ) --> compile = no iffold = yes iffake = no + Name: temper ( module # 2 ) --> compile = no iffold = yes iffake = no + Name: turbul ( module # 4 ) --> compile = no iffold = yes iffake = no + Name: wavequ ( module # 12 ) --> compile = no iffold = yes iffake = no + Name: xxxxxx ( module # 25 ) --> compile = no iffold = no iffake = no + + Makefile will make the following binaries: + + Alya.x --> Production (non debugger) binary + Alya.g --> Debugger binary + + +--| Makefile created. Ready to make. +--| The command you typed has been saved in file conf.log +--| Bye. + + +#+end_example +*** Make +#+begin_src shell :results output +cd ~/misc/alya/alya-bsc/ThirdParties/metis-4.0/; make -j 4 2>&1 +cd ~/misc/alya/alya-bsc/Executables/unix/; make -j 4 2>&1 +#+end_src + +#+RESULTS: +** 2017-03-28 OAR Job Allocation +#+begin_src shell :results output +oarsub -p "cluster='taurus'" -l "nodes=4/walltime=03:00:00" -r "$(date '+%Y-%m-%d %H:%M:%S')" -t deploy +oarsub -C 852259 +#taurus-1.lyon.grid5000.fr +#taurus-16.lyon.grid5000.fr +#taurus-2.lyon.grid5000.fr +#taurus-7.lyon.grid5000.fr +kadeploy3 --env-file images/stretch_energy.env --file ${OAR_NODE_FILE} -k +cd ~/alya-bsc/Thirdparties/metis-4.0/; make -j 4 +cd ~/alya-bsc/Executables/unix/; ln -s ./configure.in/config_gfortran.in config.in; make -j 4 +#+end_src +** 2017-03-29 Experiments to execute +#+begin_src shell :results output +for CASE in sfc metis; do + for NP in 32 64 96 120; do + NP=$((NP+1)) + UNIQUE=taurus-np${NP}-${CASE} + echo "./scripts/run_alya_experiment.sh machine-file ${NP} ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_${CASE}/fensap.dat $UNIQUE" + echo "mv $UNIQUE /home/lschnorr/" + done +done > script.sh +cat script.sh +#+end_src + +#+RESULTS: +#+begin_example +./scripts/run_alya_experiment.sh machine-file 33 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_sfc/fensap.dat taurus-np33-sfc +mv taurus-np33-sfc /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 65 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_sfc/fensap.dat taurus-np65-sfc +mv taurus-np65-sfc /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 97 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_sfc/fensap.dat taurus-np97-sfc +mv taurus-np97-sfc /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 121 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_sfc/fensap.dat taurus-np121-sfc +mv taurus-np121-sfc /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 33 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_metis/fensap.dat taurus-np33-metis +mv taurus-np33-metis /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 65 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_metis/fensap.dat taurus-np65-metis +mv taurus-np65-metis /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 97 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_metis/fensap.dat taurus-np97-metis +mv taurus-np97-metis /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 121 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_metis/fensap.dat taurus-np121-metis +mv taurus-np121-metis /home/lschnorr/ +#+end_example +** 2017-03-29 Quick analysis of first results +I got these: + +#+name: first-results +#+begin_src shell :results output org table +for case in $(find $HOME/segundo/taurus-np* | grep -e "sfc.org" -e "metis.org"); do echo -n "$case "; cat $case | grep epoch | sed "s/^.*: //" | tr '\n' '-' | sed -e "s/-$//" -e "s/-/ - /" -e "s/$/\n/" | bc -l; done | sed -e "s/^.*np//" -e "s/.org//" -e "s/-/ /g" -e "s/ */|/g" +#+end_src + +#+RESULTS: first-results +#+BEGIN_SRC org +| 121 | metis | 0 | +| 121 | sfc | 1 | +| 33 | metis | 1160 | +| 33 | sfc | 938 | +| 65 | metis | 653 | +| 65 | sfc | 612 | +| 97 | metis | 503 | +| 97 | sfc | 470 | +#+END_SRC + +#+header: :var df=first-results +#+begin_src R :results output graphics :file img/first-results.png :exports both :width 600 :height 400 :session +library(dplyr); +library(ggplot2); +df %>% + rename(NP = V1, + Case = V2, + Time = V3) %>% + filter(NP != 121) %>% + ggplot(aes(x=NP, y=Time, color=Case)) + + theme_bw() + + ylim(0,NA) + + geom_point() + + geom_line() + + theme(legend.position = "top"); +#+end_src + +#+RESULTS: +[[file:img/first-results.png]] +** 2017-03-29 Planning the tracing with Scorep + +Score-P 3.0 allows the selective instrumentation of any function using +a GCC plugin. Do to so, refer to the Section 3.1 of ScoreP user +manual. The flag is =--instrument-filter=, much better described in +Section 5.3 of the same document. This method is preferred since it +reduces the intrusion. To make such method available in a scorep +installation, we need to previously install the package +=gcc-6-plugin-dev= or something similar. + +It is interesting to disable everything else which is not MPI and the +necessary to trace filtered user functions. Guilherme, for example, +had used this line to instrument Alya: + +=scorep --nopomp --nocompiler --nocuda --nopdt --nouser --noopencl= + +We would have to keep only: + +=scorep --nopomp --nocuda --noopencl --nopdt= + +This is the filter that Guilherme used dynamically. We want to use it +statically. + +#+begin_src shell :results output :tangle alya-timestep-filters.scorep +SCOREP_REGION_NAMES_BEGIN +EXCLUDE * +INCLUDE MPI COM +INCLUDE endste_ timste_ +SCOREP_REGION_NAMES_END +#+end_src + +The way to use this filter in runtime is to use the appropriate +environment variable this way: + +#+BEGIN_EXAMPLE +SCOREP_FILTERING_FILE=name_of_the_file +#+END_EXAMPLE + +_Summary_ of that has worked: + +Finally, I've managed to get it working with the following steps. + +1. Configure ScoreP with the following line +#+BEGIN_EXAMPLE +./configure --without-gui --prefix=/home/lschnorr/install/nova/scorep-3.0-alya/ --with-shmem=no --with-opari2=no +#+END_EXAMPLE +- Do not forget to install the gcc plugin package first + +2. Configure Alya's =config.in= to use scorep this way +#+BEGIN_EXAMPLE +SCOREP=scorep --compiler --instrument-filter=/home/lschnorr/alya-bsc/Executables/unix/alya-timestep-filters.scorep +#+END_EXAMPLE +Where the contents of the filter are like this +#+BEGIN_EXAMPLE +#SCOREP_FILE_NAMES_BEGIN +#EXCLUDE * +#INCLUDE */Alya.f90 +#SCOREP_FILE_NAMES_END + +SCOREP_REGION_NAMES_BEGIN +EXCLUDE * +INCLUDE endste +INCLUDE timste +SCOREP_REGION_NAMES_END +#+END_EXAMPLE + +3. Finally, run the program this way +#+BEGIN_EXAMPLE +mpirun -x SCOREP_FILTERING_FILE=/home/lschnorr/alya-bsc/Executables/unix/alya-timestep-filters.scorep -x SCOREP_EXPERIMENT_DIRECTORY=/home/lschnorr/xxx-scorep7 -x SCOREP_OVERWRITE_EXPERIMENT_DIRECTORY=true -x SCOREP_ENABLE_TRACING=true --bind-to core -np 12 -machinefile /tmp/Alya-Perf/machine-file /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap +#+END_EXAMPLE +- The traces need to be put in the HOME NFS. +- Filtering is not obligatory here since we instrumented in compile time. +** 2017-04-05 Infiniband problems +In all machines that belong to an Infiniband experiment, do: + +1. In file =/etc/security/limits.conf=, add these lines: + #+BEGIN_EXAMPLE + * hard memlock unlimited + * soft memlock unlimited + #+END_EXAMPLE + +2. In file =/etc/systemd/system.conf=, make the line looks like this: + #+BEGIN_EXAMPLE + DefaultLimitMEMLOCK=infinity + #+END_EXAMPLE +** 2017-04-06 Nose check +#+name: nose_table + | Rank | X | Y | Z | + |------+--------------+--------------+--------------| + | 1 | 0.919827E-01 | 0.115893E+00 | -.639115E-01 | + | 2 | 0.103425E+00 | 0.149886E+00 | -.123160E+00 | + | 3 | 0.103773E+00 | 0.169636E+00 | -.146991E+00 | + | 4 | 0.961346E-01 | 0.202625E+00 | -.216010E+00 | + | 5 | 0.139644E+00 | 0.204323E+00 | -.223233E+00 | + | 6 | 0.101879E+00 | 0.850899E-01 | 0.235732E-01 | + | 7 | 0.933165E-01 | 0.458008E-01 | 0.261210E-01 | + | 8 | 0.913297E-01 | 0.682040E-01 | 0.290498E-01 | + | 9 | 0.952650E-01 | 0.111200E+00 | 0.149948E-01 | + | 10 | 0.950596E-01 | 0.122169E+00 | -.234605E-01 | + | 11 | -.403059E+00 | -.400000E+00 | -.489343E+00 | + | 12 | 0.596941E+00 | -.400000E+00 | -.489343E+00 | + | 13 | 0.596941E+00 | 0.240397E+00 | -.489343E+00 | + | 14 | -.403059E+00 | 0.240397E+00 | -.489343E+00 | + | 15 | -.403059E+00 | -.400000E+00 | 0.510656E+00 | + | 16 | 0.596941E+00 | -.400000E+00 | 0.510656E+00 | + | 17 | 0.596941E+00 | 0.240397E+00 | 0.510656E+00 | + | 18 | -.403059E+00 | 0.240397E+00 | 0.510656E+00 | + +#+begin_src R :results output :session :exports both :var df=nose_table +df; +#+end_src + +#+RESULTS: +#+begin_example + Rank X Y Z +1 1 0.0919827 0.1158930 -0.0639115 +2 2 0.1034250 0.1498860 -0.1231600 +3 3 0.1037730 0.1696360 -0.1469910 +4 4 0.0961346 0.2026250 -0.2160100 +5 5 0.1396440 0.2043230 -0.2232330 +6 6 0.1018790 0.0850899 0.0235732 +7 7 0.0933165 0.0458008 0.0261210 +8 8 0.0913297 0.0682040 0.0290498 +9 9 0.0952650 0.1112000 0.0149948 +10 10 0.0950596 0.1221690 -0.0234605 +11 11 -0.4030590 -0.4000000 -0.4893430 +12 12 0.5969410 -0.4000000 -0.4893430 +13 13 0.5969410 0.2403970 -0.4893430 +14 14 -0.4030590 0.2403970 -0.4893430 +15 15 -0.4030590 -0.4000000 0.5106560 +16 16 0.5969410 -0.4000000 0.5106560 +17 17 0.5969410 0.2403970 0.5106560 +18 18 -0.4030590 0.2403970 0.5106560 +#+end_example + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session +library(ggplot2); +plot(df); +#+end_src + +#+RESULTS: +[[file:/tmp/babel-22618hTf/figure22618i_y.png]] + +** 2017-04-06 Alya modifications to get more info, set balance to 1 :EXP10: +#+BEGIN_EXAMPLE +lschnorr@fnancy:~/alya-bsc$ git diff +diff --git a/Executables/unix/configure.in/config_gfortran.in b/Executables/unix/configure.in/config_gfortran.in +index 1c73a1a..90de238 100644 +--- a/Executables/unix/configure.in/config_gfortran.in ++++ b/Executables/unix/configure.in/config_gfortran.in +@@ -4,16 +4,17 @@ + #module load gcc/5.1.0 openmpi/1.8.5 # + ################################################################### + ++SCOREP=scorep --compiler --instrument-filter=/home/lschnorr/alya-bsc/Executables/unix/alya-timestep-filters.scorep + +-F77 = mpif90 +-F90 = mpif90 +-FCOCC = mpicc -c ++F77 = $(SCOREP) mpif90 ++F90 = $(SCOREP) mpif90 ++FCOCC = $(SCOREP) mpicc -c + FCFLAGS = -c -J$O -I$O -ffree-line-length-none -fimplicit-none + FPPFLAGS = -x f95-cpp-input + EXTRALIB = -lc + EXTRAINC = +-fa2p = mpif90 -c -x f95-cpp-input -DMPI_OFF -J../../Utils/user/alya2pos -I../../Utils/user/alya2pos +-fa2plk = mpif90 -lc ++fa2p = $(SCOREP) mpif90 -c -x f95-cpp-input -DMPI_OFF -J../../Utils/user/alya2pos -I../../Utils/user/alya2pos ++fa2plk = $(SCOREP) mpif90 -lc + + ################################################################### + # PERFORMANCE FLAGS # +diff --git a/Sources/kernel/parall/mod_par_partit_sfc.f90 b/Sources/kernel/parall/mod_par_partit_sfc.f90 +index ee6fd1d..e55185d 100644 +--- a/Sources/kernel/parall/mod_par_partit_sfc.f90 ++++ b/Sources/kernel/parall/mod_par_partit_sfc.f90 +@@ -383,7 +383,7 @@ contains + + iboxf = lboxf(ielem) + iboxl = lboxl(ielem) +- iweig = real(ngaus(abs(ltype_par(ielem))),rp) ++ iweig = 1_rp !real(ngaus(abs(ltype_par(ielem))),rp) + + if( PAR_MY_PARMETIS_RANK /= iboxc-1 )then + if(bufwei(iboxl*2) == 0_rp) then +diff --git a/Sources/services/parall/par_outinf.f90 b/Sources/services/parall/par_outinf.f90 +index 6664b93..301522a 100644 +--- a/Sources/services/parall/par_outinf.f90 ++++ b/Sources/services/parall/par_outinf.f90 +@@ -15,6 +15,7 @@ subroutine par_outinf() + use mod_parall + implicit none + integer(ip) :: isubd,ksmin(2),ksmax(2),kaver,koutp,nb ++ integer, parameter :: out_unit_npoin=20 + + if( IMASTER ) then + +@@ -27,7 +27,11 @@ subroutine par_outinf() + ksmin = huge(1_ip) + ksmax = -huge(1_ip) + kaver = 0 ++ ! LUCAS: manual instrumentation starts here, all code marked by LUCAS ++ open (unit=out_unit_npoin,file="results_NPOIN_NELEM_NELEW_NBOUN.log",action="write",status="replace") + do isubd=1,npart_par ++ ! LUCAS: Here I need to print: npoin_par(isubd) ++ write (out_unit_npoin,*) "Rank ", isubd ," NPOIN ", npoin_par(isubd), " NELEM ", nelem_par(isubd), " NELEW ", nelew_par(isubd), " NBOUN ", nboun_par(isubd), " NNEIG ", lneig_par(isubd), " NBBOU ", npoin_par(isubd) + if(npoin_par(isubd)% + select(-X1) %>% + rename(Start = S, End = E) %>% + mutate(Duration = End - Start) %>% + mutate(Platform = as.factor(Platform), + Partitioning = as.factor(Partitioning), + Nodes = as.factor(Nodes)); + +df2 <- read_csv("data/11/326040-b8e2-41f3-8479-c378f70864d9/exp_11-v1_grimoire_8.csv.gz") %>% + select(-X1) %>% + rename(Start = S, End = E) %>% + mutate(Duration = End - Start) %>% + mutate(Platform = as.factor(Platform), + Partitioning = as.factor(Partitioning), + Nodes = as.factor(Nodes)); + +df <- rbind(df1, df2); +df %>% head; +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + X1 = col_integer(), + Rank = col_integer(), + Iteration = col_integer(), + Platform = col_character(), + Nodes = col_integer(), + NP = col_integer(), + Partitioning = col_character(), + EID = col_character(), + Infiniband = col_logical(), + N = col_integer(), + S = col_double(), + E = col_double(), + Comm = col_double(), + Comp = col_double() +) +Warning message: +Missing column names filled in: 'X1' [1] +Parsed with column specification: +cols( + X1 = col_integer(), + Rank = col_integer(), + Iteration = col_integer(), + Platform = col_character(), + Nodes = col_integer(), + NP = col_integer(), + Partitioning = col_character(), + EID = col_character(), + Infiniband = col_logical(), + N = col_integer(), + S = col_double(), + E = col_double(), + Comm = col_double(), + Comp = col_double() +) +Warning message: +Missing column names filled in: 'X1' [1] +# A tibble: 6 × 14 + Rank Iteration Platform Nodes NP Partitioning EID Infiniband N + +1 0 1 grimoire 8 128 sfc 10-v2 TRUE 5454 +2 0 2 grimoire 8 128 sfc 10-v2 TRUE 4752 +3 0 3 grimoire 8 128 sfc 10-v2 TRUE 4873 +4 0 4 grimoire 8 128 sfc 10-v2 TRUE 4697 +5 0 5 grimoire 8 128 sfc 10-v2 TRUE 4660 +6 0 6 grimoire 8 128 sfc 10-v2 TRUE 4544 +# ... with 5 more variables: Start , End , Comm , Comp , +# Duration +#+end_example +**** Plot Comp +#+begin_src R :results output graphics :file img/exp_10-v2_exp_11-v1_comp.png :exports both :width 500 :height 400 :session +library(tidyr); +library(ggplot2); +df %>% + select(-Start, -End, -N, -Duration) %>% + mutate(Total = Comm+Comp) %>% + gather(Variable, Value, -Rank, -Iteration, -Platform, -Nodes, -NP, -Partitioning, -EID, -Infiniband) %>% + filter(Variable == "Comp", Iteration != 10, Rank != 0) %>% + ggplot(aes(x=as.factor(Iteration), y=Value, color=Infiniband)) + + theme_bw(base_size=12) + + geom_point(alpha=.3) + + ylim(0,NA) + + geom_line (aes(group=as.factor(Rank):as.factor(Partitioning):as.factor(Infiniband))) + + theme(legend.position="top") + + facet_grid(EID~Partitioning) +#+end_src + +#+RESULTS: +[[file:img/exp_10-v2_exp_11-v1_comp.png]] +*** Partition data +**** Process +#+begin_src shell :results output +EDIR="exp_11-v1_grimoire_8 exp_10-v2_grimoire_8" +for file in $(find $EDIR | grep results | grep log$); do + OUTPUT=$(dirname $file)/$(basename $file .log).csv + cat $file | tr -s " " | sed "s/^ //" | tr ' ' ',' | cut -d"," -f1,3,5,7,9,11,13 | uniq > $OUTPUT + cat $file | tr -s " " | sed "s/^ //" | tr ' ' ',' | cut -d"," -f2,4,6,8,10,12,14 >> $OUTPUT + echo "=> $file <=" + head $OUTPUT +done +#+end_src + +#+RESULTS: +#+begin_example +=> exp_11-v1_grimoire_8/11-v1_grimoire_8_128_sfc_true.dir/results_NPOIN_NELEM_NELEW_NBOUN.log <= +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,32182,174168,174611,2973,16,32182 +2,20919,53529,174758,5381,15,20919 +3,19727,49972,175172,5020,11,19727 +4,21918,69049,174054,4277,23,21918 +5,21683,64345,174280,4439,14,21683 +6,22205,64122,175002,4582,18,22205 +7,21361,63432,174587,4436,11,21361 +8,22162,75699,174459,3914,8,22162 +9,21389,66339,174799,4332,8,21389 +=> exp_11-v1_grimoire_8/11-v1_grimoire_8_128_metis_true.dir/results_NPOIN_NELEM_NELEW_NBOUN.log <= +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,22405,85371,172276,3477,4,22405 +2,22923,86049,179769,3742,5,22923 +3,24316,99541,175416,3029,5,24316 +4,22907,90905,170125,3170,4,22907 +5,22636,86964,173194,3448,5,22636 +6,22774,88416,171711,3333,6,22774 +7,23564,91625,177865,3452,7,23564 +8,22211,83394,173364,3612,5,22211 +9,23427,89980,179685,3588,4,23427 +=> exp_10-v2_grimoire_8/10-v2_grimoire_8_128_sfc_true.dir/results_NPOIN_NELEM_NELEW_NBOUN.log <= +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,14976,72536,72796,2384,17,14976 +2,14641,72392,72575,448,12,14641 +3,22264,72647,167221,4446,16,22264 +4,28129,72464,251684,7187,11,28129 +5,23239,72488,178813,4341,25,23239 +6,26093,72525,209615,5640,14,26093 +7,22996,72391,183401,4422,10,22996 +8,21190,72393,160643,3505,7,21190 +9,25178,72612,201462,5195,10,25178 +=> exp_10-v2_grimoire_8/10-v2_grimoire_8_128_metis_true.dir/results_NPOIN_NELEM_NELEW_NBOUN.log <= +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,22405,85371,172276,3477,4,22405 +2,22923,86049,179769,3742,5,22923 +3,24316,99541,175416,3029,5,24316 +4,22907,90905,170125,3170,4,22907 +5,22636,86964,173194,3448,5,22636 +6,22774,88416,171711,3333,6,22774 +7,23564,91625,177865,3452,7,23564 +8,22211,83394,173364,3612,5,22211 +9,23427,89980,179685,3588,4,23427 +#+end_example +**** Read +#+begin_src R :results output :session :exports both +library(readr); +library(dplyr); +library(tidyr); + +read_npoin <- function(filename) +{ + meta <- unlist(strsplit(unlist(strsplit(filename, "/"))[2], "_")); + meta <- gsub(".dir", "", meta); + read_csv(filename) %>% + gather(Variable, Value, -Rank) %>% + mutate(EID = meta[1], + Platform = meta[2], + Nodes = meta[3], + NP = meta[4], + Partitioning = meta[5], + Infiniband = as.logical(meta[6])); + +} +files <- list.files("exp_10-v2_grimoire_8", pattern="results_NPOIN_NELEM_NELEW_NBOUN.csv", recursive=TRUE, full.names=TRUE); +files <- c(files, list.files("exp_11-v1_grimoire_8", pattern="results_NPOIN_NELEM_NELEW_NBOUN.csv", recursive=TRUE, full.names=TRUE)); +dfp <- do.call("rbind", lapply(files, function(x) { read_npoin(x) })) +dfp %>% filter(Rank == 111, Variable == "NELEW") +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +# A tibble: 4 × 9 + Rank Variable Value EID Platform Nodes NP Partitioning Infiniband + +1 111 NELEW 172833 10-v2 grimoire 8 128 metis TRUE +2 111 NELEW 258786 10-v2 grimoire 8 128 sfc TRUE +3 111 NELEW 172833 11-v1 grimoire 8 128 metis TRUE +4 111 NELEW 173994 11-v1 grimoire 8 128 sfc TRUE +#+end_example +*** Merge +Get only the first iteration +#+begin_src R :results output :session :exports both +dfz <- df %>% filter(Iteration == 1, Rank != 0) %>% + select(-Platform, -Nodes, -NP, -Infiniband, -N, -Start, -End, -Comm, -Duration, -Iteration) %>% + gather(Variable, Value, Comp, -EID, -Partitioning) %>% + select(EID, Partitioning, Rank, Variable, Value); +dfpz <- dfp %>% select(-Platform, -Nodes, -NP, -Infiniband) %>% + select(EID, Partitioning, Rank, Variable, Value); +dfm <- rbind(dfz, dfpz); +dfm %>% head; +#+end_src + +#+RESULTS: +: # A tibble: 6 × 5 +: EID Partitioning Rank Variable Value +: +: 1 10-v2 sfc 1 Comp 8.811678 +: 2 10-v2 sfc 2 Comp 8.560127 +: 3 10-v2 sfc 3 Comp 12.641175 +: 4 10-v2 sfc 4 Comp 15.811419 +: 5 10-v2 sfc 5 Comp 13.183812 +: 6 10-v2 sfc 6 Comp 14.227162 +*** Plot +#+begin_src R :results output :session :exports both +dfm %>% .$Variable %>% unique +#+end_src + +#+RESULTS: +: [1] "Comp" "NPOIN" "NELEM" "NELEW" "NBOUN" "NNEIG" "NBBOU" + +#+begin_src R :results output graphics :file img/exp_10-v2_exp_11-v1_comp_partition.png :exports both :width 800 :height 400 :session +dfm %>% + mutate(EID = case_when( + grepl("10-v2", .$EID) ~ gsub("$", "(Equal Weight)", .$EID), + grepl("11-v1", .$EID) ~ gsub("$", "(Original Weight)", .$EID), + TRUE ~ "Undefined")) %>% + filter(Partitioning != "metis") %>% + filter(Variable %in% c("Comp", "NELEM", "NELEW")) %>% + ggplot(aes(x=Rank, y=Value, color=EID)) + + theme_bw(base_size=12) + + geom_point() + + ylim(0,NA) + + ggtitle ("SFC only, NP=128, 8-nodes@grimoire, Infiniband") + + theme(legend.position="top") + + facet_grid(Variable~EID, scales="free_y") +#+end_src + +#+RESULTS: +[[file:img/exp_10-v2_exp_11-v1_comp_partition.png]] + +** 2017-04-07 Check alternative Alya instrumentation :EXP12: +*** First try - no imbrication +#+begin_src shell :results output +ssh nancy.g5k cat ./alya-bsc/Executables/unix/all.scorep +#+end_src + +#+RESULTS: +#+begin_example +#SCOREP_FILE_NAMES_BEGIN +#EXCLUDE * +#INCLUDE */Alya.f90 +#SCOREP_FILE_NAMES_END + +SCOREP_REGION_NAMES_BEGIN +EXCLUDE * +INCLUDE endste +INCLUDE timste +INCLUDE doiter +INCLUDE concou +INCLUDE conblk +INCLUDE begzon +INCLUDE endzon +INCLUDE moduls +INCLUDE nastin +INCLUDE parall +SCOREP_REGION_NAMES_END +#+end_example + +#+begin_src shell :results output +scp nancy.g5k:./scorep_12-v1_grimoire_6_96_metis_true/traces.csv.gz . +#+end_src + +#+RESULTS: + +#+begin_src R :results output :session :exports both +library(readr); +library(dplyr); +df <- read_csv("traces.csv.gz", + col_names=c("Rank", "Start", "End", "Value"), + progress=FALSE); +df %>% group_by(Value) %>% summarize(N=n()) +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +# A tibble: 10 × 2 + Value N + +1 begzon 96 +2 conblk 96 +3 concou 96 +4 doiter 96 +5 endste 96 +6 endzon 96 +7 moduls 1248 +8 nastin 1152 +9 parall 431331 +10 timste 96 +#+end_example + +Let's concentrate in nastin: + +#+begin_src R :results output :session :exports both +df %>% filter(Rank == 61) +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 4,542 × 4 + Rank Start End Value + +1 61 0.010045 0.010054 parall +2 61 0.012898 0.043018 parall +3 61 0.043031 0.051135 parall +4 61 0.051233 0.051235 moduls +5 61 0.051251 249.036677 parall +6 61 249.037003 249.094670 parall +7 61 249.094764 249.094993 parall +8 61 249.095035 249.095187 parall +9 61 249.095202 249.095204 parall +10 61 249.553567 249.709909 parall +# ... with 4,532 more rows +#+end_example + +I need imbrication to understand that. +*** With imbrication, plot +#+begin_src shell :results output +scp nancy.g5k:./scorep_12-v1_grimoire_6_96_metis_true/traces.csv.gz . +#+end_src + +#+RESULTS: + +#+begin_src R :results output :session :exports both +df <- read_csv("traces.csv.gz", + col_names=c("Nature", "Rank", "Type", "Start", "End", "Duration", "Imbrication", "Value"), + progress=FALSE); +df; +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + Nature = col_character(), + Rank = col_integer(), + Type = col_character(), + Start = col_double(), + End = col_double(), + Duration = col_double(), + Imbrication = col_double(), + Value = col_character() +) +# A tibble: 857,455 × 8 + Nature Rank Type Start End Duration Imbrication Value + +1 State 95 STATE 0.000267 0.000279 0.000012 0 parall +2 State 95 STATE 0.000290 0.001318 0.001029 0 parall +3 State 95 STATE 0.001326 0.032243 0.030917 0 parall +4 State 95 STATE 0.032352 0.032354 0.000002 0 moduls +5 State 95 STATE 0.032367 252.062136 252.029768 0 parall +6 State 95 STATE 252.062477 252.118046 0.055569 0 parall +7 State 95 STATE 252.118115 252.118401 0.000286 0 parall +8 State 95 STATE 252.118442 252.118595 0.000153 0 parall +9 State 95 STATE 252.118610 252.118644 0.000034 0 parall +10 State 95 STATE 252.722345 252.732470 0.010126 0 parall +# ... with 857,445 more rows +#+end_example + +#+begin_src R :results output :session :exports both +dfplot %>% .$Imbrication %>% unique %>% sort +startTime <- df %>% filter(Value == "timste") %>% arrange(Start) %>% slice(1) %>% .$Start +startTime +endTime <- df %>% filter(Value == "endste") %>% arrange(End) %>% slice(n()) %>% .$End +endTime +dfplot %>% filter(Start > startTime, End < endTime) %>% filter(Imbrication != 10) %>% filter(Value == "moduls") +#+end_src + +#+RESULTS: +#+begin_example +[1] 0 +[1] 259.3502 +[1] 315.6014 +# A tibble: 15 × 8 + Nature Rank Type Start End Duration Imbrication Value + +1 State 4 STATE 259.6966 259.6972 0.000613 0 moduls +2 State 4 STATE 287.6941 287.7119 0.017868 0 moduls +3 State 4 STATE 288.0547 288.0553 0.000610 0 moduls +4 State 3 STATE 259.6966 259.6973 0.000682 0 moduls +5 State 3 STATE 287.6947 287.7119 0.017238 0 moduls +6 State 3 STATE 288.0546 288.0553 0.000678 0 moduls +7 State 2 STATE 259.6966 259.6972 0.000600 0 moduls +8 State 2 STATE 287.6937 287.7119 0.018168 0 moduls +9 State 2 STATE 288.0546 288.0552 0.000601 0 moduls +10 State 1 STATE 259.6966 259.6973 0.000712 0 moduls +11 State 1 STATE 287.6949 287.7119 0.017059 0 moduls +12 State 1 STATE 288.0547 288.0554 0.000707 0 moduls +13 State 0 STATE 259.6968 259.6969 0.000010 0 moduls +14 State 0 STATE 287.6883 287.7119 0.023637 0 moduls +15 State 0 STATE 288.0548 288.0548 0.000011 0 moduls +#+end_example + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 1200 :height 300 :session +library(ggplot2); + +maxImbrication = df %>% .$Imbrication %>% max; +plot <- NULL; +dfplot <- df %>% filter(!(Value %in% c("timste", "endste"))) %>% filter(Rank < 5) %>% filter(Imbrication %in% c(0,1)); +dfplot %>% + ggplot() + + coord_cartesian(xlim=c(startTime, endTime)) + + theme_bw(base_size=12) + + scale_fill_brewer(palette = "Set1") -> plot; + +for (i in (dfplot %>% .$Imbrication %>% unique %>% sort)){ + plot <- plot + + geom_rect(data=(dfplot %>% filter(Imbrication == i)), aes(fill=Value, + xmin=Start, + xmax=End, + ymin=Rank, + ymax=Rank+0.9-(Imbrication*(1/5)))); +} +plot; +#+end_src + +#+RESULTS: +[[file:/tmp/babel-24853m4f/figure24853J5Y.png]] +*** New tracing with MPI + nsi functions +#+begin_src shell :results output +ssh nancy.g5k cat ./alya-bsc/Executables/unix/all.scorep +#+end_src + +#+RESULTS: +#+begin_example +#SCOREP_FILE_NAMES_BEGIN +#EXCLUDE * +#INCLUDE */Alya.f90 +#SCOREP_FILE_NAMES_END + +SCOREP_REGION_NAMES_BEGIN +EXCLUDE * +INCLUDE endste +INCLUDE timste +INCLUDE doiter +INCLUDE moduls +INCLUDE nastin +INCLUDE parall +INCLUDE nsi_doiter +INCLUDE nsi_turnon +INCLUDE nsi_parall + +SCOREP_REGION_NAMES_END +#+end_example + +#+begin_src shell :results output +scp nancy.g5k:./scorep_12-v1_grimoire_6_96_metis_true_v3/traces.csv.gz . +#+end_src + +#+RESULTS: + +#+begin_src R :results output :session :exports both +df <- read_csv("traces.csv.gz", + col_names=c("Nature", "Rank", "Type", "Start", "End", "Duration", "Imbrication", "Value"), + progress=FALSE); +df <- df %>% mutate(Value = gsub("MPI_.*", "MPI", Value)); +df; +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + Nature = col_character(), + Rank = col_integer(), + Type = col_character(), + Start = col_double(), + End = col_double(), + Duration = col_double(), + Imbrication = col_double(), + Value = col_character() +) +# A tibble: 2,338,005 × 8 + Nature Rank Type Start End Duration Imbrication Value + +1 State 95 STATE 1.770337 1.770339 2.0e-06 0 MPI +2 State 95 STATE 1.770339 1.770340 1.0e-06 0 MPI +3 State 95 STATE 1.770348 1.770353 5.0e-06 0 MPI +4 State 95 STATE 1.770357 1.770449 9.3e-05 0 MPI +5 State 95 STATE 1.770450 1.770690 2.4e-04 0 MPI +6 State 95 STATE 1.770691 1.770691 0.0e+00 0 MPI +7 State 95 STATE 1.770691 1.770691 0.0e+00 0 MPI +8 State 95 STATE 1.770692 1.770693 1.0e-06 0 MPI +9 State 95 STATE 1.770693 1.770723 3.0e-05 0 MPI +10 State 95 STATE 1.770724 1.770731 8.0e-06 0 MPI +# ... with 2,337,995 more rows +#+end_example + +#+begin_src R :results output :session :exports both +startTime <- df %>% filter(Value == "timste") %>% arrange(Start) %>% slice(1) %>% .$End +startTime +endTime <- df %>% filter(Value == "endste") %>% arrange(End) %>% slice(n()) %>% .$End +endTime +maxImbrication = df %>% .$Imbrication %>% max; +maxImbrication +#+end_src + +#+RESULTS: +: [1] 126.1242 +: [1] 153.975 +: [1] 5 + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 1200 :height 300 :session +library(ggplot2); + +plot <- NULL; +dfplot <- df %>% + filter(Rank != 0) %>% + filter(!(Value %in% c("timste", "endste", "nsi_turnon"))) %>% + filter(Imbrication != 5) %>% + filter(Rank < 5); +dfplot %>% + ggplot() + + coord_cartesian(xlim=c(startTime, endTime)) + + xlim(startTime, endTime) + + theme_bw(base_size=12) + + scale_fill_brewer(palette = "Set1") -> plot; + +for (i in (dfplot %>% .$Imbrication %>% unique %>% sort)){ + plot <- plot + + geom_rect(data=(dfplot %>% filter(Imbrication == i)), aes(fill=Value, + xmin=Start, + xmax=End, + ymin=Rank, + ymax=Rank+0.9-(Imbrication*(1/10)))); +} +plot; +#+end_src + +#+RESULTS: +[[file:/tmp/babel-24853m4f/figure24853ygt.png]] + +Conclusions: +- I should take the end =timste= and =endste= because =timste= is long +- =nsi_doiter= (or the more general =doiter=) state has the calculations, but it has a lot of MPI calls within +- =nsi_turnon= is not necessary + +#+begin_src R :results output :session :exports both +df %>% filter(Rank == 4, Start > 135, End < 140) %>% filter(!(Value %in% c("timste", "endste", "doiter", "moduls"))) +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 2,674 × 8 + Nature Rank Type Start End Duration Imbrication Value + +1 State 4 STATE 135.0003 135.0005 0.000219 4 MPI +2 State 4 STATE 135.0008 135.0010 0.000210 4 MPI +3 State 4 STATE 135.0012 135.0015 0.000228 4 MPI +4 State 4 STATE 135.0017 135.0018 0.000048 4 parall +5 State 4 STATE 135.0018 135.0018 0.000010 5 MPI +6 State 4 STATE 135.0018 135.0018 0.000003 5 MPI +7 State 4 STATE 135.0018 135.0018 0.000004 5 MPI +8 State 4 STATE 135.0018 135.0018 0.000002 5 MPI +9 State 4 STATE 135.0018 135.0018 0.000003 5 MPI +10 State 4 STATE 135.0018 135.0018 0.000001 5 MPI +# ... with 2,664 more rows +#+end_example + +#+begin_src R :results output :session :exports both +df %>% filter(Imbrication == 5) %>% .$Value %>% unique +#+end_src + +#+RESULTS: +: [1] "MPI" + +#+begin_src R :results output :session :exports both +df %>% filter(Value == "nsi_turnon") %>% filter(Rank == 4) +#+end_src + +#+RESULTS: +: # A tibble: 1 × 8 +: Nature Rank Type Start End Duration Imbrication Value +: +: 1 State 4 STATE 125.57 125.6114 0.041432 2 nsi_turnon +*** New round +#+begin_src shell :results output +ssh nancy.g5k cat ./alya-bsc/Executables/unix/all.scorep +#+end_src + +#+RESULTS: +#+begin_example +#SCOREP_FILE_NAMES_BEGIN +#EXCLUDE * +#INCLUDE */Alya.f90 +#SCOREP_FILE_NAMES_END + +SCOREP_REGION_NAMES_BEGIN +EXCLUDE * +# Timestep begins when rank leaves the timste state +INCLUDE timste +# Timestep ends when rank enters the endste +INCLUDE endste +# doiter is the iteration itself +INCLUDE doiter + +# Nastin calls +INCLUDE nastin +INCLUDE nsi_turnon +INCLUDE nsi_timste +INCLUDE nsi_iniunk +INCLUDE nsi_begste +INCLUDE nsi_doiter +INCLUDE nsi_concou +INCLUDE nsi_conblk +INCLUDE nsi_newmsh +INCLUDE nsi_endste +INCLUDE nsi_filter +INCLUDE nsi_output +INCLUDE nsi_turnof +SCOREP_REGION_NAMES_END +#+end_example + + +#+begin_src shell :results output +scp nancy.g5k:./scorep_12-v1_grimoire_6_96_metis_true_v4/traces.csv.gz . +#+end_src + +#+RESULTS: + +#+begin_src R :results output :session :exports both +df <- read_csv("traces.csv.gz", + col_names=c("Nature", "Rank", "Type", "Start", "End", "Duration", "Imbrication", "Value"), + progress=FALSE); +df <- df %>% mutate(Value = gsub("MPI_.*", "MPI", Value)); +startTime <- df %>% filter(Value == "timste") %>% arrange(Start) %>% slice(1) %>% .$End +startTime +endTime <- df %>% filter(Value == "endste") %>% arrange(End) %>% slice(n()) %>% .$End +endTime +maxImbrication = df %>% .$Imbrication %>% max; +maxImbrication +df; +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + Nature = col_character(), + Rank = col_integer(), + Type = col_character(), + Start = col_double(), + End = col_double(), + Duration = col_double(), + Imbrication = col_double(), + Value = col_character() +) +[1] 123.4664 +[1] 151.6115 +[1] 3 +# A tibble: 1,906,194 × 8 + Nature Rank Type Start End Duration Imbrication Value + +1 State 95 STATE 1.110072 1.110075 0.000003 0 MPI +2 State 95 STATE 1.110076 1.110077 0.000001 0 MPI +3 State 95 STATE 1.110085 1.110090 0.000005 0 MPI +4 State 95 STATE 1.110091 1.110191 0.000100 0 MPI +5 State 95 STATE 1.110191 1.110424 0.000232 0 MPI +6 State 95 STATE 1.110425 1.110425 0.000000 0 MPI +7 State 95 STATE 1.110425 1.110425 0.000000 0 MPI +8 State 95 STATE 1.110426 1.110427 0.000001 0 MPI +9 State 95 STATE 1.110427 1.110463 0.000036 0 MPI +10 State 95 STATE 1.110463 1.110470 0.000007 0 MPI +# ... with 1,906,184 more rows +#+end_example + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 1200 :height 300 :session +library(ggplot2); + +plot <- NULL; +dfplot <- df %>% + filter(Rank != 0) %>% + filter(!(Value %in% c("timste", "endste"))) %>% + filter(Imbrication != 5) %>% + filter(Rank < 5); +dfplot %>% + ggplot() + + coord_cartesian(xlim=c(135, 145)) + #startTime, endTime)) + +# xlim(startTime, endTime) + + theme_bw(base_size=12) + + scale_fill_brewer(palette = "Set1") -> plot; + +for (i in (dfplot %>% .$Imbrication %>% unique %>% sort)){ + plot <- plot + + geom_rect(data=(dfplot %>% filter(Imbrication == i)), aes(fill=Value, + xmin=Start, + xmax=End, + ymin=Rank, + ymax=Rank+0.9-(Imbrication*(1/10)))); +} +plot; +#+end_src + +#+RESULTS: +[[file:/tmp/babel-24853m4f/figure24853mXu.png]] + +Still not there, I can't see any function in between. +*** New round with =nsi_solite= +#+begin_src shell :results output +ssh nancy.g5k cat ./alya-bsc/Executables/unix/all.scorep +#+end_src + +#+RESULTS: +#+begin_example +#SCOREP_FILE_NAMES_BEGIN +#EXCLUDE * +#INCLUDE */Alya.f90 +#SCOREP_FILE_NAMES_END + +SCOREP_REGION_NAMES_BEGIN +EXCLUDE * +# Timestep begins when rank leaves the timste state +INCLUDE timste +# Timestep ends when rank enters the endste +INCLUDE endste +# doiter is the iteration itself +INCLUDE doiter + +# Nastin calls +INCLUDE nastin +INCLUDE nsi_turnon +INCLUDE nsi_timste +INCLUDE nsi_iniunk +INCLUDE nsi_begste +INCLUDE nsi_doiter +INCLUDE nsi_concou +INCLUDE nsi_conblk +INCLUDE nsi_newmsh +INCLUDE nsi_endste +INCLUDE nsi_filter +INCLUDE nsi_output +INCLUDE nsi_turnof + +# the solver +INCLUDE nsi_solite + +SCOREP_REGION_NAMES_END +#+end_example + + +#+begin_src shell :results output +scp nancy.g5k:./scorep_12-v1_grimoire_6_96_metis_true_v5/traces.csv.gz . +#+end_src + +#+RESULTS: + +#+begin_src R :results output :session :exports both +df <- read_csv("traces.csv.gz", + col_names=c("Nature", "Rank", "Type", "Start", "End", "Duration", "Imbrication", "Value"), + progress=FALSE); +df <- df %>% mutate(Value = gsub("MPI_.*", "MPI", Value)); +startTime <- df %>% filter(Value == "timste") %>% arrange(Start) %>% slice(1) %>% .$End +startTime +endTime <- df %>% filter(Value == "endste") %>% arrange(End) %>% slice(n()) %>% .$End +endTime +maxImbrication = df %>% .$Imbrication %>% max; +maxImbrication +df; +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + Nature = col_character(), + Rank = col_integer(), + Type = col_character(), + Start = col_double(), + End = col_double(), + Duration = col_double(), + Imbrication = col_double(), + Value = col_character() +) +[1] 125.3257 +[1] 153.2368 +[1] 4 +# A tibble: 1,906,674 × 8 + Nature Rank Type Start End Duration Imbrication Value + +1 State 95 STATE 1.025796 1.025799 0.000002 0 MPI +2 State 95 STATE 1.025800 1.025800 0.000001 0 MPI +3 State 95 STATE 1.025809 1.025814 0.000005 0 MPI +4 State 95 STATE 1.025815 1.025919 0.000104 0 MPI +5 State 95 STATE 1.025919 1.026159 0.000240 0 MPI +6 State 95 STATE 1.026160 1.026161 0.000000 0 MPI +7 State 95 STATE 1.026161 1.026161 0.000000 0 MPI +8 State 95 STATE 1.026162 1.026163 0.000001 0 MPI +9 State 95 STATE 1.026163 1.026200 0.000037 0 MPI +10 State 95 STATE 1.026200 1.026206 0.000005 0 MPI +# ... with 1,906,664 more rows +#+end_example + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 1200 :height 300 :session +library(ggplot2); + +plot <- NULL; +dfplot <- df %>% + filter(Rank != 0) %>% + filter(!(Value %in% c("timste", "endste", "doiter", "nastin"))) %>% + filter(Imbrication != 5) %>% + filter(Rank < 5); +dfplot %>% + ggplot() + + coord_cartesian(xlim=c(startTime, endTime)) + + theme_bw(base_size=12) + + scale_fill_brewer(palette = "Set1") -> plot; + +for (i in (dfplot %>% .$Imbrication %>% unique %>% sort)){ + plot <- plot + + geom_rect(data=(dfplot %>% filter(Imbrication == i)), aes(fill=Value, + xmin=Start, + xmax=End, + ymin=Rank, + ymax=Rank+0.9-(Imbrication*(1/10)))); +} +plot; +#+end_src + +#+RESULTS: +[[file:/tmp/babel-24853m4f/figure24853ZbQ.png]] + + +We have five =nsi_solite= which matches: + +#+BEGIN_EXAMPLE +--| ALYA SOLVE NASTIN (MSMSC) +--| ALYA SOLVE NASTIN (MSMSC) +--| ALYA SOLVE NASTIN (MSMSC) +--| ALYA SOLVE NASTIN (MSMSC) +--| ALYA SOLVE NASTIN (MSMSC) +#+END_EXAMPLE + +There are still some MPI calls in between. Perhaps one of each MSMSC. +*** With many functions called by nsi solite + +#+begin_src shell :results output +ssh nancy.g5k cat ./alya-bsc/Executables/unix/all.scorep +#+end_src + +#+RESULTS: +#+begin_example +#SCOREP_FILE_NAMES_BEGIN +#EXCLUDE * +#INCLUDE */Alya.f90 +#SCOREP_FILE_NAMES_END + +SCOREP_REGION_NAMES_BEGIN +EXCLUDE * +# Timestep begins when rank leaves the timste state +INCLUDE timste +# Timestep ends when rank enters the endste +INCLUDE endste +# doiter is the iteration itself +INCLUDE doiter + +# Nastin calls +INCLUDE nastin +INCLUDE nsi_turnon +INCLUDE nsi_timste +INCLUDE nsi_iniunk +INCLUDE nsi_begste +INCLUDE nsi_doiter +INCLUDE nsi_concou +INCLUDE nsi_conblk +INCLUDE nsi_newmsh +INCLUDE nsi_endste +INCLUDE nsi_filter +INCLUDE nsi_output +INCLUDE nsi_turnof + +# the solver +INCLUDE nsi_solite + +# functions called by the nsi solver +INCLUDE nsi_updbcs +INCLUDE nsi_ifconf +INCLUDE nsi_updunk +INCLUDE nsi_solsgs +INCLUDE nsi_inisol +INCLUDE nsi_matrix +INCLUDE nsimatndof +INCLUDE nsi_agmgsol +INCLUDE solver +INCLUDE nsi_multi_step_fs_solution +INCLUDE nsi_fractional_step_solution +INCLUDE nsi_schur_complement_solution +INCLUDE nsi_solver_postprocess + +SCOREP_REGION_NAMES_END +#+end_example + + +#+begin_src shell :results output +scp nancy.g5k:./scorep_12-v1_grimoire_6_96_metis_true_v6/traces.csv.gz . +#+end_src + +#+RESULTS: + +#+begin_src R :results output :session :exports both +df <- read_csv("traces.csv.gz", + col_names=c("Nature", "Rank", "Type", "Start", "End", "Duration", "Imbrication", "Value"), + progress=FALSE); +df <- df %>% mutate(Value = gsub("MPI_.*", "MPI", Value)); +startTime <- df %>% filter(Value == "timste") %>% arrange(Start) %>% slice(1) %>% .$End +startTime +endTime <- df %>% filter(Value == "endste") %>% arrange(End) %>% slice(n()) %>% .$End +endTime +maxImbrication = df %>% .$Imbrication %>% max; +maxImbrication +df; +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + Nature = col_character(), + Rank = col_integer(), + Type = col_character(), + Start = col_double(), + End = col_double(), + Duration = col_double(), + Imbrication = col_double(), + Value = col_character() +) +[1] 124.8219 +[1] 152.4296 +[1] 5 +# A tibble: 1,914,065 × 8 + Nature Rank Type Start End Duration Imbrication Value + +1 State 95 STATE 1.014010 1.014012 0.000002 0 MPI +2 State 95 STATE 1.014013 1.014014 0.000001 0 MPI +3 State 95 STATE 1.014022 1.014027 0.000004 0 MPI +4 State 95 STATE 1.014028 1.014128 0.000100 0 MPI +5 State 95 STATE 1.014128 1.014358 0.000229 0 MPI +6 State 95 STATE 1.014359 1.014359 0.000000 0 MPI +7 State 95 STATE 1.014359 1.014359 0.000000 0 MPI +8 State 95 STATE 1.014360 1.014361 0.000001 0 MPI +9 State 95 STATE 1.014361 1.014398 0.000037 0 MPI +10 State 95 STATE 1.014398 1.014405 0.000007 0 MPI +# ... with 1,914,055 more rows +#+end_example + +#+begin_src R :results output :session :exports both +df %>% select(Imbrication, Value) %>% unique %>% as.data.frame() %>% filter(Imbrication == 4) +#+end_src + +#+RESULTS: +: Imbrication Value +: 1 4 nsi_updbcs +: 2 4 nsi_ifconf +: 3 4 nsi_updunk +: 4 4 nsi_solsgs +: 5 4 nsi_matrix +: 6 4 solver +: 7 4 MPI +: 8 4 nsi_inisol + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 1400 :height 300 :session +library(ggplot2); + +plot <- NULL; +dfplot <- df %>% + filter(Rank != 0) %>% + filter(Value %in% c(#"MPI", + "solver", + #"nsi_solite", + "nsi_matrix", + "nsi_inisol", + "nsi_updunk", + "nsi_updbcs", + "nsi_ifconf", + "nsi_updbcs")) %>% #, "nsi_matrix")) %>% +# filter(!(Value %in% c("timste", "endste", "doiter", "nastin", "nsi_begste", "nsi_endste", "nsi_concou", "nsi_output"))) %>% + filter(Imbrication != 5)# %>% +# filter(Rank < 5); +dfplot %>% + ggplot() + + coord_cartesian(xlim=c(startTime, endTime)) + + xlim(startTime, endTime) + + theme_bw(base_size=12) + + scale_fill_brewer(palette = "Set1") -> plot; + +for (i in (dfplot %>% .$Imbrication %>% unique %>% sort)){ + plot <- plot + + geom_rect(data=(dfplot %>% filter(Imbrication == i)), aes(fill=Value, + xmin=Start, + xmax=End, + ymin=Rank, + ymax=Rank+0.9-(Imbrication*(1/10)))); +} +plot; +#+end_src + +#+RESULTS: +[[file:/tmp/babel-24853m4f/figure24853PbT.png]] + + +So, for Nastin, computation time is: +- =nsi_matrix= +- =solver= +- =nsi_inisol= (hardly seen) +- +*** Final attempt +Check analysis of EXP12. +*** SCOREP Filter File :ATTACH: +:PROPERTIES: +:Attachments: all.scorep +:ID: ef6feaae-5f92-4706-a250-bde856794f47 +:END: + +#+begin_src shell :results output +scp nancy.g5k:./alya-bsc/Executables/unix/all.scorep . +#+end_src + +#+RESULTS: + +** 2017-05-04 Meeting D2.4 " *Report on Multilevel Programming and I/O optimizations" +Presents: Guillaume, Rick, Lucas, Arnaud +Link: https://rendez-vous.renater.fr/alya +*** Introduction (Guillaume's Message) +Guillaume's Message: I'm responsible for a deliverable of HPC4E, D2.4 +"Report on Multilevel Programming and I/O optimizations". We are going +to report our work on SFC as a mesh partitioner and I was wondering if +you want to report the performance analysis you did on this as +well. No need to be extensive, one page would be okay. +*** Discussed Topics +**** Load imbalance (and SFC/Metis) +When the number of points is higher, the time spent in the "solver" +function is reduce. That surprises us because it should be the +contrary. The number of points correspond to the number of rows in the +matrix associated to each rank. Instead, Rick told us that we should +look instead to the number of entries in each row; that should +directly correlate with the workload of each rank. + +So we should add a new row with the number of entries associated to +each rank: =img/exp_13_solver_partition.png=. + +Load imbalance is independent of Metis/SFC. + +At one point, when load imbalance is perfect, we shall look again to +the communication issue we have previously seen. +**** The one-page report +We still have 10 days to finish the delivrable. So we can check the +number of entries previously mentionned before writing the report. +** 2017-05-10 Experimental session +#+begin_src shell :results output +oarsub -p "cluster='grisou'" -l "nodes=20,walltime=14:00:00" -r "$(date '+%Y-%m-%d 19:00:00')" -t deploy +kadeploy3 --env-file /home/lschnorr/images/stretch_energy.env -f ${OAR_NODE_FILE} -k +#+end_src +** 2018-01-21 Meeting @BSC during HP4E closure +*** Pre-meeting notes based on e-mails +**** Partition quality using adaptation strategy: run-time measurements + unity tests (Ricard) +The SFC partitioning process has shown to be very fast: a mesh of 30M +elements can be partitioned in 0.05 seconds using 4K CPUs. As you say, +next steps must be focused on the partition quality. In particular, +I’m thinking on a partition adaptation strategy based on run-time +measurements or some initial measurements on unity tests. +**** Different input to check for variability (Lucas) +Sure. Back in those days, we have instrumented the kernels that were +used in one of the simulations to measure their run-time ie. the +per-iteration load of each rank. The gathered information can be used +to understand how such load evolves along the execution. As you plan, +perhaps we can leverage on that to improve the partition quality. I +wonder if we can play with many different inputs to check for +variability. +**** LB with co-execution on hybrid systems where a proper load distribution is hard to guess (Ricard+Arnaud) +Le fait de lancer plusieurs codes (couplés ou pas) sur les mêmes +ressources. Par exemple, dans Alya, qui fait de la multi physique, on +peut activer plusieurs "modules" (un pour l'atmosphère, un pour +l'océan, un pour ...) qui sont en fait des programmes autonomes mais +couplés. Tu peux mettre un module sur une partie des ressources, un +module sur une autre partie des ressources ou les deux modules sur +l'ensemble des ressources. Pour moi, quand tu les met au même endroit, +c'est un exemple de co-execution et sinon, ils se partagent quand même +les ressources réseau. On parle de co-execution aussi au niveau batch +scheduler quand on s'autorise à exécuter deux applis vraiment +indépendantes sur les mêmes ressources mais je ne pense pas que ça +soit ça dont il parle. Là, c'est plusieurs modules au sein de la même +application et histoire de simplifier les choses ces modules utilisent +plus ou moins les GPUs, plus ou moins en même temps selon la façon +dont c'est programmé.. +*** Meeting +- Partitioning is very fast, so we can run several times until the + quality is very good. There are two approaches to improve the + quality: the first one is based on using realtime measuremnets from + the Alya; the second is with a unity test (mimicing the Alya + code). The second has some disavatnages but is pretty much faster. +- They are using a Hilbert SFC because it provides better locality. + - Other approaches: Peano, Morton +- The current weight: the number of elements, mass center of the element + - Other approaches Ricard tried: more sofisticated approaches but it + is a manual approach, very slow. A runtime approach with + monitoring is better. +- They read the mesh in parallel + - Boxes and subboxes: the distribution based on SFC and the weight + of each box is passed throught (acumulated) to other boxes. +- Test cases: + - sniff (17.7 M elements, 6.8M nodes); \approx1K cores (60K elements per core) + - in memory occupation: 50K (1GBytes per core) + - combustor (28.9 M elements, 7M nodes); \approx2K cores + - 30 to 60 number of elements per core +- Cartesian division from the initial global bounding box + - Each rank reads the mesh in parallel and send to each box (another + rank is responsible) the number of elements and their weight + - Starts with a coarse grid, but a finer cartesion grid used for the + SFC gives better initial LB. He does a very thin grid and its + good. +- How to improve the SFC. There are two options + - You cut the line before if the weight is too much + - Change the cartesian in some axis + - Change the granularity of one box with a finer SFC +- Arnaud (another approach): perturbate the partioning and collect + measurements with multiple SFC configurations (with cuts slightly + before and after) and then, + +** 2018-01-29 Prep data for Arnaud (three stable runs) +*** Introduction + +We are talking about experiences: +- EXP24 (5 iterations) + - It is a bit strange as we previously saw + - See figure in to remember + #+BEGIN_EXAMPLE + ** 4-node grisou (64 cores + 1 core) 1its, new round + *** Visualization / Analysis + **** Merge the cumsum of exp24 and exp25 + #+END_EXAMPLE +- EXP25 (1 iteration) +- EXP26 (5 iterations) + +*** Gather data + +#+name: exp24_PREP +#+header: :var dep0=exp24_enrich +#+header: :var dep1=exp24_number_of_elements +#+begin_src R :results output :session :exports both +exp.ENRICH %>% left_join(exp.ELEMENTS) %>% mutate(Exp = "exp24") -> exp24.PREP +#+end_src + +#+RESULTS: exp24_PREP +: Joining, by = "Rank" + +#+name: exp25_PREP +#+header: :var dep0=exp25_enrich +#+header: :var dep1=exp25_number_of_elements +#+begin_src R :results output :session :exports both +exp.ENRICH %>% left_join(exp.ELEMENTS) %>% mutate(Exp = "exp25") -> exp25.PREP +#+end_src + +#+RESULTS: exp25_PREP +: Joining, by = "Rank" + +#+name: exp26_PREP +#+header: :var dep0=exp26_enrich +#+header: :var dep1=exp26_number_of_elements +#+begin_src R :results output :session :exports both +exp.ENRICH %>% left_join(exp.ELEMENTS) %>% mutate(Exp = "exp26") -> exp26.PREP +#+end_src + +#+RESULTS: exp26_PREP +: Joining, by = "Rank" + +#+begin_src R :results output :session :exports both +exp24.PREP %>% + bind_rows(exp25.PREP) %>% + bind_rows(exp26.PREP) %>% + select(Phase, ID, Rank, NELEM,NPOIN,NBOUN,NPOI32,Exp) -> exp.PREP +exp.PREP %>% + select(Phase, Exp) %>% unique +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 11 x 2 + Phase Exp + + 1 1 exp24 + 2 2 exp24 + 3 3 exp24 + 4 4 exp24 + 5 5 exp24 + 6 1 exp25 + 7 1 exp26 + 8 2 exp26 + 9 3 exp26 +10 4 exp26 +11 5 exp26 +#+end_example + +*** Save the data here :ATTACH: +:PROPERTIES: +:Attachments: exp24-exp25-exp26.csv.gz +:ID: 2c325c57-678e-491d-a7d4-f71be90dbd73 +:END: + +#+begin_src R :results output :session :exports both +write_csv(exp.PREP, "exp24-exp25-exp26.csv.gz") +#+end_src + +#+RESULTS: + +** 2018-02-06 New hypothesis Check +*** PAST: ping-pong effects can be avoided by learning from the past (many rounds) + +Arnaud. + +*** GRANULARITY: a finer SFC granularity can reveal necessary details to LB + +10 iterations of the test case with 128, 256, and 512 workers, for 20 +refinement rounds, with granularity levels for each partitioner +sub-box of 128_ip (the current def), 256_ip, and 512_ip. Please, let me +know if you think that is enough. I also intend to keep CRITERIA +equals to 1 (unchanged) to avoid changing everything at once, except +if I can allocated another cluster for that specific test. + +*** CRITERIA: use a different =CRITERIA = 2_ip= so we consider weighted elements + +* Experiments +** 2017-03-31 Grisou experiments +#+begin_src shell :results output +SCRIPT=grisou-script.sh +rm -f ${SCRIPT} +echo 'export PATH=$PATH:$HOME/install/nova/scorep-3.0-alya/bin' >> ${SCRIPT} +echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/install/nova/scorep-3.0-alya/lib' >> ${SCRIPT} +for CASE in sfc metis; do + for NP in 127 112 96 80 64 48 32 16; do + NP=$((NP+1)) + UNIQUE=taurus-np${NP}-${CASE} + echo "./scripts/run_alya_experiment.sh machine-file ${NP} ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_${CASE}/fensap.dat $UNIQUE" + echo "mv $UNIQUE /home/lschnorr/" + done +done >> ${SCRIPT} +cat ${SCRIPT} +#+end_src + +#+RESULTS: +#+begin_example +export PATH=$PATH:$HOME/install/nova/scorep-3.0-alya/bin +export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/install/nova/scorep-3.0-alya/lib +./scripts/run_alya_experiment.sh machine-file 128 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_sfc/fensap.dat taurus-np128-sfc +mv taurus-np128-sfc /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 113 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_sfc/fensap.dat taurus-np113-sfc +mv taurus-np113-sfc /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 97 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_sfc/fensap.dat taurus-np97-sfc +mv taurus-np97-sfc /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 81 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_sfc/fensap.dat taurus-np81-sfc +mv taurus-np81-sfc /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 65 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_sfc/fensap.dat taurus-np65-sfc +mv taurus-np65-sfc /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 49 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_sfc/fensap.dat taurus-np49-sfc +mv taurus-np49-sfc /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 33 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_sfc/fensap.dat taurus-np33-sfc +mv taurus-np33-sfc /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 17 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_sfc/fensap.dat taurus-np17-sfc +mv taurus-np17-sfc /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 128 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_metis/fensap.dat taurus-np128-metis +mv taurus-np128-metis /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 113 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_metis/fensap.dat taurus-np113-metis +mv taurus-np113-metis /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 97 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_metis/fensap.dat taurus-np97-metis +mv taurus-np97-metis /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 81 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_metis/fensap.dat taurus-np81-metis +mv taurus-np81-metis /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 65 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_metis/fensap.dat taurus-np65-metis +mv taurus-np65-metis /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 49 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_metis/fensap.dat taurus-np49-metis +mv taurus-np49-metis /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 33 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_metis/fensap.dat taurus-np33-metis +mv taurus-np33-metis /home/lschnorr/ +./scripts/run_alya_experiment.sh machine-file 17 ~/alya-bsc/Executables/unix/Alya.x ~/WORK-RICARD/resp_metis/fensap.dat taurus-np17-metis +mv taurus-np17-metis /home/lschnorr/ +#+end_example + +Those experiments have failed. + +** 2017-03-31 2-node grisou debugging experiments :EXP1: + +Job 1224689 @ Nancy + +#+begin_src shell :results output +git clone https://gitlab.inria.fr/schnorr/Alya-Perf.git +#+end_src + +Requirements + +- The mesh: =$HOME/node_denis/= +- The case study: =$HOME/WORK-RICARD/= +- Scorep 3.0 installed at: =$HOME/install/nova/scorep-3.0-alya/= +- Alya: =$HOME/alya-bsc/Executables/unix/Alya.x= + - Instrumented with the aid of scorep + #+BEGIN_EXAMPLE + SCOREP=scorep --compiler --instrument-filter=/home/lschnorr/alya-bsc/Executables/unix/alya-timestep-filters.scorep + #+END_EXAMPLE + Contents of the filter + #+BEGIN_EXAMPLE +SCOREP_REGION_NAMES_BEGIN +EXCLUDE * +INCLUDE endste +INCLUDE timste +SCOREP_REGION_NAMES_END + #+END_EXAMPLE + +Experimental design + +#+begin_src shell :results output +CLUSTER=grisou +NODES=2 +MACHINEFILE=machine-file +SCRIPT="script_${CLUSTER}_${NODES}_nodes.sh" +rm -f ${SCRIPT} +for NP in 31 16; do + for CASE in sfc metis; do + NPN=$((NP+1)) + UNIQUE=${CLUSTER}_${NODES}_${NPN}_${CASE} + echo "./scripts/run_alya_experiment.sh ${MACHINEFILE} ${NPN} /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_${CASE}/fensap.dat $UNIQUE" + echo "mv $UNIQUE.dir /home/lschnorr/" + done +done >> ${SCRIPT} +chmod 755 ${SCRIPT} +cat ${SCRIPT} +#+end_src + +#+RESULTS: +: ./scripts/run_alya_experiment.sh machine-file 32 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat grisou_2_32_sfc +: mv grisou_2_32_sfc.dir /home/lschnorr/ +: ./scripts/run_alya_experiment.sh machine-file 32 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat grisou_2_32_metis +: mv grisou_2_32_metis.dir /home/lschnorr/ +: ./scripts/run_alya_experiment.sh machine-file 17 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat grisou_2_17_sfc +: mv grisou_2_17_sfc.dir /home/lschnorr/ +: ./scripts/run_alya_experiment.sh machine-file 17 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat grisou_2_17_metis +: mv grisou_2_17_metis.dir /home/lschnorr/ +** 2017-03-31 10-node grisou debugging experiments :EXP2:EXP3: + +Same as previous entry, but with 10 nodes. + +Job 1224741 @ Nancy + +Kadeploy the image: +=/home/lschnorr/images/stretch_energy_v6_2017_03_30.tgz= + +Same requirements as previous entry. + +Experimental design + +#+begin_src shell :results output +EID=3 +CLUSTER=grisou +NODES=10 +MACHINEFILE=machine-file +G5KHOME=/home/lschnorr/ +SCRIPT="script_${CLUSTER}_${NODES}_nodes.sh" + +rm -f $SCRIPT + +EDIR="$G5KHOME/exp_${EID}_${CLUSTER}_${NODES}" +echo "if [ -e "$EDIR" ]; then" >> $SCRIPT +echo " echo \"Directory/File '${EDIR}' already exists. Remove it or rename it.\"" >> $SCRIPT +echo " exit" >> $SCRIPT +echo "fi" >> $SCRIPT + +# Create experimental directory in the home +echo "mkdir -p $EDIR" >> $SCRIPT + +# The design +for NP in 159 144 128 112 96; do + for CASE in sfc metis; do + NPN=$((NP+1)) + UNIQUE=${EID}_${CLUSTER}_${NODES}_${NPN}_${CASE} + echo "./scripts/run_alya_experiment.sh ${MACHINEFILE} ${NPN} /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_${CASE}/fensap.dat $UNIQUE" + echo "cp -prfv $UNIQUE.dir ${EDIR}" + done +done >> ${SCRIPT} +echo "mv $SCRIPT $EDIR" >> $SCRIPT +chmod 755 ${SCRIPT} +cat ${SCRIPT} +echo "This script is at '${SCRIPT}', commit it." +#+end_src + +#+RESULTS: +#+begin_example +if [ -e /home/lschnorr//exp_3_grisou_10 ]; then + echo "Directory/File '/home/lschnorr//exp_3_grisou_10' already exists. Remove it or rename it." + exit +fi +mkdir -p /home/lschnorr//exp_3_grisou_10 +./scripts/run_alya_experiment.sh machine-file 160 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 3_grisou_10_160_sfc +cp -prfv 3_grisou_10_160_sfc.dir /home/lschnorr//exp_3_grisou_10 +./scripts/run_alya_experiment.sh machine-file 160 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 3_grisou_10_160_metis +cp -prfv 3_grisou_10_160_metis.dir /home/lschnorr//exp_3_grisou_10 +./scripts/run_alya_experiment.sh machine-file 145 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 3_grisou_10_145_sfc +cp -prfv 3_grisou_10_145_sfc.dir /home/lschnorr//exp_3_grisou_10 +./scripts/run_alya_experiment.sh machine-file 145 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 3_grisou_10_145_metis +cp -prfv 3_grisou_10_145_metis.dir /home/lschnorr//exp_3_grisou_10 +./scripts/run_alya_experiment.sh machine-file 129 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 3_grisou_10_129_sfc +cp -prfv 3_grisou_10_129_sfc.dir /home/lschnorr//exp_3_grisou_10 +./scripts/run_alya_experiment.sh machine-file 129 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 3_grisou_10_129_metis +cp -prfv 3_grisou_10_129_metis.dir /home/lschnorr//exp_3_grisou_10 +./scripts/run_alya_experiment.sh machine-file 113 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 3_grisou_10_113_sfc +cp -prfv 3_grisou_10_113_sfc.dir /home/lschnorr//exp_3_grisou_10 +./scripts/run_alya_experiment.sh machine-file 113 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 3_grisou_10_113_metis +cp -prfv 3_grisou_10_113_metis.dir /home/lschnorr//exp_3_grisou_10 +./scripts/run_alya_experiment.sh machine-file 97 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 3_grisou_10_97_sfc +cp -prfv 3_grisou_10_97_sfc.dir /home/lschnorr//exp_3_grisou_10 +./scripts/run_alya_experiment.sh machine-file 97 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 3_grisou_10_97_metis +cp -prfv 3_grisou_10_97_metis.dir /home/lschnorr//exp_3_grisou_10 +mv script_grisou_10_nodes.sh /home/lschnorr//exp_3_grisou_10 +This script is at 'script_grisou_10_nodes.sh', commit it. +#+end_example + +Faster way to copy is to tar (without compression) all files, and then +copy using (not the g5kg) +#+BEGIN_EXAMPLE +rsync --progress grisou-51.nancy.g5kg:~/exp_2_grisou_10.tar . +#+END_EXAMPLE + +Two replications have been made: +- exp 2 +- exp 3 +** 2017-04-03 23-node (368 cores) nova in April 3rd :EXP4: +*** Preparation for nova (prep-4-v[12]) + +#+begin_src shell :results output +EID=prep-4 +CLUSTER=grisou +NODES=4 +MACHINEFILE=machine-file +G5KHOME=/home/lschnorr/ +SCRIPT="script_${EID}_${CLUSTER}_${NODES}_nodes.sh" + +rm -f $SCRIPT + +EDIR="$G5KHOME/exp_${EID}_${CLUSTER}_${NODES}" +echo "if [ -e "$EDIR" ]; then" >> $SCRIPT +echo " echo \"Directory/File '${EDIR}' already exists. Remove it or rename it.\"" >> $SCRIPT +echo " exit" >> $SCRIPT +echo "fi" >> $SCRIPT + +# Create experimental directory in the home +echo "mkdir -p $EDIR" >> $SCRIPT + +# The design +for NP in 62 48 32 17; do + for CASE in sfc metis; do + NPN=$((NP+1)) + UNIQUE=${EID}_${CLUSTER}_${NODES}_${NPN}_${CASE} + echo "./scripts/run_alya_experiment.sh ${MACHINEFILE} ${NPN} /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_${CASE}/fensap.dat $UNIQUE" + echo "cp -prfv $UNIQUE.dir ${EDIR}" + done +done >> ${SCRIPT} +echo "mv $SCRIPT $EDIR" >> $SCRIPT +chmod 755 ${SCRIPT} +cat ${SCRIPT} +echo "This script is at '${SCRIPT}', commit it." +#+end_src + +#+RESULTS: +#+begin_example +if [ -e /home/lschnorr//exp_prep-4_grisou_4 ]; then + echo "Directory/File '/home/lschnorr//exp_prep-4_grisou_4' already exists. Remove it or rename it." + exit +fi +mkdir -p /home/lschnorr//exp_prep-4_grisou_4 +./scripts/run_alya_experiment.sh machine-file 63 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat prep-4_grisou_4_63_sfc +cp -prfv prep-4_grisou_4_63_sfc.dir /home/lschnorr//exp_prep-4_grisou_4 +./scripts/run_alya_experiment.sh machine-file 63 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat prep-4_grisou_4_63_metis +cp -prfv prep-4_grisou_4_63_metis.dir /home/lschnorr//exp_prep-4_grisou_4 +./scripts/run_alya_experiment.sh machine-file 49 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat prep-4_grisou_4_49_sfc +cp -prfv prep-4_grisou_4_49_sfc.dir /home/lschnorr//exp_prep-4_grisou_4 +./scripts/run_alya_experiment.sh machine-file 49 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat prep-4_grisou_4_49_metis +cp -prfv prep-4_grisou_4_49_metis.dir /home/lschnorr//exp_prep-4_grisou_4 +./scripts/run_alya_experiment.sh machine-file 33 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat prep-4_grisou_4_33_sfc +cp -prfv prep-4_grisou_4_33_sfc.dir /home/lschnorr//exp_prep-4_grisou_4 +./scripts/run_alya_experiment.sh machine-file 33 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat prep-4_grisou_4_33_metis +cp -prfv prep-4_grisou_4_33_metis.dir /home/lschnorr//exp_prep-4_grisou_4 +./scripts/run_alya_experiment.sh machine-file 18 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat prep-4_grisou_4_18_sfc +cp -prfv prep-4_grisou_4_18_sfc.dir /home/lschnorr//exp_prep-4_grisou_4 +./scripts/run_alya_experiment.sh machine-file 18 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat prep-4_grisou_4_18_metis +cp -prfv prep-4_grisou_4_18_metis.dir /home/lschnorr//exp_prep-4_grisou_4 +mv script_prep-4_grisou_4_nodes.sh /home/lschnorr//exp_prep-4_grisou_4 +This script is at 'script_prep-4_grisou_4_nodes.sh', commit it. +#+end_example +*** Full 23-node nova + +JobId 854496 @ Lyon. + +#+begin_src shell :results output +EID=4 +CLUSTER=grisou +NODES=23 +MACHINEFILE=machine-file +G5KHOME=/home/lschnorr/ +SCRIPT="script_${EID}_${CLUSTER}_${NODES}_nodes.sh" + +rm -f $SCRIPT + +EDIR="$G5KHOME/exp_${EID}_${CLUSTER}_${NODES}" +echo "if [ -e "$EDIR" ]; then" >> $SCRIPT +echo " echo \"Directory/File '${EDIR}' already exists. Remove it or rename it.\"" >> $SCRIPT +echo " exit" >> $SCRIPT +echo "fi" >> $SCRIPT + +# Create experimental directory in the home +echo "mkdir -p $EDIR" >> $SCRIPT + +# The design +for NP in 367 352 336 272 208 144 80 16; do + for CASE in sfc metis; do + NPN=$((NP+1)) + UNIQUE=${EID}_${CLUSTER}_${NODES}_${NPN}_${CASE} + echo "./scripts/run_alya_experiment.sh ${MACHINEFILE} ${NPN} /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_${CASE}/fensap.dat $UNIQUE" + echo "cp -prfv $UNIQUE.dir ${EDIR}" + done +done >> ${SCRIPT} +echo "cp $SCRIPT $EDIR" >> $SCRIPT +chmod 755 ${SCRIPT} +cat ${SCRIPT} +echo "This script is at '${SCRIPT}', commit it." +#+end_src + +#+RESULTS: +#+begin_example +if [ -e /home/lschnorr//exp_4_grisou_23 ]; then + echo "Directory/File '/home/lschnorr//exp_4_grisou_23' already exists. Remove it or rename it." + exit +fi +mkdir -p /home/lschnorr//exp_4_grisou_23 +./scripts/run_alya_experiment.sh machine-file 368 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 4_grisou_23_368_sfc +cp -prfv 4_grisou_23_368_sfc.dir /home/lschnorr//exp_4_grisou_23 +./scripts/run_alya_experiment.sh machine-file 368 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 4_grisou_23_368_metis +cp -prfv 4_grisou_23_368_metis.dir /home/lschnorr//exp_4_grisou_23 +./scripts/run_alya_experiment.sh machine-file 353 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 4_grisou_23_353_sfc +cp -prfv 4_grisou_23_353_sfc.dir /home/lschnorr//exp_4_grisou_23 +./scripts/run_alya_experiment.sh machine-file 353 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 4_grisou_23_353_metis +cp -prfv 4_grisou_23_353_metis.dir /home/lschnorr//exp_4_grisou_23 +./scripts/run_alya_experiment.sh machine-file 337 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 4_grisou_23_337_sfc +cp -prfv 4_grisou_23_337_sfc.dir /home/lschnorr//exp_4_grisou_23 +./scripts/run_alya_experiment.sh machine-file 337 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 4_grisou_23_337_metis +cp -prfv 4_grisou_23_337_metis.dir /home/lschnorr//exp_4_grisou_23 +./scripts/run_alya_experiment.sh machine-file 273 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 4_grisou_23_273_sfc +cp -prfv 4_grisou_23_273_sfc.dir /home/lschnorr//exp_4_grisou_23 +./scripts/run_alya_experiment.sh machine-file 273 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 4_grisou_23_273_metis +cp -prfv 4_grisou_23_273_metis.dir /home/lschnorr//exp_4_grisou_23 +./scripts/run_alya_experiment.sh machine-file 209 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 4_grisou_23_209_sfc +cp -prfv 4_grisou_23_209_sfc.dir /home/lschnorr//exp_4_grisou_23 +./scripts/run_alya_experiment.sh machine-file 209 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 4_grisou_23_209_metis +cp -prfv 4_grisou_23_209_metis.dir /home/lschnorr//exp_4_grisou_23 +./scripts/run_alya_experiment.sh machine-file 145 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 4_grisou_23_145_sfc +cp -prfv 4_grisou_23_145_sfc.dir /home/lschnorr//exp_4_grisou_23 +./scripts/run_alya_experiment.sh machine-file 145 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 4_grisou_23_145_metis +cp -prfv 4_grisou_23_145_metis.dir /home/lschnorr//exp_4_grisou_23 +./scripts/run_alya_experiment.sh machine-file 81 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 4_grisou_23_81_sfc +cp -prfv 4_grisou_23_81_sfc.dir /home/lschnorr//exp_4_grisou_23 +./scripts/run_alya_experiment.sh machine-file 81 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 4_grisou_23_81_metis +cp -prfv 4_grisou_23_81_metis.dir /home/lschnorr//exp_4_grisou_23 +./scripts/run_alya_experiment.sh machine-file 17 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 4_grisou_23_17_sfc +cp -prfv 4_grisou_23_17_sfc.dir /home/lschnorr//exp_4_grisou_23 +./scripts/run_alya_experiment.sh machine-file 17 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 4_grisou_23_17_metis +cp -prfv 4_grisou_23_17_metis.dir /home/lschnorr//exp_4_grisou_23 +cp script_4_grisou_23_nodes.sh /home/lschnorr//exp_4_grisou_23 +This script is at 'script_4_grisou_23_nodes.sh', commit it. +#+end_example + +** 2017-04-03 50-node (800 cores) grisou in April 4th :EXP5: +#+begin_src shell :results output +EID=5 +VERSION=1 +CLUSTER=grisou +NODES=50 +MACHINEFILE=machine-file +G5KHOME=/home/lschnorr/ +SCRIPT="script_${EID}_v${VERSION}_${CLUSTER}_${NODES}_nodes.sh" + +rm -f $SCRIPT + +EDIR="$G5KHOME/exp_${EID}-v${VERSION}_${CLUSTER}_${NODES}" +echo "if [ -e "$EDIR" ]; then" >> $SCRIPT +echo " echo \"Directory/File '${EDIR}' already exists. Remove it or rename it.\"" >> $SCRIPT +echo " exit" >> $SCRIPT +echo "fi" >> $SCRIPT + +# Create experimental directory in the home +echo "mkdir -p $EDIR" >> $SCRIPT + +# The design +for NP in 800 720 640 560 480 400; do + for CASE in sfc metis; do + NPN=$((NP)) + UNIQUE=${EID}_${CLUSTER}_${NODES}_${NPN}_${CASE} + echo "./scripts/run_alya_experiment.sh ${MACHINEFILE} ${NPN} /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_${CASE}/fensap.dat $UNIQUE false" + echo "cp -prfv $UNIQUE.dir ${EDIR}" + done +done >> ${SCRIPT} +echo "cp $SCRIPT $EDIR" >> $SCRIPT +chmod 755 ${SCRIPT} +cat ${SCRIPT} +echo "This script is at '${SCRIPT}', commit it." +#+end_src + +#+RESULTS: +#+begin_example +if [ -e /home/lschnorr//exp_5-v1_grisou_50 ]; then + echo "Directory/File '/home/lschnorr//exp_5-v1_grisou_50' already exists. Remove it or rename it." + exit +fi +mkdir -p /home/lschnorr//exp_5-v1_grisou_50 +./scripts/run_alya_experiment.sh machine-file 800 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 5_grisou_50_800_sfc false +cp -prfv 5_grisou_50_800_sfc.dir /home/lschnorr//exp_5-v1_grisou_50 +./scripts/run_alya_experiment.sh machine-file 800 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 5_grisou_50_800_metis false +cp -prfv 5_grisou_50_800_metis.dir /home/lschnorr//exp_5-v1_grisou_50 +./scripts/run_alya_experiment.sh machine-file 720 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 5_grisou_50_720_sfc false +cp -prfv 5_grisou_50_720_sfc.dir /home/lschnorr//exp_5-v1_grisou_50 +./scripts/run_alya_experiment.sh machine-file 720 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 5_grisou_50_720_metis false +cp -prfv 5_grisou_50_720_metis.dir /home/lschnorr//exp_5-v1_grisou_50 +./scripts/run_alya_experiment.sh machine-file 640 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 5_grisou_50_640_sfc false +cp -prfv 5_grisou_50_640_sfc.dir /home/lschnorr//exp_5-v1_grisou_50 +./scripts/run_alya_experiment.sh machine-file 640 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 5_grisou_50_640_metis false +cp -prfv 5_grisou_50_640_metis.dir /home/lschnorr//exp_5-v1_grisou_50 +./scripts/run_alya_experiment.sh machine-file 560 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 5_grisou_50_560_sfc false +cp -prfv 5_grisou_50_560_sfc.dir /home/lschnorr//exp_5-v1_grisou_50 +./scripts/run_alya_experiment.sh machine-file 560 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 5_grisou_50_560_metis false +cp -prfv 5_grisou_50_560_metis.dir /home/lschnorr//exp_5-v1_grisou_50 +./scripts/run_alya_experiment.sh machine-file 480 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 5_grisou_50_480_sfc false +cp -prfv 5_grisou_50_480_sfc.dir /home/lschnorr//exp_5-v1_grisou_50 +./scripts/run_alya_experiment.sh machine-file 480 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 5_grisou_50_480_metis false +cp -prfv 5_grisou_50_480_metis.dir /home/lschnorr//exp_5-v1_grisou_50 +./scripts/run_alya_experiment.sh machine-file 400 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 5_grisou_50_400_sfc false +cp -prfv 5_grisou_50_400_sfc.dir /home/lschnorr//exp_5-v1_grisou_50 +./scripts/run_alya_experiment.sh machine-file 400 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 5_grisou_50_400_metis false +cp -prfv 5_grisou_50_400_metis.dir /home/lschnorr//exp_5-v1_grisou_50 +cp script_5_v1_grisou_50_nodes.sh /home/lschnorr//exp_5-v1_grisou_50 +This script is at 'script_5_v1_grisou_50_nodes.sh', commit it. +#+end_example + +** 2017-04-03 7-node (112 cores) grimoire with Infiniband :EXP6: +#+begin_src shell :results output +EID=6 +CLUSTER=grimoire +NODES=7 +MACHINEFILE=machine-file +G5KHOME=/home/lschnorr/ +SCRIPT="script_${EID}_${CLUSTER}_${NODES}_nodes.sh" + +rm -f $SCRIPT + +EDIR="$G5KHOME/exp_${EID}_${CLUSTER}_${NODES}" +echo "if [ -e "$EDIR" ]; then" >> $SCRIPT +echo " echo \"Directory/File '${EDIR}' already exists. Remove it or rename it.\"" >> $SCRIPT +echo " exit" >> $SCRIPT +echo "fi" >> $SCRIPT + +# Create experimental directory in the home +echo "mkdir -p $EDIR" >> $SCRIPT + +# The design +for NP in 111 96; do + for CASE in sfc metis; do + NPN=$((NP+1)) + UNIQUE=${EID}_${CLUSTER}_${NODES}_${NPN}_${CASE} + echo "./scripts/run_alya_experiment.sh ${MACHINEFILE} ${NPN} /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_${CASE}/fensap.dat $UNIQUE true" + echo "cp -prfv $UNIQUE.dir ${EDIR}" + done +done >> ${SCRIPT} +echo "cp $SCRIPT $EDIR" >> $SCRIPT +chmod 755 ${SCRIPT} +cat ${SCRIPT} +echo "This script is at '${SCRIPT}', commit it." +#+end_src + +#+RESULTS: +#+begin_example +if [ -e /home/lschnorr//exp_6_grimoire_7 ]; then + echo "Directory/File '/home/lschnorr//exp_6_grimoire_7' already exists. Remove it or rename it." + exit +fi +mkdir -p /home/lschnorr//exp_6_grimoire_7 +./scripts/run_alya_experiment.sh machine-file 112 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 6_grimoire_7_112_sfc true +cp -prfv 6_grimoire_7_112_sfc.dir /home/lschnorr//exp_6_grimoire_7 +./scripts/run_alya_experiment.sh machine-file 112 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 6_grimoire_7_112_metis true +cp -prfv 6_grimoire_7_112_metis.dir /home/lschnorr//exp_6_grimoire_7 +./scripts/run_alya_experiment.sh machine-file 97 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 6_grimoire_7_97_sfc true +cp -prfv 6_grimoire_7_97_sfc.dir /home/lschnorr//exp_6_grimoire_7 +./scripts/run_alya_experiment.sh machine-file 97 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 6_grimoire_7_97_metis true +cp -prfv 6_grimoire_7_97_metis.dir /home/lschnorr//exp_6_grimoire_7 +cp script_6_grimoire_7_nodes.sh /home/lschnorr//exp_6_grimoire_7 +This script is at 'script_6_grimoire_7_nodes.sh', commit it. +#+end_example + +** 2017-04-04 4-node (64 cores) grimoire with Infiniband :EXP7: + +The goal is simply confirm experimentally that OpenMPI is doing openib +communications instead of ethernet when told so. + +#+begin_src shell :results output +EID=7 +CLUSTER=grimoire +NODES=4 +MACHINEFILE=machine-file +G5KHOME=/home/lschnorr/ +SCRIPT="script_${EID}_${CLUSTER}_${NODES}_nodes.sh" + +rm -f $SCRIPT + +EDIR="$G5KHOME/exp_${EID}_${CLUSTER}_${NODES}" +echo "if [ -e "$EDIR" ]; then" >> $SCRIPT +echo " echo \"Directory/File '${EDIR}' already exists. Remove it or rename it.\"" >> $SCRIPT +echo " exit" >> $SCRIPT +echo "fi" >> $SCRIPT + +# Create experimental directory in the home +echo "mkdir -p $EDIR" >> $SCRIPT + +# The design +NPN=112 +for INFINIBAND in true false; do + for CASE in sfc metis; do + UNIQUE=${EID}_${CLUSTER}_${NODES}_${NPN}_${CASE} + echo "./scripts/run_alya_experiment.sh ${MACHINEFILE} ${NPN} /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_${CASE}/fensap.dat $UNIQUE ${INFINIBAND}" + echo "cp -prfv $UNIQUE.dir ${EDIR}" + done +done >> ${SCRIPT} +echo "cp $SCRIPT $EDIR" >> $SCRIPT +chmod 755 ${SCRIPT} +cat ${SCRIPT} +echo "This script is at '${SCRIPT}', commit it." +#+end_src + +#+RESULTS: +#+begin_example +if [ -e /home/lschnorr//exp_7_grimoire_4 ]; then + echo "Directory/File '/home/lschnorr//exp_7_grimoire_4' already exists. Remove it or rename it." + exit +fi +mkdir -p /home/lschnorr//exp_7_grimoire_4 +./scripts/run_alya_experiment.sh machine-file 112 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 7_grimoire_4_112_sfc true +cp -prfv 7_grimoire_4_112_sfc.dir /home/lschnorr//exp_7_grimoire_4 +./scripts/run_alya_experiment.sh machine-file 112 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 7_grimoire_4_112_metis true +cp -prfv 7_grimoire_4_112_metis.dir /home/lschnorr//exp_7_grimoire_4 +./scripts/run_alya_experiment.sh machine-file 112 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 7_grimoire_4_112_sfc false +cp -prfv 7_grimoire_4_112_sfc.dir /home/lschnorr//exp_7_grimoire_4 +./scripts/run_alya_experiment.sh machine-file 112 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 7_grimoire_4_112_metis false +cp -prfv 7_grimoire_4_112_metis.dir /home/lschnorr//exp_7_grimoire_4 +cp script_7_grimoire_4_nodes.sh /home/lschnorr//exp_7_grimoire_4 +This script is at 'script_7_grimoire_4_nodes.sh', commit it. +#+end_example + +** 2017-04-05 2-node (32 cores) and 4-node (64 cores) grimoire to test Infiniband :EXP8: +#+begin_src shell :results output +EID=8 +VERSION=4 +CLUSTER=grimoire +NODES=4 +MACHINEFILE=machine-file +G5KHOME=/home/lschnorr/ +EID=${EID}-v${VERSION} +SCRIPT="script_${EID}_${CLUSTER}_${NODES}_nodes.sh" + +rm -f $SCRIPT + +EDIR="$G5KHOME/exp_${EID}_${CLUSTER}_${NODES}" +echo "if [ -e "$EDIR" ]; then" >> $SCRIPT +echo " echo \"Directory/File '${EDIR}' already exists. Remove it or rename it.\"" >> $SCRIPT +echo " exit" >> $SCRIPT +echo "fi" >> $SCRIPT + +# Create experimental directory in the home +echo "mkdir -p $EDIR" >> $SCRIPT + +# The design +NP=60 +for INFINIBAND in true false; do + for CASE in metis sfc; do + NPN=$((NP)) + UNIQUE=${EID}_${CLUSTER}_${NODES}_${NPN}_${CASE}_${INFINIBAND} + echo "# " + echo "# Experiment $UNIQUE" + echo "# " + echo "./scripts/run_alya_experiment.sh ${MACHINEFILE} ${NPN} /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_${CASE}/fensap.dat $UNIQUE ${INFINIBAND}" + echo "cp -prfv $UNIQUE.dir ${EDIR}" + echo "rm -rf $UNIQUE.dir" + done +done >> ${SCRIPT} +echo "# " >> ${SCRIPT} +echo "# Final" >> ${SCRIPT} +echo "# " >> ${SCRIPT} +echo "cp $SCRIPT $EDIR" >> $SCRIPT +chmod 755 ${SCRIPT} +cat ${SCRIPT} +echo "This script is at '${SCRIPT}', commit it." +#+end_src + +#+RESULTS: +#+begin_example +if [ -e /home/lschnorr//exp_8-v4_grimoire_4 ]; then + echo "Directory/File '/home/lschnorr//exp_8-v4_grimoire_4' already exists. Remove it or rename it." + exit +fi +mkdir -p /home/lschnorr//exp_8-v4_grimoire_4 +# +# Experiment 8-v4_grimoire_4_60_metis_true +# +./scripts/run_alya_experiment.sh machine-file 60 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 8-v4_grimoire_4_60_metis_true true +cp -prfv 8-v4_grimoire_4_60_metis_true.dir /home/lschnorr//exp_8-v4_grimoire_4 +rm -rf 8-v4_grimoire_4_60_metis_true.dir +# +# Experiment 8-v4_grimoire_4_60_sfc_true +# +./scripts/run_alya_experiment.sh machine-file 60 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 8-v4_grimoire_4_60_sfc_true true +cp -prfv 8-v4_grimoire_4_60_sfc_true.dir /home/lschnorr//exp_8-v4_grimoire_4 +rm -rf 8-v4_grimoire_4_60_sfc_true.dir +# +# Experiment 8-v4_grimoire_4_60_metis_false +# +./scripts/run_alya_experiment.sh machine-file 60 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 8-v4_grimoire_4_60_metis_false false +cp -prfv 8-v4_grimoire_4_60_metis_false.dir /home/lschnorr//exp_8-v4_grimoire_4 +rm -rf 8-v4_grimoire_4_60_metis_false.dir +# +# Experiment 8-v4_grimoire_4_60_sfc_false +# +./scripts/run_alya_experiment.sh machine-file 60 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 8-v4_grimoire_4_60_sfc_false false +cp -prfv 8-v4_grimoire_4_60_sfc_false.dir /home/lschnorr//exp_8-v4_grimoire_4 +rm -rf 8-v4_grimoire_4_60_sfc_false.dir +# +# Final +# +cp script_8-v4_grimoire_4_nodes.sh /home/lschnorr//exp_8-v4_grimoire_4 +This script is at 'script_8-v4_grimoire_4_nodes.sh', commit it. +#+end_example + +** 2017-04-05 6-node grimoire with Infiniband :EXP9: +#+begin_src shell :results output +EID=9 +VERSION=1 +CLUSTER=grimoire +NODES=6 +MACHINEFILE=machine-file +G5KHOME=/home/lschnorr/ +EID=${EID}-v${VERSION} +SCRIPT="script_${EID}_${CLUSTER}_${NODES}_nodes.sh" + +rm -f $SCRIPT + +EDIR="$G5KHOME/exp_${EID}_${CLUSTER}_${NODES}" +echo "if [ -e "$EDIR" ]; then" >> $SCRIPT +echo " echo \"Directory/File '${EDIR}' already exists. Remove it or rename it.\"" >> $SCRIPT +echo " exit" >> $SCRIPT +echo "fi" >> $SCRIPT + +# Create experimental directory in the home +echo "mkdir -p $EDIR" >> $SCRIPT + +# The design +NP=96 +for INFINIBAND in true false; do + for CASE in metis sfc; do + NPN=$((NP)) + UNIQUE=${EID}_${CLUSTER}_${NODES}_${NPN}_${CASE}_${INFINIBAND} + echo "# " + echo "# Experiment $UNIQUE" + echo "# " + echo "./scripts/run_alya_experiment.sh ${MACHINEFILE} ${NPN} /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_${CASE}/fensap.dat $UNIQUE ${INFINIBAND}" + echo "cp -prfv $UNIQUE.dir ${EDIR}" + echo "rm -rf $UNIQUE.dir" + done +done >> ${SCRIPT} +echo "# " >> ${SCRIPT} +echo "# Final" >> ${SCRIPT} +echo "# " >> ${SCRIPT} +echo "cp $SCRIPT $EDIR" >> $SCRIPT +chmod 755 ${SCRIPT} +cat ${SCRIPT} +echo "This script is at '${SCRIPT}', commit it." +#+end_src + +#+RESULTS: +#+begin_example +if [ -e /home/lschnorr//exp_9-v1_grimoire_6 ]; then + echo "Directory/File '/home/lschnorr//exp_9-v1_grimoire_6' already exists. Remove it or rename it." + exit +fi +mkdir -p /home/lschnorr//exp_9-v1_grimoire_6 +# +# Experiment 9-v1_grimoire_6_96_metis_true +# +./scripts/run_alya_experiment.sh machine-file 96 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 9-v1_grimoire_6_96_metis_true true +cp -prfv 9-v1_grimoire_6_96_metis_true.dir /home/lschnorr//exp_9-v1_grimoire_6 +rm -rf 9-v1_grimoire_6_96_metis_true.dir +# +# Experiment 9-v1_grimoire_6_96_sfc_true +# +./scripts/run_alya_experiment.sh machine-file 96 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 9-v1_grimoire_6_96_sfc_true true +cp -prfv 9-v1_grimoire_6_96_sfc_true.dir /home/lschnorr//exp_9-v1_grimoire_6 +rm -rf 9-v1_grimoire_6_96_sfc_true.dir +# +# Experiment 9-v1_grimoire_6_96_metis_false +# +./scripts/run_alya_experiment.sh machine-file 96 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 9-v1_grimoire_6_96_metis_false false +cp -prfv 9-v1_grimoire_6_96_metis_false.dir /home/lschnorr//exp_9-v1_grimoire_6 +rm -rf 9-v1_grimoire_6_96_metis_false.dir +# +# Experiment 9-v1_grimoire_6_96_sfc_false +# +./scripts/run_alya_experiment.sh machine-file 96 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 9-v1_grimoire_6_96_sfc_false false +cp -prfv 9-v1_grimoire_6_96_sfc_false.dir /home/lschnorr//exp_9-v1_grimoire_6 +rm -rf 9-v1_grimoire_6_96_sfc_false.dir +# +# Final +# +cp script_9-v1_grimoire_6_nodes.sh /home/lschnorr//exp_9-v1_grimoire_6 +This script is at 'script_9-v1_grimoire_6_nodes.sh', commit it. +#+end_example + +** 2017-04-06 8-node grimoire infiniband-only after Alya modifications :EXP10: +#+begin_src shell :results output +EID=10 +VERSION=1 +CLUSTER=grimoire +NODES=8 +MACHINEFILE=machine-file +G5KHOME=/home/lschnorr/ +EID=${EID}-v${VERSION} +SCRIPT="script_${EID}_${CLUSTER}_${NODES}_nodes.sh" + +rm -f $SCRIPT + +EDIR="$G5KHOME/exp_${EID}_${CLUSTER}_${NODES}" +echo "if [ -e "$EDIR" ]; then" >> $SCRIPT +echo " echo \"Directory/File '${EDIR}' already exists. Remove it or rename it.\"" >> $SCRIPT +echo " exit" >> $SCRIPT +echo "fi" >> $SCRIPT + +# Create experimental directory in the home +echo "mkdir -p $EDIR" >> $SCRIPT + +# The design +NP=128 +for INFINIBAND in true; do + for CASE in metis sfc; do + NPN=$((NP)) + UNIQUE=${EID}_${CLUSTER}_${NODES}_${NPN}_${CASE}_${INFINIBAND} + echo "# " + echo "# Experiment $UNIQUE" + echo "# " + echo "./scripts/run_alya_experiment.sh ${MACHINEFILE} ${NPN} /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_${CASE}/fensap.dat $UNIQUE ${INFINIBAND}" + echo "cp -prfv $UNIQUE.dir ${EDIR}" + echo "rm -rf $UNIQUE.dir" + done +done >> ${SCRIPT} +echo "# " >> ${SCRIPT} +echo "# Final" >> ${SCRIPT} +echo "# " >> ${SCRIPT} +echo "cp $SCRIPT $EDIR" >> $SCRIPT +chmod 755 ${SCRIPT} +cat ${SCRIPT} +echo "This script is at '${SCRIPT}', commit it." +#+end_src + +#+RESULTS: +#+begin_example +if [ -e /home/lschnorr//exp_10-v1_grimoire_8 ]; then + echo "Directory/File '/home/lschnorr//exp_10-v1_grimoire_8' already exists. Remove it or rename it." + exit +fi +mkdir -p /home/lschnorr//exp_10-v1_grimoire_8 +# +# Experiment 10-v1_grimoire_8_128_metis_true +# +./scripts/run_alya_experiment.sh machine-file 128 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 10-v1_grimoire_8_128_metis_true true +cp -prfv 10-v1_grimoire_8_128_metis_true.dir /home/lschnorr//exp_10-v1_grimoire_8 +rm -rf 10-v1_grimoire_8_128_metis_true.dir +# +# Experiment 10-v1_grimoire_8_128_sfc_true +# +./scripts/run_alya_experiment.sh machine-file 128 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 10-v1_grimoire_8_128_sfc_true true +cp -prfv 10-v1_grimoire_8_128_sfc_true.dir /home/lschnorr//exp_10-v1_grimoire_8 +rm -rf 10-v1_grimoire_8_128_sfc_true.dir +# +# Final +# +cp script_10-v1_grimoire_8_nodes.sh /home/lschnorr//exp_10-v1_grimoire_8 +This script is at 'script_10-v1_grimoire_8_nodes.sh', commit it. +#+end_example + +** 2017-04-06 8-node grimoire infiniband-only without Alya modifications :EXP11: +I will compile Alya before the partitioning weight modifications. +#+begin_src shell :results output +EID=11 +VERSION=1 +CLUSTER=grimoire +NODES=8 +MACHINEFILE=machine-file +G5KHOME=/home/lschnorr/ +EID=${EID}-v${VERSION} +SCRIPT="script_${EID}_${CLUSTER}_${NODES}_nodes.sh" + +rm -f $SCRIPT + +EDIR="$G5KHOME/exp_${EID}_${CLUSTER}_${NODES}" +echo "if [ -e "$EDIR" ]; then" >> $SCRIPT +echo " echo \"Directory/File '${EDIR}' already exists. Remove it or rename it.\"" >> $SCRIPT +echo " exit" >> $SCRIPT +echo "fi" >> $SCRIPT + +# Create experimental directory in the home +echo "mkdir -p $EDIR" >> $SCRIPT + +# The design +NP=128 +for INFINIBAND in true; do + for CASE in metis sfc; do + NPN=$((NP)) + UNIQUE=${EID}_${CLUSTER}_${NODES}_${NPN}_${CASE}_${INFINIBAND} + echo "# " + echo "# Experiment $UNIQUE" + echo "# " + echo "./scripts/run_alya_experiment.sh ${MACHINEFILE} ${NPN} /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_${CASE}/fensap.dat $UNIQUE ${INFINIBAND}" + echo "cp -prfv $UNIQUE.dir ${EDIR}" + echo "rm -rf $UNIQUE.dir" + done +done >> ${SCRIPT} +echo "# " >> ${SCRIPT} +echo "# Final" >> ${SCRIPT} +echo "# " >> ${SCRIPT} +echo "cp $SCRIPT $EDIR" >> $SCRIPT +chmod 755 ${SCRIPT} +cat ${SCRIPT} +echo "This script is at '${SCRIPT}', commit it." +#+end_src + +#+RESULTS: +#+begin_example +if [ -e /home/lschnorr//exp_11-v1_grimoire_8 ]; then + echo "Directory/File '/home/lschnorr//exp_11-v1_grimoire_8' already exists. Remove it or rename it." + exit +fi +mkdir -p /home/lschnorr//exp_11-v1_grimoire_8 +# +# Experiment 11-v1_grimoire_8_128_metis_true +# +./scripts/run_alya_experiment.sh machine-file 128 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 11-v1_grimoire_8_128_metis_true true +cp -prfv 11-v1_grimoire_8_128_metis_true.dir /home/lschnorr//exp_11-v1_grimoire_8 +rm -rf 11-v1_grimoire_8_128_metis_true.dir +# +# Experiment 11-v1_grimoire_8_128_sfc_true +# +./scripts/run_alya_experiment.sh machine-file 128 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 11-v1_grimoire_8_128_sfc_true true +cp -prfv 11-v1_grimoire_8_128_sfc_true.dir /home/lschnorr//exp_11-v1_grimoire_8 +rm -rf 11-v1_grimoire_8_128_sfc_true.dir +# +# Final +# +cp script_11-v1_grimoire_8_nodes.sh /home/lschnorr//exp_11-v1_grimoire_8 +This script is at 'script_11-v1_grimoire_8_nodes.sh', commit it. +#+end_example + +** 2017-04-07 6-node grimoire (no Alya modifications, full computing instrumentations) :EXP12: + +For this experiment, we shall: +- remove MPI instrumentation completely +- adopt compiler instrumentation to obtain some application functions + +#+begin_src shell :results output +EID=12 +VERSION=1 +CLUSTER=grimoire +NODES=6 +MACHINEFILE=machine-file +G5KHOME=/home/lschnorr/ +EID=${EID}-v${VERSION} +SCRIPT="script_${EID}_${CLUSTER}_${NODES}_nodes.sh" + +rm -f $SCRIPT + +EDIR="$G5KHOME/exp_${EID}_${CLUSTER}_${NODES}" +echo "if [ -e "$EDIR" ]; then" >> $SCRIPT +echo " echo \"Directory/File '${EDIR}' already exists. Remove it or rename it.\"" >> $SCRIPT +echo " exit" >> $SCRIPT +echo "fi" >> $SCRIPT + +# Create experimental directory in the home +echo "mkdir -p $EDIR" >> $SCRIPT + +# The design +for NP in 96 48; do + for INFINIBAND in true; do + for CASE in metis sfc; do + NPN=$((NP)) + UNIQUE=${EID}_${CLUSTER}_${NODES}_${NPN}_${CASE}_${INFINIBAND} + echo "# " + echo "# Experiment $UNIQUE" + echo "# " + echo "./scripts/run_alya_experiment.sh ${MACHINEFILE} ${NPN} /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_${CASE}/fensap.dat $UNIQUE ${INFINIBAND}" + echo "cp -prfv $UNIQUE.dir ${EDIR}" + echo "rm -rf $UNIQUE.dir" + done + done +done >> ${SCRIPT} +echo "# " >> ${SCRIPT} +echo "# Final" >> ${SCRIPT} +echo "# " >> ${SCRIPT} +echo "cp $SCRIPT $EDIR" >> $SCRIPT +chmod 755 ${SCRIPT} +cat ${SCRIPT} +echo "This script is at '${SCRIPT}', commit it." +#+end_src + +#+RESULTS: +#+begin_example +if [ -e /home/lschnorr//exp_12-v1_grimoire_6 ]; then + echo "Directory/File '/home/lschnorr//exp_12-v1_grimoire_6' already exists. Remove it or rename it." + exit +fi +mkdir -p /home/lschnorr//exp_12-v1_grimoire_6 +# +# Experiment 12-v1_grimoire_6_96_metis_true +# +./scripts/run_alya_experiment.sh machine-file 96 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 12-v1_grimoire_6_96_metis_true true +cp -prfv 12-v1_grimoire_6_96_metis_true.dir /home/lschnorr//exp_12-v1_grimoire_6 +rm -rf 12-v1_grimoire_6_96_metis_true.dir +# +# Experiment 12-v1_grimoire_6_96_sfc_true +# +./scripts/run_alya_experiment.sh machine-file 96 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 12-v1_grimoire_6_96_sfc_true true +cp -prfv 12-v1_grimoire_6_96_sfc_true.dir /home/lschnorr//exp_12-v1_grimoire_6 +rm -rf 12-v1_grimoire_6_96_sfc_true.dir +# +# Experiment 12-v1_grimoire_6_48_metis_true +# +./scripts/run_alya_experiment.sh machine-file 48 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 12-v1_grimoire_6_48_metis_true true +cp -prfv 12-v1_grimoire_6_48_metis_true.dir /home/lschnorr//exp_12-v1_grimoire_6 +rm -rf 12-v1_grimoire_6_48_metis_true.dir +# +# Experiment 12-v1_grimoire_6_48_sfc_true +# +./scripts/run_alya_experiment.sh machine-file 48 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 12-v1_grimoire_6_48_sfc_true true +cp -prfv 12-v1_grimoire_6_48_sfc_true.dir /home/lschnorr//exp_12-v1_grimoire_6 +rm -rf 12-v1_grimoire_6_48_sfc_true.dir +# +# Final +# +cp script_12-v1_grimoire_6_nodes.sh /home/lschnorr//exp_12-v1_grimoire_6 +This script is at 'script_12-v1_grimoire_6_nodes.sh', commit it. +#+end_example + +** 2017-04-10 12-node grisou (computing instrumentations, Alya modifs yes/no) :EXP13: +#+begin_src shell :results output +EID=13 +VERSION=1 +CLUSTER=grisou +NODES=12 +MACHINEFILE=machine-file +G5KHOME=/home/lschnorr/ +EID=${EID}-v${VERSION} +SCRIPT="script_${EID}_${CLUSTER}_${NODES}_nodes.sh" + +rm -f $SCRIPT + +EDIR="$G5KHOME/exp_${EID}_${CLUSTER}_${NODES}" +echo "if [ -e "$EDIR" ]; then" >> $SCRIPT +echo " echo \"Directory/File '${EDIR}' already exists. Remove it or rename it.\"" >> $SCRIPT +echo " exit" >> $SCRIPT +echo "fi" >> $SCRIPT + +# Create experimental directory in the home +echo "mkdir -p $EDIR" >> $SCRIPT + +# The design +ALYAO=Alya.x +ALYAM=Alya.x.modif + +for NP in 192 96; do + for INFINIBAND in true; do + for CASE in metis sfc; do + for ALYA in $ALYAO $ALYAM; do + NPN=$((NP)) + UNIQUE=${EID}_${CLUSTER}_${NODES}_${NPN}_${CASE}_${INFINIBAND}_${ALYA} + echo "# " + echo "# Experiment $UNIQUE" + echo "# " + echo "./scripts/run_alya_experiment.sh ${MACHINEFILE} ${NPN} /home/lschnorr/alya-bsc/Executables/unix/$ALYA /home/lschnorr/WORK-RICARD/resp_${CASE}/fensap.dat $UNIQUE ${INFINIBAND}" + echo "cp -prfv $UNIQUE.dir ${EDIR}" + echo "rm -rf $UNIQUE.dir" + done + done + done >> ${SCRIPT} +done +echo "# " >> ${SCRIPT} +echo "# Final" >> ${SCRIPT} +echo "# " >> ${SCRIPT} +echo "cp $SCRIPT $EDIR" >> $SCRIPT +chmod 755 ${SCRIPT} +cat ${SCRIPT} +echo "This script is at '${SCRIPT}', commit it." +#+end_src + +#+RESULTS: +#+begin_example +if [ -e /home/lschnorr//exp_13-v1_grisou_12 ]; then + echo "Directory/File '/home/lschnorr//exp_13-v1_grisou_12' already exists. Remove it or rename it." + exit +fi +mkdir -p /home/lschnorr//exp_13-v1_grisou_12 +# +# Experiment 13-v1_grisou_12_192_metis_true_Alya.x +# +./scripts/run_alya_experiment.sh machine-file 192 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 13-v1_grisou_12_192_metis_true_Alya.x true +cp -prfv 13-v1_grisou_12_192_metis_true_Alya.x.dir /home/lschnorr//exp_13-v1_grisou_12 +rm -rf 13-v1_grisou_12_192_metis_true_Alya.x.dir +# +# Experiment 13-v1_grisou_12_192_metis_true_Alya.x.modif +# +./scripts/run_alya_experiment.sh machine-file 192 /home/lschnorr/alya-bsc/Executables/unix/Alya.x.modif /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 13-v1_grisou_12_192_metis_true_Alya.x.modif true +cp -prfv 13-v1_grisou_12_192_metis_true_Alya.x.modif.dir /home/lschnorr//exp_13-v1_grisou_12 +rm -rf 13-v1_grisou_12_192_metis_true_Alya.x.modif.dir +# +# Experiment 13-v1_grisou_12_192_sfc_true_Alya.x +# +./scripts/run_alya_experiment.sh machine-file 192 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 13-v1_grisou_12_192_sfc_true_Alya.x true +cp -prfv 13-v1_grisou_12_192_sfc_true_Alya.x.dir /home/lschnorr//exp_13-v1_grisou_12 +rm -rf 13-v1_grisou_12_192_sfc_true_Alya.x.dir +# +# Experiment 13-v1_grisou_12_192_sfc_true_Alya.x.modif +# +./scripts/run_alya_experiment.sh machine-file 192 /home/lschnorr/alya-bsc/Executables/unix/Alya.x.modif /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 13-v1_grisou_12_192_sfc_true_Alya.x.modif true +cp -prfv 13-v1_grisou_12_192_sfc_true_Alya.x.modif.dir /home/lschnorr//exp_13-v1_grisou_12 +rm -rf 13-v1_grisou_12_192_sfc_true_Alya.x.modif.dir +# +# Experiment 13-v1_grisou_12_96_metis_true_Alya.x +# +./scripts/run_alya_experiment.sh machine-file 96 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 13-v1_grisou_12_96_metis_true_Alya.x true +cp -prfv 13-v1_grisou_12_96_metis_true_Alya.x.dir /home/lschnorr//exp_13-v1_grisou_12 +rm -rf 13-v1_grisou_12_96_metis_true_Alya.x.dir +# +# Experiment 13-v1_grisou_12_96_metis_true_Alya.x.modif +# +./scripts/run_alya_experiment.sh machine-file 96 /home/lschnorr/alya-bsc/Executables/unix/Alya.x.modif /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 13-v1_grisou_12_96_metis_true_Alya.x.modif true +cp -prfv 13-v1_grisou_12_96_metis_true_Alya.x.modif.dir /home/lschnorr//exp_13-v1_grisou_12 +rm -rf 13-v1_grisou_12_96_metis_true_Alya.x.modif.dir +# +# Experiment 13-v1_grisou_12_96_sfc_true_Alya.x +# +./scripts/run_alya_experiment.sh machine-file 96 /home/lschnorr/alya-bsc/Executables/unix/Alya.x /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 13-v1_grisou_12_96_sfc_true_Alya.x true +cp -prfv 13-v1_grisou_12_96_sfc_true_Alya.x.dir /home/lschnorr//exp_13-v1_grisou_12 +rm -rf 13-v1_grisou_12_96_sfc_true_Alya.x.dir +# +# Experiment 13-v1_grisou_12_96_sfc_true_Alya.x.modif +# +./scripts/run_alya_experiment.sh machine-file 96 /home/lschnorr/alya-bsc/Executables/unix/Alya.x.modif /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 13-v1_grisou_12_96_sfc_true_Alya.x.modif true +cp -prfv 13-v1_grisou_12_96_sfc_true_Alya.x.modif.dir /home/lschnorr//exp_13-v1_grisou_12 +rm -rf 13-v1_grisou_12_96_sfc_true_Alya.x.modif.dir +# +# Final +# +cp script_13-v1_grisou_12_nodes.sh /home/lschnorr//exp_13-v1_grisou_12 +This script is at 'script_13-v1_grisou_12_nodes.sh', commit it. +#+end_example + +** 2017-05-10 20-node grisou :EXP14: +*** Introduction :ATTACH: +:PROPERTIES: +:Attachments: all.scorep +:ID: b51a80da-07e4-4d58-844c-a73009092eaa +:END: + +The attached file, =all.scorep=, contains the functions that are +instrumented during compilation by scorep. Those functions are non-MPI +codes that represent the computation load. + +I am using Trunk.6515.Alya, unmodified, to get =Alya.x.orig=. Then, I +manually modify Alya the way described here: [[*2017-04-06 Alya modifications to get more info, set balance to 1][2017-04-06 Alya +modifications to get more info, set balance to 1]] to get another binary +called =Alya.x.modif= (with elements weight set to 1). It seems that the +latest version of Alya (revision 6515 at least) has a slightly +different code to deal with element weight (with the =iweig= +variable). So, for the =modif= version, I am forcing the =iweig= to be +=1_rp=, removing the =if= that is now there. + +#+BEGIN_EXAMPLE +lschnorr@grisou-1:~/Trunk.6515.Alya/Executables/unix$ svn diff ../../Sources/kernel/parall/mod_par_partit_sfc.f90 +Index: ../../Sources/kernel/parall/mod_par_partit_sfc.f90 +=================================================================== +--- ../../Sources/kernel/parall/mod_par_partit_sfc.f90 (revision 6515) ++++ ../../Sources/kernel/parall/mod_par_partit_sfc.f90 (working copy) +@@ -459,6 +459,7 @@ + else + iweig = 1_rp + endif ++ iweig = 1_rp + + if( PAR_MY_SFC_RANK_WM /= iboxc-1 )then !if true element must be sent + if(bufwei(iboxl*2) == 0_rp) then +#+END_EXAMPLE + +Note that the file =Sources/services/parall/par_outinf.f90= needs to be +modified to output per-rank weights, points, etc. This is fundamental +to conduct our analysis. Here's the diff: + +#+BEGIN_EXAMPLE +lschnorr@grisou-1:~/Trunk.6515.Alya/Executables/unix$ svn diff ../../Sources/services/parall/par_outinf.f90 +Index: ../../Sources/services/parall/par_outinf.f90 +=================================================================== +--- ../../Sources/services/parall/par_outinf.f90 (revision 6515) ++++ ../../Sources/services/parall/par_outinf.f90 (working copy) +@@ -15,6 +15,7 @@ + use mod_parall + implicit none + integer(ip) :: isubd,ksmin(2),ksmax(2),kaver,koutp,nb ++ integer, parameter :: out_unit_npoin=20 + + if( IMASTER ) then + +@@ -27,7 +28,11 @@ + ksmin = huge(1_ip) + ksmax = -huge(1_ip) + kaver = 0 ++ ! LUCAS: manual instrumentation starts here, all code marked by LUCAS ++ open (unit=out_unit_npoin,file="results_NPOIN_NELEM_NELEW_NBOUN.log",action="write",status="replace") + do isubd=1,npart_par ++ ! LUCAS: Here I need to print: npoin_par(isubd) ++ write (out_unit_npoin,*) "Rank ", isubd ," NPOIN ", npoin_par(isubd), " NELEM ", nelem_par(isubd), " NELEW ", nelew_par(isubd), " NBOUN ", nboun_par(isubd), " NNEIG ", lneig_par(isubd), " NBBOU ", npoin_par(isubd) + if(npoin_par(isubd)> $SCRIPT +echo " echo \"Directory/File '${EDIR}' already exists. Remove it or rename it.\"" >> $SCRIPT +echo " exit" >> $SCRIPT +echo "fi" >> $SCRIPT + +# Create experimental directory in the home +echo "mkdir -p $EDIR" >> $SCRIPT + +# The design +ALYAO=Alya.x.orig +ALYAM=Alya.x.modif +ALYAPATH=/home/lschnorr/Trunk.6515.Alya + +for NP in 320; do + for INFINIBAND in false; do + for CASE in metis sfc; do + for ALYA in $ALYAO $ALYAM; do + NPN=$((NP)) + UNIQUE=${EID}_${CLUSTER}_${NODES}_${NPN}_${CASE}_${INFINIBAND}_${ALYA} + echo "# " + echo "# Experiment $UNIQUE" + echo "# " + echo "./scripts/run_alya_experiment.sh ${MACHINEFILE} ${NPN} ${ALYAPATH}/Executables/unix/$ALYA /home/lschnorr/WORK-RICARD/resp_${CASE}/fensap.dat $UNIQUE ${INFINIBAND}" + echo "cp -prfv $UNIQUE.dir ${EDIR}" + echo "rm -rf $UNIQUE.dir" + done + done + done >> ${SCRIPT} +done +echo "# " >> ${SCRIPT} +echo "# Final" >> ${SCRIPT} +echo "# " >> ${SCRIPT} +echo "cp $SCRIPT $EDIR" >> $SCRIPT +chmod 755 ${SCRIPT} +cat ${SCRIPT} +echo "This script is at '${SCRIPT}', commit it." +#+end_src + +#+RESULTS: +#+begin_example +if [ -e /home/lschnorr//exp_14-v1_grisou_20 ]; then + echo "Directory/File '/home/lschnorr//exp_14-v1_grisou_20' already exists. Remove it or rename it." + exit +fi +mkdir -p /home/lschnorr//exp_14-v1_grisou_20 +# +# Experiment 14-v1_grisou_20_320_metis_false_Alya.x.orig +# +./scripts/run_alya_experiment.sh machine-file 320 /home/lschnorr/Trunk.6515.Alya/Executables/unix/Alya.x.orig /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 14-v1_grisou_20_320_metis_false_Alya.x.orig false +cp -prfv 14-v1_grisou_20_320_metis_false_Alya.x.orig.dir /home/lschnorr//exp_14-v1_grisou_20 +rm -rf 14-v1_grisou_20_320_metis_false_Alya.x.orig.dir +# +# Experiment 14-v1_grisou_20_320_metis_false_Alya.x.modif +# +./scripts/run_alya_experiment.sh machine-file 320 /home/lschnorr/Trunk.6515.Alya/Executables/unix/Alya.x.modif /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 14-v1_grisou_20_320_metis_false_Alya.x.modif false +cp -prfv 14-v1_grisou_20_320_metis_false_Alya.x.modif.dir /home/lschnorr//exp_14-v1_grisou_20 +rm -rf 14-v1_grisou_20_320_metis_false_Alya.x.modif.dir +# +# Experiment 14-v1_grisou_20_320_sfc_false_Alya.x.orig +# +./scripts/run_alya_experiment.sh machine-file 320 /home/lschnorr/Trunk.6515.Alya/Executables/unix/Alya.x.orig /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 14-v1_grisou_20_320_sfc_false_Alya.x.orig false +cp -prfv 14-v1_grisou_20_320_sfc_false_Alya.x.orig.dir /home/lschnorr//exp_14-v1_grisou_20 +rm -rf 14-v1_grisou_20_320_sfc_false_Alya.x.orig.dir +# +# Experiment 14-v1_grisou_20_320_sfc_false_Alya.x.modif +# +./scripts/run_alya_experiment.sh machine-file 320 /home/lschnorr/Trunk.6515.Alya/Executables/unix/Alya.x.modif /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 14-v1_grisou_20_320_sfc_false_Alya.x.modif false +cp -prfv 14-v1_grisou_20_320_sfc_false_Alya.x.modif.dir /home/lschnorr//exp_14-v1_grisou_20 +rm -rf 14-v1_grisou_20_320_sfc_false_Alya.x.modif.dir +# +# Final +# +cp script_14-v1_grisou_20_nodes.sh /home/lschnorr//exp_14-v1_grisou_20 +This script is at 'script_14-v1_grisou_20_nodes.sh', commit it. +#+end_example + +** 2017-05-11 44-node grisou (same as previous entry) :EXP15: +*** Design + +#+begin_src shell :results output +EID=15 +VERSION=1 +CLUSTER=grisou +NODES=44 +MACHINEFILE=machine-file +G5KHOME=/home/lschnorr/ +EID=${EID}-v${VERSION} +SCRIPT="script_${EID}_${CLUSTER}_${NODES}_nodes.sh" + +rm -f $SCRIPT + +EDIR="$G5KHOME/exp_${EID}_${CLUSTER}_${NODES}" +echo "if [ -e "$EDIR" ]; then" >> $SCRIPT +echo " echo \"Directory/File '${EDIR}' already exists. Remove it or rename it.\"" >> $SCRIPT +echo " exit" >> $SCRIPT +echo "fi" >> $SCRIPT + +# Create experimental directory in the home +echo "mkdir -p $EDIR" >> $SCRIPT + +# The design +ALYAO=Alya.x.orig +ALYAM=Alya.x.modif +ALYAPATH=/home/lschnorr/Trunk.6515.Alya + +for NP in 704; do + for INFINIBAND in false; do + for CASE in metis sfc; do + for ALYA in $ALYAO $ALYAM; do + NPN=$((NP)) + UNIQUE=${EID}_${CLUSTER}_${NODES}_${NPN}_${CASE}_${INFINIBAND}_${ALYA} + echo "# " + echo "# Experiment $UNIQUE" + echo "# " + echo "./scripts/run_alya_experiment.sh ${MACHINEFILE} ${NPN} ${ALYAPATH}/Executables/unix/$ALYA /home/lschnorr/WORK-RICARD/resp_${CASE}/fensap.dat $UNIQUE ${INFINIBAND}" + echo "cp -prfv $UNIQUE.dir ${EDIR}" + echo "rm -rf $UNIQUE.dir" + done + done + done >> ${SCRIPT} +done +echo "# " >> ${SCRIPT} +echo "# Final" >> ${SCRIPT} +echo "# " >> ${SCRIPT} +echo "cp $SCRIPT $EDIR" >> $SCRIPT +chmod 755 ${SCRIPT} +cat ${SCRIPT} +echo "This script is at '${SCRIPT}', commit it." +#+end_src + +#+RESULTS: +#+begin_example +if [ -e /home/lschnorr//exp_15-v1_grisou_44 ]; then + echo "Directory/File '/home/lschnorr//exp_15-v1_grisou_44' already exists. Remove it or rename it." + exit +fi +mkdir -p /home/lschnorr//exp_15-v1_grisou_44 +# +# Experiment 15-v1_grisou_44_704_metis_false_Alya.x.orig +# +./scripts/run_alya_experiment.sh machine-file 704 /home/lschnorr/Trunk.6515.Alya/Executables/unix/Alya.x.orig /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 15-v1_grisou_44_704_metis_false_Alya.x.orig false +cp -prfv 15-v1_grisou_44_704_metis_false_Alya.x.orig.dir /home/lschnorr//exp_15-v1_grisou_44 +rm -rf 15-v1_grisou_44_704_metis_false_Alya.x.orig.dir +# +# Experiment 15-v1_grisou_44_704_metis_false_Alya.x.modif +# +./scripts/run_alya_experiment.sh machine-file 704 /home/lschnorr/Trunk.6515.Alya/Executables/unix/Alya.x.modif /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 15-v1_grisou_44_704_metis_false_Alya.x.modif false +cp -prfv 15-v1_grisou_44_704_metis_false_Alya.x.modif.dir /home/lschnorr//exp_15-v1_grisou_44 +rm -rf 15-v1_grisou_44_704_metis_false_Alya.x.modif.dir +# +# Experiment 15-v1_grisou_44_704_sfc_false_Alya.x.orig +# +./scripts/run_alya_experiment.sh machine-file 704 /home/lschnorr/Trunk.6515.Alya/Executables/unix/Alya.x.orig /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 15-v1_grisou_44_704_sfc_false_Alya.x.orig false +cp -prfv 15-v1_grisou_44_704_sfc_false_Alya.x.orig.dir /home/lschnorr//exp_15-v1_grisou_44 +rm -rf 15-v1_grisou_44_704_sfc_false_Alya.x.orig.dir +# +# Experiment 15-v1_grisou_44_704_sfc_false_Alya.x.modif +# +./scripts/run_alya_experiment.sh machine-file 704 /home/lschnorr/Trunk.6515.Alya/Executables/unix/Alya.x.modif /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 15-v1_grisou_44_704_sfc_false_Alya.x.modif false +cp -prfv 15-v1_grisou_44_704_sfc_false_Alya.x.modif.dir /home/lschnorr//exp_15-v1_grisou_44 +rm -rf 15-v1_grisou_44_704_sfc_false_Alya.x.modif.dir +# +# Final +# +cp script_15-v1_grisou_44_nodes.sh /home/lschnorr//exp_15-v1_grisou_44 +This script is at 'script_15-v1_grisou_44_nodes.sh', commit it. +#+end_example + +** 2017-05-11 8-node grimoire (as previous, Infiniband and Ethernet) :EXP16: +*** Design + +#+begin_src shell :results output +EID=16 +VERSION=1 +CLUSTER=grimoire +NODES=8 +MACHINEFILE=machine-file +G5KHOME=/home/lschnorr/ +EID=${EID}-v${VERSION} +SCRIPT="script_${EID}_${CLUSTER}_${NODES}_nodes.sh" + +rm -f $SCRIPT + +EDIR="$G5KHOME/exp_${EID}_${CLUSTER}_${NODES}" +echo "if [ -e "$EDIR" ]; then" >> $SCRIPT +echo " echo \"Directory/File '${EDIR}' already exists. Remove it or rename it.\"" >> $SCRIPT +echo " exit" >> $SCRIPT +echo "fi" >> $SCRIPT + +# Create experimental directory in the home +echo "mkdir -p $EDIR" >> $SCRIPT + +# The design +ALYAO=Alya.x.orig +ALYAM=Alya.x.modif +ALYAPATH=/home/lschnorr/Trunk.6515.Alya + +for NP in 128; do + for INFINIBAND in true false; do + for CASE in metis sfc; do + for ALYA in $ALYAO $ALYAM; do + NPN=$((NP)) + UNIQUE=${EID}_${CLUSTER}_${NODES}_${NPN}_${CASE}_${INFINIBAND}_${ALYA} + echo "# " + echo "# Experiment $UNIQUE" + echo "# " + echo "./scripts/run_alya_experiment.sh ${MACHINEFILE} ${NPN} ${ALYAPATH}/Executables/unix/$ALYA /home/lschnorr/WORK-RICARD/resp_${CASE}/fensap.dat $UNIQUE ${INFINIBAND}" + echo "cp -prfv $UNIQUE.dir ${EDIR}" + echo "rm -rf $UNIQUE.dir" + done + done + done >> ${SCRIPT} +done +echo "# " >> ${SCRIPT} +echo "# Final" >> ${SCRIPT} +echo "# " >> ${SCRIPT} +echo "cp $SCRIPT $EDIR" >> $SCRIPT +chmod 755 ${SCRIPT} +cat ${SCRIPT} +echo "This script is at '${SCRIPT}', commit it." +#+end_src + +#+RESULTS: +#+begin_example +if [ -e /home/lschnorr//exp_16-v1_grimoire_8 ]; then + echo "Directory/File '/home/lschnorr//exp_16-v1_grimoire_8' already exists. Remove it or rename it." + exit +fi +mkdir -p /home/lschnorr//exp_16-v1_grimoire_8 +# +# Experiment 16-v1_grimoire_8_128_metis_true_Alya.x.orig +# +./scripts/run_alya_experiment.sh machine-file 128 /home/lschnorr/Trunk.6515.Alya/Executables/unix/Alya.x.orig /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 16-v1_grimoire_8_128_metis_true_Alya.x.orig true +cp -prfv 16-v1_grimoire_8_128_metis_true_Alya.x.orig.dir /home/lschnorr//exp_16-v1_grimoire_8 +rm -rf 16-v1_grimoire_8_128_metis_true_Alya.x.orig.dir +# +# Experiment 16-v1_grimoire_8_128_metis_true_Alya.x.modif +# +./scripts/run_alya_experiment.sh machine-file 128 /home/lschnorr/Trunk.6515.Alya/Executables/unix/Alya.x.modif /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 16-v1_grimoire_8_128_metis_true_Alya.x.modif true +cp -prfv 16-v1_grimoire_8_128_metis_true_Alya.x.modif.dir /home/lschnorr//exp_16-v1_grimoire_8 +rm -rf 16-v1_grimoire_8_128_metis_true_Alya.x.modif.dir +# +# Experiment 16-v1_grimoire_8_128_sfc_true_Alya.x.orig +# +./scripts/run_alya_experiment.sh machine-file 128 /home/lschnorr/Trunk.6515.Alya/Executables/unix/Alya.x.orig /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 16-v1_grimoire_8_128_sfc_true_Alya.x.orig true +cp -prfv 16-v1_grimoire_8_128_sfc_true_Alya.x.orig.dir /home/lschnorr//exp_16-v1_grimoire_8 +rm -rf 16-v1_grimoire_8_128_sfc_true_Alya.x.orig.dir +# +# Experiment 16-v1_grimoire_8_128_sfc_true_Alya.x.modif +# +./scripts/run_alya_experiment.sh machine-file 128 /home/lschnorr/Trunk.6515.Alya/Executables/unix/Alya.x.modif /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 16-v1_grimoire_8_128_sfc_true_Alya.x.modif true +cp -prfv 16-v1_grimoire_8_128_sfc_true_Alya.x.modif.dir /home/lschnorr//exp_16-v1_grimoire_8 +rm -rf 16-v1_grimoire_8_128_sfc_true_Alya.x.modif.dir +# +# Experiment 16-v1_grimoire_8_128_metis_false_Alya.x.orig +# +./scripts/run_alya_experiment.sh machine-file 128 /home/lschnorr/Trunk.6515.Alya/Executables/unix/Alya.x.orig /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 16-v1_grimoire_8_128_metis_false_Alya.x.orig false +cp -prfv 16-v1_grimoire_8_128_metis_false_Alya.x.orig.dir /home/lschnorr//exp_16-v1_grimoire_8 +rm -rf 16-v1_grimoire_8_128_metis_false_Alya.x.orig.dir +# +# Experiment 16-v1_grimoire_8_128_metis_false_Alya.x.modif +# +./scripts/run_alya_experiment.sh machine-file 128 /home/lschnorr/Trunk.6515.Alya/Executables/unix/Alya.x.modif /home/lschnorr/WORK-RICARD/resp_metis/fensap.dat 16-v1_grimoire_8_128_metis_false_Alya.x.modif false +cp -prfv 16-v1_grimoire_8_128_metis_false_Alya.x.modif.dir /home/lschnorr//exp_16-v1_grimoire_8 +rm -rf 16-v1_grimoire_8_128_metis_false_Alya.x.modif.dir +# +# Experiment 16-v1_grimoire_8_128_sfc_false_Alya.x.orig +# +./scripts/run_alya_experiment.sh machine-file 128 /home/lschnorr/Trunk.6515.Alya/Executables/unix/Alya.x.orig /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 16-v1_grimoire_8_128_sfc_false_Alya.x.orig false +cp -prfv 16-v1_grimoire_8_128_sfc_false_Alya.x.orig.dir /home/lschnorr//exp_16-v1_grimoire_8 +rm -rf 16-v1_grimoire_8_128_sfc_false_Alya.x.orig.dir +# +# Experiment 16-v1_grimoire_8_128_sfc_false_Alya.x.modif +# +./scripts/run_alya_experiment.sh machine-file 128 /home/lschnorr/Trunk.6515.Alya/Executables/unix/Alya.x.modif /home/lschnorr/WORK-RICARD/resp_sfc/fensap.dat 16-v1_grimoire_8_128_sfc_false_Alya.x.modif false +cp -prfv 16-v1_grimoire_8_128_sfc_false_Alya.x.modif.dir /home/lschnorr//exp_16-v1_grimoire_8 +rm -rf 16-v1_grimoire_8_128_sfc_false_Alya.x.modif.dir +# +# Final +# +cp script_16-v1_grimoire_8_nodes.sh /home/lschnorr//exp_16-v1_grimoire_8 +This script is at 'script_16-v1_grimoire_8_nodes.sh', commit it. +#+end_example + +* Analysis +** 2-node grisou (17 and 32 ranks) :EXP1: +*** Rough epoch-based execution time +#+name: exp_1_grisou_2_epoch +#+begin_src shell :results output org table +EDIR=exp_1_grisou_2 +for case in $(find $EDIR | grep -e "sfc.org" -e "metis.org"); do + echo -n "$(basename $case) " + cat $case | grep epoch | sed "s/^.*: //" | tr '\n' '-' | sed -e "s/-$//" -e "s/-/ - /" -e "s/$/\n/" | bc -l; +done | sed -e "s/.org//" | sed -e "s/_/|/g" -e "s/-//g" +#+end_src + +#+RESULTS: exp_1_grisou_2_epoch +#+BEGIN_SRC org +| grisou | 2 | 32 | metis | 951 | +| grisou | 2 | 32 | sfc | 762 | +| grisou | 2 | 17 | metis | 1417 | +| grisou | 2 | 17 | sfc | 1225 | +#+END_SRC + +#+header: :var df=exp_1_grisou_2_epoch +#+begin_src R :results output graphics :file img/exp_1_grisou_2_epoch.png :exports both :width 600 :height 400 :session +library(dplyr); +library(ggplot2); +df %>% + rename(Cluster= V1, + Nodes = V2, + NP = V3, + Case = V4, + Time = V5) %>% + filter(NP != 121) %>% + ggplot(aes(x=NP, y=Time, color=Case)) + + theme_bw() + + ylim(0,NA) + + ylab("Time [s]") + + geom_point() + + geom_line() + + theme(legend.position = "top"); +#+end_src + +#+RESULTS: +[[file:img/exp_1_grisou_2_epoch.png]] +** 10-node grisou (97, 113, 129, 145, 160) :EXP2: +*** Rough epoch-based execution time +#+name: exp_2_grisou_10_epoch +#+begin_src shell :results output org table +EDIR=exp_2_grisou_10 +for case in $(find $EDIR | grep -e "sfc.org" -e "metis.org"); do + echo -n "$(basename $case) " + cat $case | grep epoch | sed "s/^.*: //" | tr '\n' '-' | sed -e "s/-$//" -e "s/-/ - /" -e "s/$/\n/" | bc -l; +done | sed -e "s/.org//" | sed -e "s/_/|/g" -e "s/-//g" +#+end_src + +#+RESULTS: exp_2_grisou_10_epoch +#+BEGIN_SRC org +| grisou | 10 | 160 | metis | 304 | +| grisou | 10 | 129 | sfc | 337 | +| grisou | 10 | 97 | metis | 401 | +| grisou | 10 | 145 | sfc | 315 | +| grisou | 10 | 129 | metis | 345 | +| grisou | 10 | 145 | metis | 321 | +| grisou | 10 | 97 | sfc | 380 | +| grisou | 10 | 160 | sfc | 301 | +| grisou | 10 | 113 | metis | 363 | +| grisou | 10 | 113 | sfc | 359 | +#+END_SRC + +#+header: :var df=exp_2_grisou_10_epoch +#+begin_src R :results output graphics :file img/exp_2_grisou_10_epoch.png :exports both :width 200 :height 400 :session +library(dplyr); +library(ggplot2); +df %>% + rename(Cluster= V1, + Nodes = V2, + NP = V3, + Case = V4, + Time = V5) %>% + filter(NP != 121) %>% + ggplot(aes(x=NP, y=Time, color=Case)) + + theme_bw() + + ylim(0,NA) + + ylab("Time [s]") + + geom_point() + + geom_line() + + theme(legend.position = "top"); +#+end_src + +#+RESULTS: +[[file:img/exp_2_grisou_10_epoch.png]] +*** Convert to CSV +I need on the $PATH: +- akypuera's otf22paje (compiled using OTF2 libraries of ScoreP 3.0) +- pajeng's =pj_dump= + +#+begin_src shell :results output +export PATH=$PATH:/home/schnorr/install/stow/bin/ +export PATH=$PATH:/home/schnorr/dev/pajeng/b/ + +convert() { + pushd $(dirname $otf2) + otf22paje traces.otf2 | pj_dump | grep ^State | cut -d, -f2,4,5,8 | sed -e "s/ //g" -e "s/MPIRank//" > traces.csv + popd +} + +EDIR=exp_2_grisou_10 +N=4 # number of parallel batches +for otf2 in $(find $EDIR | grep otf2$); do + ((i=i%N)); ((i++==0)) && wait + convert $otf2 & +done +#+end_src + +#+RESULTS: +#+begin_example +/home/schnorr/install/stow/bin//otf22paje +Usage: otf22paje [OPTION...] ANCHORFILE +Converts an OTF2 archive to the Paje file format + + -b, --basic Avoid extended events (impoverished trace file) + -d, --dummy Read input traces but won't translate (no output) + -f, --flat Flat hierarchy (only MPI ranks) + -h, --hostfile=FILE MPI hostfile to create system hierarchy + -i, --ignore-errors Ignore errors + -l, --no-links Don't convert links + -m, --comment=COMMENT Comment is echoed to output + -n, --commentfile=FILE Comments (from file) echoed to output + -o, --only-mpi Only convert MPI states + -s, --no-states Don't convert states + -z, --normalize-mpi Try to normalize MPI state names + -?, --help Give this help list + --usage Give a short usage message + +Mandatory or optional arguments to long options are also mandatory or optional +for any corresponding short options. +~/dev/Alya-Perf/exp_2_grisou_10/grisou_10_160_metis.dir/scorep_grisou_10_160_metis ~/dev/Alya-Perf +~/dev/Alya-Perf +~/dev/Alya-Perf/exp_2_grisou_10/grisou_10_129_sfc.dir/scorep_grisou_10_129_sfc ~/dev/Alya-Perf +~/dev/Alya-Perf +~/dev/Alya-Perf/exp_2_grisou_10/grisou_10_97_metis.dir/scorep_grisou_10_97_metis ~/dev/Alya-Perf +~/dev/Alya-Perf +~/dev/Alya-Perf/exp_2_grisou_10/grisou_10_145_sfc.dir/scorep_grisou_10_145_sfc ~/dev/Alya-Perf +~/dev/Alya-Perf +~/dev/Alya-Perf/exp_2_grisou_10/grisou_10_129_metis.dir/scorep_grisou_10_129_metis ~/dev/Alya-Perf +~/dev/Alya-Perf +~/dev/Alya-Perf/exp_2_grisou_10/grisou_10_145_metis.dir/scorep_grisou_10_145_metis ~/dev/Alya-Perf +~/dev/Alya-Perf +~/dev/Alya-Perf/exp_2_grisou_10/grisou_10_97_sfc.dir/scorep_grisou_10_97_sfc ~/dev/Alya-Perf +~/dev/Alya-Perf +~/dev/Alya-Perf/exp_2_grisou_10/grisou_10_160_sfc.dir/scorep_grisou_10_160_sfc ~/dev/Alya-Perf +~/dev/Alya-Perf +~/dev/Alya-Perf/exp_2_grisou_10/grisou_10_113_metis.dir/scorep_grisou_10_113_metis ~/dev/Alya-Perf +~/dev/Alya-Perf +~/dev/Alya-Perf/exp_2_grisou_10/grisou_10_113_sfc.dir/scorep_grisou_10_113_sfc ~/dev/Alya-Perf +~/dev/Alya-Perf +#+end_example + +#+begin_src shell :results output +export PATH=$PATH:/home/schnorr/install/stow/bin/ +export PATH=$PATH:/home/schnorr/dev/pajeng/b/ + +convert() { + pushd $(dirname $otf2) + otf22paje traces.otf2 | pj_dump | grep ^State | cut -d, -f2,4,5,8 | sed -e "s/ //g" -e "s/MPIRank//" > traces.csv + popd +} + +EDIR=exp_2_grisou_10 +# Files already converted (whose CSV size is not zero) +EXISTINGFILE=$(tempfile) +OTF2FILE=$(tempfile) +find $EDIR -not -empty | grep csv$ | sed -e "s/.csv$//" | sort > $EXISTINGFILE +find $EDIR | grep otf2 | sed -e "s/.otf2$//" | sort > $OTF2FILE + +for otf2 in $(comm -3 $OTF2FILE $EXISTINGFILE | sed "s/$/.otf2/"); do + convert $otf2 +done +#+end_src + +#+RESULTS: +*** Read and analysis +Check read and analysis of next section. +** 10-node grisou (97, 113, 129, 145, 160) :EXP3: +*** Rough epoch-based execution time +#+name: exp_3_grisou_10_epoch +#+begin_src shell :results output org table +EDIR=exp_3_grisou_10 +for case in $(find $EDIR | grep -e "sfc.org" -e "metis.org"); do + echo -n "$(basename $case) " + cat $case | grep epoch | sed "s/^.*: //" | tr '\n' '-' | sed -e "s/-$//" -e "s/-/ - /" -e "s/$/\n/" | bc -l; +done | sed -e "s/.org//" | sed -e "s/_/|/g" -e "s/-//g" +#+end_src + +#+RESULTS: exp_3_grisou_10_epoch +#+BEGIN_SRC org +| 3 | grisou | 10 | 145 | sfc | 320 | +| 3 | grisou | 10 | 113 | sfc | 356 | +| 3 | grisou | 10 | 97 | metis | 394 | +| 3 | grisou | 10 | 129 | sfc | 336 | +| 3 | grisou | 10 | 129 | metis | 344 | +| 3 | grisou | 10 | 160 | sfc | 308 | +| 3 | grisou | 10 | 145 | metis | 319 | +| 3 | grisou | 10 | 113 | metis | 362 | +| 3 | grisou | 10 | 160 | metis | 306 | +| 3 | grisou | 10 | 97 | sfc | 380 | +#+END_SRC + +#+header: :var df=exp_3_grisou_10_epoch +#+begin_src R :results output graphics :file img/exp_3_grisou_10_epoch.png :exports both :width 200 :height 400 :session +library(dplyr); +library(ggplot2); +df %>% + rename(EID = V1, + Cluster= V2, + Nodes = V3, + NP = V4, + Case = V5, + Time = V6) %>% + filter(NP != 121) %>% + ggplot(aes(x=NP, y=Time, color=Case)) + + theme_bw() + + ylim(0,NA) + + ylab("Time [s]") + + geom_point() + + geom_line() + + theme(legend.position = "top"); +#+end_src + +#+RESULTS: +[[file:img/exp_3_grisou_10_epoch.png]] +*** Convert to CSV +I need on the $PATH: +- akypuera's otf22paje (compiled using OTF2 libraries of ScoreP 3.0) +- pajeng's =pj_dump= + +#+begin_src shell :results output +export PATH=$PATH:/home/schnorr/install/stow/bin/ +export PATH=$PATH:/home/schnorr/dev/pajeng/b/ + +convert() { + pushd $(dirname $otf2) + otf22paje traces.otf2 | pj_dump | grep ^State | cut -d, -f2,4,5,8 | sed -e "s/ //g" -e "s/MPIRank//" > traces.csv + popd +} + +EDIR=exp_3_grisou_10 +# Files already converted (whose CSV size is not zero) +EXISTINGFILE=$(tempfile) +OTF2FILE=$(tempfile) +find $EDIR -not -empty | grep csv$ | sed -e "s/.csv$//" | sort > $EXISTINGFILE +find $EDIR | grep otf2 | sed -e "s/.otf2$//" | sort > $OTF2FILE + +for otf2 in $(comm -3 $OTF2FILE $EXISTINGFILE | sed "s/$/.otf2/"); do + echo $otf2 + convert $otf2 +done +#+end_src + +#+RESULTS: +#+begin_example +exp_3_grisou_10/3_grisou_10_113_metis.dir/scorep_3_grisou_10_113_metis/traces.otf2 +~/dev/Alya-Perf/exp_3_grisou_10/3_grisou_10_113_metis.dir/scorep_3_grisou_10_113_metis ~/dev/Alya-Perf +~/dev/Alya-Perf +exp_3_grisou_10/3_grisou_10_113_sfc.dir/scorep_3_grisou_10_113_sfc/traces.otf2 +~/dev/Alya-Perf/exp_3_grisou_10/3_grisou_10_113_sfc.dir/scorep_3_grisou_10_113_sfc ~/dev/Alya-Perf +~/dev/Alya-Perf +exp_3_grisou_10/3_grisou_10_129_metis.dir/scorep_3_grisou_10_129_metis/traces.otf2 +~/dev/Alya-Perf/exp_3_grisou_10/3_grisou_10_129_metis.dir/scorep_3_grisou_10_129_metis ~/dev/Alya-Perf +~/dev/Alya-Perf +exp_3_grisou_10/3_grisou_10_129_sfc.dir/scorep_3_grisou_10_129_sfc/traces.otf2 +~/dev/Alya-Perf/exp_3_grisou_10/3_grisou_10_129_sfc.dir/scorep_3_grisou_10_129_sfc ~/dev/Alya-Perf +~/dev/Alya-Perf +exp_3_grisou_10/3_grisou_10_145_metis.dir/scorep_3_grisou_10_145_metis/traces.otf2 +~/dev/Alya-Perf/exp_3_grisou_10/3_grisou_10_145_metis.dir/scorep_3_grisou_10_145_metis ~/dev/Alya-Perf +~/dev/Alya-Perf +exp_3_grisou_10/3_grisou_10_145_sfc.dir/scorep_3_grisou_10_145_sfc/traces.otf2 +~/dev/Alya-Perf/exp_3_grisou_10/3_grisou_10_145_sfc.dir/scorep_3_grisou_10_145_sfc ~/dev/Alya-Perf +~/dev/Alya-Perf +exp_3_grisou_10/3_grisou_10_160_metis.dir/scorep_3_grisou_10_160_metis/traces.otf2 +~/dev/Alya-Perf/exp_3_grisou_10/3_grisou_10_160_metis.dir/scorep_3_grisou_10_160_metis ~/dev/Alya-Perf +~/dev/Alya-Perf +exp_3_grisou_10/3_grisou_10_160_sfc.dir/scorep_3_grisou_10_160_sfc/traces.otf2 +~/dev/Alya-Perf/exp_3_grisou_10/3_grisou_10_160_sfc.dir/scorep_3_grisou_10_160_sfc ~/dev/Alya-Perf +~/dev/Alya-Perf +exp_3_grisou_10/3_grisou_10_97_metis.dir/scorep_3_grisou_10_97_metis/traces.otf2 +~/dev/Alya-Perf/exp_3_grisou_10/3_grisou_10_97_metis.dir/scorep_3_grisou_10_97_metis ~/dev/Alya-Perf +~/dev/Alya-Perf +exp_3_grisou_10/3_grisou_10_97_sfc.dir/scorep_3_grisou_10_97_sfc/traces.otf2 +~/dev/Alya-Perf/exp_3_grisou_10/3_grisou_10_97_sfc.dir/scorep_3_grisou_10_97_sfc ~/dev/Alya-Perf +~/dev/Alya-Perf +#+end_example +*** Read in R +#+begin_src R :results output :session :exports both :tangle do.R :tangle-mode (identity #o755) +#!/usr/bin/Rscript +library(readr); +library(dplyr); +alya_scorep_trace_read <- function(filename) +{ + meta <- unlist(strsplit(unlist(strsplit(filename, "/"))[3], "_")); + read_csv(filename, + col_names=c("Rank", "Start", "End", "Value"), + progress=TRUE) %>% + # Transform Value to factor + mutate(Value = as.factor(Value)) %>% + # Detect begin and end of iterations + mutate(Iteration = case_when( + grepl("timste", .$Value) ~ 1, + grepl("endste", .$Value) ~ -1, + TRUE ~ 0)) %>% + group_by(Rank) %>% + # Create a logical to detect observations within iterations + mutate(Iteration = as.logical(cumsum(Iteration))) %>% + # Get only observations that belongs to some iteration + filter(Iteration == TRUE) %>% + ungroup() %>% + # Create the iteration by cumsum- + mutate(Iteration = case_when( + grepl("timste", .$Value) ~ 1, + TRUE ~ 0)) %>% + group_by(Rank) %>% + mutate(Iteration = cumsum(Iteration)) %>% + mutate(EID = meta[2], + Platform = meta[3], + Nodes = meta[4], + NP = meta[5], + Partitioning = meta[6]); +} + +alya_scorep_trace_iterations <- function(filename) +{ + alya_scorep_trace_read(filename) %>% + group_by(Rank, Iteration, Platform, Nodes, NP, Partitioning, EID) %>% + summarize(N=n(), S=min(Start), E=max(End)); +} + +args = commandArgs(trailingOnly=TRUE) +print(args); +df <- do.call("rbind", lapply(args, function(x) { alya_scorep_trace_iterations(x) })); +write.csv(df, "exp2_exp3_iterations.csv"); +#+end_src + +#+RESULTS: +*** Analysis :ATTACH: +:PROPERTIES: +:Attachments: exp2_exp3_iterations.csv +:ID: d2233039-ad84-42ac-ad70-545877ba7e8e +:END: +**** Iteration-only shows SFC is worse than Metis (average per-iteration span) +#+begin_src R :results output :session :exports both +library(readr); +library(dplyr); +df <- read_csv("data/d2/233039-ad84-42ac-ad70-545877ba7e8e/exp2_exp3_iterations.csv") %>% + select(-X1) %>% + rename(Start = S, End = E) %>% + mutate(Duration = End - Start) %>% + mutate(Platform = as.factor(Platform), + Partitioning = as.factor(Partitioning), + Nodes = as.factor(Nodes)); +df %>% summary; +#+end_src + +#+RESULTS: +#+begin_example + +Attaching package: ‘dplyr’ + +The following objects are masked from ‘package:stats’: + + filter, lag + +The following objects are masked from ‘package:base’: + + intersect, setdiff, setequal, union +Parsed with column specification: +cols( + X1 = col_integer(), + Rank = col_integer(), + Iteration = col_integer(), + Platform = col_character(), + Nodes = col_integer(), + NP = col_integer(), + Partitioning = col_character(), + EID = col_integer(), + N = col_integer(), + S = col_double(), + E = col_double() +) +Warning message: +Missing column names filled in: 'X1' [1] + Rank Iteration Platform Nodes NP + Min. : 0.00 Min. : 1.0 grisou:25760 10:25760 Min. : 97.0 + 1st Qu.: 32.00 1st Qu.: 3.0 1st Qu.:113.0 + Median : 64.00 Median : 5.5 Median :129.0 + Mean : 65.84 Mean : 5.5 Mean :132.7 + 3rd Qu.: 96.00 3rd Qu.: 8.0 3rd Qu.:145.0 + Max. :159.00 Max. :10.0 Max. :160.0 + Partitioning EID N Start End + metis:12880 Min. :2.0 Min. : 3580 Min. :120.2 Min. :139.9 + sfc :12880 1st Qu.:2.0 1st Qu.:17618 1st Qu.:168.8 1st Qu.:188.9 + Median :2.5 Median :24816 Median :220.5 Median :241.4 + Mean :2.5 Mean :28263 Mean :222.6 Mean :243.2 + 3rd Qu.:3.0 3rd Qu.:37184 3rd Qu.:271.5 3rd Qu.:293.7 + Max. :3.0 Max. :85082 Max. :375.5 Max. :397.2 + Duration + Min. :13.57 + 1st Qu.:18.05 + Median :19.93 + Mean :20.63 + 3rd Qu.:23.17 + Max. :29.10 +#+end_example + +Execution time + +#+begin_src R :results output graphics :file img/exp2_exp3_iterations_only.png :exports both :width 600 :height 400 :session +df %>% + group_by(Platform, Nodes, NP, Partitioning, EID) %>% + summarize(Time = max(End) - min(Start)) %>% + group_by(Platform, Nodes, NP, Partitioning) %>% + summarize(Time = mean(Time), N=n()) %>% + print %>% + ggplot(aes(x=NP, y=Time, color=Partitioning)) + + theme_bw(base_size=12) + + geom_point() + + geom_line() + + ylim(0, NA) +#+end_src + +#+RESULTS: +[[file:img/exp2_exp3_iterations_only.png]] + +#+begin_src R :results output :session :exports both +df %>% + group_by(NP, Iteration, Partitioning) %>% + # calculate the mean of number of operations per iteration + summarize(N=mean(N), Z=n()); +#+end_src + +#+RESULTS: +#+begin_example +Source: local data frame [100 x 5] +Groups: NP, Iteration [?] + + NP Iteration Partitioning N Z + +1 97 1 metis 19204.61 194 +2 97 1 sfc 40975.99 194 +3 97 2 metis 18361.41 194 +4 97 2 sfc 39909.78 194 +5 97 3 metis 18358.87 194 +6 97 3 sfc 39712.10 194 +7 97 4 metis 17679.86 194 +8 97 4 sfc 38238.62 194 +9 97 5 metis 17201.62 194 +10 97 5 sfc 37063.47 194 +# ... with 90 more rows +#+end_example + +#+begin_src R :results output graphics :file img/exp2_exp3_number_of_operations.png :exports both :width 600 :height 400 :session +library(ggplot2); +df %>% + group_by(NP, Iteration, Partitioning) %>% + # calculate the mean of number of operations per iteration + summarize(N=mean(N), Z=n()) %>% + ungroup() %>% + ggplot(aes(x=Iteration, y=N, color=Partitioning)) + + theme_bw(base_size=12) + + ylim(0,NA) + + geom_point() + + theme(legend.position="top") + + facet_wrap(~NP, nrow=1) +#+end_src + +#+RESULTS: +[[file:img/exp2_exp3_number_of_operations.png]] +**** Addressing Rick's questions (on max per-iteration) +:RICK: +Could you give us the results shown in =exp2_exp3_iterations_only.png=, but +considering the maximum time for each CPU-core instead of the average? +:END: + +:LUCAS: +If I understood correctly your question, you want to see the time each +CPU-core took to compute the 10 iterations. So, I put Rank in the X +axis, and on Y, I put the difference of time between the moment where +a given rank enters the first iteration until the moment it leaves the +10th iteration. I select only one experiment (exp2) because they are +very similar (on the same set of machines). The exp2 experiment +contained 5 executions with different NP (97, 113, 129, 145, and +160). The five plots are in the attached file +=exp2_exp3_10iter_time_of_each_rank.png= +:END: + +#+begin_src R :results output graphics :file img/exp2_10iter_time_of_each_rank.png :exports both :width 600 :height 600 :session +library(ggplot2); +df %>% + filter(EID==2) %>% + group_by(Rank, NP, Partitioning, EID) %>% + summarize(Time = max(End) - min(Start)) %>% + ggplot(aes(x=Rank, y=Time, color=Partitioning)) + + theme_bw(base_size=12) + + geom_point() + + ylim(0, NA) + + facet_grid(EID~NP) + + theme(legend.position="top") +#+end_src + +#+RESULTS: +[[file:img/exp2_10iter_time_of_each_rank.png]] + +:LUCAS: +Since I am not sure I understood correctly your request, the plot +=img/exp2_exp3_iterations_only_max.png= contains the max execution time +(on Y) of each of the 10 iterations (the horizontal facets) taking +into account all ranks belonging to each of the five NP experiments +(on X). +:END: + +#+begin_src R :results output graphics :file img/exp2_exp3_iterations_only_max.png :exports both :width 1200 :height 400 :session +library(ggplot2); +df %>% + select(-Start, -End) %>% + group_by(Iteration, NP, Partitioning) %>% + summarize(Time = max(Duration), N=n()) %>% + ungroup() %>% + mutate(NP=as.factor(NP)) %>% + ggplot(aes(x=NP, y=Time, color=Partitioning, group=Partitioning)) + + theme_bw(base_size=12) + + geom_point() + + geom_line() + + ylim(0, NA) + + facet_grid(~Iteration) + + theme(legend.position="top") +#+end_src + +#+RESULTS: +[[file:img/exp2_exp3_iterations_only_max.png]] +** 23-node nova (367 352 336 272 208 144 80 16) :EXP4: +*** Preparation (2-nodes only) +**** Convert to CSV +I need on the $PATH: +- akypuera's otf22paje (compiled using OTF2 libraries of ScoreP 3.0) +- pajeng's =pj_dump= + +#+begin_src shell :results output +export PATH=$PATH:/home/schnorr/install/stow/bin/ +export PATH=$PATH:/home/schnorr/dev/pajeng/b/ + +convert() { + pushd $(dirname $otf2) + otf22paje traces.otf2 | pj_dump | grep ^State | cut -d, -f2,4,5,8 | sed -e "s/ //g" -e "s/MPIRank//" | gzip > traces.csv.gz + popd +} + +EDIR=exp_prep-4-v2_grisou_4 +# Files already converted (whose CSV size is not zero) +EXISTINGFILE=$(tempfile) +OTF2FILE=$(tempfile) +find $EDIR -not -empty | grep csv$ | sed -e "s/.csv$//" | sort > $EXISTINGFILE +find $EDIR | grep otf2 | sed -e "s/.otf2$//" | sort > $OTF2FILE + +for otf2 in $(comm -3 $OTF2FILE $EXISTINGFILE | sed "s/$/.otf2/"); do + echo $otf2 + convert $otf2 +done +#+end_src + +#+RESULTS: +: exp_prep-4-v2_grisou_4/prep-4_grisou_4_63_metis.dir/scorep_prep-4_grisou_4_63_metis/traces.otf2 +: ~/dev/Alya-Perf/exp_prep-4-v2_grisou_4/prep-4_grisou_4_63_metis.dir/scorep_prep-4_grisou_4_63_metis ~/dev/Alya-Perf +: ~/dev/Alya-Perf +: exp_prep-4-v2_grisou_4/prep-4_grisou_4_63_sfc.dir/scorep_prep-4_grisou_4_63_sfc/traces.otf2 +: ~/dev/Alya-Perf/exp_prep-4-v2_grisou_4/prep-4_grisou_4_63_sfc.dir/scorep_prep-4_grisou_4_63_sfc ~/dev/Alya-Perf +: ~/dev/Alya-Perf +**** Extract iteration information :ATTACH: +:PROPERTIES: +:Attachments: exp_prep-4_iterations.csv.gz +:ID: b0e3978c-4131-4005-b979-a86aad031816 +:END: +#+begin_src shell :results output +./do.R \ +exp_prep-4-v1_grisou_4/prep-4_grisou_4_18_sfc.dir/scorep_prep-4_grisou_4_18_sfc/traces.csv.gz \ +exp_prep-4-v1_grisou_4/prep-4_grisou_4_33_sfc.dir/scorep_prep-4_grisou_4_33_sfc/traces.csv.gz \ +exp_prep-4-v1_grisou_4/prep-4_grisou_4_49_metis.dir/scorep_prep-4_grisou_4_49_metis/traces.csv.gz \ +exp_prep-4-v1_grisou_4/prep-4_grisou_4_63_sfc.dir/scorep_prep-4_grisou_4_63_sfc/traces.csv.gz \ +exp_prep-4-v1_grisou_4/prep-4_grisou_4_33_metis.dir/scorep_prep-4_grisou_4_33_metis/traces.csv.gz \ +exp_prep-4-v1_grisou_4/prep-4_grisou_4_18_metis.dir/scorep_prep-4_grisou_4_18_metis/traces.csv.gz \ +exp_prep-4-v1_grisou_4/prep-4_grisou_4_49_sfc.dir/scorep_prep-4_grisou_4_49_sfc/traces.csv.gz \ +exp_prep-4-v1_grisou_4/prep-4_grisou_4_63_metis.dir/scorep_prep-4_grisou_4_63_metis/traces.csv.gz \ +exp_prep-4-v2_grisou_4/prep-4_grisou_4_63_sfc.dir/scorep_prep-4_grisou_4_63_sfc/traces.csv.gz \ +exp_prep-4-v2_grisou_4/prep-4_grisou_4_63_metis.dir/scorep_prep-4_grisou_4_63_metis/traces.csv.gz \ +#+end_src + +Results are saved in file compressed CSV file called =exp_prep-4.csv.gz=. +**** Analysis +Forthcoming. +*** 23-Node data +**** Convert to CSV +I need on the $PATH: +- akypuera's otf22paje (compiled using OTF2 libraries of ScoreP 3.0) +- pajeng's =pj_dump= + +#+begin_src shell :results output +export PATH=$PATH:/home/lschnorr/akypuera/b/ +export PATH=$PATH:/home/lschnorr/pajeng/b/ + +convert() { + pushd $(dirname $otf2) + otf22paje traces.otf2 | pj_dump | grep ^State | cut -d, -f2,4,5,8 | sed -e "s/ //g" -e "s/MPIRank//" | gzip > traces.csv.gz + popd +} + +EDIR=exp_4_grisou_23_v1/ +# Files already converted (whose CSV size is not zero) +EXISTINGFILE=$(tempfile) +OTF2FILE=$(tempfile) +find $EDIR -not -empty | grep csv.gz$ | sed -e "s/.csv.gz$//" | sort > $EXISTINGFILE +find $EDIR | grep \.otf2$ | sed -e "s/.otf2$//" | sort > $OTF2FILE + +for otf2 in $(comm -3 $OTF2FILE $EXISTINGFILE | sed "s/$/.otf2/"); do + echo $otf2 + convert $otf2 +done +#+end_src + +#+RESULTS: +#+begin_example +exp_3_grisou_10/3_grisou_10_113_metis.dir/scorep_3_grisou_10_113_metis/traces.otf2 +~/dev/Alya-Perf/exp_3_grisou_10/3_grisou_10_113_metis.dir/scorep_3_grisou_10_113_metis ~/dev/Alya-Perf +~/dev/Alya-Perf +exp_3_grisou_10/3_grisou_10_113_sfc.dir/scorep_3_grisou_10_113_sfc/traces.otf2 +~/dev/Alya-Perf/exp_3_grisou_10/3_grisou_10_113_sfc.dir/scorep_3_grisou_10_113_sfc ~/dev/Alya-Perf +~/dev/Alya-Perf +exp_3_grisou_10/3_grisou_10_129_metis.dir/scorep_3_grisou_10_129_metis/traces.otf2 +~/dev/Alya-Perf/exp_3_grisou_10/3_grisou_10_129_metis.dir/scorep_3_grisou_10_129_metis ~/dev/Alya-Perf +~/dev/Alya-Perf +exp_3_grisou_10/3_grisou_10_129_sfc.dir/scorep_3_grisou_10_129_sfc/traces.otf2 +~/dev/Alya-Perf/exp_3_grisou_10/3_grisou_10_129_sfc.dir/scorep_3_grisou_10_129_sfc ~/dev/Alya-Perf +~/dev/Alya-Perf +exp_3_grisou_10/3_grisou_10_145_metis.dir/scorep_3_grisou_10_145_metis/traces.otf2 +~/dev/Alya-Perf/exp_3_grisou_10/3_grisou_10_145_metis.dir/scorep_3_grisou_10_145_metis ~/dev/Alya-Perf +~/dev/Alya-Perf +exp_3_grisou_10/3_grisou_10_145_sfc.dir/scorep_3_grisou_10_145_sfc/traces.otf2 +~/dev/Alya-Perf/exp_3_grisou_10/3_grisou_10_145_sfc.dir/scorep_3_grisou_10_145_sfc ~/dev/Alya-Perf +~/dev/Alya-Perf +exp_3_grisou_10/3_grisou_10_160_metis.dir/scorep_3_grisou_10_160_metis/traces.otf2 +~/dev/Alya-Perf/exp_3_grisou_10/3_grisou_10_160_metis.dir/scorep_3_grisou_10_160_metis ~/dev/Alya-Perf +~/dev/Alya-Perf +exp_3_grisou_10/3_grisou_10_160_sfc.dir/scorep_3_grisou_10_160_sfc/traces.otf2 +~/dev/Alya-Perf/exp_3_grisou_10/3_grisou_10_160_sfc.dir/scorep_3_grisou_10_160_sfc ~/dev/Alya-Perf +~/dev/Alya-Perf +exp_3_grisou_10/3_grisou_10_97_metis.dir/scorep_3_grisou_10_97_metis/traces.otf2 +~/dev/Alya-Perf/exp_3_grisou_10/3_grisou_10_97_metis.dir/scorep_3_grisou_10_97_metis ~/dev/Alya-Perf +~/dev/Alya-Perf +exp_3_grisou_10/3_grisou_10_97_sfc.dir/scorep_3_grisou_10_97_sfc/traces.otf2 +~/dev/Alya-Perf/exp_3_grisou_10/3_grisou_10_97_sfc.dir/scorep_3_grisou_10_97_sfc ~/dev/Alya-Perf +~/dev/Alya-Perf +#+end_example +**** Post-processing in R +#+begin_src R :results output :session :exports both :tangle do-exp4.R :tangle-mode (identity #o755) +#!/usr/bin/Rscript +library(readr); +library(dplyr); +alya_scorep_trace_read <- function(filename) +{ + meta <- unlist(strsplit(unlist(strsplit(filename, "/"))[3], "_")); + read_csv(filename, + col_names=c("Rank", "Start", "End", "Value"), + progress=TRUE) %>% + # Transform Value to factor + mutate(Value = as.factor(Value)) %>% + # Detect begin and end of iterations + mutate(Iteration = case_when( + grepl("timste", .$Value) ~ 1, + grepl("endste", .$Value) ~ -1, + TRUE ~ 0)) %>% + group_by(Rank) %>% + # Create a logical to detect observations within iterations + mutate(Iteration = as.logical(cumsum(Iteration))) %>% + # Get only observations that belongs to some iteration + filter(Iteration == TRUE) %>% + ungroup() %>% + # Create the iteration by cumsum + mutate(Iteration = case_when( + grepl("timste", .$Value) ~ 1, + TRUE ~ 0)) %>% + group_by(Rank) %>% + mutate(Iteration = as.integer(cumsum(Iteration))) %>% + ungroup() %>% + # Define metadata + mutate(EID = meta[2], + Platform = meta[3], + Nodes = meta[4], + NP = meta[5], + Partitioning = meta[6]); +} + +alya_scorep_trace_iterations <- function(filename) +{ + alya_scorep_trace_read(filename) %>% + group_by(Rank, Iteration, Platform, Nodes, NP, Partitioning, EID) %>% + filter(grepl("MPI_", Value)) %>% + summarize(N=n(), S=min(Start), E=max(End), Comm=sum(End-Start), Comp=(E-S)-Comm); +} + +args = commandArgs(trailingOnly=TRUE) +print(args); +df <- do.call("rbind", lapply(args, function(x) { alya_scorep_trace_iterations(x) })); +write.csv(df, "exp4_iterations.csv"); +#+end_src + +**** Read in R :ATTACH: +:PROPERTIES: +:Attachments: exp4_iterations.csv.gz +:ID: 3b264f0c-58ba-47c0-b291-b5729da5e669 +:END: + +#+begin_src R :results output :session :exports both +library(readr); +library(dplyr); +df <- read_csv("data/3b/264f0c-58ba-47c0-b291-b5729da5e669/exp4_iterations.csv.gz") %>% + select(-X1) %>% + rename(Start = S, End = E) %>% + mutate(Duration = End - Start) %>% + mutate(Platform = as.factor(Platform), + Partitioning = as.factor(Partitioning), + Nodes = as.factor(Nodes)); +df %>% summary; +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + X1 = col_integer(), + Rank = col_integer(), + Iteration = col_integer(), + Platform = col_character(), + Nodes = col_integer(), + NP = col_integer(), + Partitioning = col_character(), + EID = col_integer(), + N = col_integer(), + S = col_double(), + E = col_double(), + Comm = col_double(), + Comp = col_double() +) +Warning message: +Missing column names filled in: 'X1' [1] + Rank Iteration Platform Nodes NP + Min. : 0.0 Min. : 1.000 grisou:19613 23:19613 Min. : 17.0 + 1st Qu.: 61.0 1st Qu.: 2.000 1st Qu.:209.0 + Median :132.0 Median : 5.000 Median :337.0 + Mean :145.2 Mean : 5.091 Mean :291.5 + 3rd Qu.:221.0 3rd Qu.: 8.000 3rd Qu.:353.0 + Max. :367.0 Max. :10.000 Max. :368.0 + Partitioning EID N Start End + metis: 1783 Min. :4 Min. : 3579 Min. : 131.6 Min. : 145.8 + sfc :17830 1st Qu.:4 1st Qu.: 29737 1st Qu.: 151.6 1st Qu.: 166.4 + Median :4 Median : 38587 Median : 189.7 Median : 201.3 + Mean :4 Mean : 39148 Mean : 199.2 Mean : 214.1 + 3rd Qu.:4 3rd Qu.: 47123 3rd Qu.: 223.9 3rd Qu.: 234.6 + Max. :4 Max. :103825 Max. :1417.6 Max. :1527.0 + Comm Comp Duration + Min. : 0.539 Min. : 0.01607 Min. : 8.054 + 1st Qu.: 5.304 1st Qu.: 4.92897 1st Qu.: 10.519 + Median : 5.926 Median : 5.75274 Median : 11.189 + Mean : 6.596 Mean : 8.34015 Mean : 14.936 + 3rd Qu.: 7.522 3rd Qu.: 8.05452 3rd Qu.: 15.129 + Max. :169.634 Max. :167.55322 Max. :169.672 +#+end_example +**** Plot +#+begin_src R :results output :session :exports both +df %>% filter(Partitioning == "metis") %>% tail +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 6 x 13 + Rank Iteration Platform Nodes NP Partitioning EID N Start + +1 331 1 grisou 23 337 metis 4 19477 149.0451 +2 332 1 grisou 23 337 metis 4 31973 149.0405 +3 333 1 grisou 23 337 metis 4 35097 149.0563 +4 334 1 grisou 23 337 metis 4 35097 149.0551 +5 335 1 grisou 23 337 metis 4 41345 149.0725 +6 336 1 grisou 23 337 metis 4 28849 149.0405 +# ... with 4 more variables: End , Comm , Comp , Duration +#+end_example + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 500 :height 400 :session +library(ggplot2); +library(tidyr); +df %>% + select(-Start, -End, -N, -Duration) %>% + mutate(Total = Comm+Comp) %>% + gather(Variable, Value, -Rank, -Iteration, -Platform, -Nodes, -NP, -Partitioning, -EID) %>% + group_by (Iteration, Platform, Nodes, NP, Partitioning, EID, Variable) %>% + summarize(Mean = mean(Value), SE = 3*sd(Value)/sqrt(n()), N=n()) %>% + filter(NP == 368) %>% + ggplot(aes(x=Iteration, y=Mean, ymin=Mean-SE, ymax=Mean+SE, color=Partitioning)) + + theme_bw(base_size=12) + + ylim(0,NA) + + geom_point() + + geom_errorbar(width=.3) + + geom_line () + + theme(legend.position="top") + + facet_grid(NP~Variable) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-24763lPR/figure24763z4R.png]] + + + +Unfortunately, Metis as misconfigured in lyon, it ran only 1 +iteration. +**** Plot with min/max + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 500 :height 400 :session +library(ggplot2); +library(tidyr); +df %>% + filter(Rank != 0) %>% + filter(Partitioning == "sfc") %>% + select(-Start, -End, -N, -Duration) %>% + mutate(Total = Comm+Comp) %>% + gather(Variable, Value, -Rank, -Iteration, -Platform, -Nodes, -NP, -Partitioning, -EID) %>% + filter(NP == 368) %>% + ggplot(aes(x=Iteration, y=Value, group=Iteration)) + + theme_bw(base_size=12) + + ylim(0,NA) + + geom_boxplot() + + theme(legend.position="top") + + facet_grid(Partitioning~Variable) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-24763lPR/figure24763AKM.png]] +**** Gantt to check MPI calls +***** Read the trace + +#+begin_src R :results output :session :exports both +library(readr); +library(dplyr); +alya_scorep_trace_read <- function(filename) +{ + meta <- unlist(strsplit(unlist(strsplit(filename, "/"))[3], "_")); + read_csv(filename, + col_names=c("Rank", "Start", "End", "Value"), + progress=TRUE) %>% + # Transform Value to factor + mutate(Value = as.factor(Value)) %>% + # Detect begin and end of iterations + mutate(Iteration = case_when( + grepl("timste", .$Value) ~ 1, + grepl("endste", .$Value) ~ -1, + TRUE ~ 0)) %>% + group_by(Rank) %>% + # Create a logical to detect observations within iterations + mutate(Iteration = as.logical(cumsum(Iteration))) %>% + # Get only observations that belongs to some iteration + filter(Iteration == TRUE) %>% + ungroup() %>% + # Create the iteration by cumsum + mutate(Iteration = case_when( + grepl("timste", .$Value) ~ 1, + TRUE ~ 0)) %>% + group_by(Rank) %>% + mutate(Iteration = as.integer(cumsum(Iteration))) %>% + ungroup() %>% + # Define metadata + mutate(EID = meta[2], + Platform = meta[3], + Nodes = meta[4], + NP = meta[5], + Partitioning = meta[6]); +} +TRACE <- "exp_prep-4-v1_grisou_4/prep-4_grisou_4_63_sfc.dir/scorep_prep-4_grisou_4_63_sfc/traces.csv.gz" +df4.mpi <- alya_scorep_trace_read(TRACE); +#+end_src + +#+RESULTS: +: Parsed with column specification: +: cols( +: Rank = col_integer(), +: Start = col_double(), +: End = col_double(), +: Value = col_character() +: ) +: |= | 2% 22 MB |= | 2% 22 MB |= | 2% 22 MB |= | 2% 22 MB |= | 2% 22 MB |= | 2% 22 MB |= | 2% 22 MB |= | 2% 23 MB |= | 2% 23 MB |= | 3% 23 MB |= | 3% 23 MB |= | 3% 23 MB |= | 3% 23 MB |= | 3% 23 MB |= | 3% 23 MB |= | 3% 23 MB |== | 3% 23 MB |== | 3% 23 MB |== | 3% 23 MB |== | 3% 24 MB |== | 3% 24 MB |== | 3% 24 MB |== | 3% 24 MB |== | 3% 24 MB |== | 3% 24 MB |== | 3% 24 MB |== | 3% 24 MB |== | 3% 24 MB |== | 3% 24 MB |== | 3% 24 MB |== | 3% 24 MB |== | 3% 25 MB |== | 3% 25 MB |== | 3% 25 MB |== | 3% 25 MB |== | 3% 25 MB |== | 3% 25 MB |== | 3% 25 MB |== | 3% 25 MB |== | 3% 25 MB |== | 3% 25 MB |== | 3% 25 MB |== | 3% 26 MB |== | 3% 26 MB |== | 3% 26 MB |== | 3% 26 MB |== | 3% 26 MB |== | 3% 26 MB |== | 3% 26 MB |== | 3% 26 MB |== | 3% 26 MB |== | 3% 26 MB |== | 3% 26 MB |== | 3% 26 MB |== | 3% 27 MB |== | 3% 27 MB |== | 3% 27 MB |== | 3% 27 MB |== | 3% 27 MB |== | 3% 27 MB |== | 3% 27 MB |== | 3% 27 MB |== | 3% 27 MB |== | 3% 27 MB |== | 3% 27 MB |== | 3% 27 MB |== | 3% 28 MB |== | 3% 28 MB |== | 3% 28 MB |== | 3% 28 MB |== | 3% 28 MB |== | 3% 28 MB |== | 3% 28 MB |== | 3% 28 MB |== | 3% 28 MB |== | 3% 28 MB |== | 3% 28 MB |== | 3% 28 MB |== | 3% 29 MB |== | 3% 29 MB |== | 3% 29 MB |== | 3% 29 MB |== | 3% 29 MB |== | 3% 29 MB |== | 3% 29 MB |== | 3% 29 MB |== | 3% 29 MB |== | 3% 29 MB |== | 3% 29 MB |== | 3% 30 MB |== | 3% 30 MB |== | 3% 30 MB |== | 3% 30 MB |== | 3% 30 MB |== | 3% 30 MB |== | 3% 30 MB |== | 3% 30 MB |== | 3% 30 MB |== | 3% 30 MB |== | 4% 30 MB |== | 4% 30 MB |== | 4% 31 MB |== | 4% 31 MB |== | 4% 31 MB |== | 4% 31 MB |== | 4% 31 MB |== | 4% 31 MB |== | 4% 31 MB |== | 4% 31 MB |== | 4% 31 MB |== | 4% 31 MB |== | 4% 31 MB |== | 4% 31 MB |== | 4% 32 MB |== | 4% 32 MB |== | 4% 32 MB |== | 4% 32 MB |== | 4% 32 MB |== | 4% 32 MB |== | 4% 32 MB |== | 4% 32 MB |== | 4% 32 MB |== | 4% 32 MB |== | 4% 32 MB |== | 4% 32 MB |== | 4% 33 MB |== | 4% 33 MB |== | 4% 33 MB |== | 4% 33 MB |== | 4% 33 MB |== | 4% 33 MB |== | 4% 33 MB |== | 4% 33 MB |== | 4% 33 MB |== | 4% 33 MB |== | 4% 33 MB |== | 4% 34 MB |== | 4% 34 MB |== | 4% 34 MB |== | 4% 34 MB |== | 4% 34 MB |== | 4% 34 MB |== | 4% 34 MB |== | 4% 34 MB |== | 4% 34 MB |== | 4% 34 MB |== | 4% 34 MB |== | 4% 34 MB |== | 4% 35 MB |== | 4% 35 MB |== | 4% 35 MB |== | 4% 35 MB |== | 4% 35 MB |== | 4% 35 MB |== | 4% 35 MB |=== | 4% 35 MB |=== | 4% 35 MB |=== | 4% 35 MB |=== | 4% 35 MB |=== | 4% 35 MB |=== | 4% 36 MB |=== | 4% 36 MB |=== | 4% 36 MB |=== | 4% 36 MB |=== | 4% 36 MB |=== | 4% 36 MB |=== | 4% 36 MB |=== | 4% 36 MB |=== | 4% 36 MB |=== | 4% 36 MB |=== | 4% 36 MB |=== | 4% 36 MB |=== | 4% 37 MB |=== | 4% 37 MB |=== | 4% 37 MB |=== | 4% 37 MB |=== | 4% 37 MB |=== | 4% 37 MB |=== | 4% 37 MB |=== | 4% 37 MB |=== | 4% 37 MB |=== | 4% 37 MB |=== | 4% 37 MB |=== | 4% 38 MB |=== | 4% 38 MB |=== | 4% 38 MB |=== | 4% 38 MB |=== | 4% 38 MB |=== | 4% 38 MB |=== | 4% 38 MB |=== | 5% 38 MB |=== | 5% 38 MB |=== | 5% 38 MB |=== | 5% 38 MB |=== | 5% 38 MB |=== | 5% 39 MB |=== | 5% 39 MB |=== | 5% 39 MB |=== | 5% 39 MB |=== | 5% 39 MB |=== | 5% 39 MB |=== | 5% 39 MB |=== | 5% 39 MB |=== | 5% 39 MB |=== | 5% 39 MB |=== | 5% 39 MB |=== | 5% 39 MB |=== | 5% 40 MB |=== | 5% 40 MB |=== | 5% 40 MB |=== | 5% 40 MB |=== | 5% 40 MB |=== | 5% 40 MB |=== | 5% 40 MB |=== | 5% 40 MB |=== | 5% 40 MB |=== | 5% 40 MB |=== | 5% 40 MB |=== | 5% 40 MB |=== | 5% 41 MB |=== | 5% 41 MB |=== | 5% 41 MB |=== | 5% 41 MB |=== | 5% 41 MB |=== | 5% 41 MB |=== | 5% 41 MB |=== | 5% 41 MB |=== | 5% 41 MB |=== | 5% 41 MB |=== | 5% 41 MB |=== | 5% 42 MB |=== | 5% 42 MB |=== | 5% 42 MB |=== | 5% 42 MB |=== | 5% 42 MB |=== | 5% 42 MB |=== | 5% 42 MB |=== | 5% 42 MB |=== | 5% 42 MB |=== | 5% 42 MB |=== | 5% 42 MB |=== | 5% 42 MB |=== | 5% 43 MB |=== | 5% 43 MB |=== | 5% 43 MB |=== | 5% 43 MB |=== | 5% 43 MB |=== | 5% 43 MB |=== | 5% 43 MB |=== | 5% 43 MB |=== | 5% 43 MB |=== | 5% 43 MB |=== | 5% 43 MB |=== | 5% 43 MB |=== | 5% 44 MB |=== | 5% 44 MB |=== | 5% 44 MB |=== | 5% 44 MB |=== | 5% 44 MB |=== | 5% 44 MB |=== | 5% 44 MB |=== | 5% 44 MB |=== | 5% 44 MB |=== | 5% 44 MB |=== | 5% 44 MB |=== | 5% 44 MB |=== | 5% 45 MB |=== | 5% 45 MB |=== | 5% 45 MB |=== | 5% 45 MB |=== | 5% 45 MB |=== | 5% 45 MB |=== | 5% 45 MB |=== | 5% 45 MB |=== | 5% 45 MB |=== | 5% 45 MB |=== | 5% 45 MB |=== | 5% 46 MB |=== | 5% 46 MB |=== | 5% 46 MB |=== | 6% 46 MB |=== | 6% 46 MB |=== | 6% 46 MB |=== | 6% 46 MB |=== | 6% 46 MB |=== | 6% 46 MB |=== | 6% 46 MB |=== | 6% 46 MB |=== | 6% 46 MB |=== | 6% 47 MB |=== | 6% 47 MB |=== | 6% 47 MB |=== | 6% 47 MB |=== | 6% 47 MB |==== | 6% 47 MB |==== | 6% 47 MB |==== | 6% 47 MB |==== | 6% 47 MB |==== | 6% 47 MB |==== | 6% 47 MB |==== | 6% 47 MB |==== | 6% 48 MB |==== | 6% 48 MB |==== | 6% 48 MB |==== | 6% 48 MB |==== | 6% 48 MB |==== | 6% 48 MB |==== | 6% 48 MB |==== | 6% 48 MB |==== | 6% 48 MB |==== | 6% 48 MB |==== | 6% 48 MB |==== | 6% 48 MB |==== | 6% 49 MB |==== | 6% 49 MB |==== | 6% 49 MB |==== | 6% 49 MB |==== | 6% 49 MB |==== | 6% 49 MB |==== | 6% 49 MB |==== | 6% 49 MB |==== | 6% 49 MB |==== | 6% 49 MB |==== | 6% 49 MB |==== | 6% 50 MB |==== | 6% 50 MB |==== | 6% 50 MB |==== | 6% 50 MB |==== | 6% 50 MB |==== | 6% 50 MB |==== | 6% 50 MB |==== | 6% 50 MB |==== | 6% 50 MB |==== | 6% 50 MB |==== | 6% 50 MB |==== | 6% 50 MB |==== | 6% 51 MB |==== | 6% 51 MB |==== | 6% 51 MB |==== | 6% 51 MB |==== | 6% 51 MB |==== | 6% 51 MB |==== | 6% 51 MB |==== | 6% 51 MB |==== | 6% 51 MB |==== | 6% 51 MB |==== | 6% 51 MB |==== | 6% 51 MB |==== | 6% 52 MB |==== | 6% 52 MB |==== | 6% 52 MB |==== | 6% 52 MB |==== | 6% 52 MB |==== | 6% 52 MB |==== | 6% 52 MB |==== | 6% 52 MB |==== | 6% 52 MB |==== | 6% 52 MB |==== | 6% 52 MB |==== | 6% 52 MB |==== | 6% 53 MB |==== | 6% 53 MB |==== | 6% 53 MB |==== | 6% 53 MB |==== | 6% 53 MB |==== | 6% 53 MB |==== | 6% 53 MB |==== | 6% 53 MB |==== | 6% 53 MB |==== | 6% 53 MB |==== | 6% 53 MB |==== | 7% 54 MB |==== | 7% 54 MB |==== | 7% 54 MB |==== | 7% 54 MB |==== | 7% 54 MB |==== | 7% 54 MB |==== | 7% 54 MB |==== | 7% 54 MB |==== | 7% 54 MB |==== | 7% 54 MB |==== | 7% 54 MB |==== | 7% 54 MB |==== | 7% 55 MB |==== | 7% 55 MB |==== | 7% 55 MB |==== | 7% 55 MB |==== | 7% 55 MB |==== | 7% 55 MB |==== | 7% 55 MB |==== | 7% 55 MB |==== | 7% 55 MB |==== | 7% 55 MB |==== | 7% 55 MB |==== | 7% 55 MB |==== | 7% 56 MB |==== | 7% 56 MB |==== | 7% 56 MB |==== | 7% 56 MB |==== | 7% 56 MB |==== | 7% 56 MB |==== | 7% 56 MB |==== | 7% 56 MB |==== | 7% 56 MB |==== | 7% 56 MB |==== | 7% 56 MB |==== | 7% 56 MB |==== | 7% 57 MB |==== | 7% 57 MB |==== | 7% 57 MB |==== | 7% 57 MB |==== | 7% 57 MB |==== | 7% 57 MB |==== | 7% 57 MB |==== | 7% 57 MB |==== | 7% 57 MB |==== | 7% 57 MB |==== | 7% 57 MB |==== | 7% 57 MB |==== | 7% 58 MB |==== | 7% 58 MB |==== | 7% 58 MB |==== | 7% 58 MB |==== | 7% 58 MB |==== | 7% 58 MB |==== | 7% 58 MB |==== | 7% 58 MB |==== | 7% 58 MB |==== | 7% 58 MB |==== | 7% 58 MB |==== | 7% 59 MB |==== | 7% 59 MB |==== | 7% 59 MB |==== | 7% 59 MB |===== | 7% 59 MB |===== | 7% 59 MB |===== | 7% 59 MB |===== | 7% 59 MB |===== | 7% 59 MB |===== | 7% 59 MB |===== | 7% 59 MB |===== | 7% 59 MB |===== | 7% 60 MB |===== | 7% 60 MB |===== | 7% 60 MB |===== | 7% 60 MB |===== | 7% 60 MB |===== | 7% 60 MB |===== | 7% 60 MB |===== | 7% 60 MB |===== | 7% 60 MB |===== | 7% 60 MB |===== | 7% 60 MB |===== | 7% 60 MB |===== | 7% 61 MB |===== | 7% 61 MB |===== | 7% 61 MB |===== | 7% 61 MB |===== | 7% 61 MB |===== | 7% 61 MB |===== | 7% 61 MB |===== | 7% 61 MB |===== | 8% 61 MB |===== | 8% 61 MB |===== | 8% 61 MB |===== | 8% 61 MB |===== | 8% 62 MB |===== | 8% 62 MB |===== | 8% 62 MB |===== | 8% 62 MB |===== | 8% 62 MB |===== | 8% 62 MB |===== | 8% 62 MB |===== | 8% 62 MB |===== | 8% 62 MB |===== | 8% 62 MB |===== | 8% 62 MB |===== | 8% 63 MB |===== | 8% 63 MB |===== | 8% 63 MB |===== | 8% 63 MB |===== | 8% 63 MB |===== | 8% 63 MB |===== | 8% 63 MB |===== | 8% 63 MB |===== | 8% 63 MB |===== | 8% 63 MB |===== | 8% 63 MB |===== | 8% 63 MB |===== | 8% 64 MB |===== | 8% 64 MB |===== | 8% 64 MB |===== | 8% 64 MB |===== | 8% 64 MB |===== | 8% 64 MB |===== | 8% 64 MB |===== | 8% 64 MB |===== | 8% 64 MB |===== | 8% 64 MB |===== | 8% 64 MB |===== | 8% 64 MB |===== | 8% 65 MB |===== | 8% 65 MB |===== | 8% 65 MB |===== | 8% 65 MB |===== | 8% 65 MB |===== | 8% 65 MB |===== | 8% 65 MB |===== | 8% 65 MB |===== | 8% 65 MB |===== | 8% 65 MB |===== | 8% 65 MB |===== | 8% 65 MB |===== | 8% 66 MB |===== | 8% 66 MB |===== | 8% 66 MB |===== | 8% 66 MB |===== | 8% 66 MB |===== | 8% 66 MB |===== | 8% 66 MB |===== | 8% 66 MB |===== | 8% 66 MB |===== | 8% 66 MB |===== | 8% 66 MB |===== | 8% 66 MB |===== | 8% 67 MB |===== | 8% 67 MB |===== | 8% 67 MB |===== | 8% 67 MB |===== | 8% 67 MB |===== | 8% 67 MB |===== | 8% 67 MB |===== | 8% 67 MB |===== | 8% 67 MB |===== | 8% 67 MB |===== | 8% 67 MB |===== | 8% 68 MB |===== | 8% 68 MB |===== | 8% 68 MB |===== | 8% 68 MB |===== | 8% 68 MB |===== | 8% 68 MB |===== | 8% 68 MB |===== | 8% 68 MB |===== | 8% 68 MB |===== | 8% 68 MB |===== | 8% 68 MB |===== | 8% 68 MB |===== | 8% 69 MB |===== | 8% 69 MB |===== | 8% 69 MB |===== | 8% 69 MB |===== | 8% 69 MB |===== | 9% 69 MB |===== | 9% 69 MB |===== | 9% 69 MB |===== | 9% 69 MB |===== | 9% 69 MB |===== | 9% 69 MB |===== | 9% 69 MB |===== | 9% 70 MB |===== | 9% 70 MB |===== | 9% 70 MB |===== | 9% 70 MB |===== | 9% 70 MB |===== | 9% 70 MB |===== | 9% 70 MB |===== | 9% 70 MB |===== | 9% 70 MB |===== | 9% 70 MB |===== | 9% 70 MB |===== | 9% 70 MB |===== | 9% 71 MB |===== | 9% 71 MB |====== | 9% 71 MB |====== | 9% 71 MB |====== | 9% 71 MB |====== | 9% 71 MB |====== | 9% 71 MB |====== | 9% 71 MB |====== | 9% 71 MB |====== | 9% 71 MB |====== | 9% 71 MB |====== | 9% 72 MB |====== | 9% 72 MB |====== | 9% 72 MB |====== | 9% 72 MB |====== | 9% 72 MB |====== | 9% 72 MB |====== | 9% 72 MB |====== | 9% 72 MB |====== | 9% 72 MB |====== | 9% 72 MB |====== | 9% 72 MB |====== | 9% 72 MB |====== | 9% 73 MB |====== | 9% 73 MB |====== | 9% 73 MB |====== | 9% 73 MB |====== | 9% 73 MB |====== | 9% 73 MB |====== | 9% 73 MB |====== | 9% 73 MB |====== | 9% 73 MB |====== | 9% 73 MB |====== | 9% 73 MB |====== | 9% 73 MB |====== | 9% 74 MB |====== | 9% 74 MB |====== | 9% 74 MB |====== | 9% 74 MB |====== | 9% 74 MB |====== | 9% 74 MB |====== | 9% 74 MB |====== | 9% 74 MB |====== | 9% 74 MB |====== | 9% 74 MB |====== | 9% 74 MB |====== | 9% 74 MB |====== | 9% 75 MB |====== | 9% 75 MB |====== | 9% 75 MB |====== | 9% 75 MB |====== | 9% 75 MB |====== | 9% 75 MB |====== | 9% 75 MB |====== | 9% 75 MB |====== | 9% 75 MB |====== | 9% 75 MB |====== | 9% 75 MB |====== | 9% 75 MB |====== | 9% 76 MB |====== | 9% 76 MB |====== | 9% 76 MB |====== | 9% 76 MB |====== | 9% 76 MB |====== | 9% 76 MB |====== | 9% 76 MB |====== | 9% 76 MB |====== | 9% 76 MB |====== | 9% 76 MB |====== | 9% 76 MB |====== | 9% 77 MB |====== | 9% 77 MB |====== | 10% 77 MB |====== | 10% 77 MB |====== | 10% 77 MB |====== | 10% 77 MB |====== | 10% 77 MB |====== | 10% 77 MB |====== | 10% 77 MB |====== | 10% 77 MB |====== | 10% 77 MB |====== | 10% 77 MB |====== | 10% 78 MB |====== | 10% 78 MB |====== | 10% 78 MB |====== | 10% 78 MB |====== | 10% 78 MB |====== | 10% 78 MB |====== | 10% 78 MB |====== | 10% 78 MB |====== | 10% 78 MB |====== | 10% 78 MB |====== | 10% 78 MB |====== | 10% 78 MB |====== | 10% 79 MB |====== | 10% 79 MB |====== | 10% 79 MB |====== | 10% 79 MB |====== | 10% 79 MB |====== | 10% 79 MB |====== | 10% 79 MB |====== | 10% 79 MB |====== | 10% 79 MB |====== | 10% 79 MB |====== | 10% 79 MB |====== | 10% 79 MB |====== | 10% 80 MB |====== | 10% 80 MB |====== | 10% 80 MB |====== | 10% 80 MB |====== | 10% 80 MB |====== | 10% 80 MB |====== | 10% 80 MB |====== | 10% 80 MB |====== | 10% 80 MB |====== | 10% 80 MB |====== | 10% 80 MB |====== | 10% 80 MB |====== | 10% 81 MB |====== | 10% 81 MB |====== | 10% 81 MB |====== | 10% 81 MB |====== | 10% 81 MB |====== | 10% 81 MB |====== | 10% 81 MB |====== | 10% 81 MB |====== | 10% 81 MB |====== | 10% 81 MB |====== | 10% 81 MB |====== | 10% 82 MB |====== | 10% 82 MB |====== | 10% 82 MB |====== | 10% 82 MB |====== | 10% 82 MB |====== | 10% 82 MB |====== | 10% 82 MB |====== | 10% 82 MB |====== | 10% 82 MB |====== | 10% 82 MB |====== | 10% 82 MB |====== | 10% 82 MB |====== | 10% 83 MB |======= | 10% 83 MB |======= | 10% 83 MB |======= | 10% 83 MB |======= | 10% 83 MB |======= | 10% 83 MB |======= | 10% 83 MB |======= | 10% 83 MB |======= | 10% 83 MB |======= | 10% 83 MB |======= | 10% 83 MB |======= | 10% 83 MB |======= | 10% 84 MB |======= | 10% 84 MB |======= | 10% 84 MB |======= | 10% 84 MB |======= | 10% 84 MB |======= | 10% 84 MB |======= | 10% 84 MB |======= | 10% 84 MB |======= | 10% 84 MB |======= | 10% 84 MB |======= | 11% 84 MB |======= | 11% 84 MB |======= | 11% 85 MB |======= | 11% 85 MB |======= | 11% 85 MB |======= | 11% 85 MB |======= | 11% 85 MB |======= | 11% 85 MB |======= | 11% 85 MB |======= | 11% 85 MB |======= | 11% 85 MB |======= | 11% 85 MB |======= | 11% 85 MB |======= | 11% 86 MB |======= | 11% 86 MB |======= | 11% 86 MB |======= | 11% 86 MB |======= | 11% 86 MB |======= | 11% 86 MB |======= | 11% 86 MB |======= | 11% 86 MB |======= | 11% 86 MB |======= | 11% 86 MB |======= | 11% 86 MB |======= | 11% 86 MB |======= | 11% 87 MB |======= | 11% 87 MB |======= | 11% 87 MB |======= | 11% 87 MB |======= | 11% 87 MB |======= | 11% 87 MB |======= | 11% 87 MB |======= | 11% 87 MB |======= | 11% 87 MB |======= | 11% 87 MB |======= | 11% 87 MB |======= | 11% 87 MB |======= | 11% 88 MB |======= | 11% 88 MB |======= | 11% 88 MB |======= | 11% 88 MB |======= | 11% 88 MB |======= | 11% 88 MB |======= | 11% 88 MB |======= | 11% 88 MB |======= | 11% 88 MB |======= | 11% 88 MB |======= | 11% 88 MB |======= | 11% 88 MB |======= | 11% 89 MB |======= | 11% 89 MB |======= | 11% 89 MB |======= | 11% 89 MB |======= | 11% 89 MB |======= | 11% 89 MB |======= | 11% 89 MB |======= | 11% 89 MB |======= | 11% 89 MB |======= | 11% 89 MB |======= | 11% 89 MB |======= | 11% 90 MB |======= | 11% 90 MB |======= | 11% 90 MB |======= | 11% 90 MB |======= | 11% 90 MB |======= | 11% 90 MB |======= | 11% 90 MB |======= | 11% 90 MB |======= | 11% 90 MB |======= | 11% 90 MB |======= | 11% 90 MB |======= | 11% 90 MB |======= | 11% 91 MB |======= | 11% 91 MB |======= | 11% 91 MB |======= | 11% 91 MB |======= | 11% 91 MB |======= | 11% 91 MB |======= | 11% 91 MB |======= | 11% 91 MB |======= | 11% 91 MB |======= | 11% 91 MB |======= | 11% 91 MB |======= | 11% 91 MB |======= | 11% 92 MB |======= | 11% 92 MB |======= | 11% 92 MB |======= | 11% 92 MB |======= | 11% 92 MB |======= | 11% 92 MB |======= | 12% 92 MB |======= | 12% 92 MB |======= | 12% 92 MB |======= | 12% 92 MB |======= | 12% 92 MB |======= | 12% 92 MB |======= | 12% 93 MB |======= | 12% 93 MB |======= | 12% 93 MB |======= | 12% 93 MB |======= | 12% 93 MB |======= | 12% 93 MB |======= | 12% 93 MB |======= | 12% 93 MB |======= | 12% 93 MB |======= | 12% 93 MB |======= | 12% 93 MB |======= | 12% 94 MB |======= | 12% 94 MB |======= | 12% 94 MB |======= | 12% 94 MB |======= | 12% 94 MB |======= | 12% 94 MB |======= | 12% 94 MB |======= | 12% 94 MB |======= | 12% 94 MB |======= | 12% 94 MB |======= | 12% 94 MB |======== | 12% 94 MB |======== | 12% 95 MB |======== | 12% 95 MB |======== | 12% 95 MB |======== | 12% 95 MB |======== | 12% 95 MB |======== | 12% 95 MB |======== | 12% 95 MB |======== | 12% 95 MB |======== | 12% 95 MB |======== | 12% 95 MB |======== | 12% 95 MB |======== | 12% 95 MB |======== | 12% 96 MB |======== | 12% 96 MB |======== | 12% 96 MB |======== | 12% 96 MB |======== | 12% 96 MB |======== | 12% 96 MB |======== | 12% 96 MB |======== | 12% 96 MB |======== | 12% 96 MB |======== | 12% 96 MB |======== | 12% 96 MB |======== | 12% 96 MB |======== | 12% 97 MB |======== | 12% 97 MB |======== | 12% 97 MB |======== | 12% 97 MB |======== | 12% 97 MB |======== | 12% 97 MB |======== | 12% 97 MB |======== | 12% 97 MB |======== | 12% 97 MB |======== | 12% 97 MB |======== | 12% 97 MB |======== | 12% 98 MB |======== | 12% 98 MB |======== | 12% 98 MB |======== | 12% 98 MB |======== | 12% 98 MB |======== | 12% 98 MB |======== | 12% 98 MB |======== | 12% 98 MB |======== | 12% 98 MB |======== | 12% 98 MB |======== | 12% 98 MB |======== | 12% 98 MB |======== | 12% 99 MB |======== | 12% 99 MB |======== | 12% 99 MB |======== | 12% 99 MB |======== | 12% 99 MB |======== | 12% 99 MB |======== | 12% 99 MB |======== | 12% 99 MB |======== | 12% 99 MB |======== | 12% 99 MB |======== | 12% 99 MB |======== | 12% 99 MB |======== | 12% 100 MB |======== | 12% 100 MB |======== | 12% 100 MB |======== | 13% 100 MB |======== | 13% 100 MB |======== | 13% 100 MB |======== | 13% 100 MB |======== | 13% 100 MB |======== | 13% 100 MB |======== | 13% 100 MB |======== | 13% 100 MB |======== | 13% 100 MB |======== | 13% 101 MB |======== | 13% 101 MB |======== | 13% 101 MB |======== | 13% 101 MB |======== | 13% 101 MB |======== | 13% 101 MB |======== | 13% 101 MB |======== | 13% 101 MB |======== | 13% 101 MB |======== | 13% 101 MB |======== | 13% 101 MB |======== | 13% 101 MB |======== | 13% 102 MB |======== | 13% 102 MB |======== | 13% 102 MB |======== | 13% 102 MB |======== | 13% 102 MB |======== | 13% 102 MB |======== | 13% 102 MB |======== | 13% 102 MB |======== | 13% 102 MB |======== | 13% 102 MB |======== | 13% 102 MB |======== | 13% 103 MB |======== | 13% 103 MB |======== | 13% 103 MB |======== | 13% 103 MB |======== | 13% 103 MB |======== | 13% 103 MB |======== | 13% 103 MB |======== | 13% 103 MB |======== | 13% 103 MB |======== | 13% 103 MB |======== | 13% 103 MB |======== | 13% 103 MB |======== | 13% 104 MB |======== | 13% 104 MB |======== | 13% 104 MB |======== | 13% 104 MB |======== | 13% 104 MB |======== | 13% 104 MB |======== | 13% 104 MB |======== | 13% 104 MB |======== | 13% 104 MB |======== | 13% 104 MB |======== | 13% 104 MB |======== | 13% 104 MB |======== | 13% 105 MB |======== | 13% 105 MB |======== | 13% 105 MB |======== | 13% 105 MB |======== | 13% 105 MB |======== | 13% 105 MB |======== | 13% 105 MB |======== | 13% 105 MB |======== | 13% 105 MB |======== | 13% 105 MB |======== | 13% 105 MB |======== | 13% 105 MB |======== | 13% 106 MB |======== | 13% 106 MB |======== | 13% 106 MB |======== | 13% 106 MB |======== | 13% 106 MB |======== | 13% 106 MB |======== | 13% 106 MB |======== | 13% 106 MB |======== | 13% 106 MB |========= | 13% 106 MB |========= | 13% 106 MB |========= | 13% 106 MB |========= | 13% 107 MB |========= | 13% 107 MB |========= | 13% 107 MB |========= | 13% 107 MB |========= | 13% 107 MB |========= | 13% 107 MB |========= | 13% 107 MB |========= | 13% 107 MB |========= | 13% 107 MB |========= | 13% 107 MB |========= | 13% 107 MB |========= | 14% 108 MB |========= | 14% 108 MB |========= | 14% 108 MB |========= | 14% 108 MB |========= | 14% 108 MB |========= | 14% 108 MB |========= | 14% 108 MB |========= | 14% 108 MB |========= | 14% 108 MB |========= | 14% 108 MB |========= | 14% 108 MB |========= | 14% 108 MB |========= | 14% 109 MB |========= | 14% 109 MB |========= | 14% 109 MB |========= | 14% 109 MB |========= | 14% 109 MB |========= | 14% 109 MB |========= | 14% 109 MB |========= | 14% 109 MB |========= | 14% 109 MB |========= | 14% 109 MB |========= | 14% 109 MB |========= | 14% 109 MB |========= | 14% 110 MB |========= | 14% 110 MB |========= | 14% 110 MB |========= | 14% 110 MB |========= | 14% 110 MB |========= | 14% 110 MB |========= | 14% 110 MB |========= | 14% 110 MB |========= | 14% 110 MB |========= | 14% 110 MB |========= | 14% 110 MB |========= | 14% 110 MB |========= | 14% 111 MB |========= | 14% 111 MB |========= | 14% 111 MB |========= | 14% 111 MB |========= | 14% 111 MB |========= | 14% 111 MB |========= | 14% 111 MB |========= | 14% 111 MB |========= | 14% 111 MB |========= | 14% 111 MB |========= | 14% 111 MB |========= | 14% 111 MB |========= | 14% 112 MB |========= | 14% 112 MB |========= | 14% 112 MB |========= | 14% 112 MB |========= | 14% 112 MB |========= | 14% 112 MB |========= | 14% 112 MB |========= | 14% 112 MB |========= | 14% 112 MB |========= | 14% 112 MB |========= | 14% 112 MB |========= | 14% 113 MB |========= | 14% 113 MB |========= | 14% 113 MB |========= | 14% 113 MB |========= | 14% 113 MB |========= | 14% 113 MB |========= | 14% 113 MB |========= | 14% 113 MB |========= | 14% 113 MB |========= | 14% 113 MB |========= | 14% 113 MB |========= | 14% 113 MB |========= | 14% 114 MB |========= | 14% 114 MB |========= | 14% 114 MB |========= | 14% 114 MB |========= | 14% 114 MB |========= | 14% 114 MB |========= | 14% 114 MB |========= | 14% 114 MB |========= | 14% 114 MB |========= | 14% 114 MB |========= | 14% 114 MB |========= | 14% 114 MB |========= | 14% 115 MB |========= | 14% 115 MB |========= | 14% 115 MB |========= | 14% 115 MB |========= | 14% 115 MB |========= | 14% 115 MB |========= | 14% 115 MB |========= | 14% 115 MB |========= | 15% 115 MB |========= | 15% 115 MB |========= | 15% 115 MB |========= | 15% 115 MB |========= | 15% 116 MB |========= | 15% 116 MB |========= | 15% 116 MB |========= | 15% 116 MB |========= | 15% 116 MB |========= | 15% 116 MB |========= | 15% 116 MB |========= | 15% 116 MB |========= | 15% 116 MB |========= | 15% 116 MB |========= | 15% 116 MB |========= | 15% 116 MB |========= | 15% 117 MB |========= | 15% 117 MB |========= | 15% 117 MB |========= | 15% 117 MB |========= | 15% 117 MB |========= | 15% 117 MB |========= | 15% 117 MB |========= | 15% 117 MB |========= | 15% 117 MB |========= | 15% 117 MB |========= | 15% 117 MB |========= | 15% 118 MB |========= | 15% 118 MB |========= | 15% 118 MB |========= | 15% 118 MB |========= | 15% 118 MB |========= | 15% 118 MB |========= | 15% 118 MB |========= | 15% 118 MB |========== | 15% 118 MB |========== | 15% 118 MB |========== | 15% 118 MB |========== | 15% 118 MB |========== | 15% 119 MB |========== | 15% 119 MB |========== | 15% 119 MB |========== | 15% 119 MB |========== | 15% 119 MB |========== | 15% 119 MB |========== | 15% 119 MB |========== | 15% 119 MB |========== | 15% 119 MB |========== | 15% 119 MB |========== | 15% 119 MB |========== | 15% 119 MB |========== | 15% 120 MB |========== | 15% 120 MB |========== | 15% 120 MB |========== | 15% 120 MB |========== | 15% 120 MB |========== | 15% 120 MB |========== | 15% 120 MB |========== | 15% 120 MB |========== | 15% 120 MB |========== | 15% 120 MB |========== | 15% 120 MB |========== | 15% 120 MB |========== | 15% 121 MB |========== | 15% 121 MB |========== | 15% 121 MB |========== | 15% 121 MB |========== | 15% 121 MB |========== | 15% 121 MB |========== | 15% 121 MB |========== | 15% 121 MB |========== | 15% 121 MB |========== | 15% 121 MB |========== | 15% 121 MB |========== | 15% 121 MB |========== | 15% 122 MB |========== | 15% 122 MB |========== | 15% 122 MB |========== | 15% 122 MB |========== | 15% 122 MB |========== | 15% 122 MB |========== | 15% 122 MB |========== | 15% 122 MB |========== | 15% 122 MB |========== | 15% 122 MB |========== | 15% 122 MB |========== | 15% 123 MB |========== | 15% 123 MB |========== | 15% 123 MB |========== | 15% 123 MB |========== | 15% 123 MB |========== | 16% 123 MB |========== | 16% 123 MB |========== | 16% 123 MB |========== | 16% 123 MB |========== | 16% 123 MB |========== | 16% 123 MB |========== | 16% 123 MB |========== | 16% 124 MB |========== | 16% 124 MB |========== | 16% 124 MB |========== | 16% 124 MB |========== | 16% 124 MB |========== | 16% 124 MB |========== | 16% 124 MB |========== | 16% 124 MB |========== | 16% 124 MB |========== | 16% 124 MB |========== | 16% 124 MB |========== | 16% 124 MB |========== | 16% 125 MB |========== | 16% 125 MB |========== | 16% 125 MB |========== | 16% 125 MB |========== | 16% 125 MB |========== | 16% 125 MB |========== | 16% 125 MB |========== | 16% 125 MB |========== | 16% 125 MB |========== | 16% 125 MB |========== | 16% 125 MB |========== | 16% 125 MB |========== | 16% 126 MB |========== | 16% 126 MB |========== | 16% 126 MB |========== | 16% 126 MB |========== | 16% 126 MB |========== | 16% 126 MB |========== | 16% 126 MB |========== | 16% 126 MB |========== | 16% 126 MB |========== | 16% 126 MB |========== | 16% 126 MB |========== | 16% 127 MB |========== | 16% 127 MB |========== | 16% 127 MB |========== | 16% 127 MB |========== | 16% 127 MB |========== | 16% 127 MB |========== | 16% 127 MB |========== | 16% 127 MB |========== | 16% 127 MB |========== | 16% 127 MB |========== | 16% 127 MB |========== | 16% 127 MB |========== | 16% 128 MB |========== | 16% 128 MB |========== | 16% 128 MB |========== | 16% 128 MB |========== | 16% 128 MB |========== | 16% 128 MB |========== | 16% 128 MB |========== | 16% 128 MB |========== | 16% 128 MB |========== | 16% 128 MB |========== | 16% 128 MB |========== | 16% 128 MB |========== | 16% 129 MB |========== | 16% 129 MB |========== | 16% 129 MB |========== | 16% 129 MB |========== | 16% 129 MB |========== | 16% 129 MB |========== | 16% 129 MB |========== | 16% 129 MB |========== | 16% 129 MB |========== | 16% 129 MB |========== | 16% 129 MB |========== | 16% 129 MB |========== | 16% 130 MB |========== | 16% 130 MB |========== | 16% 130 MB |========== | 16% 130 MB |========== | 16% 130 MB |========== | 16% 130 MB |=========== | 16% 130 MB |=========== | 16% 130 MB |=========== | 16% 130 MB |=========== | 16% 130 MB |=========== | 16% 130 MB |=========== | 16% 131 MB |=========== | 16% 131 MB |=========== | 17% 131 MB |=========== | 17% 131 MB |=========== | 17% 131 MB |=========== | 17% 131 MB |=========== | 17% 131 MB |=========== | 17% 131 MB |=========== | 17% 131 MB |=========== | 17% 131 MB |=========== | 17% 131 MB |=========== | 17% 131 MB |=========== | 17% 132 MB |=========== | 17% 132 MB |=========== | 17% 132 MB |=========== | 17% 132 MB |=========== | 17% 132 MB |=========== | 17% 132 MB |=========== | 17% 132 MB |=========== | 17% 132 MB |=========== | 17% 132 MB |=========== | 17% 132 MB |=========== | 17% 132 MB |=========== | 17% 132 MB |=========== | 17% 133 MB |=========== | 17% 133 MB |=========== | 17% 133 MB |=========== | 17% 133 MB |=========== | 17% 133 MB |=========== | 17% 133 MB |=========== | 17% 133 MB |=========== | 17% 133 MB |=========== | 17% 133 MB |=========== | 17% 133 MB |=========== | 17% 133 MB |=========== | 17% 133 MB |=========== | 17% 134 MB |=========== | 17% 134 MB |=========== | 17% 134 MB |=========== | 17% 134 MB |=========== | 17% 134 MB |=========== | 17% 134 MB |=========== | 17% 134 MB |=========== | 17% 134 MB |=========== | 17% 134 MB |=========== | 17% 134 MB |=========== | 17% 134 MB |=========== | 17% 135 MB |=========== | 17% 135 MB |=========== | 17% 135 MB |=========== | 17% 135 MB |=========== | 17% 135 MB |=========== | 17% 135 MB |=========== | 17% 135 MB |=========== | 17% 135 MB |=========== | 17% 135 MB |=========== | 17% 135 MB |=========== | 17% 135 MB |=========== | 17% 135 MB |=========== | 17% 136 MB |=========== | 17% 136 MB |=========== | 17% 136 MB |=========== | 17% 136 MB |=========== | 17% 136 MB |=========== | 17% 136 MB |=========== | 17% 136 MB |=========== | 17% 136 MB |=========== | 17% 136 MB |=========== | 17% 136 MB |=========== | 17% 136 MB |=========== | 17% 136 MB |=========== | 17% 137 MB |=========== | 17% 137 MB |=========== | 17% 137 MB |=========== | 17% 137 MB |=========== | 17% 137 MB |=========== | 17% 137 MB |=========== | 17% 137 MB |=========== | 17% 137 MB |=========== | 17% 137 MB |=========== | 17% 137 MB |=========== | 17% 137 MB |=========== | 17% 137 MB |=========== | 17% 138 MB |=========== | 17% 138 MB |=========== | 17% 138 MB |=========== | 17% 138 MB |=========== | 17% 138 MB |=========== | 17% 138 MB |=========== | 17% 138 MB |=========== | 17% 138 MB |=========== | 17% 138 MB |=========== | 18% 138 MB |=========== | 18% 138 MB |=========== | 18% 138 MB |=========== | 18% 139 MB |=========== | 18% 139 MB |=========== | 18% 139 MB |=========== | 18% 139 MB |=========== | 18% 139 MB |=========== | 18% 139 MB |=========== | 18% 139 MB |=========== | 18% 139 MB |=========== | 18% 139 MB |=========== | 18% 139 MB |=========== | 18% 139 MB |=========== | 18% 140 MB |=========== | 18% 140 MB |=========== | 18% 140 MB |=========== | 18% 140 MB |=========== | 18% 140 MB |=========== | 18% 140 MB |=========== | 18% 140 MB |=========== | 18% 140 MB |=========== | 18% 140 MB |=========== | 18% 140 MB |=========== | 18% 140 MB |=========== | 18% 140 MB |=========== | 18% 141 MB |=========== | 18% 141 MB |=========== | 18% 141 MB |=========== | 18% 141 MB |=========== | 18% 141 MB |=========== | 18% 141 MB |=========== | 18% 141 MB |=========== | 18% 141 MB |=========== | 18% 141 MB |=========== | 18% 141 MB |=========== | 18% 141 MB |=========== | 18% 141 MB |=========== | 18% 142 MB |=========== | 18% 142 MB |=========== | 18% 142 MB |=========== | 18% 142 MB |============ | 18% 142 MB |============ | 18% 142 MB |============ | 18% 142 MB |============ | 18% 142 MB |============ | 18% 142 MB |============ | 18% 142 MB |============ | 18% 142 MB |============ | 18% 142 MB |============ | 18% 143 MB |============ | 18% 143 MB |============ | 18% 143 MB |============ | 18% 143 MB |============ | 18% 143 MB |============ | 18% 143 MB |============ | 18% 143 MB |============ | 18% 143 MB |============ | 18% 143 MB |============ | 18% 143 MB |============ | 18% 143 MB |============ | 18% 143 MB |============ | 18% 144 MB |============ | 18% 144 MB |============ | 18% 144 MB |============ | 18% 144 MB |============ | 18% 144 MB |============ | 18% 144 MB |============ | 18% 144 MB |============ | 18% 144 MB |============ | 18% 144 MB |============ | 18% 144 MB |============ | 18% 144 MB |============ | 18% 145 MB |============ | 18% 145 MB |============ | 18% 145 MB |============ | 18% 145 MB |============ | 18% 145 MB |============ | 18% 145 MB |============ | 18% 145 MB |============ | 18% 145 MB |============ | 18% 145 MB |============ | 18% 145 MB |============ | 18% 145 MB |============ | 18% 145 MB |============ | 18% 146 MB |============ | 18% 146 MB |============ | 18% 146 MB |============ | 18% 146 MB |============ | 18% 146 MB |============ | 18% 146 MB |============ | 19% 146 MB |============ | 19% 146 MB |============ | 19% 146 MB |============ | 19% 146 MB |============ | 19% 146 MB |============ | 19% 146 MB |============ | 19% 147 MB |============ | 19% 147 MB |============ | 19% 147 MB |============ | 19% 147 MB |============ | 19% 147 MB |============ | 19% 147 MB |============ | 19% 147 MB |============ | 19% 147 MB |============ | 19% 147 MB |============ | 19% 147 MB |============ | 19% 147 MB |============ | 19% 147 MB |============ | 19% 148 MB |============ | 19% 148 MB |============ | 19% 148 MB |============ | 19% 148 MB |============ | 19% 148 MB |============ | 19% 148 MB |============ | 19% 148 MB |============ | 19% 148 MB |============ | 19% 148 MB |============ | 19% 148 MB |============ | 19% 148 MB |============ | 19% 149 MB |============ | 19% 149 MB |============ | 19% 149 MB |============ | 19% 149 MB |============ | 19% 149 MB |============ | 19% 149 MB |============ | 19% 149 MB |============ | 19% 149 MB |============ | 19% 149 MB |============ | 19% 149 MB |============ | 19% 149 MB |============ | 19% 149 MB |============ | 19% 150 MB |============ | 19% 150 MB |============ | 19% 150 MB |============ | 19% 150 MB |============ | 19% 150 MB |============ | 19% 150 MB |============ | 19% 150 MB |============ | 19% 150 MB |============ | 19% 150 MB |============ | 19% 150 MB |============ | 19% 150 MB |============ | 19% 150 MB |============ | 19% 151 MB |============ | 19% 151 MB |============ | 19% 151 MB |============ | 19% 151 MB |============ | 19% 151 MB |============ | 19% 151 MB |============ | 19% 151 MB |============ | 19% 151 MB |============ | 19% 151 MB |============ | 19% 151 MB |============ | 19% 151 MB |============ | 19% 151 MB |============ | 19% 152 MB |============ | 19% 152 MB |============ | 19% 152 MB |============ | 19% 152 MB |============ | 19% 152 MB |============ | 19% 152 MB |============ | 19% 152 MB |============ | 19% 152 MB |============ | 19% 152 MB |============ | 19% 152 MB |============ | 19% 152 MB |============ | 19% 152 MB |============ | 19% 153 MB |============ | 19% 153 MB |============ | 19% 153 MB |============ | 19% 153 MB |============ | 19% 153 MB |============ | 19% 153 MB |============ | 19% 153 MB |============ | 19% 153 MB |============ | 19% 153 MB |============ | 19% 153 MB |============ | 19% 153 MB |============ | 19% 154 MB |============ | 19% 154 MB |============ | 19% 154 MB |============= | 20% 154 MB |============= | 20% 154 MB |============= | 20% 154 MB |============= | 20% 154 MB |============= | 20% 154 MB |============= | 20% 154 MB |============= | 20% 154 MB |============= | 20% 154 MB |============= | 20% 154 MB |============= | 20% 155 MB |============= | 20% 155 MB |============= | 20% 155 MB |============= | 20% 155 MB |============= | 20% 155 MB |============= | 20% 155 MB |============= | 20% 155 MB |============= | 20% 155 MB |============= | 20% 155 MB |============= | 20% 155 MB |============= | 20% 155 MB |============= | 20% 155 MB |============= | 20% 156 MB |============= | 20% 156 MB |============= | 20% 156 MB |============= | 20% 156 MB |============= | 20% 156 MB |============= | 20% 156 MB |============= | 20% 156 MB |============= | 20% 156 MB |============= | 20% 156 MB |============= | 20% 156 MB |============= | 20% 156 MB |============= | 20% 157 MB |============= | 20% 157 MB |============= | 20% 157 MB |============= | 20% 157 MB |============= | 20% 157 MB |============= | 20% 157 MB |============= | 20% 157 MB |============= | 20% 157 MB |============= | 20% 157 MB |============= | 20% 157 MB |============= | 20% 157 MB |============= | 20% 157 MB |============= | 20% 158 MB |============= | 20% 158 MB |============= | 20% 158 MB |============= | 20% 158 MB |============= | 20% 158 MB |============= | 20% 158 MB |============= | 20% 158 MB |============= | 20% 158 MB |============= | 20% 158 MB |============= | 20% 158 MB |============= | 20% 158 MB |============= | 20% 158 MB |============= | 20% 159 MB |============= | 20% 159 MB |============= | 20% 159 MB |============= | 20% 159 MB |============= | 20% 159 MB |============= | 20% 159 MB |============= | 20% 159 MB |============= | 20% 159 MB |============= | 20% 159 MB |============= | 20% 159 MB |============= | 20% 159 MB |============= | 20% 159 MB |============= | 20% 160 MB |============= | 20% 160 MB |============= | 20% 160 MB |============= | 20% 160 MB |============= | 20% 160 MB |============= | 20% 160 MB |============= | 20% 160 MB |============= | 20% 160 MB |============= | 20% 160 MB |============= | 20% 160 MB |============= | 20% 160 MB |============= | 20% 161 MB |============= | 20% 161 MB |============= | 20% 161 MB |============= | 20% 161 MB |============= | 20% 161 MB |============= | 20% 161 MB |============= | 20% 161 MB |============= | 20% 161 MB |============= | 20% 161 MB |============= | 20% 161 MB |============= | 20% 161 MB |============= | 21% 161 MB |============= | 21% 162 MB |============= | 21% 162 MB |============= | 21% 162 MB |============= | 21% 162 MB |============= | 21% 162 MB |============= | 21% 162 MB |============= | 21% 162 MB |============= | 21% 162 MB |============= | 21% 162 MB |============= | 21% 162 MB |============= | 21% 162 MB |============= | 21% 162 MB |============= | 21% 163 MB |============= | 21% 163 MB |============= | 21% 163 MB |============= | 21% 163 MB |============= | 21% 163 MB |============= | 21% 163 MB |============= | 21% 163 MB |============= | 21% 163 MB |============= | 21% 163 MB |============= | 21% 163 MB |============= | 21% 163 MB |============= | 21% 163 MB |============= | 21% 164 MB |============= | 21% 164 MB |============= | 21% 164 MB |============= | 21% 164 MB |============= | 21% 164 MB |============= | 21% 164 MB |============= | 21% 164 MB |============= | 21% 164 MB |============= | 21% 164 MB |============= | 21% 164 MB |============= | 21% 164 MB |============= | 21% 165 MB |============= | 21% 165 MB |============= | 21% 165 MB |============= | 21% 165 MB |============= | 21% 165 MB |============= | 21% 165 MB |============= | 21% 165 MB |============= | 21% 165 MB |============= | 21% 165 MB |============= | 21% 165 MB |============= | 21% 165 MB |============= | 21% 165 MB |============= | 21% 166 MB |============== | 21% 166 MB |============== | 21% 166 MB |============== | 21% 166 MB |============== | 21% 166 MB |============== | 21% 166 MB |============== | 21% 166 MB |============== | 21% 166 MB |============== | 21% 166 MB |============== | 21% 166 MB |============== | 21% 166 MB |============== | 21% 166 MB |============== | 21% 167 MB |============== | 21% 167 MB |============== | 21% 167 MB |============== | 21% 167 MB |============== | 21% 167 MB |============== | 21% 167 MB |============== | 21% 167 MB |============== | 21% 167 MB |============== | 21% 167 MB |============== | 21% 167 MB |============== | 21% 167 MB |============== | 21% 167 MB |============== | 21% 168 MB |============== | 21% 168 MB |============== | 21% 168 MB |============== | 21% 168 MB |============== | 21% 168 MB |============== | 21% 168 MB |============== | 21% 168 MB |============== | 21% 168 MB |============== | 21% 168 MB |============== | 21% 168 MB |============== | 21% 168 MB |============== | 21% 168 MB |============== | 21% 169 MB |============== | 21% 169 MB |============== | 21% 169 MB |============== | 21% 169 MB |============== | 21% 169 MB |============== | 21% 169 MB |============== | 21% 169 MB |============== | 22% 169 MB |============== | 22% 169 MB |============== | 22% 169 MB |============== | 22% 169 MB |============== | 22% 169 MB |============== | 22% 170 MB |============== | 22% 170 MB |============== | 22% 170 MB |============== | 22% 170 MB |============== | 22% 170 MB |============== | 22% 170 MB |============== | 22% 170 MB |============== | 22% 170 MB |============== | 22% 170 MB |============== | 22% 170 MB |============== | 22% 170 MB |============== | 22% 171 MB |============== | 22% 171 MB |============== | 22% 171 MB |============== | 22% 171 MB |============== | 22% 171 MB |============== | 22% 171 MB |============== | 22% 171 MB |============== | 22% 171 MB |============== | 22% 171 MB |============== | 22% 171 MB |============== | 22% 171 MB |============== | 22% 171 MB |============== | 22% 172 MB |============== | 22% 172 MB |============== | 22% 172 MB |============== | 22% 172 MB |============== | 22% 172 MB |============== | 22% 172 MB |============== | 22% 172 MB |============== | 22% 172 MB |============== | 22% 172 MB |============== | 22% 172 MB |============== | 22% 172 MB |============== | 22% 172 MB |============== | 22% 173 MB |============== | 22% 173 MB |============== | 22% 173 MB |============== | 22% 173 MB |============== | 22% 173 MB |============== | 22% 173 MB |============== | 22% 173 MB |============== | 22% 173 MB |============== | 22% 173 MB |============== | 22% 173 MB |============== | 22% 173 MB |============== | 22% 173 MB |============== | 22% 174 MB |============== | 22% 174 MB |============== | 22% 174 MB |============== | 22% 174 MB |============== | 22% 174 MB |============== | 22% 174 MB |============== | 22% 174 MB |============== | 22% 174 MB |============== | 22% 174 MB |============== | 22% 174 MB |============== | 22% 174 MB |============== | 22% 174 MB |============== | 22% 175 MB |============== | 22% 175 MB |============== | 22% 175 MB |============== | 22% 175 MB |============== | 22% 175 MB |============== | 22% 175 MB |============== | 22% 175 MB |============== | 22% 175 MB |============== | 22% 175 MB |============== | 22% 175 MB |============== | 22% 175 MB |============== | 22% 175 MB |============== | 22% 176 MB |============== | 22% 176 MB |============== | 22% 176 MB |============== | 22% 176 MB |============== | 22% 176 MB |============== | 22% 176 MB |============== | 22% 176 MB |============== | 22% 176 MB |============== | 22% 176 MB |============== | 22% 176 MB |============== | 22% 176 MB |============== | 22% 177 MB |============== | 22% 177 MB |============== | 22% 177 MB |============== | 22% 177 MB |============== | 22% 177 MB |============== | 23% 177 MB |============== | 23% 177 MB |============== | 23% 177 MB |============== | 23% 177 MB |============== | 23% 177 MB |============== | 23% 177 MB |============== | 23% 177 MB |=============== | 23% 178 MB |=============== | 23% 178 MB |=============== | 23% 178 MB |=============== | 23% 178 MB |=============== | 23% 178 MB |=============== | 23% 178 MB |=============== | 23% 178 MB |=============== | 23% 178 MB |=============== | 23% 178 MB |=============== | 23% 178 MB |=============== | 23% 178 MB |=============== | 23% 178 MB |=============== | 23% 179 MB |=============== | 23% 179 MB |=============== | 23% 179 MB |=============== | 23% 179 MB |=============== | 23% 179 MB |=============== | 23% 179 MB |=============== | 23% 179 MB |=============== | 23% 179 MB |=============== | 23% 179 MB |=============== | 23% 179 MB |=============== | 23% 179 MB |=============== | 23% 179 MB |=============== | 23% 180 MB |=============== | 23% 180 MB |=============== | 23% 180 MB |=============== | 23% 180 MB |=============== | 23% 180 MB |=============== | 23% 180 MB |=============== | 23% 180 MB |=============== | 23% 180 MB |=============== | 23% 180 MB |=============== | 23% 180 MB |=============== | 23% 180 MB |=============== | 23% 180 MB |=============== | 23% 181 MB |=============== | 23% 181 MB |=============== | 23% 181 MB |=============== | 23% 181 MB |=============== | 23% 181 MB |=============== | 23% 181 MB |=============== | 23% 181 MB |=============== | 23% 181 MB |=============== | 23% 181 MB |=============== | 23% 181 MB |=============== | 23% 181 MB |=============== | 23% 182 MB |=============== | 23% 182 MB |=============== | 23% 182 MB |=============== | 23% 182 MB |=============== | 23% 182 MB |=============== | 23% 182 MB |=============== | 23% 182 MB |=============== | 23% 182 MB |=============== | 23% 182 MB |=============== | 23% 182 MB |=============== | 23% 182 MB |=============== | 23% 182 MB |=============== | 23% 183 MB |=============== | 23% 183 MB |=============== | 23% 183 MB |=============== | 23% 183 MB |=============== | 23% 183 MB |=============== | 23% 183 MB |=============== | 23% 183 MB |=============== | 23% 183 MB |=============== | 23% 183 MB |=============== | 23% 183 MB |=============== | 23% 183 MB |=============== | 23% 183 MB |=============== | 23% 184 MB |=============== | 23% 184 MB |=============== | 23% 184 MB |=============== | 23% 184 MB |=============== | 23% 184 MB |=============== | 23% 184 MB |=============== | 23% 184 MB |=============== | 23% 184 MB |=============== | 23% 184 MB |=============== | 23% 184 MB |=============== | 23% 184 MB |=============== | 23% 184 MB |=============== | 23% 185 MB |=============== | 24% 185 MB |=============== | 24% 185 MB |=============== | 24% 185 MB |=============== | 24% 185 MB |=============== | 24% 185 MB |=============== | 24% 185 MB |=============== | 24% 185 MB |=============== | 24% 185 MB |=============== | 24% 185 MB |=============== | 24% 185 MB |=============== | 24% 185 MB |=============== | 24% 186 MB |=============== | 24% 186 MB |=============== | 24% 186 MB |=============== | 24% 186 MB |=============== | 24% 186 MB |=============== | 24% 186 MB |=============== | 24% 186 MB |=============== | 24% 186 MB |=============== | 24% 186 MB |=============== | 24% 186 MB |=============== | 24% 186 MB |=============== | 24% 186 MB |=============== | 24% 187 MB |=============== | 24% 187 MB |=============== | 24% 187 MB |=============== | 24% 187 MB |=============== | 24% 187 MB |=============== | 24% 187 MB |=============== | 24% 187 MB |=============== | 24% 187 MB |=============== | 24% 187 MB |=============== | 24% 187 MB |=============== | 24% 187 MB |=============== | 24% 187 MB |=============== | 24% 188 MB |=============== | 24% 188 MB |=============== | 24% 188 MB |=============== | 24% 188 MB |=============== | 24% 188 MB |=============== | 24% 188 MB |=============== | 24% 188 MB |=============== | 24% 188 MB |=============== | 24% 188 MB |=============== | 24% 188 MB |=============== | 24% 188 MB |=============== | 24% 189 MB |=============== | 24% 189 MB |=============== | 24% 189 MB |=============== | 24% 189 MB |=============== | 24% 189 MB |=============== | 24% 189 MB |=============== | 24% 189 MB |=============== | 24% 189 MB |=============== | 24% 189 MB |=============== | 24% 189 MB |================ | 24% 189 MB |================ | 24% 189 MB |================ | 24% 190 MB |================ | 24% 190 MB |================ | 24% 190 MB |================ | 24% 190 MB |================ | 24% 190 MB |================ | 24% 190 MB |================ | 24% 190 MB |================ | 24% 190 MB |================ | 24% 190 MB |================ | 24% 190 MB |================ | 24% 190 MB |================ | 24% 190 MB |================ | 24% 191 MB |================ | 24% 191 MB |================ | 24% 191 MB |================ | 24% 191 MB |================ | 24% 191 MB |================ | 24% 191 MB |================ | 24% 191 MB |================ | 24% 191 MB |================ | 24% 191 MB |================ | 24% 191 MB |================ | 24% 191 MB |================ | 24% 191 MB |================ | 24% 192 MB |================ | 24% 192 MB |================ | 24% 192 MB |================ | 24% 192 MB |================ | 24% 192 MB |================ | 24% 192 MB |================ | 24% 192 MB |================ | 24% 192 MB |================ | 24% 192 MB |================ | 25% 192 MB |================ | 25% 192 MB |================ | 25% 192 MB |================ | 25% 193 MB |================ | 25% 193 MB |================ | 25% 193 MB |================ | 25% 193 MB |================ | 25% 193 MB |================ | 25% 193 MB |================ | 25% 193 MB |================ | 25% 193 MB |================ | 25% 193 MB |================ | 25% 193 MB |================ | 25% 193 MB |================ | 25% 193 MB |================ | 25% 194 MB |================ | 25% 194 MB |================ | 25% 194 MB |================ | 25% 194 MB |================ | 25% 194 MB |================ | 25% 194 MB |================ | 25% 194 MB |================ | 25% 194 MB |================ | 25% 194 MB |================ | 25% 194 MB |================ | 25% 194 MB |================ | 25% 195 MB |================ | 25% 195 MB |================ | 25% 195 MB |================ | 25% 195 MB |================ | 25% 195 MB |================ | 25% 195 MB |================ | 25% 195 MB |================ | 25% 195 MB |================ | 25% 195 MB |================ | 25% 195 MB |================ | 25% 195 MB |================ | 25% 195 MB |================ | 25% 196 MB |================ | 25% 196 MB |================ | 25% 196 MB |================ | 25% 196 MB |================ | 25% 196 MB |================ | 25% 196 MB |================ | 25% 196 MB |================ | 25% 196 MB |================ | 25% 196 MB |================ | 25% 196 MB |================ | 25% 196 MB |================ | 25% 196 MB |================ | 25% 197 MB |================ | 25% 197 MB |================ | 25% 197 MB |================ | 25% 197 MB |================ | 25% 197 MB |================ | 25% 197 MB |================ | 25% 197 MB |================ | 25% 197 MB |================ | 25% 197 MB |================ | 25% 197 MB |================ | 25% 197 MB |================ | 25% 197 MB |================ | 25% 198 MB |================ | 25% 198 MB |================ | 25% 198 MB |================ | 25% 198 MB |================ | 25% 198 MB |================ | 25% 198 MB |================ | 25% 198 MB |================ | 25% 198 MB |================ | 25% 198 MB |================ | 25% 198 MB |================ | 25% 198 MB |================ | 25% 198 MB |================ | 25% 199 MB |================ | 25% 199 MB |================ | 25% 199 MB |================ | 25% 199 MB |================ | 25% 199 MB |================ | 25% 199 MB |================ | 25% 199 MB |================ | 25% 199 MB |================ | 25% 199 MB |================ | 25% 199 MB |================ | 25% 199 MB |================ | 25% 200 MB |================ | 25% 200 MB |================ | 25% 200 MB |================ | 25% 200 MB |================ | 25% 200 MB |================ | 25% 200 MB |================ | 26% 200 MB |================ | 26% 200 MB |================ | 26% 200 MB |================ | 26% 200 MB |================ | 26% 200 MB |================ | 26% 200 MB |================ | 26% 201 MB |================ | 26% 201 MB |================ | 26% 201 MB |================ | 26% 201 MB |================ | 26% 201 MB |================ | 26% 201 MB |================ | 26% 201 MB |================ | 26% 201 MB |================= | 26% 201 MB |================= | 26% 201 MB |================= | 26% 201 MB |================= | 26% 201 MB |================= | 26% 202 MB |================= | 26% 202 MB |================= | 26% 202 MB |================= | 26% 202 MB |================= | 26% 202 MB |================= | 26% 202 MB |================= | 26% 202 MB |================= | 26% 202 MB |================= | 26% 202 MB |================= | 26% 202 MB |================= | 26% 202 MB |================= | 26% 202 MB |================= | 26% 203 MB |================= | 26% 203 MB |================= | 26% 203 MB |================= | 26% 203 MB |================= | 26% 203 MB |================= | 26% 203 MB |================= | 26% 203 MB |================= | 26% 203 MB |================= | 26% 203 MB |================= | 26% 203 MB |================= | 26% 203 MB |================= | 26% 204 MB |================= | 26% 204 MB |================= | 26% 204 MB |================= | 26% 204 MB |================= | 26% 204 MB |================= | 26% 204 MB |================= | 26% 204 MB |================= | 26% 204 MB |================= | 26% 204 MB |================= | 26% 204 MB |================= | 26% 204 MB |================= | 26% 204 MB |================= | 26% 205 MB |================= | 26% 205 MB |================= | 26% 205 MB |================= | 26% 205 MB |================= | 26% 205 MB |================= | 26% 205 MB |================= | 26% 205 MB |================= | 26% 205 MB |================= | 26% 205 MB |================= | 26% 205 MB |================= | 26% 205 MB |================= | 26% 205 MB |================= | 26% 206 MB |================= | 26% 206 MB |================= | 26% 206 MB |================= | 26% 206 MB |================= | 26% 206 MB |================= | 26% 206 MB |================= | 26% 206 MB |================= | 26% 206 MB |================= | 26% 206 MB |================= | 26% 206 MB |================= | 26% 206 MB |================= | 26% 206 MB |================= | 26% 207 MB |================= | 26% 207 MB |================= | 26% 207 MB |================= | 26% 207 MB |================= | 26% 207 MB |================= | 26% 207 MB |================= | 26% 207 MB |================= | 26% 207 MB |================= | 26% 207 MB |================= | 26% 207 MB |================= | 26% 207 MB |================= | 26% 208 MB |================= | 26% 208 MB |================= | 26% 208 MB |================= | 27% 208 MB |================= | 27% 208 MB |================= | 27% 208 MB |================= | 27% 208 MB |================= | 27% 208 MB |================= | 27% 208 MB |================= | 27% 208 MB |================= | 27% 208 MB |================= | 27% 208 MB |================= | 27% 209 MB |================= | 27% 209 MB |================= | 27% 209 MB |================= | 27% 209 MB |================= | 27% 209 MB |================= | 27% 209 MB |================= | 27% 209 MB |================= | 27% 209 MB |================= | 27% 209 MB |================= | 27% 209 MB |================= | 27% 209 MB |================= | 27% 209 MB |================= | 27% 210 MB |================= | 27% 210 MB |================= | 27% 210 MB |================= | 27% 210 MB |================= | 27% 210 MB |================= | 27% 210 MB |================= | 27% 210 MB |================= | 27% 210 MB |================= | 27% 210 MB |================= | 27% 210 MB |================= | 27% 210 MB |================= | 27% 210 MB |================= | 27% 211 MB |================= | 27% 211 MB |================= | 27% 211 MB |================= | 27% 211 MB |================= | 27% 211 MB |================= | 27% 211 MB |================= | 27% 211 MB |================= | 27% 211 MB |================= | 27% 211 MB |================= | 27% 211 MB |================= | 27% 211 MB |================= | 27% 212 MB |================= | 27% 212 MB |================= | 27% 212 MB |================= | 27% 212 MB |================= | 27% 212 MB |================= | 27% 212 MB |================= | 27% 212 MB |================= | 27% 212 MB |================= | 27% 212 MB |================= | 27% 212 MB |================= | 27% 212 MB |================= | 27% 212 MB |================= | 27% 213 MB |================= | 27% 213 MB |================= | 27% 213 MB |================= | 27% 213 MB |================= | 27% 213 MB |================= | 27% 213 MB |================= | 27% 213 MB |================== | 27% 213 MB |================== | 27% 213 MB |================== | 27% 213 MB |================== | 27% 213 MB |================== | 27% 213 MB |================== | 27% 214 MB |================== | 27% 214 MB |================== | 27% 214 MB |================== | 27% 214 MB |================== | 27% 214 MB |================== | 27% 214 MB |================== | 27% 214 MB |================== | 27% 214 MB |================== | 27% 214 MB |================== | 27% 214 MB |================== | 27% 214 MB |================== | 27% 214 MB |================== | 27% 215 MB |================== | 27% 215 MB |================== | 27% 215 MB |================== | 27% 215 MB |================== | 27% 215 MB |================== | 27% 215 MB |================== | 27% 215 MB |================== | 27% 215 MB |================== | 27% 215 MB |================== | 27% 215 MB |================== | 27% 215 MB |================== | 28% 215 MB |================== | 28% 216 MB |================== | 28% 216 MB |================== | 28% 216 MB |================== | 28% 216 MB |================== | 28% 216 MB |================== | 28% 216 MB |================== | 28% 216 MB |================== | 28% 216 MB |================== | 28% 216 MB |================== | 28% 216 MB |================== | 28% 216 MB |================== | 28% 217 MB |================== | 28% 217 MB |================== | 28% 217 MB |================== | 28% 217 MB |================== | 28% 217 MB |================== | 28% 217 MB |================== | 28% 217 MB |================== | 28% 217 MB |================== | 28% 217 MB |================== | 28% 217 MB |================== | 28% 217 MB |================== | 28% 217 MB |================== | 28% 218 MB |================== | 28% 218 MB |================== | 28% 218 MB |================== | 28% 218 MB |================== | 28% 218 MB |================== | 28% 218 MB |================== | 28% 218 MB |================== | 28% 218 MB |================== | 28% 218 MB |================== | 28% 218 MB |================== | 28% 218 MB |================== | 28% 218 MB |================== | 28% 219 MB |================== | 28% 219 MB |================== | 28% 219 MB |================== | 28% 219 MB |================== | 28% 219 MB |================== | 28% 219 MB |================== | 28% 219 MB |================== | 28% 219 MB |================== | 28% 219 MB |================== | 28% 219 MB |================== | 28% 219 MB |================== | 28% 219 MB |================== | 28% 220 MB |================== | 28% 220 MB |================== | 28% 220 MB |================== | 28% 220 MB |================== | 28% 220 MB |================== | 28% 220 MB |================== | 28% 220 MB |================== | 28% 220 MB |================== | 28% 220 MB |================== | 28% 220 MB |================== | 28% 220 MB |================== | 28% 220 MB |================== | 28% 221 MB |================== | 28% 221 MB |================== | 28% 221 MB |================== | 28% 221 MB |================== | 28% 221 MB |================== | 28% 221 MB |================== | 28% 221 MB |================== | 28% 221 MB |================== | 28% 221 MB |================== | 28% 221 MB |================== | 28% 221 MB |================== | 28% 222 MB |================== | 28% 222 MB |================== | 28% 222 MB |================== | 28% 222 MB |================== | 28% 222 MB |================== | 28% 222 MB |================== | 28% 222 MB |================== | 28% 222 MB |================== | 28% 222 MB |================== | 28% 222 MB |================== | 28% 222 MB |================== | 28% 222 MB |================== | 28% 223 MB |================== | 28% 223 MB |================== | 28% 223 MB |================== | 28% 223 MB |================== | 28% 223 MB |================== | 28% 223 MB |================== | 28% 223 MB |================== | 29% 223 MB |================== | 29% 223 MB |================== | 29% 223 MB |================== | 29% 223 MB |================== | 29% 223 MB |================== | 29% 224 MB |================== | 29% 224 MB |================== | 29% 224 MB |================== | 29% 224 MB |================== | 29% 224 MB |================== | 29% 224 MB |================== | 29% 224 MB |================== | 29% 224 MB |================== | 29% 224 MB |================== | 29% 224 MB |================== | 29% 224 MB |================== | 29% 224 MB |================== | 29% 225 MB |================== | 29% 225 MB |================== | 29% 225 MB |================== | 29% 225 MB |=================== | 29% 225 MB |=================== | 29% 225 MB |=================== | 29% 225 MB |=================== | 29% 225 MB |=================== | 29% 225 MB |=================== | 29% 225 MB |=================== | 29% 225 MB |=================== | 29% 226 MB |=================== | 29% 226 MB |=================== | 29% 226 MB |=================== | 29% 226 MB |=================== | 29% 226 MB |=================== | 29% 226 MB |=================== | 29% 226 MB |=================== | 29% 226 MB |=================== | 29% 226 MB |=================== | 29% 226 MB |=================== | 29% 226 MB |=================== | 29% 226 MB |=================== | 29% 227 MB |=================== | 29% 227 MB |=================== | 29% 227 MB |=================== | 29% 227 MB |=================== | 29% 227 MB |=================== | 29% 227 MB |=================== | 29% 227 MB |=================== | 29% 227 MB |=================== | 29% 227 MB |=================== | 29% 227 MB |=================== | 29% 227 MB |=================== | 29% 227 MB |=================== | 29% 228 MB |=================== | 29% 228 MB |=================== | 29% 228 MB |=================== | 29% 228 MB |=================== | 29% 228 MB |=================== | 29% 228 MB |=================== | 29% 228 MB |=================== | 29% 228 MB |=================== | 29% 228 MB |=================== | 29% 228 MB |=================== | 29% 228 MB |=================== | 29% 228 MB |=================== | 29% 229 MB |=================== | 29% 229 MB |=================== | 29% 229 MB |=================== | 29% 229 MB |=================== | 29% 229 MB |=================== | 29% 229 MB |=================== | 29% 229 MB |=================== | 29% 229 MB |=================== | 29% 229 MB |=================== | 29% 229 MB |=================== | 29% 229 MB |=================== | 29% 230 MB |=================== | 29% 230 MB |=================== | 29% 230 MB |=================== | 29% 230 MB |=================== | 29% 230 MB |=================== | 29% 230 MB |=================== | 29% 230 MB |=================== | 29% 230 MB |=================== | 29% 230 MB |=================== | 29% 230 MB |=================== | 29% 230 MB |=================== | 29% 230 MB |=================== | 29% 231 MB |=================== | 29% 231 MB |=================== | 29% 231 MB |=================== | 29% 231 MB |=================== | 30% 231 MB |=================== | 30% 231 MB |=================== | 30% 231 MB |=================== | 30% 231 MB |=================== | 30% 231 MB |=================== | 30% 231 MB |=================== | 30% 231 MB |=================== | 30% 231 MB |=================== | 30% 232 MB |=================== | 30% 232 MB |=================== | 30% 232 MB |=================== | 30% 232 MB |=================== | 30% 232 MB |=================== | 30% 232 MB |=================== | 30% 232 MB |=================== | 30% 232 MB |=================== | 30% 232 MB |=================== | 30% 232 MB |=================== | 30% 232 MB |=================== | 30% 232 MB |=================== | 30% 233 MB |=================== | 30% 233 MB |=================== | 30% 233 MB |=================== | 30% 233 MB |=================== | 30% 233 MB |=================== | 30% 233 MB |=================== | 30% 233 MB |=================== | 30% 233 MB |=================== | 30% 233 MB |=================== | 30% 233 MB |=================== | 30% 233 MB |=================== | 30% 234 MB |=================== | 30% 234 MB |=================== | 30% 234 MB |=================== | 30% 234 MB |=================== | 30% 234 MB |=================== | 30% 234 MB |=================== | 30% 234 MB |=================== | 30% 234 MB |=================== | 30% 234 MB |=================== | 30% 234 MB |=================== | 30% 234 MB |=================== | 30% 234 MB |=================== | 30% 235 MB |=================== | 30% 235 MB |=================== | 30% 235 MB |=================== | 30% 235 MB |=================== | 30% 235 MB |=================== | 30% 235 MB |=================== | 30% 235 MB |=================== | 30% 235 MB |=================== | 30% 235 MB |=================== | 30% 235 MB |=================== | 30% 235 MB |=================== | 30% 235 MB |=================== | 30% 236 MB |=================== | 30% 236 MB |=================== | 30% 236 MB |=================== | 30% 236 MB |=================== | 30% 236 MB |=================== | 30% 236 MB |=================== | 30% 236 MB |=================== | 30% 236 MB |=================== | 30% 236 MB |=================== | 30% 236 MB |=================== | 30% 236 MB |=================== | 30% 236 MB |=================== | 30% 237 MB |=================== | 30% 237 MB |=================== | 30% 237 MB |==================== | 30% 237 MB |==================== | 30% 237 MB |==================== | 30% 237 MB |==================== | 30% 237 MB |==================== | 30% 237 MB |==================== | 30% 237 MB |==================== | 30% 237 MB |==================== | 30% 237 MB |==================== | 30% 238 MB |==================== | 30% 238 MB |==================== | 30% 238 MB |==================== | 30% 238 MB |==================== | 30% 238 MB |==================== | 30% 238 MB |==================== | 30% 238 MB |==================== | 30% 238 MB |==================== | 30% 238 MB |==================== | 30% 238 MB |==================== | 30% 238 MB |==================== | 30% 238 MB |==================== | 30% 239 MB |==================== | 31% 239 MB |==================== | 31% 239 MB |==================== | 31% 239 MB |==================== | 31% 239 MB |==================== | 31% 239 MB |==================== | 31% 239 MB |==================== | 31% 239 MB |==================== | 31% 239 MB |==================== | 31% 239 MB |==================== | 31% 239 MB |==================== | 31% 239 MB |==================== | 31% 240 MB |==================== | 31% 240 MB |==================== | 31% 240 MB |==================== | 31% 240 MB |==================== | 31% 240 MB |==================== | 31% 240 MB |==================== | 31% 240 MB |==================== | 31% 240 MB |==================== | 31% 240 MB |==================== | 31% 240 MB |==================== | 31% 240 MB |==================== | 31% 240 MB |==================== | 31% 241 MB |==================== | 31% 241 MB |==================== | 31% 241 MB |==================== | 31% 241 MB |==================== | 31% 241 MB |==================== | 31% 241 MB |==================== | 31% 241 MB |==================== | 31% 241 MB |==================== | 31% 241 MB |==================== | 31% 241 MB |==================== | 31% 241 MB |==================== | 31% 241 MB |==================== | 31% 242 MB |==================== | 31% 242 MB |==================== | 31% 242 MB |==================== | 31% 242 MB |==================== | 31% 242 MB |==================== | 31% 242 MB |==================== | 31% 242 MB |==================== | 31% 242 MB |==================== | 31% 242 MB |==================== | 31% 242 MB |==================== | 31% 242 MB |==================== | 31% 243 MB |==================== | 31% 243 MB |==================== | 31% 243 MB |==================== | 31% 243 MB |==================== | 31% 243 MB |==================== | 31% 243 MB |==================== | 31% 243 MB |==================== | 31% 243 MB |==================== | 31% 243 MB |==================== | 31% 243 MB |==================== | 31% 243 MB |==================== | 31% 243 MB |==================== | 31% 244 MB |==================== | 31% 244 MB |==================== | 31% 244 MB |==================== | 31% 244 MB |==================== | 31% 244 MB |==================== | 31% 244 MB |==================== | 31% 244 MB |==================== | 31% 244 MB |==================== | 31% 244 MB |==================== | 31% 244 MB |==================== | 31% 244 MB |==================== | 31% 244 MB |==================== | 31% 245 MB |==================== | 31% 245 MB |==================== | 31% 245 MB |==================== | 31% 245 MB |==================== | 31% 245 MB |==================== | 31% 245 MB |==================== | 31% 245 MB |==================== | 31% 245 MB |==================== | 31% 245 MB |==================== | 31% 245 MB |==================== | 31% 245 MB |==================== | 31% 245 MB |==================== | 31% 246 MB |==================== | 31% 246 MB |==================== | 31% 246 MB |==================== | 31% 246 MB |==================== | 31% 246 MB |==================== | 31% 246 MB |==================== | 31% 246 MB |==================== | 31% 246 MB |==================== | 31% 246 MB |==================== | 32% 246 MB |==================== | 32% 246 MB |==================== | 32% 246 MB |==================== | 32% 247 MB |==================== | 32% 247 MB |==================== | 32% 247 MB |==================== | 32% 247 MB |==================== | 32% 247 MB |==================== | 32% 247 MB |==================== | 32% 247 MB |==================== | 32% 247 MB |==================== | 32% 247 MB |==================== | 32% 247 MB |==================== | 32% 247 MB |==================== | 32% 248 MB |==================== | 32% 248 MB |==================== | 32% 248 MB |==================== | 32% 248 MB |==================== | 32% 248 MB |==================== | 32% 248 MB |==================== | 32% 248 MB |==================== | 32% 248 MB |==================== | 32% 248 MB |==================== | 32% 248 MB |==================== | 32% 248 MB |==================== | 32% 248 MB |==================== | 32% 249 MB |==================== | 32% 249 MB |===================== | 32% 249 MB |===================== | 32% 249 MB |===================== | 32% 249 MB |===================== | 32% 249 MB |===================== | 32% 249 MB |===================== | 32% 249 MB |===================== | 32% 249 MB |===================== | 32% 249 MB |===================== | 32% 249 MB |===================== | 32% 249 MB |===================== | 32% 250 MB |===================== | 32% 250 MB |===================== | 32% 250 MB |===================== | 32% 250 MB |===================== | 32% 250 MB |===================== | 32% 250 MB |===================== | 32% 250 MB |===================== | 32% 250 MB |===================== | 32% 250 MB |===================== | 32% 250 MB |===================== | 32% 250 MB |===================== | 32% 250 MB |===================== | 32% 251 MB |===================== | 32% 251 MB |===================== | 32% 251 MB |===================== | 32% 251 MB |===================== | 32% 251 MB |===================== | 32% 251 MB |===================== | 32% 251 MB |===================== | 32% 251 MB |===================== | 32% 251 MB |===================== | 32% 251 MB |===================== | 32% 251 MB |===================== | 32% 251 MB |===================== | 32% 252 MB |===================== | 32% 252 MB |===================== | 32% 252 MB |===================== | 32% 252 MB |===================== | 32% 252 MB |===================== | 32% 252 MB |===================== | 32% 252 MB |===================== | 32% 252 MB |===================== | 32% 252 MB |===================== | 32% 252 MB |===================== | 32% 252 MB |===================== | 32% 253 MB |===================== | 32% 253 MB |===================== | 32% 253 MB |===================== | 32% 253 MB |===================== | 32% 253 MB |===================== | 32% 253 MB |===================== | 32% 253 MB |===================== | 32% 253 MB |===================== | 32% 253 MB |===================== | 32% 253 MB |===================== | 32% 253 MB |===================== | 32% 253 MB |===================== | 32% 254 MB |===================== | 32% 254 MB |===================== | 32% 254 MB |===================== | 32% 254 MB |===================== | 32% 254 MB |===================== | 32% 254 MB |===================== | 33% 254 MB |===================== | 33% 254 MB |===================== | 33% 254 MB |===================== | 33% 254 MB |===================== | 33% 254 MB |===================== | 33% 254 MB |===================== | 33% 255 MB |===================== | 33% 255 MB |===================== | 33% 255 MB |===================== | 33% 255 MB |===================== | 33% 255 MB |===================== | 33% 255 MB |===================== | 33% 255 MB |===================== | 33% 255 MB |===================== | 33% 255 MB |===================== | 33% 255 MB |===================== | 33% 255 MB |===================== | 33% 255 MB |===================== | 33% 256 MB |===================== | 33% 256 MB |===================== | 33% 256 MB |===================== | 33% 256 MB |===================== | 33% 256 MB |===================== | 33% 256 MB |===================== | 33% 256 MB |===================== | 33% 256 MB |===================== | 33% 256 MB |===================== | 33% 256 MB |===================== | 33% 256 MB |===================== | 33% 256 MB |===================== | 33% 257 MB |===================== | 33% 257 MB |===================== | 33% 257 MB |===================== | 33% 257 MB |===================== | 33% 257 MB |===================== | 33% 257 MB |===================== | 33% 257 MB |===================== | 33% 257 MB |===================== | 33% 257 MB |===================== | 33% 257 MB |===================== | 33% 257 MB |===================== | 33% 257 MB |===================== | 33% 258 MB |===================== | 33% 258 MB |===================== | 33% 258 MB |===================== | 33% 258 MB |===================== | 33% 258 MB |===================== | 33% 258 MB |===================== | 33% 258 MB |===================== | 33% 258 MB |===================== | 33% 258 MB |===================== | 33% 258 MB |===================== | 33% 258 MB |===================== | 33% 259 MB |===================== | 33% 259 MB |===================== | 33% 259 MB |===================== | 33% 259 MB |===================== | 33% 259 MB |===================== | 33% 259 MB |===================== | 33% 259 MB |===================== | 33% 259 MB |===================== | 33% 259 MB |===================== | 33% 259 MB |===================== | 33% 259 MB |===================== | 33% 259 MB |===================== | 33% 260 MB |===================== | 33% 260 MB |===================== | 33% 260 MB |===================== | 33% 260 MB |===================== | 33% 260 MB |===================== | 33% 260 MB |===================== | 33% 260 MB |===================== | 33% 260 MB |===================== | 33% 260 MB |===================== | 33% 260 MB |===================== | 33% 260 MB |===================== | 33% 260 MB |====================== | 33% 261 MB |====================== | 33% 261 MB |====================== | 33% 261 MB |====================== | 33% 261 MB |====================== | 33% 261 MB |====================== | 33% 261 MB |====================== | 33% 261 MB |====================== | 33% 261 MB |====================== | 33% 261 MB |====================== | 33% 261 MB |====================== | 33% 261 MB |====================== | 33% 261 MB |====================== | 33% 262 MB |====================== | 33% 262 MB |====================== | 34% 262 MB |====================== | 34% 262 MB |====================== | 34% 262 MB |====================== | 34% 262 MB |====================== | 34% 262 MB |====================== | 34% 262 MB |====================== | 34% 262 MB |====================== | 34% 262 MB |====================== | 34% 262 MB |====================== | 34% 262 MB |====================== | 34% 263 MB |====================== | 34% 263 MB |====================== | 34% 263 MB |====================== | 34% 263 MB |====================== | 34% 263 MB |====================== | 34% 263 MB |====================== | 34% 263 MB |====================== | 34% 263 MB |====================== | 34% 263 MB |====================== | 34% 263 MB |====================== | 34% 263 MB |====================== | 34% 264 MB |====================== | 34% 264 MB |====================== | 34% 264 MB |====================== | 34% 264 MB |====================== | 34% 264 MB |====================== | 34% 264 MB |====================== | 34% 264 MB |====================== | 34% 264 MB |====================== | 34% 264 MB |====================== | 34% 264 MB |====================== | 34% 264 MB |====================== | 34% 264 MB |====================== | 34% 265 MB |====================== | 34% 265 MB |====================== | 34% 265 MB |====================== | 34% 265 MB |====================== | 34% 265 MB |====================== | 34% 265 MB |====================== | 34% 265 MB |====================== | 34% 265 MB |====================== | 34% 265 MB |====================== | 34% 265 MB |====================== | 34% 265 MB |====================== | 34% 265 MB |====================== | 34% 266 MB |====================== | 34% 266 MB |====================== | 34% 266 MB |====================== | 34% 266 MB |====================== | 34% 266 MB |====================== | 34% 266 MB |====================== | 34% 266 MB |====================== | 34% 266 MB |====================== | 34% 266 MB |====================== | 34% 266 MB |====================== | 34% 266 MB |====================== | 34% 266 MB |====================== | 34% 267 MB |====================== | 34% 267 MB |====================== | 34% 267 MB |====================== | 34% 267 MB |====================== | 34% 267 MB |====================== | 34% 267 MB |====================== | 34% 267 MB |====================== | 34% 267 MB |====================== | 34% 267 MB |====================== | 34% 267 MB |====================== | 34% 267 MB |====================== | 34% 267 MB |====================== | 34% 268 MB |====================== | 34% 268 MB |====================== | 34% 268 MB |====================== | 34% 268 MB |====================== | 34% 268 MB |====================== | 34% 268 MB |====================== | 34% 268 MB |====================== | 34% 268 MB |====================== | 34% 268 MB |====================== | 34% 268 MB |====================== | 34% 268 MB |====================== | 34% 269 MB |====================== | 34% 269 MB |====================== | 34% 269 MB |====================== | 34% 269 MB |====================== | 34% 269 MB |====================== | 34% 269 MB |====================== | 34% 269 MB |====================== | 34% 269 MB |====================== | 34% 269 MB |====================== | 34% 269 MB |====================== | 34% 269 MB |====================== | 35% 269 MB |====================== | 35% 270 MB |====================== | 35% 270 MB |====================== | 35% 270 MB |====================== | 35% 270 MB |====================== | 35% 270 MB |====================== | 35% 270 MB |====================== | 35% 270 MB |====================== | 35% 270 MB |====================== | 35% 270 MB |====================== | 35% 270 MB |====================== | 35% 270 MB |====================== | 35% 270 MB |====================== | 35% 271 MB |====================== | 35% 271 MB |====================== | 35% 271 MB |====================== | 35% 271 MB |====================== | 35% 271 MB |====================== | 35% 271 MB |====================== | 35% 271 MB |====================== | 35% 271 MB |====================== | 35% 271 MB |====================== | 35% 271 MB |====================== | 35% 271 MB |====================== | 35% 271 MB |====================== | 35% 272 MB |====================== | 35% 272 MB |====================== | 35% 272 MB |====================== | 35% 272 MB |====================== | 35% 272 MB |====================== | 35% 272 MB |====================== | 35% 272 MB |====================== | 35% 272 MB |====================== | 35% 272 MB |====================== | 35% 272 MB |======================= | 35% 272 MB |======================= | 35% 273 MB |======================= | 35% 273 MB |======================= | 35% 273 MB |======================= | 35% 273 MB |======================= | 35% 273 MB |======================= | 35% 273 MB |======================= | 35% 273 MB |======================= | 35% 273 MB |======================= | 35% 273 MB |======================= | 35% 273 MB |======================= | 35% 273 MB |======================= | 35% 273 MB |======================= | 35% 274 MB |======================= | 35% 274 MB |======================= | 35% 274 MB |======================= | 35% 274 MB |======================= | 35% 274 MB |======================= | 35% 274 MB |======================= | 35% 274 MB |======================= | 35% 274 MB |======================= | 35% 274 MB |======================= | 35% 274 MB |======================= | 35% 274 MB |======================= | 35% 274 MB |======================= | 35% 275 MB |======================= | 35% 275 MB |======================= | 35% 275 MB |======================= | 35% 275 MB |======================= | 35% 275 MB |======================= | 35% 275 MB |======================= | 35% 275 MB |======================= | 35% 275 MB |======================= | 35% 275 MB |======================= | 35% 275 MB |======================= | 35% 275 MB |======================= | 35% 275 MB |======================= | 35% 276 MB |======================= | 35% 276 MB |======================= | 35% 276 MB |======================= | 35% 276 MB |======================= | 35% 276 MB |======================= | 35% 276 MB |======================= | 35% 276 MB |======================= | 35% 276 MB |======================= | 35% 276 MB |======================= | 35% 276 MB |======================= | 35% 276 MB |======================= | 35% 277 MB |======================= | 35% 277 MB |======================= | 35% 277 MB |======================= | 35% 277 MB |======================= | 35% 277 MB |======================= | 35% 277 MB |======================= | 35% 277 MB |======================= | 36% 277 MB |======================= | 36% 277 MB |======================= | 36% 277 MB |======================= | 36% 277 MB |======================= | 36% 277 MB |======================= | 36% 278 MB |======================= | 36% 278 MB |======================= | 36% 278 MB |======================= | 36% 278 MB |======================= | 36% 278 MB |======================= | 36% 278 MB |======================= | 36% 278 MB |======================= | 36% 278 MB |======================= | 36% 278 MB |======================= | 36% 278 MB |======================= | 36% 278 MB |======================= | 36% 278 MB |======================= | 36% 279 MB |======================= | 36% 279 MB |======================= | 36% 279 MB |======================= | 36% 279 MB |======================= | 36% 279 MB |======================= | 36% 279 MB |======================= | 36% 279 MB |======================= | 36% 279 MB |======================= | 36% 279 MB |======================= | 36% 279 MB |======================= | 36% 279 MB |======================= | 36% 279 MB |======================= | 36% 280 MB |======================= | 36% 280 MB |======================= | 36% 280 MB |======================= | 36% 280 MB |======================= | 36% 280 MB |======================= | 36% 280 MB |======================= | 36% 280 MB |======================= | 36% 280 MB |======================= | 36% 280 MB |======================= | 36% 280 MB |======================= | 36% 280 MB |======================= | 36% 281 MB |======================= | 36% 281 MB |======================= | 36% 281 MB |======================= | 36% 281 MB |======================= | 36% 281 MB |======================= | 36% 281 MB |======================= | 36% 281 MB |======================= | 36% 281 MB |======================= | 36% 281 MB |======================= | 36% 281 MB |======================= | 36% 281 MB |======================= | 36% 281 MB |======================= | 36% 282 MB |======================= | 36% 282 MB |======================= | 36% 282 MB |======================= | 36% 282 MB |======================= | 36% 282 MB |======================= | 36% 282 MB |======================= | 36% 282 MB |======================= | 36% 282 MB |======================= | 36% 282 MB |======================= | 36% 282 MB |======================= | 36% 282 MB |======================= | 36% 282 MB |======================= | 36% 283 MB |======================= | 36% 283 MB |======================= | 36% 283 MB |======================= | 36% 283 MB |======================= | 36% 283 MB |======================= | 36% 283 MB |======================= | 36% 283 MB |======================= | 36% 283 MB |======================= | 36% 283 MB |======================= | 36% 283 MB |======================= | 36% 283 MB |======================= | 36% 284 MB |======================= | 36% 284 MB |======================= | 36% 284 MB |======================= | 36% 284 MB |======================= | 36% 284 MB |======================= | 36% 284 MB |======================= | 36% 284 MB |======================= | 36% 284 MB |======================= | 36% 284 MB |======================== | 36% 284 MB |======================== | 36% 284 MB |======================== | 36% 284 MB |======================== | 36% 285 MB |======================== | 36% 285 MB |======================== | 36% 285 MB |======================== | 36% 285 MB |======================== | 37% 285 MB |======================== | 37% 285 MB |======================== | 37% 285 MB |======================== | 37% 285 MB |======================== | 37% 285 MB |======================== | 37% 285 MB |======================== | 37% 285 MB |======================== | 37% 285 MB |======================== | 37% 286 MB |======================== | 37% 286 MB |======================== | 37% 286 MB |======================== | 37% 286 MB |======================== | 37% 286 MB |======================== | 37% 286 MB |======================== | 37% 286 MB |======================== | 37% 286 MB |======================== | 37% 286 MB |======================== | 37% 286 MB |======================== | 37% 286 MB |======================== | 37% 286 MB |======================== | 37% 287 MB |======================== | 37% 287 MB |======================== | 37% 287 MB |======================== | 37% 287 MB |======================== | 37% 287 MB |======================== | 37% 287 MB |======================== | 37% 287 MB |======================== | 37% 287 MB |======================== | 37% 287 MB |======================== | 37% 287 MB |======================== | 37% 287 MB |======================== | 37% 288 MB |======================== | 37% 288 MB |======================== | 37% 288 MB |======================== | 37% 288 MB |======================== | 37% 288 MB |======================== | 37% 288 MB |======================== | 37% 288 MB |======================== | 37% 288 MB |======================== | 37% 288 MB |======================== | 37% 288 MB |======================== | 37% 288 MB |======================== | 37% 288 MB |======================== | 37% 289 MB |======================== | 37% 289 MB |======================== | 37% 289 MB |======================== | 37% 289 MB |======================== | 37% 289 MB |======================== | 37% 289 MB |======================== | 37% 289 MB |======================== | 37% 289 MB |======================== | 37% 289 MB |======================== | 37% 289 MB |======================== | 37% 289 MB |======================== | 37% 289 MB |======================== | 37% 290 MB |======================== | 37% 290 MB |======================== | 37% 290 MB |======================== | 37% 290 MB |======================== | 37% 290 MB |======================== | 37% 290 MB |======================== | 37% 290 MB |======================== | 37% 290 MB |======================== | 37% 290 MB |======================== | 37% 290 MB |======================== | 37% 290 MB |======================== | 37% 290 MB |======================== | 37% 291 MB |======================== | 37% 291 MB |======================== | 37% 291 MB |======================== | 37% 291 MB |======================== | 37% 291 MB |======================== | 37% 291 MB |======================== | 37% 291 MB |======================== | 37% 291 MB |======================== | 37% 291 MB |======================== | 37% 291 MB |======================== | 37% 291 MB |======================== | 37% 291 MB |======================== | 37% 292 MB |======================== | 37% 292 MB |======================== | 37% 292 MB |======================== | 37% 292 MB |======================== | 37% 292 MB |======================== | 37% 292 MB |======================== | 37% 292 MB |======================== | 37% 292 MB |======================== | 37% 292 MB |======================== | 37% 292 MB |======================== | 37% 292 MB |======================== | 37% 292 MB |======================== | 38% 293 MB |======================== | 38% 293 MB |======================== | 38% 293 MB |======================== | 38% 293 MB |======================== | 38% 293 MB |======================== | 38% 293 MB |======================== | 38% 293 MB |======================== | 38% 293 MB |======================== | 38% 293 MB |======================== | 38% 293 MB |======================== | 38% 293 MB |======================== | 38% 294 MB |======================== | 38% 294 MB |======================== | 38% 294 MB |======================== | 38% 294 MB |======================== | 38% 294 MB |======================== | 38% 294 MB |======================== | 38% 294 MB |======================== | 38% 294 MB |======================== | 38% 294 MB |======================== | 38% 294 MB |======================== | 38% 294 MB |======================== | 38% 294 MB |======================== | 38% 295 MB |======================== | 38% 295 MB |======================== | 38% 295 MB |======================== | 38% 295 MB |======================== | 38% 295 MB |======================== | 38% 295 MB |======================== | 38% 295 MB |======================== | 38% 295 MB |======================== | 38% 295 MB |======================== | 38% 295 MB |======================== | 38% 295 MB |======================== | 38% 295 MB |======================== | 38% 296 MB |======================== | 38% 296 MB |======================== | 38% 296 MB |======================== | 38% 296 MB |======================== | 38% 296 MB |======================== | 38% 296 MB |======================== | 38% 296 MB |========================= | 38% 296 MB |========================= | 38% 296 MB |========================= | 38% 296 MB |========================= | 38% 296 MB |========================= | 38% 296 MB |========================= | 38% 297 MB |========================= | 38% 297 MB |========================= | 38% 297 MB |========================= | 38% 297 MB |========================= | 38% 297 MB |========================= | 38% 297 MB |========================= | 38% 297 MB |========================= | 38% 297 MB |========================= | 38% 297 MB |========================= | 38% 297 MB |========================= | 38% 297 MB |========================= | 38% 297 MB |========================= | 38% 298 MB |========================= | 38% 298 MB |========================= | 38% 298 MB |========================= | 38% 298 MB |========================= | 38% 298 MB |========================= | 38% 298 MB |========================= | 38% 298 MB |========================= | 38% 298 MB |========================= | 38% 298 MB |========================= | 38% 298 MB |========================= | 38% 298 MB |========================= | 38% 299 MB |========================= | 38% 299 MB |========================= | 38% 299 MB |========================= | 38% 299 MB |========================= | 38% 299 MB |========================= | 38% 299 MB |========================= | 38% 299 MB |========================= | 38% 299 MB |========================= | 38% 299 MB |========================= | 38% 299 MB |========================= | 38% 299 MB |========================= | 38% 299 MB |========================= | 38% 300 MB |========================= | 38% 300 MB |========================= | 38% 300 MB |========================= | 38% 300 MB |========================= | 38% 300 MB |========================= | 38% 300 MB |========================= | 38% 300 MB |========================= | 38% 300 MB |========================= | 38% 300 MB |========================= | 39% 300 MB |========================= | 39% 300 MB |========================= | 39% 300 MB |========================= | 39% 301 MB |========================= | 39% 301 MB |========================= | 39% 301 MB |========================= | 39% 301 MB |========================= | 39% 301 MB |========================= | 39% 301 MB |========================= | 39% 301 MB |========================= | 39% 301 MB |========================= | 39% 301 MB |========================= | 39% 301 MB |========================= | 39% 301 MB |========================= | 39% 301 MB |========================= | 39% 302 MB |========================= | 39% 302 MB |========================= | 39% 302 MB |========================= | 39% 302 MB |========================= | 39% 302 MB |========================= | 39% 302 MB |========================= | 39% 302 MB |========================= | 39% 302 MB |========================= | 39% 302 MB |========================= | 39% 302 MB |========================= | 39% 302 MB |========================= | 39% 302 MB |========================= | 39% 303 MB |========================= | 39% 303 MB |========================= | 39% 303 MB |========================= | 39% 303 MB |========================= | 39% 303 MB |========================= | 39% 303 MB |========================= | 39% 303 MB |========================= | 39% 303 MB |========================= | 39% 303 MB |========================= | 39% 303 MB |========================= | 39% 303 MB |========================= | 39% 303 MB |========================= | 39% 304 MB |========================= | 39% 304 MB |========================= | 39% 304 MB |========================= | 39% 304 MB |========================= | 39% 304 MB |========================= | 39% 304 MB |========================= | 39% 304 MB |========================= | 39% 304 MB |========================= | 39% 304 MB |========================= | 39% 304 MB |========================= | 39% 304 MB |========================= | 39% 305 MB |========================= | 39% 305 MB |========================= | 39% 305 MB |========================= | 39% 305 MB |========================= | 39% 305 MB |========================= | 39% 305 MB |========================= | 39% 305 MB |========================= | 39% 305 MB |========================= | 39% 305 MB |========================= | 39% 305 MB |========================= | 39% 305 MB |========================= | 39% 305 MB |========================= | 39% 306 MB |========================= | 39% 306 MB |========================= | 39% 306 MB |========================= | 39% 306 MB |========================= | 39% 306 MB |========================= | 39% 306 MB |========================= | 39% 306 MB |========================= | 39% 306 MB |========================= | 39% 306 MB |========================= | 39% 306 MB |========================= | 39% 306 MB |========================= | 39% 306 MB |========================= | 39% 307 MB |========================= | 39% 307 MB |========================= | 39% 307 MB |========================= | 39% 307 MB |========================= | 39% 307 MB |========================= | 39% 307 MB |========================= | 39% 307 MB |========================= | 39% 307 MB |========================= | 39% 307 MB |========================= | 39% 307 MB |========================= | 39% 307 MB |========================= | 39% 307 MB |========================= | 39% 308 MB |========================= | 39% 308 MB |========================= | 39% 308 MB |========================= | 39% 308 MB |========================= | 39% 308 MB |========================== | 40% 308 MB |========================== | 40% 308 MB |========================== | 40% 308 MB |========================== | 40% 308 MB |========================== | 40% 308 MB |========================== | 40% 308 MB |========================== | 40% 309 MB |========================== | 40% 309 MB |========================== | 40% 309 MB |========================== | 40% 309 MB |========================== | 40% 309 MB |========================== | 40% 309 MB |========================== | 40% 309 MB |========================== | 40% 309 MB |========================== | 40% 309 MB |========================== | 40% 309 MB |========================== | 40% 309 MB |========================== | 40% 309 MB |========================== | 40% 310 MB |========================== | 40% 310 MB |========================== | 40% 310 MB |========================== | 40% 310 MB |========================== | 40% 310 MB |========================== | 40% 310 MB |========================== | 40% 310 MB |========================== | 40% 310 MB |========================== | 40% 310 MB |========================== | 40% 310 MB |========================== | 40% 310 MB |========================== | 40% 310 MB |========================== | 40% 311 MB |========================== | 40% 311 MB |========================== | 40% 311 MB |========================== | 40% 311 MB |========================== | 40% 311 MB |========================== | 40% 311 MB |========================== | 40% 311 MB |========================== | 40% 311 MB |========================== | 40% 311 MB |========================== | 40% 311 MB |========================== | 40% 311 MB |========================== | 40% 311 MB |========================== | 40% 312 MB |========================== | 40% 312 MB |========================== | 40% 312 MB |========================== | 40% 312 MB |========================== | 40% 312 MB |========================== | 40% 312 MB |========================== | 40% 312 MB |========================== | 40% 312 MB |========================== | 40% 312 MB |========================== | 40% 312 MB |========================== | 40% 312 MB |========================== | 40% 313 MB |========================== | 40% 313 MB |========================== | 40% 313 MB |========================== | 40% 313 MB |========================== | 40% 313 MB |========================== | 40% 313 MB |========================== | 40% 313 MB |========================== | 40% 313 MB |========================== | 40% 313 MB |========================== | 40% 313 MB |========================== | 40% 313 MB |========================== | 40% 313 MB |========================== | 40% 314 MB |========================== | 40% 314 MB |========================== | 40% 314 MB |========================== | 40% 314 MB |========================== | 40% 314 MB |========================== | 40% 314 MB |========================== | 40% 314 MB |========================== | 40% 314 MB |========================== | 40% 314 MB |========================== | 40% 314 MB |========================== | 40% 314 MB |========================== | 40% 314 MB |========================== | 40% 315 MB |========================== | 40% 315 MB |========================== | 40% 315 MB |========================== | 40% 315 MB |========================== | 40% 315 MB |========================== | 40% 315 MB |========================== | 40% 315 MB |========================== | 40% 315 MB |========================== | 40% 315 MB |========================== | 40% 315 MB |========================== | 40% 315 MB |========================== | 40% 315 MB |========================== | 40% 316 MB |========================== | 41% 316 MB |========================== | 41% 316 MB |========================== | 41% 316 MB |========================== | 41% 316 MB |========================== | 41% 316 MB |========================== | 41% 316 MB |========================== | 41% 316 MB |========================== | 41% 316 MB |========================== | 41% 316 MB |========================== | 41% 316 MB |========================== | 41% 317 MB |========================== | 41% 317 MB |========================== | 41% 317 MB |========================== | 41% 317 MB |========================== | 41% 317 MB |========================== | 41% 317 MB |========================== | 41% 317 MB |========================== | 41% 317 MB |========================== | 41% 317 MB |========================== | 41% 317 MB |========================== | 41% 317 MB |========================== | 41% 317 MB |========================== | 41% 318 MB |========================== | 41% 318 MB |========================== | 41% 318 MB |========================== | 41% 318 MB |========================== | 41% 318 MB |========================== | 41% 318 MB |========================== | 41% 318 MB |========================== | 41% 318 MB |========================== | 41% 318 MB |========================== | 41% 318 MB |========================== | 41% 318 MB |========================== | 41% 318 MB |========================== | 41% 319 MB |========================== | 41% 319 MB |========================== | 41% 319 MB |========================== | 41% 319 MB |========================== | 41% 319 MB |========================== | 41% 319 MB |========================== | 41% 319 MB |========================== | 41% 319 MB |========================== | 41% 319 MB |========================== | 41% 319 MB |========================== | 41% 319 MB |========================== | 41% 319 MB |========================== | 41% 320 MB |========================== | 41% 320 MB |========================== | 41% 320 MB |========================== | 41% 320 MB |=========================== | 41% 320 MB |=========================== | 41% 320 MB |=========================== | 41% 320 MB |=========================== | 41% 320 MB |=========================== | 41% 320 MB |=========================== | 41% 320 MB |=========================== | 41% 320 MB |=========================== | 41% 320 MB |=========================== | 41% 321 MB |=========================== | 41% 321 MB |=========================== | 41% 321 MB |=========================== | 41% 321 MB |=========================== | 41% 321 MB |=========================== | 41% 321 MB |=========================== | 41% 321 MB |=========================== | 41% 321 MB |=========================== | 41% 321 MB |=========================== | 41% 321 MB |=========================== | 41% 321 MB |=========================== | 41% 321 MB |=========================== | 41% 322 MB |=========================== | 41% 322 MB |=========================== | 41% 322 MB |=========================== | 41% 322 MB |=========================== | 41% 322 MB |=========================== | 41% 322 MB |=========================== | 41% 322 MB |=========================== | 41% 322 MB |=========================== | 41% 322 MB |=========================== | 41% 322 MB |=========================== | 41% 322 MB |=========================== | 41% 323 MB |=========================== | 41% 323 MB |=========================== | 41% 323 MB |=========================== | 41% 323 MB |=========================== | 41% 323 MB |=========================== | 41% 323 MB |=========================== | 41% 323 MB |=========================== | 41% 323 MB |=========================== | 41% 323 MB |=========================== | 41% 323 MB |=========================== | 41% 323 MB |=========================== | 42% 323 MB |=========================== | 42% 324 MB |=========================== | 42% 324 MB |=========================== | 42% 324 MB |=========================== | 42% 324 MB |=========================== | 42% 324 MB |=========================== | 42% 324 MB |=========================== | 42% 324 MB |=========================== | 42% 324 MB |=========================== | 42% 324 MB |=========================== | 42% 324 MB |=========================== | 42% 324 MB |=========================== | 42% 324 MB |=========================== | 42% 325 MB |=========================== | 42% 325 MB |=========================== | 42% 325 MB |=========================== | 42% 325 MB |=========================== | 42% 325 MB |=========================== | 42% 325 MB |=========================== | 42% 325 MB |=========================== | 42% 325 MB |=========================== | 42% 325 MB |=========================== | 42% 325 MB |=========================== | 42% 325 MB |=========================== | 42% 325 MB |=========================== | 42% 326 MB |=========================== | 42% 326 MB |=========================== | 42% 326 MB |=========================== | 42% 326 MB |=========================== | 42% 326 MB |=========================== | 42% 326 MB |=========================== | 42% 326 MB |=========================== | 42% 326 MB |=========================== | 42% 326 MB |=========================== | 42% 326 MB |=========================== | 42% 326 MB |=========================== | 42% 326 MB |=========================== | 42% 327 MB |=========================== | 42% 327 MB |=========================== | 42% 327 MB |=========================== | 42% 327 MB |=========================== | 42% 327 MB |=========================== | 42% 327 MB |=========================== | 42% 327 MB |=========================== | 42% 327 MB |=========================== | 42% 327 MB |=========================== | 42% 327 MB |=========================== | 42% 327 MB |=========================== | 42% 327 MB |=========================== | 42% 328 MB |=========================== | 42% 328 MB |=========================== | 42% 328 MB |=========================== | 42% 328 MB |=========================== | 42% 328 MB |=========================== | 42% 328 MB |=========================== | 42% 328 MB |=========================== | 42% 328 MB |=========================== | 42% 328 MB |=========================== | 42% 328 MB |=========================== | 42% 328 MB |=========================== | 42% 329 MB |=========================== | 42% 329 MB |=========================== | 42% 329 MB |=========================== | 42% 329 MB |=========================== | 42% 329 MB |=========================== | 42% 329 MB |=========================== | 42% 329 MB |=========================== | 42% 329 MB |=========================== | 42% 329 MB |=========================== | 42% 329 MB |=========================== | 42% 329 MB |=========================== | 42% 329 MB |=========================== | 42% 330 MB |=========================== | 42% 330 MB |=========================== | 42% 330 MB |=========================== | 42% 330 MB |=========================== | 42% 330 MB |=========================== | 42% 330 MB |=========================== | 42% 330 MB |=========================== | 42% 330 MB |=========================== | 42% 330 MB |=========================== | 42% 330 MB |=========================== | 42% 330 MB |=========================== | 42% 330 MB |=========================== | 42% 331 MB |=========================== | 42% 331 MB |=========================== | 42% 331 MB |=========================== | 42% 331 MB |=========================== | 42% 331 MB |=========================== | 42% 331 MB |=========================== | 42% 331 MB |=========================== | 43% 331 MB |=========================== | 43% 331 MB |=========================== | 43% 331 MB |=========================== | 43% 331 MB |=========================== | 43% 331 MB |=========================== | 43% 332 MB |=========================== | 43% 332 MB |============================ | 43% 332 MB |============================ | 43% 332 MB |============================ | 43% 332 MB |============================ | 43% 332 MB |============================ | 43% 332 MB |============================ | 43% 332 MB |============================ | 43% 332 MB |============================ | 43% 332 MB |============================ | 43% 332 MB |============================ | 43% 332 MB |============================ | 43% 333 MB |============================ | 43% 333 MB |============================ | 43% 333 MB |============================ | 43% 333 MB |============================ | 43% 333 MB |============================ | 43% 333 MB |============================ | 43% 333 MB |============================ | 43% 333 MB |============================ | 43% 333 MB |============================ | 43% 333 MB |============================ | 43% 333 MB |============================ | 43% 334 MB |============================ | 43% 334 MB |============================ | 43% 334 MB |============================ | 43% 334 MB |============================ | 43% 334 MB |============================ | 43% 334 MB |============================ | 43% 334 MB |============================ | 43% 334 MB |============================ | 43% 334 MB |============================ | 43% 334 MB |============================ | 43% 334 MB |============================ | 43% 334 MB |============================ | 43% 335 MB |============================ | 43% 335 MB |============================ | 43% 335 MB |============================ | 43% 335 MB |============================ | 43% 335 MB |============================ | 43% 335 MB |============================ | 43% 335 MB |============================ | 43% 335 MB |============================ | 43% 335 MB |============================ | 43% 335 MB |============================ | 43% 335 MB |============================ | 43% 335 MB |============================ | 43% 336 MB |============================ | 43% 336 MB |============================ | 43% 336 MB |============================ | 43% 336 MB |============================ | 43% 336 MB |============================ | 43% 336 MB |============================ | 43% 336 MB |============================ | 43% 336 MB |============================ | 43% 336 MB |============================ | 43% 336 MB |============================ | 43% 336 MB |============================ | 43% 336 MB |============================ | 43% 337 MB |============================ | 43% 337 MB |============================ | 43% 337 MB |============================ | 43% 337 MB |============================ | 43% 337 MB |============================ | 43% 337 MB |============================ | 43% 337 MB |============================ | 43% 337 MB |============================ | 43% 337 MB |============================ | 43% 337 MB |============================ | 43% 337 MB |============================ | 43% 337 MB |============================ | 43% 338 MB |============================ | 43% 338 MB |============================ | 43% 338 MB |============================ | 43% 338 MB |============================ | 43% 338 MB |============================ | 43% 338 MB |============================ | 43% 338 MB |============================ | 43% 338 MB |============================ | 43% 338 MB |============================ | 43% 338 MB |============================ | 43% 338 MB |============================ | 43% 339 MB |============================ | 43% 339 MB |============================ | 43% 339 MB |============================ | 43% 339 MB |============================ | 44% 339 MB |============================ | 44% 339 MB |============================ | 44% 339 MB |============================ | 44% 339 MB |============================ | 44% 339 MB |============================ | 44% 339 MB |============================ | 44% 339 MB |============================ | 44% 339 MB |============================ | 44% 340 MB |============================ | 44% 340 MB |============================ | 44% 340 MB |============================ | 44% 340 MB |============================ | 44% 340 MB |============================ | 44% 340 MB |============================ | 44% 340 MB |============================ | 44% 340 MB |============================ | 44% 340 MB |============================ | 44% 340 MB |============================ | 44% 340 MB |============================ | 44% 340 MB |============================ | 44% 341 MB |============================ | 44% 341 MB |============================ | 44% 341 MB |============================ | 44% 341 MB |============================ | 44% 341 MB |============================ | 44% 341 MB |============================ | 44% 341 MB |============================ | 44% 341 MB |============================ | 44% 341 MB |============================ | 44% 341 MB |============================ | 44% 341 MB |============================ | 44% 341 MB |============================ | 44% 342 MB |============================ | 44% 342 MB |============================ | 44% 342 MB |============================ | 44% 342 MB |============================ | 44% 342 MB |============================ | 44% 342 MB |============================ | 44% 342 MB |============================ | 44% 342 MB |============================ | 44% 342 MB |============================ | 44% 342 MB |============================ | 44% 342 MB |============================ | 44% 342 MB |============================ | 44% 343 MB |============================ | 44% 343 MB |============================ | 44% 343 MB |============================ | 44% 343 MB |============================ | 44% 343 MB |============================ | 44% 343 MB |============================ | 44% 343 MB |============================ | 44% 343 MB |============================ | 44% 343 MB |============================ | 44% 343 MB |============================ | 44% 343 MB |============================ | 44% 344 MB |============================= | 44% 344 MB |============================= | 44% 344 MB |============================= | 44% 344 MB |============================= | 44% 344 MB |============================= | 44% 344 MB |============================= | 44% 344 MB |============================= | 44% 344 MB |============================= | 44% 344 MB |============================= | 44% 344 MB |============================= | 44% 344 MB |============================= | 44% 344 MB |============================= | 44% 345 MB |============================= | 44% 345 MB |============================= | 44% 345 MB |============================= | 44% 345 MB |============================= | 44% 345 MB |============================= | 44% 345 MB |============================= | 44% 345 MB |============================= | 44% 345 MB |============================= | 44% 345 MB |============================= | 44% 345 MB |============================= | 44% 345 MB |============================= | 44% 345 MB |============================= | 44% 346 MB |============================= | 44% 346 MB |============================= | 44% 346 MB |============================= | 44% 346 MB |============================= | 44% 346 MB |============================= | 44% 346 MB |============================= | 44% 346 MB |============================= | 44% 346 MB |============================= | 44% 346 MB |============================= | 44% 346 MB |============================= | 44% 346 MB |============================= | 44% 346 MB |============================= | 45% 347 MB |============================= | 45% 347 MB |============================= | 45% 347 MB |============================= | 45% 347 MB |============================= | 45% 347 MB |============================= | 45% 347 MB |============================= | 45% 347 MB |============================= | 45% 347 MB |============================= | 45% 347 MB |============================= | 45% 347 MB |============================= | 45% 347 MB |============================= | 45% 347 MB |============================= | 45% 348 MB |============================= | 45% 348 MB |============================= | 45% 348 MB |============================= | 45% 348 MB |============================= | 45% 348 MB |============================= | 45% 348 MB |============================= | 45% 348 MB |============================= | 45% 348 MB |============================= | 45% 348 MB |============================= | 45% 348 MB |============================= | 45% 348 MB |============================= | 45% 349 MB |============================= | 45% 349 MB |============================= | 45% 349 MB |============================= | 45% 349 MB |============================= | 45% 349 MB |============================= | 45% 349 MB |============================= | 45% 349 MB |============================= | 45% 349 MB |============================= | 45% 349 MB |============================= | 45% 349 MB |============================= | 45% 349 MB |============================= | 45% 349 MB |============================= | 45% 350 MB |============================= | 45% 350 MB |============================= | 45% 350 MB |============================= | 45% 350 MB |============================= | 45% 350 MB |============================= | 45% 350 MB |============================= | 45% 350 MB |============================= | 45% 350 MB |============================= | 45% 350 MB |============================= | 45% 350 MB |============================= | 45% 350 MB |============================= | 45% 350 MB |============================= | 45% 351 MB |============================= | 45% 351 MB |============================= | 45% 351 MB |============================= | 45% 351 MB |============================= | 45% 351 MB |============================= | 45% 351 MB |============================= | 45% 351 MB |============================= | 45% 351 MB |============================= | 45% 351 MB |============================= | 45% 351 MB |============================= | 45% 351 MB |============================= | 45% 351 MB |============================= | 45% 352 MB |============================= | 45% 352 MB |============================= | 45% 352 MB |============================= | 45% 352 MB |============================= | 45% 352 MB |============================= | 45% 352 MB |============================= | 45% 352 MB |============================= | 45% 352 MB |============================= | 45% 352 MB |============================= | 45% 352 MB |============================= | 45% 352 MB |============================= | 45% 352 MB |============================= | 45% 353 MB |============================= | 45% 353 MB |============================= | 45% 353 MB |============================= | 45% 353 MB |============================= | 45% 353 MB |============================= | 45% 353 MB |============================= | 45% 353 MB |============================= | 45% 353 MB |============================= | 45% 353 MB |============================= | 45% 353 MB |============================= | 45% 353 MB |============================= | 45% 353 MB |============================= | 45% 354 MB |============================= | 45% 354 MB |============================= | 45% 354 MB |============================= | 45% 354 MB |============================= | 45% 354 MB |============================= | 45% 354 MB |============================= | 45% 354 MB |============================= | 45% 354 MB |============================= | 46% 354 MB |============================= | 46% 354 MB |============================= | 46% 354 MB |============================= | 46% 355 MB |============================= | 46% 355 MB |============================= | 46% 355 MB |============================= | 46% 355 MB |============================= | 46% 355 MB |============================= | 46% 355 MB |============================= | 46% 355 MB |============================= | 46% 355 MB |============================= | 46% 355 MB |============================= | 46% 355 MB |============================= | 46% 355 MB |============================== | 46% 355 MB |============================== | 46% 356 MB |============================== | 46% 356 MB |============================== | 46% 356 MB |============================== | 46% 356 MB |============================== | 46% 356 MB |============================== | 46% 356 MB |============================== | 46% 356 MB |============================== | 46% 356 MB |============================== | 46% 356 MB |============================== | 46% 356 MB |============================== | 46% 356 MB |============================== | 46% 356 MB |============================== | 46% 357 MB |============================== | 46% 357 MB |============================== | 46% 357 MB |============================== | 46% 357 MB |============================== | 46% 357 MB |============================== | 46% 357 MB |============================== | 46% 357 MB |============================== | 46% 357 MB |============================== | 46% 357 MB |============================== | 46% 357 MB |============================== | 46% 357 MB |============================== | 46% 357 MB |============================== | 46% 358 MB |============================== | 46% 358 MB |============================== | 46% 358 MB |============================== | 46% 358 MB |============================== | 46% 358 MB |============================== | 46% 358 MB |============================== | 46% 358 MB |============================== | 46% 358 MB |============================== | 46% 358 MB |============================== | 46% 358 MB |============================== | 46% 358 MB |============================== | 46% 358 MB |============================== | 46% 359 MB |============================== | 46% 359 MB |============================== | 46% 359 MB |============================== | 46% 359 MB |============================== | 46% 359 MB |============================== | 46% 359 MB |============================== | 46% 359 MB |============================== | 46% 359 MB |============================== | 46% 359 MB |============================== | 46% 359 MB |============================== | 46% 359 MB |============================== | 46% 359 MB |============================== | 46% 360 MB |============================== | 46% 360 MB |============================== | 46% 360 MB |============================== | 46% 360 MB |============================== | 46% 360 MB |============================== | 46% 360 MB |============================== | 46% 360 MB |============================== | 46% 360 MB |============================== | 46% 360 MB |============================== | 46% 360 MB |============================== | 46% 360 MB |============================== | 46% 361 MB |============================== | 46% 361 MB |============================== | 46% 361 MB |============================== | 46% 361 MB |============================== | 46% 361 MB |============================== | 46% 361 MB |============================== | 46% 361 MB |============================== | 46% 361 MB |============================== | 46% 361 MB |============================== | 46% 361 MB |============================== | 46% 361 MB |============================== | 46% 361 MB |============================== | 46% 362 MB |============================== | 46% 362 MB |============================== | 46% 362 MB |============================== | 46% 362 MB |============================== | 46% 362 MB |============================== | 47% 362 MB |============================== | 47% 362 MB |============================== | 47% 362 MB |============================== | 47% 362 MB |============================== | 47% 362 MB |============================== | 47% 362 MB |============================== | 47% 362 MB |============================== | 47% 363 MB |============================== | 47% 363 MB |============================== | 47% 363 MB |============================== | 47% 363 MB |============================== | 47% 363 MB |============================== | 47% 363 MB |============================== | 47% 363 MB |============================== | 47% 363 MB |============================== | 47% 363 MB |============================== | 47% 363 MB |============================== | 47% 363 MB |============================== | 47% 363 MB |============================== | 47% 364 MB |============================== | 47% 364 MB |============================== | 47% 364 MB |============================== | 47% 364 MB |============================== | 47% 364 MB |============================== | 47% 364 MB |============================== | 47% 364 MB |============================== | 47% 364 MB |============================== | 47% 364 MB |============================== | 47% 364 MB |============================== | 47% 364 MB |============================== | 47% 364 MB |============================== | 47% 365 MB |============================== | 47% 365 MB |============================== | 47% 365 MB |============================== | 47% 365 MB |============================== | 47% 365 MB |============================== | 47% 365 MB |============================== | 47% 365 MB |============================== | 47% 365 MB |============================== | 47% 365 MB |============================== | 47% 365 MB |============================== | 47% 365 MB |============================== | 47% 365 MB |============================== | 47% 366 MB |============================== | 47% 366 MB |============================== | 47% 366 MB |============================== | 47% 366 MB |============================== | 47% 366 MB |============================== | 47% 366 MB |============================== | 47% 366 MB |============================== | 47% 366 MB |============================== | 47% 366 MB |============================== | 47% 366 MB |============================== | 47% 366 MB |============================== | 47% 366 MB |============================== | 47% 367 MB |============================== | 47% 367 MB |============================== | 47% 367 MB |============================== | 47% 367 MB |============================== | 47% 367 MB |============================== | 47% 367 MB |============================== | 47% 367 MB |============================== | 47% 367 MB |=============================== | 47% 367 MB |=============================== | 47% 367 MB |=============================== | 47% 367 MB |=============================== | 47% 368 MB |=============================== | 47% 368 MB |=============================== | 47% 368 MB |=============================== | 47% 368 MB |=============================== | 47% 368 MB |=============================== | 47% 368 MB |=============================== | 47% 368 MB |=============================== | 47% 368 MB |=============================== | 47% 368 MB |=============================== | 47% 368 MB |=============================== | 47% 368 MB |=============================== | 47% 368 MB |=============================== | 47% 369 MB |=============================== | 47% 369 MB |=============================== | 47% 369 MB |=============================== | 47% 369 MB |=============================== | 47% 369 MB |=============================== | 47% 369 MB |=============================== | 47% 369 MB |=============================== | 47% 369 MB |=============================== | 47% 369 MB |=============================== | 47% 369 MB |=============================== | 47% 369 MB |=============================== | 47% 370 MB |=============================== | 47% 370 MB |=============================== | 48% 370 MB |=============================== | 48% 370 MB |=============================== | 48% 370 MB |=============================== | 48% 370 MB |=============================== | 48% 370 MB |=============================== | 48% 370 MB |=============================== | 48% 370 MB |=============================== | 48% 370 MB |=============================== | 48% 370 MB |=============================== | 48% 370 MB |=============================== | 48% 371 MB |=============================== | 48% 371 MB |=============================== | 48% 371 MB |=============================== | 48% 371 MB |=============================== | 48% 371 MB |=============================== | 48% 371 MB |=============================== | 48% 371 MB |=============================== | 48% 371 MB |=============================== | 48% 371 MB |=============================== | 48% 371 MB |=============================== | 48% 371 MB |=============================== | 48% 371 MB |=============================== | 48% 372 MB |=============================== | 48% 372 MB |=============================== | 48% 372 MB |=============================== | 48% 372 MB |=============================== | 48% 372 MB |=============================== | 48% 372 MB |=============================== | 48% 372 MB |=============================== | 48% 372 MB |=============================== | 48% 372 MB |=============================== | 48% 372 MB |=============================== | 48% 372 MB |=============================== | 48% 373 MB |=============================== | 48% 373 MB |=============================== | 48% 373 MB |=============================== | 48% 373 MB |=============================== | 48% 373 MB |=============================== | 48% 373 MB |=============================== | 48% 373 MB |=============================== | 48% 373 MB |=============================== | 48% 373 MB |=============================== | 48% 373 MB |=============================== | 48% 373 MB |=============================== | 48% 373 MB |=============================== | 48% 374 MB |=============================== | 48% 374 MB |=============================== | 48% 374 MB |=============================== | 48% 374 MB |=============================== | 48% 374 MB |=============================== | 48% 374 MB |=============================== | 48% 374 MB |=============================== | 48% 374 MB |=============================== | 48% 374 MB |=============================== | 48% 374 MB |=============================== | 48% 374 MB |=============================== | 48% 374 MB |=============================== | 48% 375 MB |=============================== | 48% 375 MB |=============================== | 48% 375 MB |=============================== | 48% 375 MB |=============================== | 48% 375 MB |=============================== | 48% 375 MB |=============================== | 48% 375 MB |=============================== | 48% 375 MB |=============================== | 48% 375 MB |=============================== | 48% 375 MB |=============================== | 48% 375 MB |=============================== | 48% 376 MB |=============================== | 48% 376 MB |=============================== | 48% 376 MB |=============================== | 48% 376 MB |=============================== | 48% 376 MB |=============================== | 48% 376 MB |=============================== | 48% 376 MB |=============================== | 48% 376 MB |=============================== | 48% 376 MB |=============================== | 48% 376 MB |=============================== | 48% 376 MB |=============================== | 48% 376 MB |=============================== | 48% 377 MB |=============================== | 48% 377 MB |=============================== | 48% 377 MB |=============================== | 48% 377 MB |=============================== | 48% 377 MB |=============================== | 48% 377 MB |=============================== | 48% 377 MB |=============================== | 48% 377 MB |=============================== | 48% 377 MB |=============================== | 48% 377 MB |=============================== | 49% 377 MB |=============================== | 49% 377 MB |=============================== | 49% 378 MB |=============================== | 49% 378 MB |=============================== | 49% 378 MB |=============================== | 49% 378 MB |=============================== | 49% 378 MB |=============================== | 49% 378 MB |=============================== | 49% 378 MB |=============================== | 49% 378 MB |=============================== | 49% 378 MB |=============================== | 49% 378 MB |=============================== | 49% 378 MB |=============================== | 49% 379 MB |=============================== | 49% 379 MB |=============================== | 49% 379 MB |=============================== | 49% 379 MB |=============================== | 49% 379 MB |=============================== | 49% 379 MB |=============================== | 49% 379 MB |=============================== | 49% 379 MB |================================ | 49% 379 MB |================================ | 49% 379 MB |================================ | 49% 379 MB |================================ | 49% 379 MB |================================ | 49% 380 MB |================================ | 49% 380 MB |================================ | 49% 380 MB |================================ | 49% 380 MB |================================ | 49% 380 MB |================================ | 49% 380 MB |================================ | 49% 380 MB |================================ | 49% 380 MB |================================ | 49% 380 MB |================================ | 49% 380 MB |================================ | 49% 380 MB |================================ | 49% 380 MB |================================ | 49% 381 MB |================================ | 49% 381 MB |================================ | 49% 381 MB |================================ | 49% 381 MB |================================ | 49% 381 MB |================================ | 49% 381 MB |================================ | 49% 381 MB |================================ | 49% 381 MB |================================ | 49% 381 MB |================================ | 49% 381 MB |================================ | 49% 381 MB |================================ | 49% 381 MB |================================ | 49% 382 MB |================================ | 49% 382 MB |================================ | 49% 382 MB |================================ | 49% 382 MB |================================ | 49% 382 MB |================================ | 49% 382 MB |================================ | 49% 382 MB |================================ | 49% 382 MB |================================ | 49% 382 MB |================================ | 49% 382 MB |================================ | 49% 382 MB |================================ | 49% 383 MB |================================ | 49% 383 MB |================================ | 49% 383 MB |================================ | 49% 383 MB |================================ | 49% 383 MB |================================ | 49% 383 MB |================================ | 49% 383 MB |================================ | 49% 383 MB |================================ | 49% 383 MB |================================ | 49% 383 MB |================================ | 49% 383 MB |================================ | 49% 383 MB |================================ | 49% 384 MB |================================ | 49% 384 MB |================================ | 49% 384 MB |================================ | 49% 384 MB |================================ | 49% 384 MB |================================ | 49% 384 MB |================================ | 49% 384 MB |================================ | 49% 384 MB |================================ | 49% 384 MB |================================ | 49% 384 MB |================================ | 49% 384 MB |================================ | 49% 384 MB |================================ | 49% 385 MB |================================ | 49% 385 MB |================================ | 49% 385 MB |================================ | 49% 385 MB |================================ | 49% 385 MB |================================ | 49% 385 MB |================================ | 50% 385 MB |================================ | 50% 385 MB |================================ | 50% 385 MB |================================ | 50% 385 MB |================================ | 50% 385 MB |================================ | 50% 386 MB |================================ | 50% 386 MB |================================ | 50% 386 MB |================================ | 50% 386 MB |================================ | 50% 386 MB |================================ | 50% 386 MB |================================ | 50% 386 MB |================================ | 50% 386 MB |================================ | 50% 386 MB |================================ | 50% 386 MB |================================ | 50% 386 MB |================================ | 50% 386 MB |================================ | 50% 387 MB |================================ | 50% 387 MB |================================ | 50% 387 MB |================================ | 50% 387 MB |================================ | 50% 387 MB |================================ | 50% 387 MB |================================ | 50% 387 MB |================================ | 50% 387 MB |================================ | 50% 387 MB |================================ | 50% 387 MB |================================ | 50% 387 MB |================================ | 50% 387 MB |================================ | 50% 388 MB |================================ | 50% 388 MB |================================ | 50% 388 MB |================================ | 50% 388 MB |================================ | 50% 388 MB |================================ | 50% 388 MB |================================ | 50% 388 MB |================================ | 50% 388 MB |================================ | 50% 388 MB |================================ | 50% 388 MB |================================ | 50% 388 MB |================================ | 50% 389 MB |================================ | 50% 389 MB |================================ | 50% 389 MB |================================ | 50% 389 MB |================================ | 50% 389 MB |================================ | 50% 389 MB |================================ | 50% 389 MB |================================ | 50% 389 MB |================================ | 50% 389 MB |================================ | 50% 389 MB |================================ | 50% 389 MB |================================ | 50% 389 MB |================================ | 50% 390 MB |================================ | 50% 390 MB |================================ | 50% 390 MB |================================ | 50% 390 MB |================================ | 50% 390 MB |================================ | 50% 390 MB |================================ | 50% 390 MB |================================ | 50% 390 MB |================================ | 50% 390 MB |================================ | 50% 390 MB |================================ | 50% 390 MB |================================ | 50% 390 MB |================================ | 50% 391 MB |================================ | 50% 391 MB |================================ | 50% 391 MB |================================ | 50% 391 MB |================================ | 50% 391 MB |================================ | 50% 391 MB |================================= | 50% 391 MB |================================= | 50% 391 MB |================================= | 50% 391 MB |================================= | 50% 391 MB |================================= | 50% 391 MB |================================= | 50% 391 MB |================================= | 50% 392 MB |================================= | 50% 392 MB |================================= | 50% 392 MB |================================= | 50% 392 MB |================================= | 50% 392 MB |================================= | 50% 392 MB |================================= | 50% 392 MB |================================= | 50% 392 MB |================================= | 50% 392 MB |================================= | 50% 392 MB |================================= | 50% 392 MB |================================= | 50% 393 MB |================================= | 50% 393 MB |================================= | 50% 393 MB |================================= | 51% 393 MB |================================= | 51% 393 MB |================================= | 51% 393 MB |================================= | 51% 393 MB |================================= | 51% 393 MB |================================= | 51% 393 MB |================================= | 51% 393 MB |================================= | 51% 393 MB |================================= | 51% 393 MB |================================= | 51% 394 MB |================================= | 51% 394 MB |================================= | 51% 394 MB |================================= | 51% 394 MB |================================= | 51% 394 MB |================================= | 51% 394 MB |================================= | 51% 394 MB |================================= | 51% 394 MB |================================= | 51% 394 MB |================================= | 51% 394 MB |================================= | 51% 394 MB |================================= | 51% 394 MB |================================= | 51% 395 MB |================================= | 51% 395 MB |================================= | 51% 395 MB |================================= | 51% 395 MB |================================= | 51% 395 MB |================================= | 51% 395 MB |================================= | 51% 395 MB |================================= | 51% 395 MB |================================= | 51% 395 MB |================================= | 51% 395 MB |================================= | 51% 395 MB |================================= | 51% 395 MB |================================= | 51% 396 MB |================================= | 51% 396 MB |================================= | 51% 396 MB |================================= | 51% 396 MB |================================= | 51% 396 MB |================================= | 51% 396 MB |================================= | 51% 396 MB |================================= | 51% 396 MB |================================= | 51% 396 MB |================================= | 51% 396 MB |================================= | 51% 396 MB |================================= | 51% 397 MB |================================= | 51% 397 MB |================================= | 51% 397 MB |================================= | 51% 397 MB |================================= | 51% 397 MB |================================= | 51% 397 MB |================================= | 51% 397 MB |================================= | 51% 397 MB |================================= | 51% 397 MB |================================= | 51% 397 MB |================================= | 51% 397 MB |================================= | 51% 397 MB |================================= | 51% 398 MB |================================= | 51% 398 MB |================================= | 51% 398 MB |================================= | 51% 398 MB |================================= | 51% 398 MB |================================= | 51% 398 MB |================================= | 51% 398 MB |================================= | 51% 398 MB |================================= | 51% 398 MB |================================= | 51% 398 MB |================================= | 51% 398 MB |================================= | 51% 398 MB |================================= | 51% 399 MB |================================= | 51% 399 MB |================================= | 51% 399 MB |================================= | 51% 399 MB |================================= | 51% 399 MB |================================= | 51% 399 MB |================================= | 51% 399 MB |================================= | 51% 399 MB |================================= | 51% 399 MB |================================= | 51% 399 MB |================================= | 51% 399 MB |================================= | 51% 399 MB |================================= | 51% 400 MB |================================= | 51% 400 MB |================================= | 51% 400 MB |================================= | 51% 400 MB |================================= | 51% 400 MB |================================= | 51% 400 MB |================================= | 51% 400 MB |================================= | 51% 400 MB |================================= | 51% 400 MB |================================= | 51% 400 MB |================================= | 51% 400 MB |================================= | 52% 401 MB |================================= | 52% 401 MB |================================= | 52% 401 MB |================================= | 52% 401 MB |================================= | 52% 401 MB |================================= | 52% 401 MB |================================= | 52% 401 MB |================================= | 52% 401 MB |================================= | 52% 401 MB |================================= | 52% 401 MB |================================= | 52% 401 MB |================================= | 52% 401 MB |================================= | 52% 402 MB |================================= | 52% 402 MB |================================= | 52% 402 MB |================================= | 52% 402 MB |================================= | 52% 402 MB |================================= | 52% 402 MB |================================= | 52% 402 MB |================================= | 52% 402 MB |================================= | 52% 402 MB |================================= | 52% 402 MB |================================= | 52% 402 MB |================================= | 52% 402 MB |================================= | 52% 403 MB |================================= | 52% 403 MB |================================= | 52% 403 MB |================================= | 52% 403 MB |================================== | 52% 403 MB |================================== | 52% 403 MB |================================== | 52% 403 MB |================================== | 52% 403 MB |================================== | 52% 403 MB |================================== | 52% 403 MB |================================== | 52% 403 MB |================================== | 52% 403 MB |================================== | 52% 404 MB |================================== | 52% 404 MB |================================== | 52% 404 MB |================================== | 52% 404 MB |================================== | 52% 404 MB |================================== | 52% 404 MB |================================== | 52% 404 MB |================================== | 52% 404 MB |================================== | 52% 404 MB |================================== | 52% 404 MB |================================== | 52% 404 MB |================================== | 52% 405 MB |================================== | 52% 405 MB |================================== | 52% 405 MB |================================== | 52% 405 MB |================================== | 52% 405 MB |================================== | 52% 405 MB |================================== | 52% 405 MB |================================== | 52% 405 MB |================================== | 52% 405 MB |================================== | 52% 405 MB |================================== | 52% 405 MB |================================== | 52% 405 MB |================================== | 52% 406 MB |================================== | 52% 406 MB |================================== | 52% 406 MB |================================== | 52% 406 MB |================================== | 52% 406 MB |================================== | 52% 406 MB |================================== | 52% 406 MB |================================== | 52% 406 MB |================================== | 52% 406 MB |================================== | 52% 406 MB |================================== | 52% 406 MB |================================== | 52% 406 MB |================================== | 52% 407 MB |================================== | 52% 407 MB |================================== | 52% 407 MB |================================== | 52% 407 MB |================================== | 52% 407 MB |================================== | 52% 407 MB |================================== | 52% 407 MB |================================== | 52% 407 MB |================================== | 52% 407 MB |================================== | 52% 407 MB |================================== | 52% 407 MB |================================== | 52% 408 MB |================================== | 52% 408 MB |================================== | 52% 408 MB |================================== | 52% 408 MB |================================== | 52% 408 MB |================================== | 52% 408 MB |================================== | 52% 408 MB |================================== | 52% 408 MB |================================== | 53% 408 MB |================================== | 53% 408 MB |================================== | 53% 408 MB |================================== | 53% 408 MB |================================== | 53% 409 MB |================================== | 53% 409 MB |================================== | 53% 409 MB |================================== | 53% 409 MB |================================== | 53% 409 MB |================================== | 53% 409 MB |================================== | 53% 409 MB |================================== | 53% 409 MB |================================== | 53% 409 MB |================================== | 53% 409 MB |================================== | 53% 409 MB |================================== | 53% 409 MB |================================== | 53% 410 MB |================================== | 53% 410 MB |================================== | 53% 410 MB |================================== | 53% 410 MB |================================== | 53% 410 MB |================================== | 53% 410 MB |================================== | 53% 410 MB |================================== | 53% 410 MB |================================== | 53% 410 MB |================================== | 53% 410 MB |================================== | 53% 410 MB |================================== | 53% 411 MB |================================== | 53% 411 MB |================================== | 53% 411 MB |================================== | 53% 411 MB |================================== | 53% 411 MB |================================== | 53% 411 MB |================================== | 53% 411 MB |================================== | 53% 411 MB |================================== | 53% 411 MB |================================== | 53% 411 MB |================================== | 53% 411 MB |================================== | 53% 411 MB |================================== | 53% 412 MB |================================== | 53% 412 MB |================================== | 53% 412 MB |================================== | 53% 412 MB |================================== | 53% 412 MB |================================== | 53% 412 MB |================================== | 53% 412 MB |================================== | 53% 412 MB |================================== | 53% 412 MB |================================== | 53% 412 MB |================================== | 53% 412 MB |================================== | 53% 412 MB |================================== | 53% 413 MB |================================== | 53% 413 MB |================================== | 53% 413 MB |================================== | 53% 413 MB |================================== | 53% 413 MB |================================== | 53% 413 MB |================================== | 53% 413 MB |================================== | 53% 413 MB |================================== | 53% 413 MB |================================== | 53% 413 MB |================================== | 53% 413 MB |================================== | 53% 414 MB |================================== | 53% 414 MB |================================== | 53% 414 MB |================================== | 53% 414 MB |================================== | 53% 414 MB |================================== | 53% 414 MB |================================== | 53% 414 MB |================================== | 53% 414 MB |================================== | 53% 414 MB |================================== | 53% 414 MB |================================== | 53% 414 MB |================================== | 53% 414 MB |================================== | 53% 415 MB |================================== | 53% 415 MB |================================== | 53% 415 MB |=================================== | 53% 415 MB |=================================== | 53% 415 MB |=================================== | 53% 415 MB |=================================== | 53% 415 MB |=================================== | 53% 415 MB |=================================== | 53% 415 MB |=================================== | 53% 415 MB |=================================== | 53% 415 MB |=================================== | 53% 415 MB |=================================== | 53% 416 MB |=================================== | 53% 416 MB |=================================== | 53% 416 MB |=================================== | 53% 416 MB |=================================== | 54% 416 MB |=================================== | 54% 416 MB |=================================== | 54% 416 MB |=================================== | 54% 416 MB |=================================== | 54% 416 MB |=================================== | 54% 416 MB |=================================== | 54% 416 MB |=================================== | 54% 417 MB |=================================== | 54% 417 MB |=================================== | 54% 417 MB |=================================== | 54% 417 MB |=================================== | 54% 417 MB |=================================== | 54% 417 MB |=================================== | 54% 417 MB |=================================== | 54% 417 MB |=================================== | 54% 417 MB |=================================== | 54% 417 MB |=================================== | 54% 417 MB |=================================== | 54% 417 MB |=================================== | 54% 418 MB |=================================== | 54% 418 MB |=================================== | 54% 418 MB |=================================== | 54% 418 MB |=================================== | 54% 418 MB |=================================== | 54% 418 MB |=================================== | 54% 418 MB |=================================== | 54% 418 MB |=================================== | 54% 418 MB |=================================== | 54% 418 MB |=================================== | 54% 418 MB |=================================== | 54% 418 MB |=================================== | 54% 419 MB |=================================== | 54% 419 MB |=================================== | 54% 419 MB |=================================== | 54% 419 MB |=================================== | 54% 419 MB |=================================== | 54% 419 MB |=================================== | 54% 419 MB |=================================== | 54% 419 MB |=================================== | 54% 419 MB |=================================== | 54% 419 MB |=================================== | 54% 419 MB |=================================== | 54% 420 MB |=================================== | 54% 420 MB |=================================== | 54% 420 MB |=================================== | 54% 420 MB |=================================== | 54% 420 MB |=================================== | 54% 420 MB |=================================== | 54% 420 MB |=================================== | 54% 420 MB |=================================== | 54% 420 MB |=================================== | 54% 420 MB |=================================== | 54% 420 MB |=================================== | 54% 420 MB |=================================== | 54% 421 MB |=================================== | 54% 421 MB |=================================== | 54% 421 MB |=================================== | 54% 421 MB |=================================== | 54% 421 MB |=================================== | 54% 421 MB |=================================== | 54% 421 MB |=================================== | 54% 421 MB |=================================== | 54% 421 MB |=================================== | 54% 421 MB |=================================== | 54% 421 MB |=================================== | 54% 421 MB |=================================== | 54% 422 MB |=================================== | 54% 422 MB |=================================== | 54% 422 MB |=================================== | 54% 422 MB |=================================== | 54% 422 MB |=================================== | 54% 422 MB |=================================== | 54% 422 MB |=================================== | 54% 422 MB |=================================== | 54% 422 MB |=================================== | 54% 422 MB |=================================== | 54% 422 MB |=================================== | 54% 422 MB |=================================== | 54% 423 MB |=================================== | 54% 423 MB |=================================== | 54% 423 MB |=================================== | 54% 423 MB |=================================== | 54% 423 MB |=================================== | 54% 423 MB |=================================== | 54% 423 MB |=================================== | 54% 423 MB |=================================== | 54% 423 MB |=================================== | 54% 423 MB |=================================== | 54% 423 MB |=================================== | 54% 424 MB |=================================== | 54% 424 MB |=================================== | 55% 424 MB |=================================== | 55% 424 MB |=================================== | 55% 424 MB |=================================== | 55% 424 MB |=================================== | 55% 424 MB |=================================== | 55% 424 MB |=================================== | 55% 424 MB |=================================== | 55% 424 MB |=================================== | 55% 424 MB |=================================== | 55% 424 MB |=================================== | 55% 425 MB |=================================== | 55% 425 MB |=================================== | 55% 425 MB |=================================== | 55% 425 MB |=================================== | 55% 425 MB |=================================== | 55% 425 MB |=================================== | 55% 425 MB |=================================== | 55% 425 MB |=================================== | 55% 425 MB |=================================== | 55% 425 MB |=================================== | 55% 425 MB |=================================== | 55% 425 MB |=================================== | 55% 426 MB |=================================== | 55% 426 MB |=================================== | 55% 426 MB |=================================== | 55% 426 MB |=================================== | 55% 426 MB |=================================== | 55% 426 MB |=================================== | 55% 426 MB |=================================== | 55% 426 MB |=================================== | 55% 426 MB |=================================== | 55% 426 MB |=================================== | 55% 426 MB |=================================== | 55% 426 MB |=================================== | 55% 427 MB |==================================== | 55% 427 MB |==================================== | 55% 427 MB |==================================== | 55% 427 MB |==================================== | 55% 427 MB |==================================== | 55% 427 MB |==================================== | 55% 427 MB |==================================== | 55% 427 MB |==================================== | 55% 427 MB |==================================== | 55% 427 MB |==================================== | 55% 427 MB |==================================== | 55% 427 MB |==================================== | 55% 428 MB |==================================== | 55% 428 MB |==================================== | 55% 428 MB |==================================== | 55% 428 MB |==================================== | 55% 428 MB |==================================== | 55% 428 MB |==================================== | 55% 428 MB |==================================== | 55% 428 MB |==================================== | 55% 428 MB |==================================== | 55% 428 MB |==================================== | 55% 428 MB |==================================== | 55% 429 MB |==================================== | 55% 429 MB |==================================== | 55% 429 MB |==================================== | 55% 429 MB |==================================== | 55% 429 MB |==================================== | 55% 429 MB |==================================== | 55% 429 MB |==================================== | 55% 429 MB |==================================== | 55% 429 MB |==================================== | 55% 429 MB |==================================== | 55% 429 MB |==================================== | 55% 429 MB |==================================== | 55% 430 MB |==================================== | 55% 430 MB |==================================== | 55% 430 MB |==================================== | 55% 430 MB |==================================== | 55% 430 MB |==================================== | 55% 430 MB |==================================== | 55% 430 MB |==================================== | 55% 430 MB |==================================== | 55% 430 MB |==================================== | 55% 430 MB |==================================== | 55% 430 MB |==================================== | 55% 430 MB |==================================== | 55% 431 MB |==================================== | 55% 431 MB |==================================== | 55% 431 MB |==================================== | 55% 431 MB |==================================== | 55% 431 MB |==================================== | 55% 431 MB |==================================== | 55% 431 MB |==================================== | 55% 431 MB |==================================== | 55% 431 MB |==================================== | 55% 431 MB |==================================== | 56% 431 MB |==================================== | 56% 431 MB |==================================== | 56% 432 MB |==================================== | 56% 432 MB |==================================== | 56% 432 MB |==================================== | 56% 432 MB |==================================== | 56% 432 MB |==================================== | 56% 432 MB |==================================== | 56% 432 MB |==================================== | 56% 432 MB |==================================== | 56% 432 MB |==================================== | 56% 432 MB |==================================== | 56% 432 MB |==================================== | 56% 432 MB |==================================== | 56% 433 MB |==================================== | 56% 433 MB |==================================== | 56% 433 MB |==================================== | 56% 433 MB |==================================== | 56% 433 MB |==================================== | 56% 433 MB |==================================== | 56% 433 MB |==================================== | 56% 433 MB |==================================== | 56% 433 MB |==================================== | 56% 433 MB |==================================== | 56% 433 MB |==================================== | 56% 434 MB |==================================== | 56% 434 MB |==================================== | 56% 434 MB |==================================== | 56% 434 MB |==================================== | 56% 434 MB |==================================== | 56% 434 MB |==================================== | 56% 434 MB |==================================== | 56% 434 MB |==================================== | 56% 434 MB |==================================== | 56% 434 MB |==================================== | 56% 434 MB |==================================== | 56% 434 MB |==================================== | 56% 435 MB |==================================== | 56% 435 MB |==================================== | 56% 435 MB |==================================== | 56% 435 MB |==================================== | 56% 435 MB |==================================== | 56% 435 MB |==================================== | 56% 435 MB |==================================== | 56% 435 MB |==================================== | 56% 435 MB |==================================== | 56% 435 MB |==================================== | 56% 435 MB |==================================== | 56% 435 MB |==================================== | 56% 436 MB |==================================== | 56% 436 MB |==================================== | 56% 436 MB |==================================== | 56% 436 MB |==================================== | 56% 436 MB |==================================== | 56% 436 MB |==================================== | 56% 436 MB |==================================== | 56% 436 MB |==================================== | 56% 436 MB |==================================== | 56% 436 MB |==================================== | 56% 436 MB |==================================== | 56% 437 MB |==================================== | 56% 437 MB |==================================== | 56% 437 MB |==================================== | 56% 437 MB |==================================== | 56% 437 MB |==================================== | 56% 437 MB |==================================== | 56% 437 MB |==================================== | 56% 437 MB |==================================== | 56% 437 MB |==================================== | 56% 437 MB |==================================== | 56% 437 MB |==================================== | 56% 437 MB |==================================== | 56% 438 MB |==================================== | 56% 438 MB |==================================== | 56% 438 MB |==================================== | 56% 438 MB |==================================== | 56% 438 MB |==================================== | 56% 438 MB |==================================== | 56% 438 MB |==================================== | 56% 438 MB |==================================== | 56% 438 MB |==================================== | 56% 438 MB |==================================== | 56% 438 MB |===================================== | 56% 438 MB |===================================== | 56% 439 MB |===================================== | 56% 439 MB |===================================== | 56% 439 MB |===================================== | 56% 439 MB |===================================== | 56% 439 MB |===================================== | 56% 439 MB |===================================== | 57% 439 MB |===================================== | 57% 439 MB |===================================== | 57% 439 MB |===================================== | 57% 439 MB |===================================== | 57% 439 MB |===================================== | 57% 440 MB |===================================== | 57% 440 MB |===================================== | 57% 440 MB |===================================== | 57% 440 MB |===================================== | 57% 440 MB |===================================== | 57% 440 MB |===================================== | 57% 440 MB |===================================== | 57% 440 MB |===================================== | 57% 440 MB |===================================== | 57% 440 MB |===================================== | 57% 440 MB |===================================== | 57% 440 MB |===================================== | 57% 441 MB |===================================== | 57% 441 MB |===================================== | 57% 441 MB |===================================== | 57% 441 MB |===================================== | 57% 441 MB |===================================== | 57% 441 MB |===================================== | 57% 441 MB |===================================== | 57% 441 MB |===================================== | 57% 441 MB |===================================== | 57% 441 MB |===================================== | 57% 441 MB |===================================== | 57% 442 MB |===================================== | 57% 442 MB |===================================== | 57% 442 MB |===================================== | 57% 442 MB |===================================== | 57% 442 MB |===================================== | 57% 442 MB |===================================== | 57% 442 MB |===================================== | 57% 442 MB |===================================== | 57% 442 MB |===================================== | 57% 442 MB |===================================== | 57% 442 MB |===================================== | 57% 442 MB |===================================== | 57% 443 MB |===================================== | 57% 443 MB |===================================== | 57% 443 MB |===================================== | 57% 443 MB |===================================== | 57% 443 MB |===================================== | 57% 443 MB |===================================== | 57% 443 MB |===================================== | 57% 443 MB |===================================== | 57% 443 MB |===================================== | 57% 443 MB |===================================== | 57% 443 MB |===================================== | 57% 443 MB |===================================== | 57% 444 MB |===================================== | 57% 444 MB |===================================== | 57% 444 MB |===================================== | 57% 444 MB |===================================== | 57% 444 MB |===================================== | 57% 444 MB |===================================== | 57% 444 MB |===================================== | 57% 444 MB |===================================== | 57% 444 MB |===================================== | 57% 444 MB |===================================== | 57% 444 MB |===================================== | 57% 445 MB |===================================== | 57% 445 MB |===================================== | 57% 445 MB |===================================== | 57% 445 MB |===================================== | 57% 445 MB |===================================== | 57% 445 MB |===================================== | 57% 445 MB |===================================== | 57% 445 MB |===================================== | 57% 445 MB |===================================== | 57% 445 MB |===================================== | 57% 445 MB |===================================== | 57% 445 MB |===================================== | 57% 446 MB |===================================== | 57% 446 MB |===================================== | 57% 446 MB |===================================== | 57% 446 MB |===================================== | 57% 446 MB |===================================== | 57% 446 MB |===================================== | 57% 446 MB |===================================== | 57% 446 MB |===================================== | 57% 446 MB |===================================== | 57% 446 MB |===================================== | 57% 446 MB |===================================== | 57% 446 MB |===================================== | 57% 447 MB |===================================== | 57% 447 MB |===================================== | 57% 447 MB |===================================== | 58% 447 MB |===================================== | 58% 447 MB |===================================== | 58% 447 MB |===================================== | 58% 447 MB |===================================== | 58% 447 MB |===================================== | 58% 447 MB |===================================== | 58% 447 MB |===================================== | 58% 447 MB |===================================== | 58% 447 MB |===================================== | 58% 448 MB |===================================== | 58% 448 MB |===================================== | 58% 448 MB |===================================== | 58% 448 MB |===================================== | 58% 448 MB |===================================== | 58% 448 MB |===================================== | 58% 448 MB |===================================== | 58% 448 MB |===================================== | 58% 448 MB |===================================== | 58% 448 MB |===================================== | 58% 448 MB |===================================== | 58% 449 MB |===================================== | 58% 449 MB |===================================== | 58% 449 MB |===================================== | 58% 449 MB |===================================== | 58% 449 MB |===================================== | 58% 449 MB |===================================== | 58% 449 MB |===================================== | 58% 449 MB |===================================== | 58% 449 MB |===================================== | 58% 449 MB |===================================== | 58% 449 MB |===================================== | 58% 449 MB |===================================== | 58% 450 MB |===================================== | 58% 450 MB |===================================== | 58% 450 MB |===================================== | 58% 450 MB |===================================== | 58% 450 MB |===================================== | 58% 450 MB |===================================== | 58% 450 MB |===================================== | 58% 450 MB |===================================== | 58% 450 MB |====================================== | 58% 450 MB |====================================== | 58% 450 MB |====================================== | 58% 450 MB |====================================== | 58% 451 MB |====================================== | 58% 451 MB |====================================== | 58% 451 MB |====================================== | 58% 451 MB |====================================== | 58% 451 MB |====================================== | 58% 451 MB |====================================== | 58% 451 MB |====================================== | 58% 451 MB |====================================== | 58% 451 MB |====================================== | 58% 451 MB |====================================== | 58% 451 MB |====================================== | 58% 452 MB |====================================== | 58% 452 MB |====================================== | 58% 452 MB |====================================== | 58% 452 MB |====================================== | 58% 452 MB |====================================== | 58% 452 MB |====================================== | 58% 452 MB |====================================== | 58% 452 MB |====================================== | 58% 452 MB |====================================== | 58% 452 MB |====================================== | 58% 452 MB |====================================== | 58% 452 MB |====================================== | 58% 453 MB |====================================== | 58% 453 MB |====================================== | 58% 453 MB |====================================== | 58% 453 MB |====================================== | 58% 453 MB |====================================== | 58% 453 MB |====================================== | 58% 453 MB |====================================== | 58% 453 MB |====================================== | 58% 453 MB |====================================== | 58% 453 MB |====================================== | 58% 453 MB |====================================== | 58% 453 MB |====================================== | 58% 454 MB |====================================== | 58% 454 MB |====================================== | 58% 454 MB |====================================== | 58% 454 MB |====================================== | 58% 454 MB |====================================== | 58% 454 MB |====================================== | 58% 454 MB |====================================== | 58% 454 MB |====================================== | 58% 454 MB |====================================== | 58% 454 MB |====================================== | 58% 454 MB |====================================== | 59% 454 MB |====================================== | 59% 455 MB |====================================== | 59% 455 MB |====================================== | 59% 455 MB |====================================== | 59% 455 MB |====================================== | 59% 455 MB |====================================== | 59% 455 MB |====================================== | 59% 455 MB |====================================== | 59% 455 MB |====================================== | 59% 455 MB |====================================== | 59% 455 MB |====================================== | 59% 455 MB |====================================== | 59% 455 MB |====================================== | 59% 456 MB |====================================== | 59% 456 MB |====================================== | 59% 456 MB |====================================== | 59% 456 MB |====================================== | 59% 456 MB |====================================== | 59% 456 MB |====================================== | 59% 456 MB |====================================== | 59% 456 MB |====================================== | 59% 456 MB |====================================== | 59% 456 MB |====================================== | 59% 456 MB |====================================== | 59% 457 MB |====================================== | 59% 457 MB |====================================== | 59% 457 MB |====================================== | 59% 457 MB |====================================== | 59% 457 MB |====================================== | 59% 457 MB |====================================== | 59% 457 MB |====================================== | 59% 457 MB |====================================== | 59% 457 MB |====================================== | 59% 457 MB |====================================== | 59% 457 MB |====================================== | 59% 457 MB |====================================== | 59% 458 MB |====================================== | 59% 458 MB |====================================== | 59% 458 MB |====================================== | 59% 458 MB |====================================== | 59% 458 MB |====================================== | 59% 458 MB |====================================== | 59% 458 MB |====================================== | 59% 458 MB |====================================== | 59% 458 MB |====================================== | 59% 458 MB |====================================== | 59% 458 MB |====================================== | 59% 458 MB |====================================== | 59% 459 MB |====================================== | 59% 459 MB |====================================== | 59% 459 MB |====================================== | 59% 459 MB |====================================== | 59% 459 MB |====================================== | 59% 459 MB |====================================== | 59% 459 MB |====================================== | 59% 459 MB |====================================== | 59% 459 MB |====================================== | 59% 459 MB |====================================== | 59% 459 MB |====================================== | 59% 459 MB |====================================== | 59% 460 MB |====================================== | 59% 460 MB |====================================== | 59% 460 MB |====================================== | 59% 460 MB |====================================== | 59% 460 MB |====================================== | 59% 460 MB |====================================== | 59% 460 MB |====================================== | 59% 460 MB |====================================== | 59% 460 MB |====================================== | 59% 460 MB |====================================== | 59% 460 MB |====================================== | 59% 461 MB |====================================== | 59% 461 MB |====================================== | 59% 461 MB |====================================== | 59% 461 MB |====================================== | 59% 461 MB |====================================== | 59% 461 MB |====================================== | 59% 461 MB |====================================== | 59% 461 MB |====================================== | 59% 461 MB |====================================== | 59% 461 MB |====================================== | 59% 461 MB |====================================== | 59% 461 MB |====================================== | 59% 462 MB |====================================== | 59% 462 MB |====================================== | 59% 462 MB |====================================== | 59% 462 MB |====================================== | 59% 462 MB |====================================== | 59% 462 MB |====================================== | 59% 462 MB |====================================== | 59% 462 MB |======================================= | 60% 462 MB |======================================= | 60% 462 MB |======================================= | 60% 462 MB |======================================= | 60% 462 MB |======================================= | 60% 463 MB |======================================= | 60% 463 MB |======================================= | 60% 463 MB |======================================= | 60% 463 MB |======================================= | 60% 463 MB |======================================= | 60% 463 MB |======================================= | 60% 463 MB |======================================= | 60% 463 MB |======================================= | 60% 463 MB |======================================= | 60% 463 MB |======================================= | 60% 463 MB |======================================= | 60% 463 MB |======================================= | 60% 464 MB |======================================= | 60% 464 MB |======================================= | 60% 464 MB |======================================= | 60% 464 MB |======================================= | 60% 464 MB |======================================= | 60% 464 MB |======================================= | 60% 464 MB |======================================= | 60% 464 MB |======================================= | 60% 464 MB |======================================= | 60% 464 MB |======================================= | 60% 464 MB |======================================= | 60% 465 MB |======================================= | 60% 465 MB |======================================= | 60% 465 MB |======================================= | 60% 465 MB |======================================= | 60% 465 MB |======================================= | 60% 465 MB |======================================= | 60% 465 MB |======================================= | 60% 465 MB |======================================= | 60% 465 MB |======================================= | 60% 465 MB |======================================= | 60% 465 MB |======================================= | 60% 465 MB |======================================= | 60% 466 MB |======================================= | 60% 466 MB |======================================= | 60% 466 MB |======================================= | 60% 466 MB |======================================= | 60% 466 MB |======================================= | 60% 466 MB |======================================= | 60% 466 MB |======================================= | 60% 466 MB |======================================= | 60% 466 MB |======================================= | 60% 466 MB |======================================= | 60% 466 MB |======================================= | 60% 466 MB |======================================= | 60% 467 MB |======================================= | 60% 467 MB |======================================= | 60% 467 MB |======================================= | 60% 467 MB |======================================= | 60% 467 MB |======================================= | 60% 467 MB |======================================= | 60% 467 MB |======================================= | 60% 467 MB |======================================= | 60% 467 MB |======================================= | 60% 467 MB |======================================= | 60% 467 MB |======================================= | 60% 468 MB |======================================= | 60% 468 MB |======================================= | 60% 468 MB |======================================= | 60% 468 MB |======================================= | 60% 468 MB |======================================= | 60% 468 MB |======================================= | 60% 468 MB |======================================= | 60% 468 MB |======================================= | 60% 468 MB |======================================= | 60% 468 MB |======================================= | 60% 468 MB |======================================= | 60% 468 MB |======================================= | 60% 469 MB |======================================= | 60% 469 MB |======================================= | 60% 469 MB |======================================= | 60% 469 MB |======================================= | 60% 469 MB |======================================= | 60% 469 MB |======================================= | 60% 469 MB |======================================= | 60% 469 MB |======================================= | 60% 469 MB |======================================= | 60% 469 MB |======================================= | 60% 469 MB |======================================= | 60% 469 MB |======================================= | 60% 470 MB |======================================= | 60% 470 MB |======================================= | 60% 470 MB |======================================= | 60% 470 MB |======================================= | 61% 470 MB |======================================= | 61% 470 MB |======================================= | 61% 470 MB |======================================= | 61% 470 MB |======================================= | 61% 470 MB |======================================= | 61% 470 MB |======================================= | 61% 470 MB |======================================= | 61% 470 MB |======================================= | 61% 471 MB |======================================= | 61% 471 MB |======================================= | 61% 471 MB |======================================= | 61% 471 MB |======================================= | 61% 471 MB |======================================= | 61% 471 MB |======================================= | 61% 471 MB |======================================= | 61% 471 MB |======================================= | 61% 471 MB |======================================= | 61% 471 MB |======================================= | 61% 471 MB |======================================= | 61% 472 MB |======================================= | 61% 472 MB |======================================= | 61% 472 MB |======================================= | 61% 472 MB |======================================= | 61% 472 MB |======================================= | 61% 472 MB |======================================= | 61% 472 MB |======================================= | 61% 472 MB |======================================= | 61% 472 MB |======================================= | 61% 472 MB |======================================= | 61% 472 MB |======================================= | 61% 472 MB |======================================= | 61% 473 MB |======================================= | 61% 473 MB |======================================= | 61% 473 MB |======================================= | 61% 473 MB |======================================= | 61% 473 MB |======================================= | 61% 473 MB |======================================= | 61% 473 MB |======================================= | 61% 473 MB |======================================= | 61% 473 MB |======================================= | 61% 473 MB |======================================= | 61% 473 MB |======================================= | 61% 473 MB |======================================= | 61% 474 MB |======================================= | 61% 474 MB |======================================= | 61% 474 MB |======================================= | 61% 474 MB |======================================= | 61% 474 MB |======================================= | 61% 474 MB |======================================== | 61% 474 MB |======================================== | 61% 474 MB |======================================== | 61% 474 MB |======================================== | 61% 474 MB |======================================== | 61% 474 MB |======================================== | 61% 475 MB |======================================== | 61% 475 MB |======================================== | 61% 475 MB |======================================== | 61% 475 MB |======================================== | 61% 475 MB |======================================== | 61% 475 MB |======================================== | 61% 475 MB |======================================== | 61% 475 MB |======================================== | 61% 475 MB |======================================== | 61% 475 MB |======================================== | 61% 475 MB |======================================== | 61% 475 MB |======================================== | 61% 476 MB |======================================== | 61% 476 MB |======================================== | 61% 476 MB |======================================== | 61% 476 MB |======================================== | 61% 476 MB |======================================== | 61% 476 MB |======================================== | 61% 476 MB |======================================== | 61% 476 MB |======================================== | 61% 476 MB |======================================== | 61% 476 MB |======================================== | 61% 476 MB |======================================== | 61% 476 MB |======================================== | 61% 477 MB |======================================== | 61% 477 MB |======================================== | 61% 477 MB |======================================== | 61% 477 MB |======================================== | 61% 477 MB |======================================== | 61% 477 MB |======================================== | 61% 477 MB |======================================== | 61% 477 MB |======================================== | 61% 477 MB |======================================== | 61% 477 MB |======================================== | 61% 477 MB |======================================== | 61% 478 MB |======================================== | 61% 478 MB |======================================== | 62% 478 MB |======================================== | 62% 478 MB |======================================== | 62% 478 MB |======================================== | 62% 478 MB |======================================== | 62% 478 MB |======================================== | 62% 478 MB |======================================== | 62% 478 MB |======================================== | 62% 478 MB |======================================== | 62% 478 MB |======================================== | 62% 478 MB |======================================== | 62% 479 MB |======================================== | 62% 479 MB |======================================== | 62% 479 MB |======================================== | 62% 479 MB |======================================== | 62% 479 MB |======================================== | 62% 479 MB |======================================== | 62% 479 MB |======================================== | 62% 479 MB |======================================== | 62% 479 MB |======================================== | 62% 479 MB |======================================== | 62% 479 MB |======================================== | 62% 479 MB |======================================== | 62% 480 MB |======================================== | 62% 480 MB |======================================== | 62% 480 MB |======================================== | 62% 480 MB |======================================== | 62% 480 MB |======================================== | 62% 480 MB |======================================== | 62% 480 MB |======================================== | 62% 480 MB |======================================== | 62% 480 MB |======================================== | 62% 480 MB |======================================== | 62% 480 MB |======================================== | 62% 480 MB |======================================== | 62% 481 MB |======================================== | 62% 481 MB |======================================== | 62% 481 MB |======================================== | 62% 481 MB |======================================== | 62% 481 MB |======================================== | 62% 481 MB |======================================== | 62% 481 MB |======================================== | 62% 481 MB |======================================== | 62% 481 MB |======================================== | 62% 481 MB |======================================== | 62% 481 MB |======================================== | 62% 482 MB |======================================== | 62% 482 MB |======================================== | 62% 482 MB |======================================== | 62% 482 MB |======================================== | 62% 482 MB |======================================== | 62% 482 MB |======================================== | 62% 482 MB |======================================== | 62% 482 MB |======================================== | 62% 482 MB |======================================== | 62% 482 MB |======================================== | 62% 482 MB |======================================== | 62% 482 MB |======================================== | 62% 483 MB |======================================== | 62% 483 MB |======================================== | 62% 483 MB |======================================== | 62% 483 MB |======================================== | 62% 483 MB |======================================== | 62% 483 MB |======================================== | 62% 483 MB |======================================== | 62% 483 MB |======================================== | 62% 483 MB |======================================== | 62% 483 MB |======================================== | 62% 483 MB |======================================== | 62% 483 MB |======================================== | 62% 484 MB |======================================== | 62% 484 MB |======================================== | 62% 484 MB |======================================== | 62% 484 MB |======================================== | 62% 484 MB |======================================== | 62% 484 MB |======================================== | 62% 484 MB |======================================== | 62% 484 MB |======================================== | 62% 484 MB |======================================== | 62% 484 MB |======================================== | 62% 484 MB |======================================== | 62% 485 MB |======================================== | 62% 485 MB |======================================== | 62% 485 MB |======================================== | 62% 485 MB |======================================== | 62% 485 MB |======================================== | 62% 485 MB |======================================== | 62% 485 MB |======================================== | 62% 485 MB |======================================== | 62% 485 MB |======================================== | 62% 485 MB |======================================== | 63% 485 MB |======================================== | 63% 485 MB |======================================== | 63% 486 MB |======================================== | 63% 486 MB |======================================== | 63% 486 MB |======================================== | 63% 486 MB |======================================== | 63% 486 MB |========================================= | 63% 486 MB |========================================= | 63% 486 MB |========================================= | 63% 486 MB |========================================= | 63% 486 MB |========================================= | 63% 486 MB |========================================= | 63% 486 MB |========================================= | 63% 486 MB |========================================= | 63% 487 MB |========================================= | 63% 487 MB |========================================= | 63% 487 MB |========================================= | 63% 487 MB |========================================= | 63% 487 MB |========================================= | 63% 487 MB |========================================= | 63% 487 MB |========================================= | 63% 487 MB |========================================= | 63% 487 MB |========================================= | 63% 487 MB |========================================= | 63% 487 MB |========================================= | 63% 487 MB |========================================= | 63% 488 MB |========================================= | 63% 488 MB |========================================= | 63% 488 MB |========================================= | 63% 488 MB |========================================= | 63% 488 MB |========================================= | 63% 488 MB |========================================= | 63% 488 MB |========================================= | 63% 488 MB |========================================= | 63% 488 MB |========================================= | 63% 488 MB |========================================= | 63% 488 MB |========================================= | 63% 489 MB |========================================= | 63% 489 MB |========================================= | 63% 489 MB |========================================= | 63% 489 MB |========================================= | 63% 489 MB |========================================= | 63% 489 MB |========================================= | 63% 489 MB |========================================= | 63% 489 MB |========================================= | 63% 489 MB |========================================= | 63% 489 MB |========================================= | 63% 489 MB |========================================= | 63% 489 MB |========================================= | 63% 490 MB |========================================= | 63% 490 MB |========================================= | 63% 490 MB |========================================= | 63% 490 MB |========================================= | 63% 490 MB |========================================= | 63% 490 MB |========================================= | 63% 490 MB |========================================= | 63% 490 MB |========================================= | 63% 490 MB |========================================= | 63% 490 MB |========================================= | 63% 490 MB |========================================= | 63% 490 MB |========================================= | 63% 491 MB |========================================= | 63% 491 MB |========================================= | 63% 491 MB |========================================= | 63% 491 MB |========================================= | 63% 491 MB |========================================= | 63% 491 MB |========================================= | 63% 491 MB |========================================= | 63% 491 MB |========================================= | 63% 491 MB |========================================= | 63% 491 MB |========================================= | 63% 491 MB |========================================= | 63% 491 MB |========================================= | 63% 492 MB |========================================= | 63% 492 MB |========================================= | 63% 492 MB |========================================= | 63% 492 MB |========================================= | 63% 492 MB |========================================= | 63% 492 MB |========================================= | 63% 492 MB |========================================= | 63% 492 MB |========================================= | 63% 492 MB |========================================= | 63% 492 MB |========================================= | 63% 492 MB |========================================= | 63% 493 MB |========================================= | 63% 493 MB |========================================= | 63% 493 MB |========================================= | 63% 493 MB |========================================= | 63% 493 MB |========================================= | 63% 493 MB |========================================= | 64% 493 MB |========================================= | 64% 493 MB |========================================= | 64% 493 MB |========================================= | 64% 493 MB |========================================= | 64% 493 MB |========================================= | 64% 493 MB |========================================= | 64% 494 MB |========================================= | 64% 494 MB |========================================= | 64% 494 MB |========================================= | 64% 494 MB |========================================= | 64% 494 MB |========================================= | 64% 494 MB |========================================= | 64% 494 MB |========================================= | 64% 494 MB |========================================= | 64% 494 MB |========================================= | 64% 494 MB |========================================= | 64% 494 MB |========================================= | 64% 494 MB |========================================= | 64% 495 MB |========================================= | 64% 495 MB |========================================= | 64% 495 MB |========================================= | 64% 495 MB |========================================= | 64% 495 MB |========================================= | 64% 495 MB |========================================= | 64% 495 MB |========================================= | 64% 495 MB |========================================= | 64% 495 MB |========================================= | 64% 495 MB |========================================= | 64% 495 MB |========================================= | 64% 495 MB |========================================= | 64% 496 MB |========================================= | 64% 496 MB |========================================= | 64% 496 MB |========================================= | 64% 496 MB |========================================= | 64% 496 MB |========================================= | 64% 496 MB |========================================= | 64% 496 MB |========================================= | 64% 496 MB |========================================= | 64% 496 MB |========================================= | 64% 496 MB |========================================= | 64% 496 MB |========================================= | 64% 497 MB |========================================= | 64% 497 MB |========================================= | 64% 497 MB |========================================= | 64% 497 MB |========================================= | 64% 497 MB |========================================= | 64% 497 MB |========================================= | 64% 497 MB |========================================= | 64% 497 MB |========================================= | 64% 497 MB |========================================= | 64% 497 MB |========================================= | 64% 497 MB |========================================= | 64% 497 MB |========================================= | 64% 498 MB |========================================= | 64% 498 MB |========================================= | 64% 498 MB |========================================== | 64% 498 MB |========================================== | 64% 498 MB |========================================== | 64% 498 MB |========================================== | 64% 498 MB |========================================== | 64% 498 MB |========================================== | 64% 498 MB |========================================== | 64% 498 MB |========================================== | 64% 498 MB |========================================== | 64% 498 MB |========================================== | 64% 499 MB |========================================== | 64% 499 MB |========================================== | 64% 499 MB |========================================== | 64% 499 MB |========================================== | 64% 499 MB |========================================== | 64% 499 MB |========================================== | 64% 499 MB |========================================== | 64% 499 MB |========================================== | 64% 499 MB |========================================== | 64% 499 MB |========================================== | 64% 499 MB |========================================== | 64% 499 MB |========================================== | 64% 500 MB |========================================== | 64% 500 MB |========================================== | 64% 500 MB |========================================== | 64% 500 MB |========================================== | 64% 500 MB |========================================== | 64% 500 MB |========================================== | 64% 500 MB |========================================== | 64% 500 MB |========================================== | 64% 500 MB |========================================== | 64% 500 MB |========================================== | 64% 500 MB |========================================== | 64% 500 MB |========================================== | 64% 501 MB |========================================== | 64% 501 MB |========================================== | 65% 501 MB |========================================== | 65% 501 MB |========================================== | 65% 501 MB |========================================== | 65% 501 MB |========================================== | 65% 501 MB |========================================== | 65% 501 MB |========================================== | 65% 501 MB |========================================== | 65% 501 MB |========================================== | 65% 501 MB |========================================== | 65% 502 MB |========================================== | 65% 502 MB |========================================== | 65% 502 MB |========================================== | 65% 502 MB |========================================== | 65% 502 MB |========================================== | 65% 502 MB |========================================== | 65% 502 MB |========================================== | 65% 502 MB |========================================== | 65% 502 MB |========================================== | 65% 502 MB |========================================== | 65% 502 MB |========================================== | 65% 502 MB |========================================== | 65% 503 MB |========================================== | 65% 503 MB |========================================== | 65% 503 MB |========================================== | 65% 503 MB |========================================== | 65% 503 MB |========================================== | 65% 503 MB |========================================== | 65% 503 MB |========================================== | 65% 503 MB |========================================== | 65% 503 MB |========================================== | 65% 503 MB |========================================== | 65% 503 MB |========================================== | 65% 503 MB |========================================== | 65% 504 MB |========================================== | 65% 504 MB |========================================== | 65% 504 MB |========================================== | 65% 504 MB |========================================== | 65% 504 MB |========================================== | 65% 504 MB |========================================== | 65% 504 MB |========================================== | 65% 504 MB |========================================== | 65% 504 MB |========================================== | 65% 504 MB |========================================== | 65% 504 MB |========================================== | 65% 505 MB |========================================== | 65% 505 MB |========================================== | 65% 505 MB |========================================== | 65% 505 MB |========================================== | 65% 505 MB |========================================== | 65% 505 MB |========================================== | 65% 505 MB |========================================== | 65% 505 MB |========================================== | 65% 505 MB |========================================== | 65% 505 MB |========================================== | 65% 505 MB |========================================== | 65% 505 MB |========================================== | 65% 506 MB |========================================== | 65% 506 MB |========================================== | 65% 506 MB |========================================== | 65% 506 MB |========================================== | 65% 506 MB |========================================== | 65% 506 MB |========================================== | 65% 506 MB |========================================== | 65% 506 MB |========================================== | 65% 506 MB |========================================== | 65% 506 MB |========================================== | 65% 506 MB |========================================== | 65% 506 MB |========================================== | 65% 507 MB |========================================== | 65% 507 MB |========================================== | 65% 507 MB |========================================== | 65% 507 MB |========================================== | 65% 507 MB |========================================== | 65% 507 MB |========================================== | 65% 507 MB |========================================== | 65% 507 MB |========================================== | 65% 507 MB |========================================== | 65% 507 MB |========================================== | 65% 507 MB |========================================== | 65% 508 MB |========================================== | 65% 508 MB |========================================== | 65% 508 MB |========================================== | 65% 508 MB |========================================== | 65% 508 MB |========================================== | 65% 508 MB |========================================== | 65% 508 MB |========================================== | 65% 508 MB |========================================== | 65% 508 MB |========================================== | 65% 508 MB |========================================== | 65% 508 MB |========================================== | 66% 508 MB |========================================== | 66% 509 MB |========================================== | 66% 509 MB |========================================== | 66% 509 MB |========================================== | 66% 509 MB |========================================== | 66% 509 MB |========================================== | 66% 509 MB |========================================== | 66% 509 MB |========================================== | 66% 509 MB |========================================== | 66% 509 MB |========================================== | 66% 509 MB |========================================== | 66% 509 MB |========================================== | 66% 509 MB |========================================== | 66% 510 MB |=========================================== | 66% 510 MB |=========================================== | 66% 510 MB |=========================================== | 66% 510 MB |=========================================== | 66% 510 MB |=========================================== | 66% 510 MB |=========================================== | 66% 510 MB |=========================================== | 66% 510 MB |=========================================== | 66% 510 MB |=========================================== | 66% 510 MB |=========================================== | 66% 510 MB |=========================================== | 66% 511 MB |=========================================== | 66% 511 MB |=========================================== | 66% 511 MB |=========================================== | 66% 511 MB |=========================================== | 66% 511 MB |=========================================== | 66% 511 MB |=========================================== | 66% 511 MB |=========================================== | 66% 511 MB |=========================================== | 66% 511 MB |=========================================== | 66% 511 MB |=========================================== | 66% 511 MB |=========================================== | 66% 511 MB |=========================================== | 66% 512 MB |=========================================== | 66% 512 MB |=========================================== | 66% 512 MB |=========================================== | 66% 512 MB |=========================================== | 66% 512 MB |=========================================== | 66% 512 MB |=========================================== | 66% 512 MB |=========================================== | 66% 512 MB |=========================================== | 66% 512 MB |=========================================== | 66% 512 MB |=========================================== | 66% 512 MB |=========================================== | 66% 512 MB |=========================================== | 66% 513 MB |=========================================== | 66% 513 MB |=========================================== | 66% 513 MB |=========================================== | 66% 513 MB |=========================================== | 66% 513 MB |=========================================== | 66% 513 MB |=========================================== | 66% 513 MB |=========================================== | 66% 513 MB |=========================================== | 66% 513 MB |=========================================== | 66% 513 MB |=========================================== | 66% 513 MB |=========================================== | 66% 513 MB |=========================================== | 66% 514 MB |=========================================== | 66% 514 MB |=========================================== | 66% 514 MB |=========================================== | 66% 514 MB |=========================================== | 66% 514 MB |=========================================== | 66% 514 MB |=========================================== | 66% 514 MB |=========================================== | 66% 514 MB |=========================================== | 66% 514 MB |=========================================== | 66% 514 MB |=========================================== | 66% 514 MB |=========================================== | 66% 515 MB |=========================================== | 66% 515 MB |=========================================== | 66% 515 MB |=========================================== | 66% 515 MB |=========================================== | 66% 515 MB |=========================================== | 66% 515 MB |=========================================== | 66% 515 MB |=========================================== | 66% 515 MB |=========================================== | 66% 515 MB |=========================================== | 66% 515 MB |=========================================== | 66% 515 MB |=========================================== | 66% 515 MB |=========================================== | 66% 516 MB |=========================================== | 66% 516 MB |=========================================== | 66% 516 MB |=========================================== | 66% 516 MB |=========================================== | 66% 516 MB |=========================================== | 66% 516 MB |=========================================== | 66% 516 MB |=========================================== | 66% 516 MB |=========================================== | 67% 516 MB |=========================================== | 67% 516 MB |=========================================== | 67% 516 MB |=========================================== | 67% 516 MB |=========================================== | 67% 517 MB |=========================================== | 67% 517 MB |=========================================== | 67% 517 MB |=========================================== | 67% 517 MB |=========================================== | 67% 517 MB |=========================================== | 67% 517 MB |=========================================== | 67% 517 MB |=========================================== | 67% 517 MB |=========================================== | 67% 517 MB |=========================================== | 67% 517 MB |=========================================== | 67% 517 MB |=========================================== | 67% 517 MB |=========================================== | 67% 518 MB |=========================================== | 67% 518 MB |=========================================== | 67% 518 MB |=========================================== | 67% 518 MB |=========================================== | 67% 518 MB |=========================================== | 67% 518 MB |=========================================== | 67% 518 MB |=========================================== | 67% 518 MB |=========================================== | 67% 518 MB |=========================================== | 67% 518 MB |=========================================== | 67% 518 MB |=========================================== | 67% 519 MB |=========================================== | 67% 519 MB |=========================================== | 67% 519 MB |=========================================== | 67% 519 MB |=========================================== | 67% 519 MB |=========================================== | 67% 519 MB |=========================================== | 67% 519 MB |=========================================== | 67% 519 MB |=========================================== | 67% 519 MB |=========================================== | 67% 519 MB |=========================================== | 67% 519 MB |=========================================== | 67% 519 MB |=========================================== | 67% 520 MB |=========================================== | 67% 520 MB |=========================================== | 67% 520 MB |=========================================== | 67% 520 MB |=========================================== | 67% 520 MB |=========================================== | 67% 520 MB |=========================================== | 67% 520 MB |=========================================== | 67% 520 MB |=========================================== | 67% 520 MB |=========================================== | 67% 520 MB |=========================================== | 67% 520 MB |=========================================== | 67% 520 MB |=========================================== | 67% 521 MB |=========================================== | 67% 521 MB |=========================================== | 67% 521 MB |=========================================== | 67% 521 MB |=========================================== | 67% 521 MB |=========================================== | 67% 521 MB |=========================================== | 67% 521 MB |=========================================== | 67% 521 MB |=========================================== | 67% 521 MB |=========================================== | 67% 521 MB |=========================================== | 67% 521 MB |============================================ | 67% 521 MB |============================================ | 67% 522 MB |============================================ | 67% 522 MB |============================================ | 67% 522 MB |============================================ | 67% 522 MB |============================================ | 67% 522 MB |============================================ | 67% 522 MB |============================================ | 67% 522 MB |============================================ | 67% 522 MB |============================================ | 67% 522 MB |============================================ | 67% 522 MB |============================================ | 67% 522 MB |============================================ | 67% 523 MB |============================================ | 67% 523 MB |============================================ | 67% 523 MB |============================================ | 67% 523 MB |============================================ | 67% 523 MB |============================================ | 67% 523 MB |============================================ | 67% 523 MB |============================================ | 67% 523 MB |============================================ | 67% 523 MB |============================================ | 67% 523 MB |============================================ | 67% 523 MB |============================================ | 67% 523 MB |============================================ | 67% 524 MB |============================================ | 67% 524 MB |============================================ | 67% 524 MB |============================================ | 67% 524 MB |============================================ | 68% 524 MB |============================================ | 68% 524 MB |============================================ | 68% 524 MB |============================================ | 68% 524 MB |============================================ | 68% 524 MB |============================================ | 68% 524 MB |============================================ | 68% 524 MB |============================================ | 68% 524 MB |============================================ | 68% 525 MB |============================================ | 68% 525 MB |============================================ | 68% 525 MB |============================================ | 68% 525 MB |============================================ | 68% 525 MB |============================================ | 68% 525 MB |============================================ | 68% 525 MB |============================================ | 68% 525 MB |============================================ | 68% 525 MB |============================================ | 68% 525 MB |============================================ | 68% 525 MB |============================================ | 68% 525 MB |============================================ | 68% 526 MB |============================================ | 68% 526 MB |============================================ | 68% 526 MB |============================================ | 68% 526 MB |============================================ | 68% 526 MB |============================================ | 68% 526 MB |============================================ | 68% 526 MB |============================================ | 68% 526 MB |============================================ | 68% 526 MB |============================================ | 68% 526 MB |============================================ | 68% 526 MB |============================================ | 68% 527 MB |============================================ | 68% 527 MB |============================================ | 68% 527 MB |============================================ | 68% 527 MB |============================================ | 68% 527 MB |============================================ | 68% 527 MB |============================================ | 68% 527 MB |============================================ | 68% 527 MB |============================================ | 68% 527 MB |============================================ | 68% 527 MB |============================================ | 68% 527 MB |============================================ | 68% 527 MB |============================================ | 68% 528 MB |============================================ | 68% 528 MB |============================================ | 68% 528 MB |============================================ | 68% 528 MB |============================================ | 68% 528 MB |============================================ | 68% 528 MB |============================================ | 68% 528 MB |============================================ | 68% 528 MB |============================================ | 68% 528 MB |============================================ | 68% 528 MB |============================================ | 68% 528 MB |============================================ | 68% 528 MB |============================================ | 68% 529 MB |============================================ | 68% 529 MB |============================================ | 68% 529 MB |============================================ | 68% 529 MB |============================================ | 68% 529 MB |============================================ | 68% 529 MB |============================================ | 68% 529 MB |============================================ | 68% 529 MB |============================================ | 68% 529 MB |============================================ | 68% 529 MB |============================================ | 68% 529 MB |============================================ | 68% 529 MB |============================================ | 68% 530 MB |============================================ | 68% 530 MB |============================================ | 68% 530 MB |============================================ | 68% 530 MB |============================================ | 68% 530 MB |============================================ | 68% 530 MB |============================================ | 68% 530 MB |============================================ | 68% 530 MB |============================================ | 68% 530 MB |============================================ | 68% 530 MB |============================================ | 68% 530 MB |============================================ | 68% 531 MB |============================================ | 68% 531 MB |============================================ | 68% 531 MB |============================================ | 68% 531 MB |============================================ | 68% 531 MB |============================================ | 68% 531 MB |============================================ | 68% 531 MB |============================================ | 68% 531 MB |============================================ | 68% 531 MB |============================================ | 68% 531 MB |============================================ | 68% 531 MB |============================================ | 68% 531 MB |============================================ | 68% 532 MB |============================================ | 69% 532 MB |============================================ | 69% 532 MB |============================================ | 69% 532 MB |============================================ | 69% 532 MB |============================================ | 69% 532 MB |============================================ | 69% 532 MB |============================================ | 69% 532 MB |============================================ | 69% 532 MB |============================================ | 69% 532 MB |============================================ | 69% 532 MB |============================================ | 69% 532 MB |============================================ | 69% 533 MB |============================================ | 69% 533 MB |============================================ | 69% 533 MB |============================================ | 69% 533 MB |============================================ | 69% 533 MB |============================================ | 69% 533 MB |============================================ | 69% 533 MB |============================================ | 69% 533 MB |============================================ | 69% 533 MB |============================================ | 69% 533 MB |============================================= | 69% 533 MB |============================================= | 69% 534 MB |============================================= | 69% 534 MB |============================================= | 69% 534 MB |============================================= | 69% 534 MB |============================================= | 69% 534 MB |============================================= | 69% 534 MB |============================================= | 69% 534 MB |============================================= | 69% 534 MB |============================================= | 69% 534 MB |============================================= | 69% 534 MB |============================================= | 69% 534 MB |============================================= | 69% 534 MB |============================================= | 69% 535 MB |============================================= | 69% 535 MB |============================================= | 69% 535 MB |============================================= | 69% 535 MB |============================================= | 69% 535 MB |============================================= | 69% 535 MB |============================================= | 69% 535 MB |============================================= | 69% 535 MB |============================================= | 69% 535 MB |============================================= | 69% 535 MB |============================================= | 69% 535 MB |============================================= | 69% 535 MB |============================================= | 69% 536 MB |============================================= | 69% 536 MB |============================================= | 69% 536 MB |============================================= | 69% 536 MB |============================================= | 69% 536 MB |============================================= | 69% 536 MB |============================================= | 69% 536 MB |============================================= | 69% 536 MB |============================================= | 69% 536 MB |============================================= | 69% 536 MB |============================================= | 69% 536 MB |============================================= | 69% 536 MB |============================================= | 69% 537 MB |============================================= | 69% 537 MB |============================================= | 69% 537 MB |============================================= | 69% 537 MB |============================================= | 69% 537 MB |============================================= | 69% 537 MB |============================================= | 69% 537 MB |============================================= | 69% 537 MB |============================================= | 69% 537 MB |============================================= | 69% 537 MB |============================================= | 69% 537 MB |============================================= | 69% 538 MB |============================================= | 69% 538 MB |============================================= | 69% 538 MB |============================================= | 69% 538 MB |============================================= | 69% 538 MB |============================================= | 69% 538 MB |============================================= | 69% 538 MB |============================================= | 69% 538 MB |============================================= | 69% 538 MB |============================================= | 69% 538 MB |============================================= | 69% 538 MB |============================================= | 69% 538 MB |============================================= | 69% 539 MB |============================================= | 69% 539 MB |============================================= | 69% 539 MB |============================================= | 69% 539 MB |============================================= | 69% 539 MB |============================================= | 69% 539 MB |============================================= | 69% 539 MB |============================================= | 69% 539 MB |============================================= | 69% 539 MB |============================================= | 70% 539 MB |============================================= | 70% 539 MB |============================================= | 70% 539 MB |============================================= | 70% 540 MB |============================================= | 70% 540 MB |============================================= | 70% 540 MB |============================================= | 70% 540 MB |============================================= | 70% 540 MB |============================================= | 70% 540 MB |============================================= | 70% 540 MB |============================================= | 70% 540 MB |============================================= | 70% 540 MB |============================================= | 70% 540 MB |============================================= | 70% 540 MB |============================================= | 70% 540 MB |============================================= | 70% 541 MB |============================================= | 70% 541 MB |============================================= | 70% 541 MB |============================================= | 70% 541 MB |============================================= | 70% 541 MB |============================================= | 70% 541 MB |============================================= | 70% 541 MB |============================================= | 70% 541 MB |============================================= | 70% 541 MB |============================================= | 70% 541 MB |============================================= | 70% 541 MB |============================================= | 70% 541 MB |============================================= | 70% 542 MB |============================================= | 70% 542 MB |============================================= | 70% 542 MB |============================================= | 70% 542 MB |============================================= | 70% 542 MB |============================================= | 70% 542 MB |============================================= | 70% 542 MB |============================================= | 70% 542 MB |============================================= | 70% 542 MB |============================================= | 70% 542 MB |============================================= | 70% 542 MB |============================================= | 70% 543 MB |============================================= | 70% 543 MB |============================================= | 70% 543 MB |============================================= | 70% 543 MB |============================================= | 70% 543 MB |============================================= | 70% 543 MB |============================================= | 70% 543 MB |============================================= | 70% 543 MB |============================================= | 70% 543 MB |============================================= | 70% 543 MB |============================================= | 70% 543 MB |============================================= | 70% 543 MB |============================================= | 70% 544 MB |============================================= | 70% 544 MB |============================================= | 70% 544 MB |============================================= | 70% 544 MB |============================================= | 70% 544 MB |============================================= | 70% 544 MB |============================================= | 70% 544 MB |============================================= | 70% 544 MB |============================================= | 70% 544 MB |============================================= | 70% 544 MB |============================================= | 70% 544 MB |============================================= | 70% 544 MB |============================================= | 70% 545 MB |============================================= | 70% 545 MB |============================================= | 70% 545 MB |============================================= | 70% 545 MB |============================================= | 70% 545 MB |============================================= | 70% 545 MB |============================================= | 70% 545 MB |============================================= | 70% 545 MB |============================================== | 70% 545 MB |============================================== | 70% 545 MB |============================================== | 70% 545 MB |============================================== | 70% 546 MB |============================================== | 70% 546 MB |============================================== | 70% 546 MB |============================================== | 70% 546 MB |============================================== | 70% 546 MB |============================================== | 70% 546 MB |============================================== | 70% 546 MB |============================================== | 70% 546 MB |============================================== | 70% 546 MB |============================================== | 70% 546 MB |============================================== | 70% 546 MB |============================================== | 70% 546 MB |============================================== | 70% 547 MB |============================================== | 70% 547 MB |============================================== | 70% 547 MB |============================================== | 70% 547 MB |============================================== | 70% 547 MB |============================================== | 70% 547 MB |============================================== | 71% 547 MB |============================================== | 71% 547 MB |============================================== | 71% 547 MB |============================================== | 71% 547 MB |============================================== | 71% 547 MB |============================================== | 71% 547 MB |============================================== | 71% 548 MB |============================================== | 71% 548 MB |============================================== | 71% 548 MB |============================================== | 71% 548 MB |============================================== | 71% 548 MB |============================================== | 71% 548 MB |============================================== | 71% 548 MB |============================================== | 71% 548 MB |============================================== | 71% 548 MB |============================================== | 71% 548 MB |============================================== | 71% 548 MB |============================================== | 71% 548 MB |============================================== | 71% 549 MB |============================================== | 71% 549 MB |============================================== | 71% 549 MB |============================================== | 71% 549 MB |============================================== | 71% 549 MB |============================================== | 71% 549 MB |============================================== | 71% 549 MB |============================================== | 71% 549 MB |============================================== | 71% 549 MB |============================================== | 71% 549 MB |============================================== | 71% 549 MB |============================================== | 71% 550 MB |============================================== | 71% 550 MB |============================================== | 71% 550 MB |============================================== | 71% 550 MB |============================================== | 71% 550 MB |============================================== | 71% 550 MB |============================================== | 71% 550 MB |============================================== | 71% 550 MB |============================================== | 71% 550 MB |============================================== | 71% 550 MB |============================================== | 71% 550 MB |============================================== | 71% 550 MB |============================================== | 71% 551 MB |============================================== | 71% 551 MB |============================================== | 71% 551 MB |============================================== | 71% 551 MB |============================================== | 71% 551 MB |============================================== | 71% 551 MB |============================================== | 71% 551 MB |============================================== | 71% 551 MB |============================================== | 71% 551 MB |============================================== | 71% 551 MB |============================================== | 71% 551 MB |============================================== | 71% 551 MB |============================================== | 71% 552 MB |============================================== | 71% 552 MB |============================================== | 71% 552 MB |============================================== | 71% 552 MB |============================================== | 71% 552 MB |============================================== | 71% 552 MB |============================================== | 71% 552 MB |============================================== | 71% 552 MB |============================================== | 71% 552 MB |============================================== | 71% 552 MB |============================================== | 71% 552 MB |============================================== | 71% 553 MB |============================================== | 71% 553 MB |============================================== | 71% 553 MB |============================================== | 71% 553 MB |============================================== | 71% 553 MB |============================================== | 71% 553 MB |============================================== | 71% 553 MB |============================================== | 71% 553 MB |============================================== | 71% 553 MB |============================================== | 71% 553 MB |============================================== | 71% 553 MB |============================================== | 71% 553 MB |============================================== | 71% 554 MB |============================================== | 71% 554 MB |============================================== | 71% 554 MB |============================================== | 71% 554 MB |============================================== | 71% 554 MB |============================================== | 71% 554 MB |============================================== | 71% 554 MB |============================================== | 71% 554 MB |============================================== | 71% 554 MB |============================================== | 71% 554 MB |============================================== | 71% 554 MB |============================================== | 71% 554 MB |============================================== | 71% 555 MB |============================================== | 71% 555 MB |============================================== | 72% 555 MB |============================================== | 72% 555 MB |============================================== | 72% 555 MB |============================================== | 72% 555 MB |============================================== | 72% 555 MB |============================================== | 72% 555 MB |============================================== | 72% 555 MB |============================================== | 72% 555 MB |============================================== | 72% 555 MB |============================================== | 72% 555 MB |============================================== | 72% 556 MB |============================================== | 72% 556 MB |============================================== | 72% 556 MB |============================================== | 72% 556 MB |============================================== | 72% 556 MB |============================================== | 72% 556 MB |============================================== | 72% 556 MB |============================================== | 72% 556 MB |============================================== | 72% 556 MB |============================================== | 72% 556 MB |============================================== | 72% 556 MB |============================================== | 72% 557 MB |============================================== | 72% 557 MB |============================================== | 72% 557 MB |============================================== | 72% 557 MB |============================================== | 72% 557 MB |============================================== | 72% 557 MB |============================================== | 72% 557 MB |=============================================== | 72% 557 MB |=============================================== | 72% 557 MB |=============================================== | 72% 557 MB |=============================================== | 72% 557 MB |=============================================== | 72% 557 MB |=============================================== | 72% 558 MB |=============================================== | 72% 558 MB |=============================================== | 72% 558 MB |=============================================== | 72% 558 MB |=============================================== | 72% 558 MB |=============================================== | 72% 558 MB |=============================================== | 72% 558 MB |=============================================== | 72% 558 MB |=============================================== | 72% 558 MB |=============================================== | 72% 558 MB |=============================================== | 72% 558 MB |=============================================== | 72% 558 MB |=============================================== | 72% 559 MB |=============================================== | 72% 559 MB |=============================================== | 72% 559 MB |=============================================== | 72% 559 MB |=============================================== | 72% 559 MB |=============================================== | 72% 559 MB |=============================================== | 72% 559 MB |=============================================== | 72% 559 MB |=============================================== | 72% 559 MB |=============================================== | 72% 559 MB |=============================================== | 72% 559 MB |=============================================== | 72% 559 MB |=============================================== | 72% 560 MB |=============================================== | 72% 560 MB |=============================================== | 72% 560 MB |=============================================== | 72% 560 MB |=============================================== | 72% 560 MB |=============================================== | 72% 560 MB |=============================================== | 72% 560 MB |=============================================== | 72% 560 MB |=============================================== | 72% 560 MB |=============================================== | 72% 560 MB |=============================================== | 72% 560 MB |=============================================== | 72% 561 MB |=============================================== | 72% 561 MB |=============================================== | 72% 561 MB |=============================================== | 72% 561 MB |=============================================== | 72% 561 MB |=============================================== | 72% 561 MB |=============================================== | 72% 561 MB |=============================================== | 72% 561 MB |=============================================== | 72% 561 MB |=============================================== | 72% 561 MB |=============================================== | 72% 561 MB |=============================================== | 72% 561 MB |=============================================== | 72% 562 MB |=============================================== | 72% 562 MB |=============================================== | 72% 562 MB |=============================================== | 72% 562 MB |=============================================== | 72% 562 MB |=============================================== | 72% 562 MB |=============================================== | 72% 562 MB |=============================================== | 72% 562 MB |=============================================== | 72% 562 MB |=============================================== | 72% 562 MB |=============================================== | 72% 562 MB |=============================================== | 73% 562 MB |=============================================== | 73% 563 MB |=============================================== | 73% 563 MB |=============================================== | 73% 563 MB |=============================================== | 73% 563 MB |=============================================== | 73% 563 MB |=============================================== | 73% 563 MB |=============================================== | 73% 563 MB |=============================================== | 73% 563 MB |=============================================== | 73% 563 MB |=============================================== | 73% 563 MB |=============================================== | 73% 563 MB |=============================================== | 73% 563 MB |=============================================== | 73% 564 MB |=============================================== | 73% 564 MB |=============================================== | 73% 564 MB |=============================================== | 73% 564 MB |=============================================== | 73% 564 MB |=============================================== | 73% 564 MB |=============================================== | 73% 564 MB |=============================================== | 73% 564 MB |=============================================== | 73% 564 MB |=============================================== | 73% 564 MB |=============================================== | 73% 564 MB |=============================================== | 73% 564 MB |=============================================== | 73% 565 MB |=============================================== | 73% 565 MB |=============================================== | 73% 565 MB |=============================================== | 73% 565 MB |=============================================== | 73% 565 MB |=============================================== | 73% 565 MB |=============================================== | 73% 565 MB |=============================================== | 73% 565 MB |=============================================== | 73% 565 MB |=============================================== | 73% 565 MB |=============================================== | 73% 565 MB |=============================================== | 73% 566 MB |=============================================== | 73% 566 MB |=============================================== | 73% 566 MB |=============================================== | 73% 566 MB |=============================================== | 73% 566 MB |=============================================== | 73% 566 MB |=============================================== | 73% 566 MB |=============================================== | 73% 566 MB |=============================================== | 73% 566 MB |=============================================== | 73% 566 MB |=============================================== | 73% 566 MB |=============================================== | 73% 566 MB |=============================================== | 73% 567 MB |=============================================== | 73% 567 MB |=============================================== | 73% 567 MB |=============================================== | 73% 567 MB |=============================================== | 73% 567 MB |=============================================== | 73% 567 MB |=============================================== | 73% 567 MB |=============================================== | 73% 567 MB |=============================================== | 73% 567 MB |=============================================== | 73% 567 MB |=============================================== | 73% 567 MB |=============================================== | 73% 567 MB |=============================================== | 73% 568 MB |=============================================== | 73% 568 MB |=============================================== | 73% 568 MB |=============================================== | 73% 568 MB |=============================================== | 73% 568 MB |=============================================== | 73% 568 MB |=============================================== | 73% 568 MB |=============================================== | 73% 568 MB |=============================================== | 73% 568 MB |=============================================== | 73% 568 MB |=============================================== | 73% 568 MB |=============================================== | 73% 568 MB |=============================================== | 73% 569 MB |=============================================== | 73% 569 MB |=============================================== | 73% 569 MB |=============================================== | 73% 569 MB |=============================================== | 73% 569 MB |================================================ | 73% 569 MB |================================================ | 73% 569 MB |================================================ | 73% 569 MB |================================================ | 73% 569 MB |================================================ | 73% 569 MB |================================================ | 73% 569 MB |================================================ | 73% 569 MB |================================================ | 73% 570 MB |================================================ | 73% 570 MB |================================================ | 73% 570 MB |================================================ | 73% 570 MB |================================================ | 73% 570 MB |================================================ | 73% 570 MB |================================================ | 73% 570 MB |================================================ | 73% 570 MB |================================================ | 74% 570 MB |================================================ | 74% 570 MB |================================================ | 74% 570 MB |================================================ | 74% 570 MB |================================================ | 74% 571 MB |================================================ | 74% 571 MB |================================================ | 74% 571 MB |================================================ | 74% 571 MB |================================================ | 74% 571 MB |================================================ | 74% 571 MB |================================================ | 74% 571 MB |================================================ | 74% 571 MB |================================================ | 74% 571 MB |================================================ | 74% 571 MB |================================================ | 74% 571 MB |================================================ | 74% 571 MB |================================================ | 74% 572 MB |================================================ | 74% 572 MB |================================================ | 74% 572 MB |================================================ | 74% 572 MB |================================================ | 74% 572 MB |================================================ | 74% 572 MB |================================================ | 74% 572 MB |================================================ | 74% 572 MB |================================================ | 74% 572 MB |================================================ | 74% 572 MB |================================================ | 74% 572 MB |================================================ | 74% 572 MB |================================================ | 74% 573 MB |================================================ | 74% 573 MB |================================================ | 74% 573 MB |================================================ | 74% 573 MB |================================================ | 74% 573 MB |================================================ | 74% 573 MB |================================================ | 74% 573 MB |================================================ | 74% 573 MB |================================================ | 74% 573 MB |================================================ | 74% 573 MB |================================================ | 74% 573 MB |================================================ | 74% 573 MB |================================================ | 74% 574 MB |================================================ | 74% 574 MB |================================================ | 74% 574 MB |================================================ | 74% 574 MB |================================================ | 74% 574 MB |================================================ | 74% 574 MB |================================================ | 74% 574 MB |================================================ | 74% 574 MB |================================================ | 74% 574 MB |================================================ | 74% 574 MB |================================================ | 74% 574 MB |================================================ | 74% 574 MB |================================================ | 74% 575 MB |================================================ | 74% 575 MB |================================================ | 74% 575 MB |================================================ | 74% 575 MB |================================================ | 74% 575 MB |================================================ | 74% 575 MB |================================================ | 74% 575 MB |================================================ | 74% 575 MB |================================================ | 74% 575 MB |================================================ | 74% 575 MB |================================================ | 74% 575 MB |================================================ | 74% 575 MB |================================================ | 74% 576 MB |================================================ | 74% 576 MB |================================================ | 74% 576 MB |================================================ | 74% 576 MB |================================================ | 74% 576 MB |================================================ | 74% 576 MB |================================================ | 74% 576 MB |================================================ | 74% 576 MB |================================================ | 74% 576 MB |================================================ | 74% 576 MB |================================================ | 74% 576 MB |================================================ | 74% 576 MB |================================================ | 74% 577 MB |================================================ | 74% 577 MB |================================================ | 74% 577 MB |================================================ | 74% 577 MB |================================================ | 74% 577 MB |================================================ | 74% 577 MB |================================================ | 74% 577 MB |================================================ | 74% 577 MB |================================================ | 74% 577 MB |================================================ | 74% 577 MB |================================================ | 74% 577 MB |================================================ | 74% 577 MB |================================================ | 74% 578 MB |================================================ | 74% 578 MB |================================================ | 74% 578 MB |================================================ | 74% 578 MB |================================================ | 75% 578 MB |================================================ | 75% 578 MB |================================================ | 75% 578 MB |================================================ | 75% 578 MB |================================================ | 75% 578 MB |================================================ | 75% 578 MB |================================================ | 75% 578 MB |================================================ | 75% 578 MB |================================================ | 75% 579 MB |================================================ | 75% 579 MB |================================================ | 75% 579 MB |================================================ | 75% 579 MB |================================================ | 75% 579 MB |================================================ | 75% 579 MB |================================================ | 75% 579 MB |================================================ | 75% 579 MB |================================================ | 75% 579 MB |================================================ | 75% 579 MB |================================================ | 75% 579 MB |================================================ | 75% 579 MB |================================================ | 75% 580 MB |================================================ | 75% 580 MB |================================================ | 75% 580 MB |================================================ | 75% 580 MB |================================================ | 75% 580 MB |================================================ | 75% 580 MB |================================================ | 75% 580 MB |================================================ | 75% 580 MB |================================================ | 75% 580 MB |================================================ | 75% 580 MB |================================================ | 75% 580 MB |================================================ | 75% 580 MB |================================================ | 75% 581 MB |================================================ | 75% 581 MB |================================================ | 75% 581 MB |================================================ | 75% 581 MB |================================================= | 75% 581 MB |================================================= | 75% 581 MB |================================================= | 75% 581 MB |================================================= | 75% 581 MB |================================================= | 75% 581 MB |================================================= | 75% 581 MB |================================================= | 75% 581 MB |================================================= | 75% 581 MB |================================================= | 75% 582 MB |================================================= | 75% 582 MB |================================================= | 75% 582 MB |================================================= | 75% 582 MB |================================================= | 75% 582 MB |================================================= | 75% 582 MB |================================================= | 75% 582 MB |================================================= | 75% 582 MB |================================================= | 75% 582 MB |================================================= | 75% 582 MB |================================================= | 75% 582 MB |================================================= | 75% 582 MB |================================================= | 75% 583 MB |================================================= | 75% 583 MB |================================================= | 75% 583 MB |================================================= | 75% 583 MB |================================================= | 75% 583 MB |================================================= | 75% 583 MB |================================================= | 75% 583 MB |================================================= | 75% 583 MB |================================================= | 75% 583 MB |================================================= | 75% 583 MB |================================================= | 75% 583 MB |================================================= | 75% 583 MB |================================================= | 75% 584 MB |================================================= | 75% 584 MB |================================================= | 75% 584 MB |================================================= | 75% 584 MB |================================================= | 75% 584 MB |================================================= | 75% 584 MB |================================================= | 75% 584 MB |================================================= | 75% 584 MB |================================================= | 75% 584 MB |================================================= | 75% 584 MB |================================================= | 75% 584 MB |================================================= | 75% 584 MB |================================================= | 75% 585 MB |================================================= | 75% 585 MB |================================================= | 75% 585 MB |================================================= | 75% 585 MB |================================================= | 75% 585 MB |================================================= | 75% 585 MB |================================================= | 75% 585 MB |================================================= | 75% 585 MB |================================================= | 75% 585 MB |================================================= | 75% 585 MB |================================================= | 75% 585 MB |================================================= | 75% 585 MB |================================================= | 75% 586 MB |================================================= | 76% 586 MB |================================================= | 76% 586 MB |================================================= | 76% 586 MB |================================================= | 76% 586 MB |================================================= | 76% 586 MB |================================================= | 76% 586 MB |================================================= | 76% 586 MB |================================================= | 76% 586 MB |================================================= | 76% 586 MB |================================================= | 76% 586 MB |================================================= | 76% 586 MB |================================================= | 76% 587 MB |================================================= | 76% 587 MB |================================================= | 76% 587 MB |================================================= | 76% 587 MB |================================================= | 76% 587 MB |================================================= | 76% 587 MB |================================================= | 76% 587 MB |================================================= | 76% 587 MB |================================================= | 76% 587 MB |================================================= | 76% 587 MB |================================================= | 76% 587 MB |================================================= | 76% 587 MB |================================================= | 76% 588 MB |================================================= | 76% 588 MB |================================================= | 76% 588 MB |================================================= | 76% 588 MB |================================================= | 76% 588 MB |================================================= | 76% 588 MB |================================================= | 76% 588 MB |================================================= | 76% 588 MB |================================================= | 76% 588 MB |================================================= | 76% 588 MB |================================================= | 76% 588 MB |================================================= | 76% 588 MB |================================================= | 76% 589 MB |================================================= | 76% 589 MB |================================================= | 76% 589 MB |================================================= | 76% 589 MB |================================================= | 76% 589 MB |================================================= | 76% 589 MB |================================================= | 76% 589 MB |================================================= | 76% 589 MB |================================================= | 76% 589 MB |================================================= | 76% 589 MB |================================================= | 76% 589 MB |================================================= | 76% 589 MB |================================================= | 76% 590 MB |================================================= | 76% 590 MB |================================================= | 76% 590 MB |================================================= | 76% 590 MB |================================================= | 76% 590 MB |================================================= | 76% 590 MB |================================================= | 76% 590 MB |================================================= | 76% 590 MB |================================================= | 76% 590 MB |================================================= | 76% 590 MB |================================================= | 76% 590 MB |================================================= | 76% 590 MB |================================================= | 76% 591 MB |================================================= | 76% 591 MB |================================================= | 76% 591 MB |================================================= | 76% 591 MB |================================================= | 76% 591 MB |================================================= | 76% 591 MB |================================================= | 76% 591 MB |================================================= | 76% 591 MB |================================================= | 76% 591 MB |================================================= | 76% 591 MB |================================================= | 76% 591 MB |================================================= | 76% 591 MB |================================================= | 76% 592 MB |================================================= | 76% 592 MB |================================================= | 76% 592 MB |================================================= | 76% 592 MB |================================================= | 76% 592 MB |================================================= | 76% 592 MB |================================================= | 76% 592 MB |================================================= | 76% 592 MB |================================================= | 76% 592 MB |================================================= | 76% 592 MB |================================================= | 76% 592 MB |================================================= | 76% 592 MB |================================================= | 76% 593 MB |================================================= | 76% 593 MB |================================================== | 76% 593 MB |================================================== | 76% 593 MB |================================================== | 76% 593 MB |================================================== | 76% 593 MB |================================================== | 76% 593 MB |================================================== | 76% 593 MB |================================================== | 76% 593 MB |================================================== | 77% 593 MB |================================================== | 77% 593 MB |================================================== | 77% 593 MB |================================================== | 77% 594 MB |================================================== | 77% 594 MB |================================================== | 77% 594 MB |================================================== | 77% 594 MB |================================================== | 77% 594 MB |================================================== | 77% 594 MB |================================================== | 77% 594 MB |================================================== | 77% 594 MB |================================================== | 77% 594 MB |================================================== | 77% 594 MB |================================================== | 77% 594 MB |================================================== | 77% 594 MB |================================================== | 77% 594 MB |================================================== | 77% 595 MB |================================================== | 77% 595 MB |================================================== | 77% 595 MB |================================================== | 77% 595 MB |================================================== | 77% 595 MB |================================================== | 77% 595 MB |================================================== | 77% 595 MB |================================================== | 77% 595 MB |================================================== | 77% 595 MB |================================================== | 77% 595 MB |================================================== | 77% 595 MB |================================================== | 77% 595 MB |================================================== | 77% 596 MB |================================================== | 77% 596 MB |================================================== | 77% 596 MB |================================================== | 77% 596 MB |================================================== | 77% 596 MB |================================================== | 77% 596 MB |================================================== | 77% 596 MB |================================================== | 77% 596 MB |================================================== | 77% 596 MB |================================================== | 77% 596 MB |================================================== | 77% 596 MB |================================================== | 77% 596 MB |================================================== | 77% 597 MB |================================================== | 77% 597 MB |================================================== | 77% 597 MB |================================================== | 77% 597 MB |================================================== | 77% 597 MB |================================================== | 77% 597 MB |================================================== | 77% 597 MB |================================================== | 77% 597 MB |================================================== | 77% 597 MB |================================================== | 77% 597 MB |================================================== | 77% 597 MB |================================================== | 77% 597 MB |================================================== | 77% 598 MB |================================================== | 77% 598 MB |================================================== | 77% 598 MB |================================================== | 77% 598 MB |================================================== | 77% 598 MB |================================================== | 77% 598 MB |================================================== | 77% 598 MB |================================================== | 77% 598 MB |================================================== | 77% 598 MB |================================================== | 77% 598 MB |================================================== | 77% 598 MB |================================================== | 77% 598 MB |================================================== | 77% 599 MB |================================================== | 77% 599 MB |================================================== | 77% 599 MB |================================================== | 77% 599 MB |================================================== | 77% 599 MB |================================================== | 77% 599 MB |================================================== | 77% 599 MB |================================================== | 77% 599 MB |================================================== | 77% 599 MB |================================================== | 77% 599 MB |================================================== | 77% 599 MB |================================================== | 77% 599 MB |================================================== | 77% 600 MB |================================================== | 77% 600 MB |================================================== | 77% 600 MB |================================================== | 77% 600 MB |================================================== | 77% 600 MB |================================================== | 77% 600 MB |================================================== | 77% 600 MB |================================================== | 77% 600 MB |================================================== | 77% 600 MB |================================================== | 77% 600 MB |================================================== | 77% 600 MB |================================================== | 77% 600 MB |================================================== | 77% 601 MB |================================================== | 77% 601 MB |================================================== | 77% 601 MB |================================================== | 77% 601 MB |================================================== | 77% 601 MB |================================================== | 77% 601 MB |================================================== | 78% 601 MB |================================================== | 78% 601 MB |================================================== | 78% 601 MB |================================================== | 78% 601 MB |================================================== | 78% 601 MB |================================================== | 78% 601 MB |================================================== | 78% 601 MB |================================================== | 78% 602 MB |================================================== | 78% 602 MB |================================================== | 78% 602 MB |================================================== | 78% 602 MB |================================================== | 78% 602 MB |================================================== | 78% 602 MB |================================================== | 78% 602 MB |================================================== | 78% 602 MB |================================================== | 78% 602 MB |================================================== | 78% 602 MB |================================================== | 78% 602 MB |================================================== | 78% 602 MB |================================================== | 78% 603 MB |================================================== | 78% 603 MB |================================================== | 78% 603 MB |================================================== | 78% 603 MB |================================================== | 78% 603 MB |================================================== | 78% 603 MB |================================================== | 78% 603 MB |================================================== | 78% 603 MB |================================================== | 78% 603 MB |================================================== | 78% 603 MB |================================================== | 78% 603 MB |================================================== | 78% 603 MB |================================================== | 78% 604 MB |================================================== | 78% 604 MB |================================================== | 78% 604 MB |================================================== | 78% 604 MB |================================================== | 78% 604 MB |================================================== | 78% 604 MB |================================================== | 78% 604 MB |================================================== | 78% 604 MB |================================================== | 78% 604 MB |================================================== | 78% 604 MB |================================================== | 78% 604 MB |================================================== | 78% 604 MB |=================================================== | 78% 605 MB |=================================================== | 78% 605 MB |=================================================== | 78% 605 MB |=================================================== | 78% 605 MB |=================================================== | 78% 605 MB |=================================================== | 78% 605 MB |=================================================== | 78% 605 MB |=================================================== | 78% 605 MB |=================================================== | 78% 605 MB |=================================================== | 78% 605 MB |=================================================== | 78% 605 MB |=================================================== | 78% 605 MB |=================================================== | 78% 606 MB |=================================================== | 78% 606 MB |=================================================== | 78% 606 MB |=================================================== | 78% 606 MB |=================================================== | 78% 606 MB |=================================================== | 78% 606 MB |=================================================== | 78% 606 MB |=================================================== | 78% 606 MB |=================================================== | 78% 606 MB |=================================================== | 78% 606 MB |=================================================== | 78% 606 MB |=================================================== | 78% 606 MB |=================================================== | 78% 607 MB |=================================================== | 78% 607 MB |=================================================== | 78% 607 MB |=================================================== | 78% 607 MB |=================================================== | 78% 607 MB |=================================================== | 78% 607 MB |=================================================== | 78% 607 MB |=================================================== | 78% 607 MB |=================================================== | 78% 607 MB |=================================================== | 78% 607 MB |=================================================== | 78% 607 MB |=================================================== | 78% 607 MB |=================================================== | 78% 608 MB |=================================================== | 78% 608 MB |=================================================== | 78% 608 MB |=================================================== | 78% 608 MB |=================================================== | 78% 608 MB |=================================================== | 78% 608 MB |=================================================== | 78% 608 MB |=================================================== | 78% 608 MB |=================================================== | 78% 608 MB |=================================================== | 78% 608 MB |=================================================== | 78% 608 MB |=================================================== | 78% 608 MB |=================================================== | 78% 608 MB |=================================================== | 78% 609 MB |=================================================== | 78% 609 MB |=================================================== | 79% 609 MB |=================================================== | 79% 609 MB |=================================================== | 79% 609 MB |=================================================== | 79% 609 MB |=================================================== | 79% 609 MB |=================================================== | 79% 609 MB |=================================================== | 79% 609 MB |=================================================== | 79% 609 MB |=================================================== | 79% 609 MB |=================================================== | 79% 609 MB |=================================================== | 79% 610 MB |=================================================== | 79% 610 MB |=================================================== | 79% 610 MB |=================================================== | 79% 610 MB |=================================================== | 79% 610 MB |=================================================== | 79% 610 MB |=================================================== | 79% 610 MB |=================================================== | 79% 610 MB |=================================================== | 79% 610 MB |=================================================== | 79% 610 MB |=================================================== | 79% 610 MB |=================================================== | 79% 610 MB |=================================================== | 79% 611 MB |=================================================== | 79% 611 MB |=================================================== | 79% 611 MB |=================================================== | 79% 611 MB |=================================================== | 79% 611 MB |=================================================== | 79% 611 MB |=================================================== | 79% 611 MB |=================================================== | 79% 611 MB |=================================================== | 79% 611 MB |=================================================== | 79% 611 MB |=================================================== | 79% 611 MB |=================================================== | 79% 611 MB |=================================================== | 79% 612 MB |=================================================== | 79% 612 MB |=================================================== | 79% 612 MB |=================================================== | 79% 612 MB |=================================================== | 79% 612 MB |=================================================== | 79% 612 MB |=================================================== | 79% 612 MB |=================================================== | 79% 612 MB |=================================================== | 79% 612 MB |=================================================== | 79% 612 MB |=================================================== | 79% 612 MB |=================================================== | 79% 612 MB |=================================================== | 79% 613 MB |=================================================== | 79% 613 MB |=================================================== | 79% 613 MB |=================================================== | 79% 613 MB |=================================================== | 79% 613 MB |=================================================== | 79% 613 MB |=================================================== | 79% 613 MB |=================================================== | 79% 613 MB |=================================================== | 79% 613 MB |=================================================== | 79% 613 MB |=================================================== | 79% 613 MB |=================================================== | 79% 613 MB |=================================================== | 79% 613 MB |=================================================== | 79% 614 MB |=================================================== | 79% 614 MB |=================================================== | 79% 614 MB |=================================================== | 79% 614 MB |=================================================== | 79% 614 MB |=================================================== | 79% 614 MB |=================================================== | 79% 614 MB |=================================================== | 79% 614 MB |=================================================== | 79% 614 MB |=================================================== | 79% 614 MB |=================================================== | 79% 614 MB |=================================================== | 79% 614 MB |=================================================== | 79% 615 MB |=================================================== | 79% 615 MB |=================================================== | 79% 615 MB |=================================================== | 79% 615 MB |=================================================== | 79% 615 MB |=================================================== | 79% 615 MB |=================================================== | 79% 615 MB |=================================================== | 79% 615 MB |=================================================== | 79% 615 MB |=================================================== | 79% 615 MB |=================================================== | 79% 615 MB |=================================================== | 79% 615 MB |=================================================== | 79% 616 MB |=================================================== | 79% 616 MB |=================================================== | 79% 616 MB |=================================================== | 79% 616 MB |=================================================== | 79% 616 MB |=================================================== | 79% 616 MB |=================================================== | 79% 616 MB |=================================================== | 79% 616 MB |=================================================== | 79% 616 MB |=================================================== | 79% 616 MB |=================================================== | 79% 616 MB |==================================================== | 80% 616 MB |==================================================== | 80% 617 MB |==================================================== | 80% 617 MB |==================================================== | 80% 617 MB |==================================================== | 80% 617 MB |==================================================== | 80% 617 MB |==================================================== | 80% 617 MB |==================================================== | 80% 617 MB |==================================================== | 80% 617 MB |==================================================== | 80% 617 MB |==================================================== | 80% 617 MB |==================================================== | 80% 617 MB |==================================================== | 80% 617 MB |==================================================== | 80% 618 MB |==================================================== | 80% 618 MB |==================================================== | 80% 618 MB |==================================================== | 80% 618 MB |==================================================== | 80% 618 MB |==================================================== | 80% 618 MB |==================================================== | 80% 618 MB |==================================================== | 80% 618 MB |==================================================== | 80% 618 MB |==================================================== | 80% 618 MB |==================================================== | 80% 618 MB |==================================================== | 80% 618 MB |==================================================== | 80% 619 MB |==================================================== | 80% 619 MB |==================================================== | 80% 619 MB |==================================================== | 80% 619 MB |==================================================== | 80% 619 MB |==================================================== | 80% 619 MB |==================================================== | 80% 619 MB |==================================================== | 80% 619 MB |==================================================== | 80% 619 MB |==================================================== | 80% 619 MB |==================================================== | 80% 619 MB |==================================================== | 80% 619 MB |==================================================== | 80% 619 MB |==================================================== | 80% 620 MB |==================================================== | 80% 620 MB |==================================================== | 80% 620 MB |==================================================== | 80% 620 MB |==================================================== | 80% 620 MB |==================================================== | 80% 620 MB |==================================================== | 80% 620 MB |==================================================== | 80% 620 MB |==================================================== | 80% 620 MB |==================================================== | 80% 620 MB |==================================================== | 80% 620 MB |==================================================== | 80% 620 MB |==================================================== | 80% 621 MB |==================================================== | 80% 621 MB |==================================================== | 80% 621 MB |==================================================== | 80% 621 MB |==================================================== | 80% 621 MB |==================================================== | 80% 621 MB |==================================================== | 80% 621 MB |==================================================== | 80% 621 MB |==================================================== | 80% 621 MB |==================================================== | 80% 621 MB |==================================================== | 80% 621 MB |==================================================== | 80% 621 MB |==================================================== | 80% 622 MB |==================================================== | 80% 622 MB |==================================================== | 80% 622 MB |==================================================== | 80% 622 MB |==================================================== | 80% 622 MB |==================================================== | 80% 622 MB |==================================================== | 80% 622 MB |==================================================== | 80% 622 MB |==================================================== | 80% 622 MB |==================================================== | 80% 622 MB |==================================================== | 80% 622 MB |==================================================== | 80% 622 MB |==================================================== | 80% 623 MB |==================================================== | 80% 623 MB |==================================================== | 80% 623 MB |==================================================== | 80% 623 MB |==================================================== | 80% 623 MB |==================================================== | 80% 623 MB |==================================================== | 80% 623 MB |==================================================== | 80% 623 MB |==================================================== | 80% 623 MB |==================================================== | 80% 623 MB |==================================================== | 80% 623 MB |==================================================== | 80% 623 MB |==================================================== | 80% 624 MB |==================================================== | 80% 624 MB |==================================================== | 80% 624 MB |==================================================== | 80% 624 MB |==================================================== | 80% 624 MB |==================================================== | 80% 624 MB |==================================================== | 80% 624 MB |==================================================== | 80% 624 MB |==================================================== | 81% 624 MB |==================================================== | 81% 624 MB |==================================================== | 81% 624 MB |==================================================== | 81% 624 MB |==================================================== | 81% 625 MB |==================================================== | 81% 625 MB |==================================================== | 81% 625 MB |==================================================== | 81% 625 MB |==================================================== | 81% 625 MB |==================================================== | 81% 625 MB |==================================================== | 81% 625 MB |==================================================== | 81% 625 MB |==================================================== | 81% 625 MB |==================================================== | 81% 625 MB |==================================================== | 81% 625 MB |==================================================== | 81% 625 MB |==================================================== | 81% 626 MB |==================================================== | 81% 626 MB |==================================================== | 81% 626 MB |==================================================== | 81% 626 MB |==================================================== | 81% 626 MB |==================================================== | 81% 626 MB |==================================================== | 81% 626 MB |==================================================== | 81% 626 MB |==================================================== | 81% 626 MB |==================================================== | 81% 626 MB |==================================================== | 81% 626 MB |==================================================== | 81% 626 MB |==================================================== | 81% 626 MB |==================================================== | 81% 627 MB |==================================================== | 81% 627 MB |==================================================== | 81% 627 MB |==================================================== | 81% 627 MB |==================================================== | 81% 627 MB |==================================================== | 81% 627 MB |==================================================== | 81% 627 MB |==================================================== | 81% 627 MB |==================================================== | 81% 627 MB |==================================================== | 81% 627 MB |==================================================== | 81% 627 MB |==================================================== | 81% 627 MB |==================================================== | 81% 628 MB |==================================================== | 81% 628 MB |==================================================== | 81% 628 MB |==================================================== | 81% 628 MB |==================================================== | 81% 628 MB |==================================================== | 81% 628 MB |==================================================== | 81% 628 MB |==================================================== | 81% 628 MB |==================================================== | 81% 628 MB |===================================================== | 81% 628 MB |===================================================== | 81% 628 MB |===================================================== | 81% 628 MB |===================================================== | 81% 629 MB |===================================================== | 81% 629 MB |===================================================== | 81% 629 MB |===================================================== | 81% 629 MB |===================================================== | 81% 629 MB |===================================================== | 81% 629 MB |===================================================== | 81% 629 MB |===================================================== | 81% 629 MB |===================================================== | 81% 629 MB |===================================================== | 81% 629 MB |===================================================== | 81% 629 MB |===================================================== | 81% 629 MB |===================================================== | 81% 630 MB |===================================================== | 81% 630 MB |===================================================== | 81% 630 MB |===================================================== | 81% 630 MB |===================================================== | 81% 630 MB |===================================================== | 81% 630 MB |===================================================== | 81% 630 MB |===================================================== | 81% 630 MB |===================================================== | 81% 630 MB |===================================================== | 81% 630 MB |===================================================== | 81% 630 MB |===================================================== | 81% 630 MB |===================================================== | 81% 631 MB |===================================================== | 81% 631 MB |===================================================== | 81% 631 MB |===================================================== | 81% 631 MB |===================================================== | 81% 631 MB |===================================================== | 81% 631 MB |===================================================== | 81% 631 MB |===================================================== | 81% 631 MB |===================================================== | 81% 631 MB |===================================================== | 81% 631 MB |===================================================== | 81% 631 MB |===================================================== | 81% 631 MB |===================================================== | 81% 632 MB |===================================================== | 81% 632 MB |===================================================== | 81% 632 MB |===================================================== | 81% 632 MB |===================================================== | 82% 632 MB |===================================================== | 82% 632 MB |===================================================== | 82% 632 MB |===================================================== | 82% 632 MB |===================================================== | 82% 632 MB |===================================================== | 82% 632 MB |===================================================== | 82% 632 MB |===================================================== | 82% 632 MB |===================================================== | 82% 633 MB |===================================================== | 82% 633 MB |===================================================== | 82% 633 MB |===================================================== | 82% 633 MB |===================================================== | 82% 633 MB |===================================================== | 82% 633 MB |===================================================== | 82% 633 MB |===================================================== | 82% 633 MB |===================================================== | 82% 633 MB |===================================================== | 82% 633 MB |===================================================== | 82% 633 MB |===================================================== | 82% 633 MB |===================================================== | 82% 634 MB |===================================================== | 82% 634 MB |===================================================== | 82% 634 MB |===================================================== | 82% 634 MB |===================================================== | 82% 634 MB |===================================================== | 82% 634 MB |===================================================== | 82% 634 MB |===================================================== | 82% 634 MB |===================================================== | 82% 634 MB |===================================================== | 82% 634 MB |===================================================== | 82% 634 MB |===================================================== | 82% 634 MB |===================================================== | 82% 635 MB |===================================================== | 82% 635 MB |===================================================== | 82% 635 MB |===================================================== | 82% 635 MB |===================================================== | 82% 635 MB |===================================================== | 82% 635 MB |===================================================== | 82% 635 MB |===================================================== | 82% 635 MB |===================================================== | 82% 635 MB |===================================================== | 82% 635 MB |===================================================== | 82% 635 MB |===================================================== | 82% 635 MB |===================================================== | 82% 636 MB |===================================================== | 82% 636 MB |===================================================== | 82% 636 MB |===================================================== | 82% 636 MB |===================================================== | 82% 636 MB |===================================================== | 82% 636 MB |===================================================== | 82% 636 MB |===================================================== | 82% 636 MB |===================================================== | 82% 636 MB |===================================================== | 82% 636 MB |===================================================== | 82% 636 MB |===================================================== | 82% 636 MB |===================================================== | 82% 637 MB |===================================================== | 82% 637 MB |===================================================== | 82% 637 MB |===================================================== | 82% 637 MB |===================================================== | 82% 637 MB |===================================================== | 82% 637 MB |===================================================== | 82% 637 MB |===================================================== | 82% 637 MB |===================================================== | 82% 637 MB |===================================================== | 82% 637 MB |===================================================== | 82% 637 MB |===================================================== | 82% 637 MB |===================================================== | 82% 638 MB |===================================================== | 82% 638 MB |===================================================== | 82% 638 MB |===================================================== | 82% 638 MB |===================================================== | 82% 638 MB |===================================================== | 82% 638 MB |===================================================== | 82% 638 MB |===================================================== | 82% 638 MB |===================================================== | 82% 638 MB |===================================================== | 82% 638 MB |===================================================== | 82% 638 MB |===================================================== | 82% 638 MB |===================================================== | 82% 639 MB |===================================================== | 82% 639 MB |===================================================== | 82% 639 MB |===================================================== | 82% 639 MB |===================================================== | 82% 639 MB |===================================================== | 82% 639 MB |===================================================== | 82% 639 MB |===================================================== | 82% 639 MB |===================================================== | 82% 639 MB |===================================================== | 82% 639 MB |===================================================== | 82% 639 MB |===================================================== | 82% 639 MB |===================================================== | 82% 639 MB |===================================================== | 83% 640 MB |===================================================== | 83% 640 MB |===================================================== | 83% 640 MB |===================================================== | 83% 640 MB |===================================================== | 83% 640 MB |===================================================== | 83% 640 MB |===================================================== | 83% 640 MB |====================================================== | 83% 640 MB |====================================================== | 83% 640 MB |====================================================== | 83% 640 MB |====================================================== | 83% 640 MB |====================================================== | 83% 640 MB |====================================================== | 83% 641 MB |====================================================== | 83% 641 MB |====================================================== | 83% 641 MB |====================================================== | 83% 641 MB |====================================================== | 83% 641 MB |====================================================== | 83% 641 MB |====================================================== | 83% 641 MB |====================================================== | 83% 641 MB |====================================================== | 83% 641 MB |====================================================== | 83% 641 MB |====================================================== | 83% 641 MB |====================================================== | 83% 641 MB |====================================================== | 83% 642 MB |====================================================== | 83% 642 MB |====================================================== | 83% 642 MB |====================================================== | 83% 642 MB |====================================================== | 83% 642 MB |====================================================== | 83% 642 MB |====================================================== | 83% 642 MB |====================================================== | 83% 642 MB |====================================================== | 83% 642 MB |====================================================== | 83% 642 MB |====================================================== | 83% 642 MB |====================================================== | 83% 642 MB |====================================================== | 83% 643 MB |====================================================== | 83% 643 MB |====================================================== | 83% 643 MB |====================================================== | 83% 643 MB |====================================================== | 83% 643 MB |====================================================== | 83% 643 MB |====================================================== | 83% 643 MB |====================================================== | 83% 643 MB |====================================================== | 83% 643 MB |====================================================== | 83% 643 MB |====================================================== | 83% 643 MB |====================================================== | 83% 643 MB |====================================================== | 83% 644 MB |====================================================== | 83% 644 MB |====================================================== | 83% 644 MB |====================================================== | 83% 644 MB |====================================================== | 83% 644 MB |====================================================== | 83% 644 MB |====================================================== | 83% 644 MB |====================================================== | 83% 644 MB |====================================================== | 83% 644 MB |====================================================== | 83% 644 MB |====================================================== | 83% 644 MB |====================================================== | 83% 644 MB |====================================================== | 83% 645 MB |====================================================== | 83% 645 MB |====================================================== | 83% 645 MB |====================================================== | 83% 645 MB |====================================================== | 83% 645 MB |====================================================== | 83% 645 MB |====================================================== | 83% 645 MB |====================================================== | 83% 645 MB |====================================================== | 83% 645 MB |====================================================== | 83% 645 MB |====================================================== | 83% 645 MB |====================================================== | 83% 645 MB |====================================================== | 83% 646 MB |====================================================== | 83% 646 MB |====================================================== | 83% 646 MB |====================================================== | 83% 646 MB |====================================================== | 83% 646 MB |====================================================== | 83% 646 MB |====================================================== | 83% 646 MB |====================================================== | 83% 646 MB |====================================================== | 83% 646 MB |====================================================== | 83% 646 MB |====================================================== | 83% 646 MB |====================================================== | 83% 646 MB |====================================================== | 83% 647 MB |====================================================== | 83% 647 MB |====================================================== | 83% 647 MB |====================================================== | 83% 647 MB |====================================================== | 83% 647 MB |====================================================== | 83% 647 MB |====================================================== | 83% 647 MB |====================================================== | 83% 647 MB |====================================================== | 83% 647 MB |====================================================== | 84% 647 MB |====================================================== | 84% 647 MB |====================================================== | 84% 647 MB |====================================================== | 84% 647 MB |====================================================== | 84% 648 MB |====================================================== | 84% 648 MB |====================================================== | 84% 648 MB |====================================================== | 84% 648 MB |====================================================== | 84% 648 MB |====================================================== | 84% 648 MB |====================================================== | 84% 648 MB |====================================================== | 84% 648 MB |====================================================== | 84% 648 MB |====================================================== | 84% 648 MB |====================================================== | 84% 648 MB |====================================================== | 84% 648 MB |====================================================== | 84% 649 MB |====================================================== | 84% 649 MB |====================================================== | 84% 649 MB |====================================================== | 84% 649 MB |====================================================== | 84% 649 MB |====================================================== | 84% 649 MB |====================================================== | 84% 649 MB |====================================================== | 84% 649 MB |====================================================== | 84% 649 MB |====================================================== | 84% 649 MB |====================================================== | 84% 649 MB |====================================================== | 84% 649 MB |====================================================== | 84% 650 MB |====================================================== | 84% 650 MB |====================================================== | 84% 650 MB |====================================================== | 84% 650 MB |====================================================== | 84% 650 MB |====================================================== | 84% 650 MB |====================================================== | 84% 650 MB |====================================================== | 84% 650 MB |====================================================== | 84% 650 MB |====================================================== | 84% 650 MB |====================================================== | 84% 650 MB |====================================================== | 84% 650 MB |====================================================== | 84% 651 MB |====================================================== | 84% 651 MB |====================================================== | 84% 651 MB |====================================================== | 84% 651 MB |====================================================== | 84% 651 MB |====================================================== | 84% 651 MB |====================================================== | 84% 651 MB |====================================================== | 84% 651 MB |====================================================== | 84% 651 MB |====================================================== | 84% 651 MB |====================================================== | 84% 651 MB |====================================================== | 84% 651 MB |====================================================== | 84% 652 MB |====================================================== | 84% 652 MB |====================================================== | 84% 652 MB |====================================================== | 84% 652 MB |====================================================== | 84% 652 MB |====================================================== | 84% 652 MB |======================================================= | 84% 652 MB |======================================================= | 84% 652 MB |======================================================= | 84% 652 MB |======================================================= | 84% 652 MB |======================================================= | 84% 652 MB |======================================================= | 84% 652 MB |======================================================= | 84% 652 MB |======================================================= | 84% 653 MB |======================================================= | 84% 653 MB |======================================================= | 84% 653 MB |======================================================= | 84% 653 MB |======================================================= | 84% 653 MB |======================================================= | 84% 653 MB |======================================================= | 84% 653 MB |======================================================= | 84% 653 MB |======================================================= | 84% 653 MB |======================================================= | 84% 653 MB |======================================================= | 84% 653 MB |======================================================= | 84% 653 MB |======================================================= | 84% 654 MB |======================================================= | 84% 654 MB |======================================================= | 84% 654 MB |======================================================= | 84% 654 MB |======================================================= | 84% 654 MB |======================================================= | 84% 654 MB |======================================================= | 84% 654 MB |======================================================= | 84% 654 MB |======================================================= | 84% 654 MB |======================================================= | 84% 654 MB |======================================================= | 84% 654 MB |======================================================= | 84% 654 MB |======================================================= | 84% 655 MB |======================================================= | 84% 655 MB |======================================================= | 84% 655 MB |======================================================= | 84% 655 MB |======================================================= | 84% 655 MB |======================================================= | 84% 655 MB |======================================================= | 85% 655 MB |======================================================= | 85% 655 MB |======================================================= | 85% 655 MB |======================================================= | 85% 655 MB |======================================================= | 85% 655 MB |======================================================= | 85% 655 MB |======================================================= | 85% 656 MB |======================================================= | 85% 656 MB |======================================================= | 85% 656 MB |======================================================= | 85% 656 MB |======================================================= | 85% 656 MB |======================================================= | 85% 656 MB |======================================================= | 85% 656 MB |======================================================= | 85% 656 MB |======================================================= | 85% 656 MB |======================================================= | 85% 656 MB |======================================================= | 85% 656 MB |======================================================= | 85% 656 MB |======================================================= | 85% 657 MB |======================================================= | 85% 657 MB |======================================================= | 85% 657 MB |======================================================= | 85% 657 MB |======================================================= | 85% 657 MB |======================================================= | 85% 657 MB |======================================================= | 85% 657 MB |======================================================= | 85% 657 MB |======================================================= | 85% 657 MB |======================================================= | 85% 657 MB |======================================================= | 85% 657 MB |======================================================= | 85% 657 MB |======================================================= | 85% 657 MB |======================================================= | 85% 658 MB |======================================================= | 85% 658 MB |======================================================= | 85% 658 MB |======================================================= | 85% 658 MB |======================================================= | 85% 658 MB |======================================================= | 85% 658 MB |======================================================= | 85% 658 MB |======================================================= | 85% 658 MB |======================================================= | 85% 658 MB |======================================================= | 85% 658 MB |======================================================= | 85% 658 MB |======================================================= | 85% 658 MB |======================================================= | 85% 659 MB |======================================================= | 85% 659 MB |======================================================= | 85% 659 MB |======================================================= | 85% 659 MB |======================================================= | 85% 659 MB |======================================================= | 85% 659 MB |======================================================= | 85% 659 MB |======================================================= | 85% 659 MB |======================================================= | 85% 659 MB |======================================================= | 85% 659 MB |======================================================= | 85% 659 MB |======================================================= | 85% 659 MB |======================================================= | 85% 660 MB |======================================================= | 85% 660 MB |======================================================= | 85% 660 MB |======================================================= | 85% 660 MB |======================================================= | 85% 660 MB |======================================================= | 85% 660 MB |======================================================= | 85% 660 MB |======================================================= | 85% 660 MB |======================================================= | 85% 660 MB |======================================================= | 85% 660 MB |======================================================= | 85% 660 MB |======================================================= | 85% 660 MB |======================================================= | 85% 661 MB |======================================================= | 85% 661 MB |======================================================= | 85% 661 MB |======================================================= | 85% 661 MB |======================================================= | 85% 661 MB |======================================================= | 85% 661 MB |======================================================= | 85% 661 MB |======================================================= | 85% 661 MB |======================================================= | 85% 661 MB |======================================================= | 85% 661 MB |======================================================= | 85% 661 MB |======================================================= | 85% 661 MB |======================================================= | 85% 662 MB |======================================================= | 85% 662 MB |======================================================= | 85% 662 MB |======================================================= | 85% 662 MB |======================================================= | 85% 662 MB |======================================================= | 85% 662 MB |======================================================= | 85% 662 MB |======================================================= | 85% 662 MB |======================================================= | 85% 662 MB |======================================================= | 85% 662 MB |======================================================= | 85% 662 MB |======================================================= | 85% 662 MB |======================================================= | 85% 662 MB |======================================================= | 85% 663 MB |======================================================= | 85% 663 MB |======================================================= | 86% 663 MB |======================================================= | 86% 663 MB |======================================================= | 86% 663 MB |======================================================= | 86% 663 MB |======================================================= | 86% 663 MB |======================================================= | 86% 663 MB |======================================================= | 86% 663 MB |======================================================= | 86% 663 MB |======================================================= | 86% 663 MB |======================================================= | 86% 663 MB |======================================================= | 86% 664 MB |======================================================= | 86% 664 MB |======================================================= | 86% 664 MB |======================================================= | 86% 664 MB |======================================================== | 86% 664 MB |======================================================== | 86% 664 MB |======================================================== | 86% 664 MB |======================================================== | 86% 664 MB |======================================================== | 86% 664 MB |======================================================== | 86% 664 MB |======================================================== | 86% 664 MB |======================================================== | 86% 664 MB |======================================================== | 86% 665 MB |======================================================== | 86% 665 MB |======================================================== | 86% 665 MB |======================================================== | 86% 665 MB |======================================================== | 86% 665 MB |======================================================== | 86% 665 MB |======================================================== | 86% 665 MB |======================================================== | 86% 665 MB |======================================================== | 86% 665 MB |======================================================== | 86% 665 MB |======================================================== | 86% 665 MB |======================================================== | 86% 665 MB |======================================================== | 86% 666 MB |======================================================== | 86% 666 MB |======================================================== | 86% 666 MB |======================================================== | 86% 666 MB |======================================================== | 86% 666 MB |======================================================== | 86% 666 MB |======================================================== | 86% 666 MB |======================================================== | 86% 666 MB |======================================================== | 86% 666 MB |======================================================== | 86% 666 MB |======================================================== | 86% 666 MB |======================================================== | 86% 666 MB |======================================================== | 86% 667 MB |======================================================== | 86% 667 MB |======================================================== | 86% 667 MB |======================================================== | 86% 667 MB |======================================================== | 86% 667 MB |======================================================== | 86% 667 MB |======================================================== | 86% 667 MB |======================================================== | 86% 667 MB |======================================================== | 86% 667 MB |======================================================== | 86% 667 MB |======================================================== | 86% 667 MB |======================================================== | 86% 667 MB |======================================================== | 86% 667 MB |======================================================== | 86% 668 MB |======================================================== | 86% 668 MB |======================================================== | 86% 668 MB |======================================================== | 86% 668 MB |======================================================== | 86% 668 MB |======================================================== | 86% 668 MB |======================================================== | 86% 668 MB |======================================================== | 86% 668 MB |======================================================== | 86% 668 MB |======================================================== | 86% 668 MB |======================================================== | 86% 668 MB |======================================================== | 86% 668 MB |======================================================== | 86% 669 MB |======================================================== | 86% 669 MB |======================================================== | 86% 669 MB |======================================================== | 86% 669 MB |======================================================== | 86% 669 MB |======================================================== | 86% 669 MB |======================================================== | 86% 669 MB |======================================================== | 86% 669 MB |======================================================== | 86% 669 MB |======================================================== | 86% 669 MB |======================================================== | 86% 669 MB |======================================================== | 86% 669 MB |======================================================== | 86% 670 MB |======================================================== | 86% 670 MB |======================================================== | 86% 670 MB |======================================================== | 86% 670 MB |======================================================== | 86% 670 MB |======================================================== | 86% 670 MB |======================================================== | 86% 670 MB |======================================================== | 86% 670 MB |======================================================== | 86% 670 MB |======================================================== | 86% 670 MB |======================================================== | 87% 670 MB |======================================================== | 87% 671 MB |======================================================== | 87% 671 MB |======================================================== | 87% 671 MB |======================================================== | 87% 671 MB |======================================================== | 87% 671 MB |======================================================== | 87% 671 MB |======================================================== | 87% 671 MB |======================================================== | 87% 671 MB |======================================================== | 87% 671 MB |======================================================== | 87% 671 MB |======================================================== | 87% 671 MB |======================================================== | 87% 671 MB |======================================================== | 87% 672 MB |======================================================== | 87% 672 MB |======================================================== | 87% 672 MB |======================================================== | 87% 672 MB |======================================================== | 87% 672 MB |======================================================== | 87% 672 MB |======================================================== | 87% 672 MB |======================================================== | 87% 672 MB |======================================================== | 87% 672 MB |======================================================== | 87% 672 MB |======================================================== | 87% 672 MB |======================================================== | 87% 672 MB |======================================================== | 87% 673 MB |======================================================== | 87% 673 MB |======================================================== | 87% 673 MB |======================================================== | 87% 673 MB |======================================================== | 87% 673 MB |======================================================== | 87% 673 MB |======================================================== | 87% 673 MB |======================================================== | 87% 673 MB |======================================================== | 87% 673 MB |======================================================== | 87% 673 MB |======================================================== | 87% 673 MB |======================================================== | 87% 673 MB |======================================================== | 87% 674 MB |======================================================== | 87% 674 MB |======================================================== | 87% 674 MB |======================================================== | 87% 674 MB |======================================================== | 87% 674 MB |======================================================== | 87% 674 MB |======================================================== | 87% 674 MB |======================================================== | 87% 674 MB |======================================================== | 87% 674 MB |======================================================== | 87% 674 MB |======================================================== | 87% 674 MB |======================================================== | 87% 675 MB |======================================================== | 87% 675 MB |======================================================== | 87% 675 MB |======================================================== | 87% 675 MB |======================================================== | 87% 675 MB |======================================================== | 87% 675 MB |======================================================== | 87% 675 MB |======================================================== | 87% 675 MB |======================================================== | 87% 675 MB |======================================================== | 87% 675 MB |======================================================== | 87% 675 MB |======================================================== | 87% 675 MB |======================================================== | 87% 676 MB |======================================================== | 87% 676 MB |======================================================== | 87% 676 MB |========================================================= | 87% 676 MB |========================================================= | 87% 676 MB |========================================================= | 87% 676 MB |========================================================= | 87% 676 MB |========================================================= | 87% 676 MB |========================================================= | 87% 676 MB |========================================================= | 87% 676 MB |========================================================= | 87% 676 MB |========================================================= | 87% 676 MB |========================================================= | 87% 677 MB |========================================================= | 87% 677 MB |========================================================= | 87% 677 MB |========================================================= | 87% 677 MB |========================================================= | 87% 677 MB |========================================================= | 87% 677 MB |========================================================= | 87% 677 MB |========================================================= | 87% 677 MB |========================================================= | 87% 677 MB |========================================================= | 87% 677 MB |========================================================= | 87% 677 MB |========================================================= | 87% 677 MB |========================================================= | 87% 678 MB |========================================================= | 87% 678 MB |========================================================= | 87% 678 MB |========================================================= | 87% 678 MB |========================================================= | 87% 678 MB |========================================================= | 87% 678 MB |========================================================= | 87% 678 MB |========================================================= | 88% 678 MB |========================================================= | 88% 678 MB |========================================================= | 88% 678 MB |========================================================= | 88% 678 MB |========================================================= | 88% 678 MB |========================================================= | 88% 679 MB |========================================================= | 88% 679 MB |========================================================= | 88% 679 MB |========================================================= | 88% 679 MB |========================================================= | 88% 679 MB |========================================================= | 88% 679 MB |========================================================= | 88% 679 MB |========================================================= | 88% 679 MB |========================================================= | 88% 679 MB |========================================================= | 88% 679 MB |========================================================= | 88% 679 MB |========================================================= | 88% 680 MB |========================================================= | 88% 680 MB |========================================================= | 88% 680 MB |========================================================= | 88% 680 MB |========================================================= | 88% 680 MB |========================================================= | 88% 680 MB |========================================================= | 88% 680 MB |========================================================= | 88% 680 MB |========================================================= | 88% 680 MB |========================================================= | 88% 680 MB |========================================================= | 88% 680 MB |========================================================= | 88% 680 MB |========================================================= | 88% 681 MB |========================================================= | 88% 681 MB |========================================================= | 88% 681 MB |========================================================= | 88% 681 MB |========================================================= | 88% 681 MB |========================================================= | 88% 681 MB |========================================================= | 88% 681 MB |========================================================= | 88% 681 MB |========================================================= | 88% 681 MB |========================================================= | 88% 681 MB |========================================================= | 88% 681 MB |========================================================= | 88% 681 MB |========================================================= | 88% 682 MB |========================================================= | 88% 682 MB |========================================================= | 88% 682 MB |========================================================= | 88% 682 MB |========================================================= | 88% 682 MB |========================================================= | 88% 682 MB |========================================================= | 88% 682 MB |========================================================= | 88% 682 MB |========================================================= | 88% 682 MB |========================================================= | 88% 682 MB |========================================================= | 88% 682 MB |========================================================= | 88% 683 MB |========================================================= | 88% 683 MB |========================================================= | 88% 683 MB |========================================================= | 88% 683 MB |========================================================= | 88% 683 MB |========================================================= | 88% 683 MB |========================================================= | 88% 683 MB |========================================================= | 88% 683 MB |========================================================= | 88% 683 MB |========================================================= | 88% 683 MB |========================================================= | 88% 683 MB |========================================================= | 88% 683 MB |========================================================= | 88% 684 MB |========================================================= | 88% 684 MB |========================================================= | 88% 684 MB |========================================================= | 88% 684 MB |========================================================= | 88% 684 MB |========================================================= | 88% 684 MB |========================================================= | 88% 684 MB |========================================================= | 88% 684 MB |========================================================= | 88% 684 MB |========================================================= | 88% 684 MB |========================================================= | 88% 684 MB |========================================================= | 88% 684 MB |========================================================= | 88% 685 MB |========================================================= | 88% 685 MB |========================================================= | 88% 685 MB |========================================================= | 88% 685 MB |========================================================= | 88% 685 MB |========================================================= | 88% 685 MB |========================================================= | 88% 685 MB |========================================================= | 88% 685 MB |========================================================= | 88% 685 MB |========================================================= | 88% 685 MB |========================================================= | 88% 685 MB |========================================================= | 88% 685 MB |========================================================= | 88% 686 MB |========================================================= | 88% 686 MB |========================================================= | 88% 686 MB |========================================================= | 89% 686 MB |========================================================= | 89% 686 MB |========================================================= | 89% 686 MB |========================================================= | 89% 686 MB |========================================================= | 89% 686 MB |========================================================= | 89% 686 MB |========================================================= | 89% 686 MB |========================================================= | 89% 686 MB |========================================================= | 89% 687 MB |========================================================= | 89% 687 MB |========================================================= | 89% 687 MB |========================================================= | 89% 687 MB |========================================================= | 89% 687 MB |========================================================= | 89% 687 MB |========================================================= | 89% 687 MB |========================================================= | 89% 687 MB |========================================================= | 89% 687 MB |========================================================= | 89% 687 MB |========================================================= | 89% 687 MB |========================================================= | 89% 687 MB |========================================================= | 89% 688 MB |========================================================== | 89% 688 MB |========================================================== | 89% 688 MB |========================================================== | 89% 688 MB |========================================================== | 89% 688 MB |========================================================== | 89% 688 MB |========================================================== | 89% 688 MB |========================================================== | 89% 688 MB |========================================================== | 89% 688 MB |========================================================== | 89% 688 MB |========================================================== | 89% 688 MB |========================================================== | 89% 688 MB |========================================================== | 89% 689 MB |========================================================== | 89% 689 MB |========================================================== | 89% 689 MB |========================================================== | 89% 689 MB |========================================================== | 89% 689 MB |========================================================== | 89% 689 MB |========================================================== | 89% 689 MB |========================================================== | 89% 689 MB |========================================================== | 89% 689 MB |========================================================== | 89% 689 MB |========================================================== | 89% 689 MB |========================================================== | 89% 690 MB |========================================================== | 89% 690 MB |========================================================== | 89% 690 MB |========================================================== | 89% 690 MB |========================================================== | 89% 690 MB |========================================================== | 89% 690 MB |========================================================== | 89% 690 MB |========================================================== | 89% 690 MB |========================================================== | 89% 690 MB |========================================================== | 89% 690 MB |========================================================== | 89% 690 MB |========================================================== | 89% 690 MB |========================================================== | 89% 691 MB |========================================================== | 89% 691 MB |========================================================== | 89% 691 MB |========================================================== | 89% 691 MB |========================================================== | 89% 691 MB |========================================================== | 89% 691 MB |========================================================== | 89% 691 MB |========================================================== | 89% 691 MB |========================================================== | 89% 691 MB |========================================================== | 89% 691 MB |========================================================== | 89% 691 MB |========================================================== | 89% 691 MB |========================================================== | 89% 692 MB |========================================================== | 89% 692 MB |========================================================== | 89% 692 MB |========================================================== | 89% 692 MB |========================================================== | 89% 692 MB |========================================================== | 89% 692 MB |========================================================== | 89% 692 MB |========================================================== | 89% 692 MB |========================================================== | 89% 692 MB |========================================================== | 89% 692 MB |========================================================== | 89% 692 MB |========================================================== | 89% 692 MB |========================================================== | 89% 693 MB |========================================================== | 89% 693 MB |========================================================== | 89% 693 MB |========================================================== | 89% 693 MB |========================================================== | 89% 693 MB |========================================================== | 89% 693 MB |========================================================== | 89% 693 MB |========================================================== | 89% 693 MB |========================================================== | 89% 693 MB |========================================================== | 89% 693 MB |========================================================== | 89% 693 MB |========================================================== | 89% 693 MB |========================================================== | 90% 694 MB |========================================================== | 90% 694 MB |========================================================== | 90% 694 MB |========================================================== | 90% 694 MB |========================================================== | 90% 694 MB |========================================================== | 90% 694 MB |========================================================== | 90% 694 MB |========================================================== | 90% 694 MB |========================================================== | 90% 694 MB |========================================================== | 90% 694 MB |========================================================== | 90% 694 MB |========================================================== | 90% 695 MB |========================================================== | 90% 695 MB |========================================================== | 90% 695 MB |========================================================== | 90% 695 MB |========================================================== | 90% 695 MB |========================================================== | 90% 695 MB |========================================================== | 90% 695 MB |========================================================== | 90% 695 MB |========================================================== | 90% 695 MB |========================================================== | 90% 695 MB |========================================================== | 90% 695 MB |========================================================== | 90% 695 MB |========================================================== | 90% 696 MB |========================================================== | 90% 696 MB |========================================================== | 90% 696 MB |========================================================== | 90% 696 MB |========================================================== | 90% 696 MB |========================================================== | 90% 696 MB |========================================================== | 90% 696 MB |========================================================== | 90% 696 MB |========================================================== | 90% 696 MB |========================================================== | 90% 696 MB |========================================================== | 90% 696 MB |========================================================== | 90% 696 MB |========================================================== | 90% 697 MB |========================================================== | 90% 697 MB |========================================================== | 90% 697 MB |========================================================== | 90% 697 MB |========================================================== | 90% 697 MB |========================================================== | 90% 697 MB |========================================================== | 90% 697 MB |========================================================== | 90% 697 MB |========================================================== | 90% 697 MB |========================================================== | 90% 697 MB |========================================================== | 90% 697 MB |========================================================== | 90% 697 MB |========================================================== | 90% 698 MB |========================================================== | 90% 698 MB |========================================================== | 90% 698 MB |========================================================== | 90% 698 MB |========================================================== | 90% 698 MB |========================================================== | 90% 698 MB |========================================================== | 90% 698 MB |========================================================== | 90% 698 MB |========================================================== | 90% 698 MB |========================================================== | 90% 698 MB |========================================================== | 90% 698 MB |========================================================== | 90% 698 MB |========================================================== | 90% 699 MB |========================================================== | 90% 699 MB |========================================================== | 90% 699 MB |========================================================== | 90% 699 MB |========================================================== | 90% 699 MB |========================================================== | 90% 699 MB |========================================================== | 90% 699 MB |========================================================== | 90% 699 MB |========================================================== | 90% 699 MB |========================================================== | 90% 699 MB |========================================================== | 90% 699 MB |=========================================================== | 90% 700 MB |=========================================================== | 90% 700 MB |=========================================================== | 90% 700 MB |=========================================================== | 90% 700 MB |=========================================================== | 90% 700 MB |=========================================================== | 90% 700 MB |=========================================================== | 90% 700 MB |=========================================================== | 90% 700 MB |=========================================================== | 90% 700 MB |=========================================================== | 90% 700 MB |=========================================================== | 90% 700 MB |=========================================================== | 90% 700 MB |=========================================================== | 90% 701 MB |=========================================================== | 90% 701 MB |=========================================================== | 90% 701 MB |=========================================================== | 90% 701 MB |=========================================================== | 90% 701 MB |=========================================================== | 90% 701 MB |=========================================================== | 90% 701 MB |=========================================================== | 90% 701 MB |=========================================================== | 90% 701 MB |=========================================================== | 91% 701 MB |=========================================================== | 91% 701 MB |=========================================================== | 91% 701 MB |=========================================================== | 91% 702 MB |=========================================================== | 91% 702 MB |=========================================================== | 91% 702 MB |=========================================================== | 91% 702 MB |=========================================================== | 91% 702 MB |=========================================================== | 91% 702 MB |=========================================================== | 91% 702 MB |=========================================================== | 91% 702 MB |=========================================================== | 91% 702 MB |=========================================================== | 91% 702 MB |=========================================================== | 91% 702 MB |=========================================================== | 91% 702 MB |=========================================================== | 91% 703 MB |=========================================================== | 91% 703 MB |=========================================================== | 91% 703 MB |=========================================================== | 91% 703 MB |=========================================================== | 91% 703 MB |=========================================================== | 91% 703 MB |=========================================================== | 91% 703 MB |=========================================================== | 91% 703 MB |=========================================================== | 91% 703 MB |=========================================================== | 91% 703 MB |=========================================================== | 91% 703 MB |=========================================================== | 91% 703 MB |=========================================================== | 91% 704 MB |=========================================================== | 91% 704 MB |=========================================================== | 91% 704 MB |=========================================================== | 91% 704 MB |=========================================================== | 91% 704 MB |=========================================================== | 91% 704 MB |=========================================================== | 91% 704 MB |=========================================================== | 91% 704 MB |=========================================================== | 91% 704 MB |=========================================================== | 91% 704 MB |=========================================================== | 91% 704 MB |=========================================================== | 91% 705 MB |=========================================================== | 91% 705 MB |=========================================================== | 91% 705 MB |=========================================================== | 91% 705 MB |=========================================================== | 91% 705 MB |=========================================================== | 91% 705 MB |=========================================================== | 91% 705 MB |=========================================================== | 91% 705 MB |=========================================================== | 91% 705 MB |=========================================================== | 91% 705 MB |=========================================================== | 91% 705 MB |=========================================================== | 91% 705 MB |=========================================================== | 91% 706 MB |=========================================================== | 91% 706 MB |=========================================================== | 91% 706 MB |=========================================================== | 91% 706 MB |=========================================================== | 91% 706 MB |=========================================================== | 91% 706 MB |=========================================================== | 91% 706 MB |=========================================================== | 91% 706 MB |=========================================================== | 91% 706 MB |=========================================================== | 91% 706 MB |=========================================================== | 91% 706 MB |=========================================================== | 91% 706 MB |=========================================================== | 91% 707 MB |=========================================================== | 91% 707 MB |=========================================================== | 91% 707 MB |=========================================================== | 91% 707 MB |=========================================================== | 91% 707 MB |=========================================================== | 91% 707 MB |=========================================================== | 91% 707 MB |=========================================================== | 91% 707 MB |=========================================================== | 91% 707 MB |=========================================================== | 91% 707 MB |=========================================================== | 91% 707 MB |=========================================================== | 91% 707 MB |=========================================================== | 91% 708 MB |=========================================================== | 91% 708 MB |=========================================================== | 91% 708 MB |=========================================================== | 91% 708 MB |=========================================================== | 91% 708 MB |=========================================================== | 91% 708 MB |=========================================================== | 91% 708 MB |=========================================================== | 91% 708 MB |=========================================================== | 91% 708 MB |=========================================================== | 91% 708 MB |=========================================================== | 91% 708 MB |=========================================================== | 91% 709 MB |=========================================================== | 91% 709 MB |=========================================================== | 91% 709 MB |=========================================================== | 91% 709 MB |=========================================================== | 91% 709 MB |=========================================================== | 92% 709 MB |=========================================================== | 92% 709 MB |=========================================================== | 92% 709 MB |=========================================================== | 92% 709 MB |=========================================================== | 92% 709 MB |=========================================================== | 92% 709 MB |=========================================================== | 92% 709 MB |=========================================================== | 92% 710 MB |=========================================================== | 92% 710 MB |=========================================================== | 92% 710 MB |=========================================================== | 92% 710 MB |=========================================================== | 92% 710 MB |=========================================================== | 92% 710 MB |=========================================================== | 92% 710 MB |=========================================================== | 92% 710 MB |=========================================================== | 92% 710 MB |=========================================================== | 92% 710 MB |=========================================================== | 92% 710 MB |=========================================================== | 92% 710 MB |=========================================================== | 92% 711 MB |=========================================================== | 92% 711 MB |=========================================================== | 92% 711 MB |=========================================================== | 92% 711 MB |=========================================================== | 92% 711 MB |=========================================================== | 92% 711 MB |=========================================================== | 92% 711 MB |=========================================================== | 92% 711 MB |=========================================================== | 92% 711 MB |============================================================ | 92% 711 MB |============================================================ | 92% 711 MB |============================================================ | 92% 711 MB |============================================================ | 92% 712 MB |============================================================ | 92% 712 MB |============================================================ | 92% 712 MB |============================================================ | 92% 712 MB |============================================================ | 92% 712 MB |============================================================ | 92% 712 MB |============================================================ | 92% 712 MB |============================================================ | 92% 712 MB |============================================================ | 92% 712 MB |============================================================ | 92% 712 MB |============================================================ | 92% 712 MB |============================================================ | 92% 713 MB |============================================================ | 92% 713 MB |============================================================ | 92% 713 MB |============================================================ | 92% 713 MB |============================================================ | 92% 713 MB |============================================================ | 92% 713 MB |============================================================ | 92% 713 MB |============================================================ | 92% 713 MB |============================================================ | 92% 713 MB |============================================================ | 92% 713 MB |============================================================ | 92% 713 MB |============================================================ | 92% 713 MB |============================================================ | 92% 714 MB |============================================================ | 92% 714 MB |============================================================ | 92% 714 MB |============================================================ | 92% 714 MB |============================================================ | 92% 714 MB |============================================================ | 92% 714 MB |============================================================ | 92% 714 MB |============================================================ | 92% 714 MB |============================================================ | 92% 714 MB |============================================================ | 92% 714 MB |============================================================ | 92% 714 MB |============================================================ | 92% 714 MB |============================================================ | 92% 715 MB |============================================================ | 92% 715 MB |============================================================ | 92% 715 MB |============================================================ | 92% 715 MB |============================================================ | 92% 715 MB |============================================================ | 92% 715 MB |============================================================ | 92% 715 MB |============================================================ | 92% 715 MB |============================================================ | 92% 715 MB |============================================================ | 92% 715 MB |============================================================ | 92% 715 MB |============================================================ | 92% 715 MB |============================================================ | 92% 716 MB |============================================================ | 92% 716 MB |============================================================ | 92% 716 MB |============================================================ | 92% 716 MB |============================================================ | 92% 716 MB |============================================================ | 92% 716 MB |============================================================ | 92% 716 MB |============================================================ | 92% 716 MB |============================================================ | 92% 716 MB |============================================================ | 92% 716 MB |============================================================ | 92% 716 MB |============================================================ | 92% 717 MB |============================================================ | 92% 717 MB |============================================================ | 93% 717 MB |============================================================ | 93% 717 MB |============================================================ | 93% 717 MB |============================================================ | 93% 717 MB |============================================================ | 93% 717 MB |============================================================ | 93% 717 MB |============================================================ | 93% 717 MB |============================================================ | 93% 717 MB |============================================================ | 93% 717 MB |============================================================ | 93% 717 MB |============================================================ | 93% 718 MB |============================================================ | 93% 718 MB |============================================================ | 93% 718 MB |============================================================ | 93% 718 MB |============================================================ | 93% 718 MB |============================================================ | 93% 718 MB |============================================================ | 93% 718 MB |============================================================ | 93% 718 MB |============================================================ | 93% 718 MB |============================================================ | 93% 718 MB |============================================================ | 93% 718 MB |============================================================ | 93% 718 MB |============================================================ | 93% 719 MB |============================================================ | 93% 719 MB |============================================================ | 93% 719 MB |============================================================ | 93% 719 MB |============================================================ | 93% 719 MB |============================================================ | 93% 719 MB |============================================================ | 93% 719 MB |============================================================ | 93% 719 MB |============================================================ | 93% 719 MB |============================================================ | 93% 719 MB |============================================================ | 93% 719 MB |============================================================ | 93% 719 MB |============================================================ | 93% 720 MB |============================================================ | 93% 720 MB |============================================================ | 93% 720 MB |============================================================ | 93% 720 MB |============================================================ | 93% 720 MB |============================================================ | 93% 720 MB |============================================================ | 93% 720 MB |============================================================ | 93% 720 MB |============================================================ | 93% 720 MB |============================================================ | 93% 720 MB |============================================================ | 93% 720 MB |============================================================ | 93% 720 MB |============================================================ | 93% 721 MB |============================================================ | 93% 721 MB |============================================================ | 93% 721 MB |============================================================ | 93% 721 MB |============================================================ | 93% 721 MB |============================================================ | 93% 721 MB |============================================================ | 93% 721 MB |============================================================ | 93% 721 MB |============================================================ | 93% 721 MB |============================================================ | 93% 721 MB |============================================================ | 93% 721 MB |============================================================ | 93% 721 MB |============================================================ | 93% 722 MB |============================================================ | 93% 722 MB |============================================================ | 93% 722 MB |============================================================ | 93% 722 MB |============================================================ | 93% 722 MB |============================================================ | 93% 722 MB |============================================================ | 93% 722 MB |============================================================ | 93% 722 MB |============================================================ | 93% 722 MB |============================================================ | 93% 722 MB |============================================================ | 93% 722 MB |============================================================ | 93% 722 MB |============================================================ | 93% 723 MB |============================================================ | 93% 723 MB |============================================================ | 93% 723 MB |============================================================ | 93% 723 MB |============================================================ | 93% 723 MB |============================================================ | 93% 723 MB |============================================================ | 93% 723 MB |============================================================= | 93% 723 MB |============================================================= | 93% 723 MB |============================================================= | 93% 723 MB |============================================================= | 93% 723 MB |============================================================= | 93% 724 MB |============================================================= | 93% 724 MB |============================================================= | 93% 724 MB |============================================================= | 93% 724 MB |============================================================= | 93% 724 MB |============================================================= | 93% 724 MB |============================================================= | 93% 724 MB |============================================================= | 93% 724 MB |============================================================= | 93% 724 MB |============================================================= | 93% 724 MB |============================================================= | 94% 724 MB |============================================================= | 94% 724 MB |============================================================= | 94% 725 MB |============================================================= | 94% 725 MB |============================================================= | 94% 725 MB |============================================================= | 94% 725 MB |============================================================= | 94% 725 MB |============================================================= | 94% 725 MB |============================================================= | 94% 725 MB |============================================================= | 94% 725 MB |============================================================= | 94% 725 MB |============================================================= | 94% 725 MB |============================================================= | 94% 725 MB |============================================================= | 94% 725 MB |============================================================= | 94% 726 MB |============================================================= | 94% 726 MB |============================================================= | 94% 726 MB |============================================================= | 94% 726 MB |============================================================= | 94% 726 MB |============================================================= | 94% 726 MB |============================================================= | 94% 726 MB |============================================================= | 94% 726 MB |============================================================= | 94% 726 MB |============================================================= | 94% 726 MB |============================================================= | 94% 726 MB |============================================================= | 94% 726 MB |============================================================= | 94% 727 MB |============================================================= | 94% 727 MB |============================================================= | 94% 727 MB |============================================================= | 94% 727 MB |============================================================= | 94% 727 MB |============================================================= | 94% 727 MB |============================================================= | 94% 727 MB |============================================================= | 94% 727 MB |============================================================= | 94% 727 MB |============================================================= | 94% 727 MB |============================================================= | 94% 727 MB |============================================================= | 94% 727 MB |============================================================= | 94% 728 MB |============================================================= | 94% 728 MB |============================================================= | 94% 728 MB |============================================================= | 94% 728 MB |============================================================= | 94% 728 MB |============================================================= | 94% 728 MB |============================================================= | 94% 728 MB |============================================================= | 94% 728 MB |============================================================= | 94% 728 MB |============================================================= | 94% 728 MB |============================================================= | 94% 728 MB |============================================================= | 94% 728 MB |============================================================= | 94% 729 MB |============================================================= | 94% 729 MB |============================================================= | 94% 729 MB |============================================================= | 94% 729 MB |============================================================= | 94% 729 MB |============================================================= | 94% 729 MB |============================================================= | 94% 729 MB |============================================================= | 94% 729 MB |============================================================= | 94% 729 MB |============================================================= | 94% 729 MB |============================================================= | 94% 729 MB |============================================================= | 94% 729 MB |============================================================= | 94% 730 MB |============================================================= | 94% 730 MB |============================================================= | 94% 730 MB |============================================================= | 94% 730 MB |============================================================= | 94% 730 MB |============================================================= | 94% 730 MB |============================================================= | 94% 730 MB |============================================================= | 94% 730 MB |============================================================= | 94% 730 MB |============================================================= | 94% 730 MB |============================================================= | 94% 730 MB |============================================================= | 94% 731 MB |============================================================= | 94% 731 MB |============================================================= | 94% 731 MB |============================================================= | 94% 731 MB |============================================================= | 94% 731 MB |============================================================= | 94% 731 MB |============================================================= | 94% 731 MB |============================================================= | 94% 731 MB |============================================================= | 94% 731 MB |============================================================= | 94% 731 MB |============================================================= | 94% 731 MB |============================================================= | 94% 731 MB |============================================================= | 94% 732 MB |============================================================= | 94% 732 MB |============================================================= | 94% 732 MB |============================================================= | 94% 732 MB |============================================================= | 94% 732 MB |============================================================= | 94% 732 MB |============================================================= | 94% 732 MB |============================================================= | 95% 732 MB |============================================================= | 95% 732 MB |============================================================= | 95% 732 MB |============================================================= | 95% 732 MB |============================================================= | 95% 732 MB |============================================================= | 95% 733 MB |============================================================= | 95% 733 MB |============================================================= | 95% 733 MB |============================================================= | 95% 733 MB |============================================================= | 95% 733 MB |============================================================= | 95% 733 MB |============================================================= | 95% 733 MB |============================================================= | 95% 733 MB |============================================================= | 95% 733 MB |============================================================= | 95% 733 MB |============================================================= | 95% 733 MB |============================================================= | 95% 733 MB |============================================================= | 95% 734 MB |============================================================= | 95% 734 MB |============================================================= | 95% 734 MB |============================================================= | 95% 734 MB |============================================================= | 95% 734 MB |============================================================= | 95% 734 MB |============================================================= | 95% 734 MB |============================================================= | 95% 734 MB |============================================================= | 95% 734 MB |============================================================= | 95% 734 MB |============================================================= | 95% 734 MB |============================================================= | 95% 734 MB |============================================================= | 95% 735 MB |============================================================= | 95% 735 MB |============================================================= | 95% 735 MB |============================================================= | 95% 735 MB |============================================================= | 95% 735 MB |============================================================= | 95% 735 MB |============================================================== | 95% 735 MB |============================================================== | 95% 735 MB |============================================================== | 95% 735 MB |============================================================== | 95% 735 MB |============================================================== | 95% 735 MB |============================================================== | 95% 735 MB |============================================================== | 95% 736 MB |============================================================== | 95% 736 MB |============================================================== | 95% 736 MB |============================================================== | 95% 736 MB |============================================================== | 95% 736 MB |============================================================== | 95% 736 MB |============================================================== | 95% 736 MB |============================================================== | 95% 736 MB |============================================================== | 95% 736 MB |============================================================== | 95% 736 MB |============================================================== | 95% 736 MB |============================================================== | 95% 736 MB |============================================================== | 95% 737 MB |============================================================== | 95% 737 MB |============================================================== | 95% 737 MB |============================================================== | 95% 737 MB |============================================================== | 95% 737 MB |============================================================== | 95% 737 MB |============================================================== | 95% 737 MB |============================================================== | 95% 737 MB |============================================================== | 95% 737 MB |============================================================== | 95% 737 MB |============================================================== | 95% 737 MB |============================================================== | 95% 738 MB |============================================================== | 95% 738 MB |============================================================== | 95% 738 MB |============================================================== | 95% 738 MB |============================================================== | 95% 738 MB |============================================================== | 95% 738 MB |============================================================== | 95% 738 MB |============================================================== | 95% 738 MB |============================================================== | 95% 738 MB |============================================================== | 95% 738 MB |============================================================== | 95% 738 MB |============================================================== | 95% 738 MB |============================================================== | 95% 739 MB |============================================================== | 95% 739 MB |============================================================== | 95% 739 MB |============================================================== | 95% 739 MB |============================================================== | 95% 739 MB |============================================================== | 95% 739 MB |============================================================== | 95% 739 MB |============================================================== | 95% 739 MB |============================================================== | 95% 739 MB |============================================================== | 95% 739 MB |============================================================== | 95% 739 MB |============================================================== | 95% 739 MB |============================================================== | 95% 740 MB |============================================================== | 95% 740 MB |============================================================== | 95% 740 MB |============================================================== | 96% 740 MB |============================================================== | 96% 740 MB |============================================================== | 96% 740 MB |============================================================== | 96% 740 MB |============================================================== | 96% 740 MB |============================================================== | 96% 740 MB |============================================================== | 96% 740 MB |============================================================== | 96% 740 MB |============================================================== | 96% 740 MB |============================================================== | 96% 741 MB |============================================================== | 96% 741 MB |============================================================== | 96% 741 MB |============================================================== | 96% 741 MB |============================================================== | 96% 741 MB |============================================================== | 96% 741 MB |============================================================== | 96% 741 MB |============================================================== | 96% 741 MB |============================================================== | 96% 741 MB |============================================================== | 96% 741 MB |============================================================== | 96% 741 MB |============================================================== | 96% 741 MB |============================================================== | 96% 742 MB |============================================================== | 96% 742 MB |============================================================== | 96% 742 MB |============================================================== | 96% 742 MB |============================================================== | 96% 742 MB |============================================================== | 96% 742 MB |============================================================== | 96% 742 MB |============================================================== | 96% 742 MB |============================================================== | 96% 742 MB |============================================================== | 96% 742 MB |============================================================== | 96% 742 MB |============================================================== | 96% 743 MB |============================================================== | 96% 743 MB |============================================================== | 96% 743 MB |============================================================== | 96% 743 MB |============================================================== | 96% 743 MB |============================================================== | 96% 743 MB |============================================================== | 96% 743 MB |============================================================== | 96% 743 MB |============================================================== | 96% 743 MB |============================================================== | 96% 743 MB |============================================================== | 96% 743 MB |============================================================== | 96% 743 MB |============================================================== | 96% 744 MB |============================================================== | 96% 744 MB |============================================================== | 96% 744 MB |============================================================== | 96% 744 MB |============================================================== | 96% 744 MB |============================================================== | 96% 744 MB |============================================================== | 96% 744 MB |============================================================== | 96% 744 MB |============================================================== | 96% 744 MB |============================================================== | 96% 744 MB |============================================================== | 96% 744 MB |============================================================== | 96% 744 MB |============================================================== | 96% 745 MB |============================================================== | 96% 745 MB |============================================================== | 96% 745 MB |============================================================== | 96% 745 MB |============================================================== | 96% 745 MB |============================================================== | 96% 745 MB |============================================================== | 96% 745 MB |============================================================== | 96% 745 MB |============================================================== | 96% 745 MB |============================================================== | 96% 745 MB |============================================================== | 96% 745 MB |============================================================== | 96% 745 MB |============================================================== | 96% 746 MB |============================================================== | 96% 746 MB |============================================================== | 96% 746 MB |============================================================== | 96% 746 MB |============================================================== | 96% 746 MB |============================================================== | 96% 746 MB |============================================================== | 96% 746 MB |============================================================== | 96% 746 MB |============================================================== | 96% 746 MB |============================================================== | 96% 746 MB |============================================================== | 96% 746 MB |============================================================== | 96% 747 MB |============================================================== | 96% 747 MB |============================================================== | 96% 747 MB |============================================================== | 96% 747 MB |============================================================== | 96% 747 MB |=============================================================== | 96% 747 MB |=============================================================== | 96% 747 MB |=============================================================== | 96% 747 MB |=============================================================== | 96% 747 MB |=============================================================== | 96% 747 MB |=============================================================== | 96% 747 MB |=============================================================== | 96% 747 MB |=============================================================== | 97% 748 MB |=============================================================== | 97% 748 MB |=============================================================== | 97% 748 MB |=============================================================== | 97% 748 MB |=============================================================== | 97% 748 MB |=============================================================== | 97% 748 MB |=============================================================== | 97% 748 MB |=============================================================== | 97% 748 MB |=============================================================== | 97% 748 MB |=============================================================== | 97% 748 MB |=============================================================== | 97% 748 MB |=============================================================== | 97% 748 MB |=============================================================== | 97% 749 MB |=============================================================== | 97% 749 MB |=============================================================== | 97% 749 MB |=============================================================== | 97% 749 MB |=============================================================== | 97% 749 MB |=============================================================== | 97% 749 MB |=============================================================== | 97% 749 MB |=============================================================== | 97% 749 MB |=============================================================== | 97% 749 MB |=============================================================== | 97% 749 MB |=============================================================== | 97% 749 MB |=============================================================== | 97% 749 MB |=============================================================== | 97% 750 MB |=============================================================== | 97% 750 MB |=============================================================== | 97% 750 MB |=============================================================== | 97% 750 MB |=============================================================== | 97% 750 MB |=============================================================== | 97% 750 MB |=============================================================== | 97% 750 MB |=============================================================== | 97% 750 MB |=============================================================== | 97% 750 MB |=============================================================== | 97% 750 MB |=============================================================== | 97% 750 MB |=============================================================== | 97% 751 MB |=============================================================== | 97% 751 MB |=============================================================== | 97% 751 MB |=============================================================== | 97% 751 MB |=============================================================== | 97% 751 MB |=============================================================== | 97% 751 MB |=============================================================== | 97% 751 MB |=============================================================== | 97% 751 MB |=============================================================== | 97% 751 MB |=============================================================== | 97% 751 MB |=============================================================== | 97% 751 MB |=============================================================== | 97% 751 MB |=============================================================== | 97% 752 MB |=============================================================== | 97% 752 MB |=============================================================== | 97% 752 MB |=============================================================== | 97% 752 MB |=============================================================== | 97% 752 MB |=============================================================== | 97% 752 MB |=============================================================== | 97% 752 MB |=============================================================== | 97% 752 MB |=============================================================== | 97% 752 MB |=============================================================== | 97% 752 MB |=============================================================== | 97% 752 MB |=============================================================== | 97% 752 MB |=============================================================== | 97% 752 MB |=============================================================== | 97% 753 MB |=============================================================== | 97% 753 MB |=============================================================== | 97% 753 MB |=============================================================== | 97% 753 MB |=============================================================== | 97% 753 MB |=============================================================== | 97% 753 MB |=============================================================== | 97% 753 MB |=============================================================== | 97% 753 MB |=============================================================== | 97% 753 MB |=============================================================== | 97% 753 MB |=============================================================== | 97% 753 MB |=============================================================== | 97% 753 MB |=============================================================== | 97% 754 MB |=============================================================== | 97% 754 MB |=============================================================== | 97% 754 MB |=============================================================== | 97% 754 MB |=============================================================== | 97% 754 MB |=============================================================== | 97% 754 MB |=============================================================== | 97% 754 MB |=============================================================== | 97% 754 MB |=============================================================== | 97% 754 MB |=============================================================== | 97% 754 MB |=============================================================== | 97% 754 MB |=============================================================== | 97% 754 MB |=============================================================== | 97% 755 MB |=============================================================== | 97% 755 MB |=============================================================== | 97% 755 MB |=============================================================== | 97% 755 MB |=============================================================== | 97% 755 MB |=============================================================== | 97% 755 MB |=============================================================== | 97% 755 MB |=============================================================== | 97% 755 MB |=============================================================== | 98% 755 MB |=============================================================== | 98% 755 MB |=============================================================== | 98% 755 MB |=============================================================== | 98% 755 MB |=============================================================== | 98% 756 MB |=============================================================== | 98% 756 MB |=============================================================== | 98% 756 MB |=============================================================== | 98% 756 MB |=============================================================== | 98% 756 MB |=============================================================== | 98% 756 MB |=============================================================== | 98% 756 MB |=============================================================== | 98% 756 MB |=============================================================== | 98% 756 MB |=============================================================== | 98% 756 MB |=============================================================== | 98% 756 MB |=============================================================== | 98% 756 MB |=============================================================== | 98% 757 MB |=============================================================== | 98% 757 MB |=============================================================== | 98% 757 MB |=============================================================== | 98% 757 MB |=============================================================== | 98% 757 MB |=============================================================== | 98% 757 MB |=============================================================== | 98% 757 MB |=============================================================== | 98% 757 MB |=============================================================== | 98% 757 MB |=============================================================== | 98% 757 MB |=============================================================== | 98% 757 MB |=============================================================== | 98% 757 MB |=============================================================== | 98% 757 MB |=============================================================== | 98% 758 MB |=============================================================== | 98% 758 MB |=============================================================== | 98% 758 MB |=============================================================== | 98% 758 MB |=============================================================== | 98% 758 MB |=============================================================== | 98% 758 MB |=============================================================== | 98% 758 MB |=============================================================== | 98% 758 MB |=============================================================== | 98% 758 MB |=============================================================== | 98% 758 MB |=============================================================== | 98% 758 MB |=============================================================== | 98% 758 MB |=============================================================== | 98% 759 MB |=============================================================== | 98% 759 MB |=============================================================== | 98% 759 MB |================================================================| 98% 759 MB |================================================================| 98% 759 MB |================================================================| 98% 759 MB |================================================================| 98% 759 MB |================================================================| 98% 759 MB |================================================================| 98% 759 MB |================================================================| 98% 759 MB |================================================================| 98% 759 MB |================================================================| 98% 759 MB |================================================================| 98% 760 MB |================================================================| 98% 760 MB |================================================================| 98% 760 MB |================================================================| 98% 760 MB |================================================================| 98% 760 MB |================================================================| 98% 760 MB |================================================================| 98% 760 MB |================================================================| 98% 760 MB |================================================================| 98% 760 MB |================================================================| 98% 760 MB |================================================================| 98% 760 MB |================================================================| 98% 760 MB |================================================================| 98% 761 MB |================================================================| 98% 761 MB |================================================================| 98% 761 MB |================================================================| 98% 761 MB |================================================================| 98% 761 MB |================================================================| 98% 761 MB |================================================================| 98% 761 MB |================================================================| 98% 761 MB |================================================================| 98% 761 MB |================================================================| 98% 761 MB |================================================================| 98% 761 MB |================================================================| 98% 761 MB |================================================================| 98% 762 MB |================================================================| 98% 762 MB |================================================================| 98% 762 MB |================================================================| 98% 762 MB |================================================================| 98% 762 MB |================================================================| 98% 762 MB |================================================================| 98% 762 MB |================================================================| 98% 762 MB |================================================================| 98% 762 MB |================================================================| 98% 762 MB |================================================================| 98% 762 MB |================================================================| 98% 762 MB |================================================================| 98% 763 MB |================================================================| 98% 763 MB |================================================================| 98% 763 MB |================================================================| 98% 763 MB |================================================================| 98% 763 MB |================================================================| 99% 763 MB |================================================================| 99% 763 MB |================================================================| 99% 763 MB |================================================================| 99% 763 MB |================================================================| 99% 763 MB |================================================================| 99% 763 MB |================================================================| 99% 763 MB |================================================================| 99% 763 MB |================================================================| 99% 764 MB |================================================================| 99% 764 MB |================================================================| 99% 764 MB |================================================================| 99% 764 MB |================================================================| 99% 764 MB |================================================================| 99% 764 MB |================================================================| 99% 764 MB |================================================================| 99% 764 MB |================================================================| 99% 764 MB |================================================================| 99% 764 MB |================================================================| 99% 764 MB |================================================================| 99% 764 MB |================================================================| 99% 765 MB |================================================================| 99% 765 MB |================================================================| 99% 765 MB |================================================================| 99% 765 MB |================================================================| 99% 765 MB |================================================================| 99% 765 MB |================================================================| 99% 765 MB |================================================================| 99% 765 MB |================================================================| 99% 765 MB |================================================================| 99% 765 MB |================================================================| 99% 765 MB |================================================================| 99% 765 MB |================================================================| 99% 766 MB |================================================================| 99% 766 MB |================================================================| 99% 766 MB |================================================================| 99% 766 MB |================================================================| 99% 766 MB |================================================================| 99% 766 MB |================================================================| 99% 766 MB |================================================================| 99% 766 MB |================================================================| 99% 766 MB |================================================================| 99% 766 MB |================================================================| 99% 766 MB |================================================================| 99% 766 MB |================================================================| 99% 767 MB |================================================================| 99% 767 MB |================================================================| 99% 767 MB |================================================================| 99% 767 MB |================================================================| 99% 767 MB |================================================================| 99% 767 MB |================================================================| 99% 767 MB |================================================================| 99% 767 MB |================================================================| 99% 767 MB |================================================================| 99% 767 MB |================================================================| 99% 767 MB |================================================================| 99% 767 MB |================================================================| 99% 768 MB |================================================================| 99% 768 MB |================================================================| 99% 768 MB |================================================================| 99% 768 MB |================================================================| 99% 768 MB |================================================================| 99% 768 MB |================================================================| 99% 768 MB |================================================================| 99% 768 MB |================================================================| 99% 768 MB |================================================================| 99% 768 MB |================================================================| 99% 768 MB |================================================================| 99% 768 MB |================================================================| 99% 769 MB |================================================================| 99% 769 MB |================================================================| 99% 769 MB |================================================================| 99% 769 MB |================================================================| 99% 769 MB |================================================================| 99% 769 MB |================================================================| 99% 769 MB |================================================================| 99% 769 MB |================================================================| 99% 769 MB |================================================================| 99% 769 MB |================================================================| 99% 769 MB |================================================================| 99% 769 MB |================================================================| 99% 770 MB |================================================================| 99% 770 MB |================================================================| 99% 770 MB |================================================================| 99% 770 MB |================================================================| 99% 770 MB |================================================================| 99% 770 MB |================================================================| 99% 770 MB |================================================================| 99% 770 MB |================================================================| 99% 770 MB |================================================================| 99% 770 MB |================================================================| 99% 770 MB |================================================================| 99% 771 MB |=================================================================| 100% 771 MB + +#+begin_src R :results output :session :exports both +df4.mpi %>% + filter(Iteration == 3) %>% + filter(Value != "timste") %>% + filter(grepl("MPI_", Value)) %>% + group_by(Value) %>% + summarize(Mean = sum(End-Start), N=n()) +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 8 x 3 + Value Mean N + +1 MPI_Allreduce 356.860608 304794 +2 MPI_Barrier 60.943087 1575 +3 MPI_Comm_rank 0.000148 315 +4 MPI_Comm_size 0.000251 315 +5 MPI_Irecv 2.528487 1023948 +6 MPI_Isend 12.640539 1023948 +7 MPI_Sendrecv 329.724298 47880 +8 MPI_Waitall 46.962233 92814 +#+end_example + +***** Plot + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 1700 :height 800 :session +df4.mpi %>% + filter(Rank != 0) %>% + filter(Iteration == 7) %>% + filter(End-Start > 0.1) -> tx; +tx %>% + group_by(Iteration) %>% + mutate(End = End - min(Start), + Start = Start - min(Start)) %>% + ungroup() %>% + ggplot() + + facet_wrap(~Iteration) + + geom_rect(aes(fill=Value, + xmin=Start, + xmax=End, + ymin=Rank, + ymax=Rank+0.9)); +#+end_src + +#+RESULTS: +[[file:/tmp/babel-24763lPR/figure24763btH.png]] + +#+begin_src R :results output :session :exports both +tx %>% group_by(Rank, Value, Iteration) %>% summarize(N=n()) +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 1,407 x 4 +# Groups: Rank, Value [?] + Rank Value Iteration N + + 1 0 solver 150 4 + 2 1 nsi_matrix 150 1 + 3 1 solver 150 4 + 4 2 nsi_matrix 150 1 + 5 2 solver 150 4 + 6 3 nsi_matrix 150 1 + 7 3 solver 150 4 + 8 4 nsi_matrix 150 1 + 9 4 solver 150 4 +10 5 nsi_matrix 150 1 +# ... with 1,397 more rows +#+end_example + +***** Calculate Comm / Comp for this prep trace + +#+begin_src R :results output :session :exports both +df4.mpi %>% + group_by(Rank, Iteration, Platform, Nodes, NP, Partitioning, EID) %>% + filter(grepl("MPI_", Value)) %>% + summarize(N=n(), S=min(Start), E=max(End), Comm=sum(End-Start), Comp=(E-S)-Comm) -> df4.sum; +#+end_src + +#+RESULTS: + +#+begin_src R :results output :session :exports both +df4.sum +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 630 x 12 +# Groups: Rank, Iteration, Platform, Nodes, NP, Partitioning [?] + Rank Iteration Platform Nodes NP Partitioning EID N S + + 1 0 1 grisou 4 63 sfc prep-4 5454 129.7846 + 2 0 2 grisou 4 63 sfc prep-4 4752 174.9951 + 3 0 3 grisou 4 63 sfc prep-4 4873 219.0275 + 4 0 4 grisou 4 63 sfc prep-4 4697 263.0684 + 5 0 5 grisou 4 63 sfc prep-4 4660 306.4143 + 6 0 6 grisou 4 63 sfc prep-4 4544 349.4879 + 7 0 7 grisou 4 63 sfc prep-4 4448 391.9349 + 8 0 8 grisou 4 63 sfc prep-4 4324 433.9185 + 9 0 9 grisou 4 63 sfc prep-4 4263 475.6839 +10 0 10 grisou 4 63 sfc prep-4 3579 517.5147 +# ... with 620 more rows, and 3 more variables: E , Comm , Comp +#+end_example + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 500 :height 400 :session +library(ggplot2); +library(tidyr); +df4.sum %>% + filter(Rank != 0) %>% + mutate(Total = Comm+Comp) %>% + select(-N, -S, -E) %>% + gather(Variable, Value, -Rank, -Iteration, -Platform, -Nodes, -NP, -Partitioning, -EID) %>% + ggplot(aes(x=Iteration, y=Value, group=Iteration)) + + theme_bw(base_size=12) + + ylim(0,NA) + + geom_boxplot() + + theme(legend.position="top") + + facet_wrap(~Variable) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-24763lPR/figure24763CTO.png]] + +** 50-node grisou :EXP5: +** 7-node grimoire with *Infiniband* :deprecated:EXP6: +*** First attempt +Experiment did not work because Infiniband was not chosen by MPI. +The message tells me this: +#+BEGIN_EXAMPLE +No OpenFabrics connection schemes reported that they were able to be +used on a specific port. As such, the openib BTL (OpenFabrics +support) will be disabled for this port. + + Local host: grimoire-4 + Local device: mlx4_0 + Local port: 1 + CPCs attempted: udcm +#+END_EXAMPLE +*** Second attempt +JobId 1227284 @ nancy. +**** Convert to CSV +I need on the $PATH: +- akypuera's otf22paje (compiled using OTF2 libraries of ScoreP 3.0) +- pajeng's =pj_dump= + +#+begin_src shell :results output +export PATH=$PATH:/home/lschnorr/akypuera/b/ +export PATH=$PATH:/home/lschnorr/pajeng/b/ + +convert() { + pushd $(dirname $otf2) + otf22paje traces.otf2 | pj_dump | grep ^State | cut -d, -f2,4,5,8 | sed -e "s/ //g" -e "s/MPIRank//" | gzip > traces.csv.gz + popd +} + +EDIR=exp_6_grimoire_7 +# Files already converted (whose CSV size is not zero) +EXISTINGFILE=$(tempfile) +OTF2FILE=$(tempfile) +find $EDIR -not -empty | grep csv$ | sed -e "s/.csv.gz$//" | sort > $EXISTINGFILE +find $EDIR | grep otf2 | sed -e "s/.otf2$//" | sort > $OTF2FILE + +for otf2 in $(comm -3 $OTF2FILE $EXISTINGFILE | sed "s/$/.otf2/"); do + echo $otf2 + convert $otf2 +done +#+end_src +**** Post-processing in R +#+begin_src R :results output :session :exports both :tangle do-exp6.R :tangle-mode (identity #o755) +#!/usr/bin/Rscript +library(readr); +library(dplyr); +alya_scorep_trace_read <- function(filename) +{ + meta <- unlist(strsplit(unlist(strsplit(filename, "/"))[3], "_")); + read_csv(filename, + col_names=c("Rank", "Start", "End", "Value"), + progress=TRUE) %>% + # Transform Value to factor + mutate(Value = as.factor(Value)) %>% + # Detect begin and end of iterations + mutate(Iteration = case_when( + grepl("timste", .$Value) ~ 1, + grepl("endste", .$Value) ~ -1, + TRUE ~ 0)) %>% + group_by(Rank) %>% + # Create a logical to detect observations within iterations + mutate(Iteration = as.logical(cumsum(Iteration))) %>% + # Get only observations that belongs to some iteration + filter(Iteration == TRUE) %>% + ungroup() %>% + # Create the iteration by cumsum + mutate(Iteration = case_when( + grepl("timste", .$Value) ~ 1, + TRUE ~ 0)) %>% + group_by(Rank) %>% + mutate(Iteration = as.integer(cumsum(Iteration))) %>% + ungroup() %>% + # Define metadata + mutate(EID = meta[2], + Platform = meta[3], + Nodes = meta[4], + NP = meta[5], + Partitioning = meta[6]); +} + +alya_scorep_trace_iterations <- function(filename) +{ + alya_scorep_trace_read(filename) %>% + group_by(Rank, Iteration, Platform, Nodes, NP, Partitioning, EID) %>% + filter(grepl("MPI_", Value)) %>% + summarize(N=n(), S=min(Start), E=max(End), Comm=sum(End-Start), Comp=(E-S)-Comm); +} + +args = commandArgs(trailingOnly=TRUE) +print(args); +df <- do.call("rbind", lapply(args, function(x) { alya_scorep_trace_iterations(x) })); +write.csv(df, "exp6_iterations.csv"); +#+end_src +**** Read in R :ATTACH: +:PROPERTIES: +:Attachments: exp6_iterations.csv.gz +:ID: 4e48b93e-3322-4cc1-9ba9-3145d1298878 +:END: + +***** Reading +#+begin_src R :results output :session :exports both +library(readr); +library(dplyr); +df <- read_csv("data/4e/48b93e-3322-4cc1-9ba9-3145d1298878/exp6_iterations.csv.gz") %>% + select(-X1) %>% + rename(Start = S, End = E) %>% + mutate(Duration = End - Start) %>% + mutate(Platform = as.factor(Platform), + Partitioning = as.factor(Partitioning), + Nodes = as.factor(Nodes)); +df %>% summary; +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + X1 = col_integer(), + Rank = col_integer(), + Iteration = col_integer(), + Platform = col_character(), + Nodes = col_integer(), + NP = col_integer(), + Partitioning = col_character(), + EID = col_integer(), + N = col_integer(), + S = col_double(), + E = col_double(), + Comm = col_double(), + Comp = col_double() +) +Warning message: +Missing column names filled in: 'X1' [1] + Rank Iteration Platform Nodes NP + Min. : 0.00 Min. : 1.0 grimoire:4180 7:4180 Min. : 97 + 1st Qu.: 26.00 1st Qu.: 3.0 1st Qu.: 97 + Median : 52.00 Median : 5.5 Median :112 + Mean : 52.02 Mean : 5.5 Mean :105 + 3rd Qu.: 78.00 3rd Qu.: 8.0 3rd Qu.:112 + Max. :111.00 Max. :10.0 Max. :112 + Partitioning EID N Start End + metis:2090 Min. :6 Min. : 3579 Min. :122.2 Min. :149.6 + sfc :2090 1st Qu.:6 1st Qu.:16353 1st Qu.:176.8 1st Qu.:200.4 + Median :6 Median :23759 Median :243.1 Median :267.8 + Mean :6 Mean :27314 Mean :240.3 Mean :264.6 + 3rd Qu.:6 3rd Qu.:36328 3rd Qu.:295.3 3rd Qu.:319.3 + Max. :6 Max. :85081 Max. :372.1 Max. :392.9 + Comm Comp Duration + Min. : 0.5519 Min. : 0.0178 Min. :18.05 + 1st Qu.: 8.4911 1st Qu.:12.9418 1st Qu.:22.99 + Median :10.0122 Median :14.4247 Median :24.01 + Mean : 9.7437 Mean :14.5349 Mean :24.28 + 3rd Qu.:11.0630 3rd Qu.:15.8340 3rd Qu.:26.13 + Max. :28.4181 Max. :27.2230 Max. :28.45 +#+end_example +***** Plot +#+begin_src R :results output :session :exports both +df %>% select(-Start, -End, -N, -Duration) %>% mutate(Total = Comm+Comp) +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 4,180 × 10 + Rank Iteration Platform Nodes NP Partitioning EID Comm Comp + +1 0 1 grimoire 7 97 metis 6 28.41812 0.028814 +2 0 2 grimoire 7 97 metis 6 27.58274 0.024577 +3 0 3 grimoire 7 97 metis 6 27.34581 0.024256 +4 0 4 grimoire 7 97 metis 6 27.04863 0.023692 +5 0 5 grimoire 7 97 metis 6 26.84089 0.023234 +6 0 6 grimoire 7 97 metis 6 26.69917 0.022638 +7 0 7 grimoire 7 97 metis 6 26.48485 0.022195 +8 0 8 grimoire 7 97 metis 6 26.27733 0.021815 +9 0 9 grimoire 7 97 metis 6 26.06916 0.021781 +10 0 10 grimoire 7 97 metis 6 21.13847 0.017997 +# ... with 4,170 more rows, and 1 more variables: Total +#+end_example + +#+begin_src R :results output :session :exports both +library(tidyr); +df %>% + select(-Start, -End, -N, -Duration) %>% + mutate(Total = Comm+Comp) %>% + gather(Variable, Value, -Rank, -Iteration, -Platform, -Nodes, -NP, -Partitioning, -EID); +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 12,540 × 9 + Rank Iteration Platform Nodes NP Partitioning EID Variable Value + +1 0 1 grimoire 7 97 metis 6 Comm 28.41812 +2 0 2 grimoire 7 97 metis 6 Comm 27.58274 +3 0 3 grimoire 7 97 metis 6 Comm 27.34581 +4 0 4 grimoire 7 97 metis 6 Comm 27.04863 +5 0 5 grimoire 7 97 metis 6 Comm 26.84089 +6 0 6 grimoire 7 97 metis 6 Comm 26.69917 +7 0 7 grimoire 7 97 metis 6 Comm 26.48485 +8 0 8 grimoire 7 97 metis 6 Comm 26.27733 +9 0 9 grimoire 7 97 metis 6 Comm 26.06916 +10 0 10 grimoire 7 97 metis 6 Comm 21.13847 +# ... with 12,530 more rows +#+end_example + +#+begin_src R :results output :session :exports both +df %>% + select(-Start, -End, -N, -Duration) %>% + mutate(Total = Comm+Comp) %>% + gather(Variable, Value, -Rank, -Iteration, -Platform, -Nodes, -NP, -Partitioning, -EID) %>% + group_by (Iteration, Platform, Nodes, NP, Partitioning, EID, Variable) %>% + summarize(Mean = mean(Value), SE = 3*sd(Value)/sqrt(n()), N=n()) %>% + # to check if it worked + filter(NP == 97, Partitioning == "metis") +#+end_src + +#+RESULTS: +#+begin_example +Source: local data frame [30 x 10] +Groups: Iteration, Platform, Nodes, NP, Partitioning, EID [10] + + Iteration Platform Nodes NP Partitioning EID Variable Mean + +1 1 grimoire 7 97 metis 6 Comm 11.77403 +2 1 grimoire 7 97 metis 6 Comp 16.53532 +3 1 grimoire 7 97 metis 6 Total 28.30936 +4 2 grimoire 7 97 metis 6 Comm 11.32483 +5 2 grimoire 7 97 metis 6 Comp 16.14359 +6 2 grimoire 7 97 metis 6 Total 27.46842 +7 3 grimoire 7 97 metis 6 Comm 11.09919 +8 3 grimoire 7 97 metis 6 Comp 16.13198 +9 3 grimoire 7 97 metis 6 Total 27.23118 +10 4 grimoire 7 97 metis 6 Comm 11.01637 +# ... with 20 more rows, and 2 more variables: SE , N +#+end_example + + + +#+begin_src R :results output graphics :file img/exp_6-97np-metis-sfc.pdf :exports both :width 6 :height 3 :session +library(ggplot2); +library(tidyr); +df %>% + select(-Start, -End, -N, -Duration) %>% + mutate(Total = Comm+Comp, Iteration = as.integer(Iteration)) %>% + gather(Variable, Value, -Rank, -Iteration, -Platform, -Nodes, -NP, -Partitioning, -EID) %>% + group_by (Iteration, Platform, Nodes, NP, Partitioning, EID, Variable) %>% + summarize(Mean = mean(Value), SE = 3*sd(Value)/sqrt(n()), N=n()) %>% + ggplot(aes(x=Iteration, y=Mean, ymin=Mean-SE, ymax=Mean+SE, + color=Partitioning)) + + theme_bw(base_size=14) + + ylim(0,NA) + + xlim(0,10) + + scale_x_continuous(breaks=seq(1:10)) + + xlab("Iteration [count]") + + ylab("Time [seconds]") + + geom_point() + + geom_errorbar(width=.3) + + geom_line () + + theme(legend.position=c(.78,.7), + panel.margin = unit(.08, "lines"), + legend.spacing = unit(.0, "lines"), + legend.margin=margin(b = -.4, unit='cm'), + plot.margin = unit(x = c(0, 0, 0, 0), units = "mm") + ) + + facet_grid(NP~Variable) +#+end_src + +#+RESULTS: +[[file:img/exp_6-97np-metis-sfc.pdf]] +***** With Arnaud +#+begin_src R :results output graphics :file img/exp_6-97np_112np-metis-sfc-LB.pdf :exports both :width 6 :height 3 :session +library(tidyr); +library(ggplot2); +df %>% + select(-Start, -End, -N, -Duration) %>% + mutate(Total = Comm+Comp) %>% + gather(Variable, Value, -Rank, -Iteration, -Platform, -Nodes, -NP, -Partitioning, -EID) %>% + filter(Variable == "Comp", Rank != 0, Iteration != 11) %>% + ggplot(aes(x=as.factor(Iteration), y=Value, color=as.factor(Rank))) + + theme_bw(base_size=14) + + ylim(0,30) + + ylab("Computation Time [seconds]") + + xlab("Iteration [count]") + +# ggtitle("Each color is an MPI rank.") + + geom_point(alpha=.3) + + geom_line (aes(group=as.factor(Rank):as.factor(Partitioning))) + + theme(legend.position="none", + panel.margin = unit(.08, "lines"), + legend.spacing = unit(.0, "lines"), + legend.margin=margin(b = -.4, unit='cm'), + plot.margin = unit(x = c(0, 0, 0, 0), units = "mm") + ) + + facet_grid(NP~Partitioning) +#+end_src + +#+RESULTS: +[[file:img/exp_6-97np_112np-metis-sfc-LB.pdf]] + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session +df %>% + select(-Start, -End, -N, -Duration) %>% + mutate(Total = Comm+Comp) %>% + gather(Variable, Value, -Rank, -Iteration, -Platform, -Nodes, -NP, -Partitioning, -EID) %>% + filter(Variable == "Comp", Rank != 0) %>% + filter(Iteration == 10) %>% + ggplot(aes(x=Value, fill=Partitioning)) + theme_bw(base_size=12) + geom_histogram() + facet_grid(Partitioning~NP); +#+end_src + +#+RESULTS: +[[file:/tmp/babel-1875bQK/figure1875FAD.png]] + + +Status: +- Load balance among processes is not equal (considering only computation) + - Iteration time is dominated by the slowest rank (numbers) + - Such ranks also have the higher number of points as shown in + fensap.par.log so we could have guessed in advance that the load + balance would be a problem. + - Actually, the load balancing is bad for both SFC and METIS + although it is slightly better (the max load is lower) for SFC. + - Communications are, as expected, more important for SFC than for + METIS but when the communication network is good (e.g., ib) this + becomes a problem only at large scale (probably above 512 + nodes as Ricard pointed out). However, when using a commodity + network (e.g, eth0), communications are problematic at smaller + scale (around above 100). Therefore, no need to go at high scale + to study the load balancing problem. It is already very bad at + medium scale. +Questions: +- Do you know how to control the partitioning (of Metis and SFC), + e.g., to disable communication cost, so we have a "perfect" load + balance. +- Do you know how to get the mapping between the average of mesh + points coordinates and ranks? This could help us understanding why + Metis and SFC have difficulties computing a good load balance. A + question we are asking ourselves is whether the heavy loaded + processes in both METIS and SFC correspond to the same region of the + mesh or not. (ref the fig) +** 2-node and 4-node grimoire with Infiniband :EXP8: +*** v1 :ATTACH: +:PROPERTIES: +:Attachments: exp_8-v1_grimoire_2.csv.gz +:ID: faffa111-90b6-4e60-97f0-39475468994c +:END: + +#+begin_src R :results output :session :exports both +library(readr); +library(dplyr); +df <- read_csv("data/fa/ffa111-90b6-4e60-97f0-39475468994c/exp_8-v1_grimoire_2.csv.gz") %>% + select(-X1) %>% + rename(Start = S, End = E) %>% + mutate(Duration = End - Start) %>% + mutate(Platform = as.factor(Platform), + Partitioning = as.factor(Partitioning), + Nodes = as.factor(Nodes)); +df %>% summary; +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + X1 = col_integer(), + Rank = col_integer(), + Iteration = col_integer(), + Platform = col_character(), + Nodes = col_integer(), + NP = col_integer(), + Partitioning = col_character(), + EID = col_character(), + Infiniband = col_logical(), + N = col_integer(), + S = col_double(), + E = col_double(), + Comm = col_double(), + Comp = col_double() +) +Warning message: +Missing column names filled in: 'X1' [1] + Rank Iteration Platform Nodes NP + Min. : 0.00 Min. : 1.0 grimoire:640 2:640 Min. :32 + 1st Qu.: 7.75 1st Qu.: 3.0 1st Qu.:32 + Median :15.50 Median : 5.5 Median :32 + Mean :15.50 Mean : 5.5 Mean :32 + 3rd Qu.:23.25 3rd Qu.: 8.0 3rd Qu.:32 + Max. :31.00 Max. :10.0 Max. :32 + Partitioning EID Infiniband N + metis:640 Length:640 Mode :logical Min. : 3579 + Class :character FALSE:320 1st Qu.:11028 + Mode :character TRUE :320 Median :13216 + NA's :0 Mean :13343 + 3rd Qu.:15539 + Max. :22601 + Start End Comm Comp + Min. :127.2 Min. :213.0 Min. : 0.2747 Min. : 0.02016 + 1st Qu.:298.2 1st Qu.:382.1 1st Qu.:28.2750 1st Qu.:44.78627 + Median :507.0 Median :588.7 Median :33.1652 Median :47.69400 + Mean :504.3 Mean :585.0 Mean :33.0316 Mean :47.60811 + 3rd Qu.:711.2 3rd Qu.:791.1 3rd Qu.:36.7584 3rd Qu.:52.60334 + Max. :876.5 Max. :941.3 Max. :86.2371 Max. :84.82805 + Duration + Min. :64.25 + 1st Qu.:80.60 + Median :82.08 + Mean :80.64 + 3rd Qu.:83.44 + Max. :86.27 +#+end_example + +#+begin_src R :results output graphics :file img/exp_8-v1_total.png :exports both :width 500 :height 400 :session +library(tidyr); +library(ggplot2); +df %>% + select(-Start, -End, -N, -Duration) %>% + mutate(Total = Comm+Comp) %>% + gather(Variable, Value, -Rank, -Iteration, -Platform, -Nodes, -NP, -Partitioning, -EID, -Infiniband) %>% + filter(Variable == "Total", Rank != 0, Partitioning == "metis", NP == 32) %>% + ggplot(aes(x=as.factor(Iteration), y=Value, color=Infiniband)) + + theme_bw(base_size=12) + + geom_point(alpha=.3) + + ylim(0,NA) + + geom_line (aes(group=as.factor(Rank):as.factor(Partitioning):as.factor(Infiniband))) + + theme(legend.position="top") + + facet_grid(Partitioning~NP) +#+end_src + +#+RESULTS: +[[file:img/exp_8-v1_total.png]] + +They look identical. I think both used Infiniband. + +I'll adapt the run alya script, run again to get v2 below. +*** v2 :ATTACH: +:PROPERTIES: +:Attachments: exp_8-v2_grimoire_2.csv.gz +:ID: 6475f529-6f29-4c6a-8d25-49c879403b6b +:END: + +#+begin_src R :results output :session :exports both +library(readr); +library(dplyr); +df <- read_csv("data/64/75f529-6f29-4c6a-8d25-49c879403b6b/exp_8-v2_grimoire_2.csv.gz") %>% + select(-X1) %>% + rename(Start = S, End = E) %>% + mutate(Duration = End - Start) %>% + mutate(Platform = as.factor(Platform), + Partitioning = as.factor(Partitioning), + Nodes = as.factor(Nodes)); +df %>% summary; +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + X1 = col_integer(), + Rank = col_integer(), + Iteration = col_integer(), + Platform = col_character(), + Nodes = col_integer(), + NP = col_integer(), + Partitioning = col_character(), + EID = col_character(), + Infiniband = col_logical(), + N = col_integer(), + S = col_double(), + E = col_double(), + Comm = col_double(), + Comp = col_double() +) +Warning message: +Missing column names filled in: 'X1' [1] + Rank Iteration Platform Nodes NP + Min. : 0.00 Min. : 1.0 grimoire:640 2:640 Min. :32 + 1st Qu.: 7.75 1st Qu.: 3.0 1st Qu.:32 + Median :15.50 Median : 5.5 Median :32 + Mean :15.50 Mean : 5.5 Mean :32 + 3rd Qu.:23.25 3rd Qu.: 8.0 3rd Qu.:32 + Max. :31.00 Max. :10.0 Max. :32 + Partitioning EID Infiniband N + metis:640 Length:640 Mode :logical Min. : 3579 + Class :character FALSE:320 1st Qu.:11028 + Mode :character TRUE :320 Median :13216 + NA's :0 Mean :13343 + 3rd Qu.:15539 + Max. :22601 + Start End Comm Comp + Min. :126.5 Min. :213.5 Min. : 0.3249 Min. : 0.02036 + 1st Qu.:298.8 1st Qu.:383.4 1st Qu.:28.6923 1st Qu.:44.80856 + Median :507.5 Median :589.7 Median :33.4850 Median :47.64884 + Mean :505.2 Mean :586.0 Mean :33.2070 Mean :47.57824 + 3rd Qu.:710.9 3rd Qu.:790.4 3rd Qu.:36.8898 3rd Qu.:52.68086 + Max. :882.3 Max. :946.9 Max. :86.9266 Max. :85.24896 + Duration + Min. :63.92 + 1st Qu.:80.31 + Median :82.00 + Mean :80.79 + 3rd Qu.:83.64 + Max. :86.96 +#+end_example + +#+begin_src R :results output graphics :file img/exp_8-v2_total.png :exports both :width 500 :height 400 :session +library(tidyr); +library(ggplot2); +df %>% + select(-Start, -End, -N, -Duration) %>% + mutate(Total = Comm+Comp) %>% + gather(Variable, Value, -Rank, -Iteration, -Platform, -Nodes, -NP, -Partitioning, -EID, -Infiniband) %>% + filter(Variable == "Total", Iteration != 10, Rank != 0, Partitioning == "metis", NP == 32) %>% + ggplot(aes(x=as.factor(Iteration), y=Value, color=as.factor(Infiniband))) + + theme_bw(base_size=12) + + geom_point(alpha=.3) + + ylim(0,NA) + + geom_line (aes(group=as.factor(Rank):as.factor(Partitioning):as.factor(Infiniband))) + + theme(legend.position="top") + + facet_grid(Partitioning~NP) +#+end_src + +#+RESULTS: +[[file:img/exp_8-v2_total.png]] + +Well, it it still very similar. But now at least when Infiniband is +TRUE, the total makespan of each iteration is smaller. + +*** v3 :ATTACH: +:PROPERTIES: +:Attachments: exp_8-v3_grimoire_4.csv.gz +:ID: bd355764-1704-4485-bebd-5e714e76c2be +:END: + +#+begin_src R :results output :session :exports both +library(readr); +library(dplyr); +df <- read_csv("data/bd/355764-1704-4485-bebd-5e714e76c2be/exp_8-v3_grimoire_4.csv.gz") %>% + select(-X1) %>% + rename(Start = S, End = E) %>% + mutate(Duration = End - Start) %>% + mutate(Platform = as.factor(Platform), + Partitioning = as.factor(Partitioning), + Nodes = as.factor(Nodes)); +df %>% summary; +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + X1 = col_integer(), + Rank = col_integer(), + Iteration = col_integer(), + Platform = col_character(), + Nodes = col_integer(), + NP = col_integer(), + Partitioning = col_character(), + EID = col_character(), + Infiniband = col_logical(), + N = col_integer(), + S = col_double(), + E = col_double(), + Comm = col_double(), + Comp = col_double() +) +Warning message: +Missing column names filled in: 'X1' [1] + Rank Iteration Platform Nodes NP + Min. : 0.00 Min. : 1.0 grimoire:1280 4:1280 Min. :64 + 1st Qu.:15.75 1st Qu.: 3.0 1st Qu.:64 + Median :31.50 Median : 5.5 Median :64 + Mean :31.50 Mean : 5.5 Mean :64 + 3rd Qu.:47.25 3rd Qu.: 8.0 3rd Qu.:64 + Max. :63.00 Max. :10.0 Max. :64 + Partitioning EID Infiniband N + metis:1280 Length:1280 Mode :logical Min. : 3579 + Class :character FALSE:640 1st Qu.:11750 + Mode :character TRUE :640 Median :14600 + NA's :0 Mean :15362 + 3rd Qu.:18626 + Max. :28849 + Start End Comm Comp + Min. :261.0 Min. :302.7 Min. : 0.3612 Min. : 0.03617 + 1st Qu.:343.8 1st Qu.:384.1 1st Qu.:13.3089 1st Qu.:21.57136 + Median :443.8 Median :483.0 Median :15.2729 Median :23.21151 + Mean :442.5 Mean :481.2 Mean :15.2131 Mean :23.46578 + 3rd Qu.:541.2 3rd Qu.:579.3 3rd Qu.:17.4194 3rd Qu.:25.69657 + Max. :621.9 Max. :653.0 Max. :41.7033 Max. :40.66069 + Duration + Min. :30.86 + 1st Qu.:38.42 + Median :39.29 + Mean :38.68 + 3rd Qu.:39.86 + Max. :41.76 +#+end_example + +#+begin_src R :results output graphics :file img/exp_8-v3_total.png :exports both :width 500 :height 400 :session +library(tidyr); +library(ggplot2); +df %>% + select(-Start, -End, -N, -Duration) %>% + mutate(Total = Comm+Comp) %>% + gather(Variable, Value, -Rank, -Iteration, -Platform, -Nodes, -NP, -Partitioning, -EID, -Infiniband) %>% + filter(Variable == "Total", Iteration != 10, Rank != 0, Partitioning == "metis", NP == 64) %>% + ggplot(aes(x=as.factor(Iteration), y=Value, color=Infiniband)) + + theme_bw(base_size=12) + + geom_point(alpha=.3) + + ylim(0,NA) + + geom_line (aes(group=as.factor(Rank):as.factor(Partitioning):as.factor(Infiniband))) + + theme(legend.position="top") + + facet_grid(Partitioning~NP) +#+end_src + +#+RESULTS: +[[file:img/exp_8-v3_total.png]] + +Same as in v3. I think Infiniband set to TRUE is already working. + +*** v4 :ATTACH: +:PROPERTIES: +:Attachments: exp_8-v4_grimoire_4.csv.gz +:ID: 96a5786d-68c5-4d6c-9a54-6947b6fd5021 +:END: +#+begin_src R :results output :session :exports both +library(readr); +library(dplyr); +df <- read_csv("data/96/a5786d-68c5-4d6c-9a54-6947b6fd5021/exp_8-v4_grimoire_4.csv.gz") %>% + select(-X1) %>% + rename(Start = S, End = E) %>% + mutate(Duration = End - Start) %>% + mutate(Platform = as.factor(Platform), + Partitioning = as.factor(Partitioning), + Nodes = as.factor(Nodes)); +df %>% summary; +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + X1 = col_integer(), + Rank = col_integer(), + Iteration = col_integer(), + Platform = col_character(), + Nodes = col_integer(), + NP = col_integer(), + Partitioning = col_character(), + EID = col_character(), + Infiniband = col_logical(), + N = col_integer(), + S = col_double(), + E = col_double(), + Comm = col_double(), + Comp = col_double() +) +Warning message: +Missing column names filled in: 'X1' [1] + Rank Iteration Platform Nodes NP + Min. : 0.00 Min. : 1.0 grimoire:2400 4:2400 Min. :60 + 1st Qu.:14.75 1st Qu.: 3.0 1st Qu.:60 + Median :29.50 Median : 5.5 Median :60 + Mean :29.50 Mean : 5.5 Mean :60 + 3rd Qu.:44.25 3rd Qu.: 8.0 3rd Qu.:60 + Max. :59.00 Max. :10.0 Max. :60 + Partitioning EID Infiniband N + metis:1200 Length:2400 Mode :logical Min. : 3579 + sfc :1200 Class :character FALSE:1200 1st Qu.:13229 + Mode :character TRUE :1200 Median :22332 + NA's :0 Mean :25072 + 3rd Qu.:33946 + Max. :69461 + Start End Comm Comp + Min. :238.8 Min. :278.3 Min. : 0.5323 Min. : 0.03779 + 1st Qu.:339.3 1st Qu.:380.7 1st Qu.:10.4865 1st Qu.:23.37386 + Median :431.8 Median :471.4 Median :13.5453 Median :25.22149 + Mean :431.4 Mean :470.1 Mean :13.5252 Mean :25.13902 + 3rd Qu.:523.3 3rd Qu.:562.9 3rd Qu.:16.4835 3rd Qu.:27.24979 + Max. :643.5 Max. :676.9 Max. :44.0683 Max. :42.73652 + Duration + Min. :28.50 + 1st Qu.:36.52 + Median :38.28 + Mean :38.66 + 3rd Qu.:41.78 + Max. :44.13 +#+end_example + +#+begin_src R :results output graphics :file img/exp_8-v4_total.png :exports both :width 500 :height 400 :session +library(tidyr); +library(ggplot2); +df %>% + select(-Start, -End, -N, -Duration) %>% + mutate(Total = Comm+Comp) %>% + gather(Variable, Value, -Rank, -Iteration, -Platform, -Nodes, -NP, -Partitioning, -EID, -Infiniband) %>% + filter(Variable == "Total", Iteration != 10, Rank != 0, NP == 60) %>% + ggplot(aes(x=as.factor(Iteration), y=Value, color=Infiniband)) + + theme_bw(base_size=12) + + geom_point(alpha=.3) + + ylim(0,NA) + + geom_line (aes(group=as.factor(Rank):as.factor(Partitioning):as.factor(Infiniband))) + + theme(legend.position="top") + + facet_grid(NP~Partitioning) +#+end_src + +#+RESULTS: +[[file:img/exp_8-v4_total.png]] + +Looks like we have proof that Infiniband can be desactivated. +** 6-node grimoire infiniband check :EXP9:ATTACH: +:PROPERTIES: +:Attachments: exp_9-v1_grimoire_6.csv.gz +:ID: e71e949a-1221-4b20-90b5-f8b9c6585810 +:END: + +#+begin_src R :results output :session :exports both +library(readr); +library(dplyr); +df <- read_csv("data/e7/1e949a-1221-4b20-90b5-f8b9c6585810/exp_9-v1_grimoire_6.csv.gz") %>% + select(-X1) %>% + rename(Start = S, End = E) %>% + mutate(Duration = End - Start) %>% + mutate(Platform = as.factor(Platform), + Partitioning = as.factor(Partitioning), + Nodes = as.factor(Nodes)); +df %>% summary; +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + X1 = col_integer(), + Rank = col_integer(), + Iteration = col_integer(), + Platform = col_character(), + Nodes = col_integer(), + NP = col_integer(), + Partitioning = col_character(), + EID = col_character(), + Infiniband = col_logical(), + N = col_integer(), + S = col_double(), + E = col_double(), + Comm = col_double(), + Comp = col_double() +) +Warning message: +Missing column names filled in: 'X1' [1] + Rank Iteration Platform Nodes NP + Min. : 0.00 Min. : 1.0 grimoire:3840 6:3840 Min. :96 + 1st Qu.:23.75 1st Qu.: 3.0 1st Qu.:96 + Median :47.50 Median : 5.5 Median :96 + Mean :47.50 Mean : 5.5 Mean :96 + 3rd Qu.:71.25 3rd Qu.: 8.0 3rd Qu.:96 + Max. :95.00 Max. :10.0 Max. :96 + Partitioning EID Infiniband N + metis:1920 Length:3840 Mode :logical Min. : 3579 + sfc :1920 Class :character FALSE:1920 1st Qu.:16320 + Mode :character TRUE :1920 Median :23759 + NA's :0 Mean :27008 + 3rd Qu.:36328 + Max. :81957 + Start End Comm Comp + Min. :123.4 Min. :150.7 Min. : 0.3444 Min. : 0.01754 + 1st Qu.:190.8 1st Qu.:217.7 1st Qu.: 8.4054 1st Qu.:14.47885 + Median :251.6 Median :277.3 Median : 9.7694 Median :15.58878 + Mean :251.8 Mean :277.3 Mean : 9.6905 Mean :15.79769 + 3rd Qu.:312.4 3rd Qu.:337.4 3rd Qu.:11.4077 3rd Qu.:17.16624 + Max. :384.8 Max. :405.8 Max. :27.9786 Max. :27.03608 + Duration + Min. :19.22 + 1st Qu.:24.94 + Median :26.06 + Mean :25.49 + 3rd Qu.:26.76 + Max. :28.22 +#+end_example + +#+begin_src R :results output graphics :file img/exp_9-v1_total.png :exports both :width 500 :height 400 :session +library(tidyr); +library(ggplot2); +df %>% + select(-Start, -End, -N, -Duration) %>% + mutate(Total = Comm+Comp) %>% + gather(Variable, Value, -Rank, -Iteration, -Platform, -Nodes, -NP, -Partitioning, -EID, -Infiniband) %>% + filter(Variable == "Total", Iteration != 10, Rank != 0) %>% + ggplot(aes(x=as.factor(Iteration), y=Value, color=Infiniband)) + + theme_bw(base_size=12) + + geom_point(alpha=.3) + + ylim(0,NA) + + geom_line (aes(group=as.factor(Rank):as.factor(Partitioning):as.factor(Infiniband))) + + theme(legend.position="top") + + facet_grid(NP~Partitioning) +#+end_src + +#+RESULTS: +[[file:img/exp_9-v1_total.png]] +** 8-node grimoire infiniband-only after Alya modifications :EXP10: +*** Read :ATTACH: +:PROPERTIES: +:Attachments: exp_10-v2_grimoire_8.csv.gz +:ID: 15c9204a-2bde-4ee3-b166-242729fbc30d +:END: + +#+begin_src R :results output :session :exports both +library(readr); +library(dplyr); +df <- read_csv("data/15/c9204a-2bde-4ee3-b166-242729fbc30d/exp_10-v2_grimoire_8.csv.gz") %>% + select(-X1) %>% + rename(Start = S, End = E) %>% + mutate(Duration = End - Start) %>% + mutate(Platform = as.factor(Platform), + Partitioning = as.factor(Partitioning), + Nodes = as.factor(Nodes)); +df %>% summary; +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + X1 = col_integer(), + Rank = col_integer(), + Iteration = col_integer(), + Platform = col_character(), + Nodes = col_integer(), + NP = col_integer(), + Partitioning = col_character(), + EID = col_character(), + Infiniband = col_logical(), + N = col_integer(), + S = col_double(), + E = col_double(), + Comm = col_double(), + Comp = col_double() +) +Warning message: +Missing column names filled in: 'X1' [1] + Rank Iteration Platform Nodes NP + Min. : 0.00 Min. : 1.0 grimoire:2560 8:2560 Min. :128 + 1st Qu.: 31.75 1st Qu.: 3.0 1st Qu.:128 + Median : 63.50 Median : 5.5 Median :128 + Mean : 63.50 Mean : 5.5 Mean :128 + 3rd Qu.: 95.25 3rd Qu.: 8.0 3rd Qu.:128 + Max. :127.00 Max. :10.0 Max. :128 + Partitioning EID Infiniband N Start + metis:1280 Length:2560 Mode:logical Min. : 3579 Min. :121.9 + sfc :1280 Class :character TRUE:2560 1st Qu.:17937 1st Qu.:167.1 + Mode :character NA's:0 Median :25074 Median :210.0 + Mean :28662 Mean :210.3 + 3rd Qu.:37422 3rd Qu.:254.4 + Max. :91329 Max. :313.9 + End Comm Comp Duration + Min. :139.5 Min. : 0.4235 Min. : 0.01789 Min. :13.30 + 1st Qu.:186.4 1st Qu.: 3.8991 1st Qu.:10.76531 1st Qu.:16.52 + Median :228.6 Median : 6.6020 Median :11.80869 Median :17.29 + Mean :228.5 Mean : 6.3043 Mean :11.90795 Mean :18.21 + 3rd Qu.:272.9 3rd Qu.: 8.5957 3rd Qu.:13.19428 3rd Qu.:20.24 + Max. :330.2 Max. :21.5603 Max. :20.72990 Max. :21.59 +#+end_example +*** Plot Comp +#+begin_src R :results output graphics :file img/exp_10-v2_comp.png :exports both :width 500 :height 400 :session +library(tidyr); +library(ggplot2); +df %>% + select(-Start, -End, -N, -Duration) %>% + mutate(Total = Comm+Comp) %>% + gather(Variable, Value, -Rank, -Iteration, -Platform, -Nodes, -NP, -Partitioning, -EID, -Infiniband) %>% + filter(Variable == "Comp", Iteration != 10, Rank != 0) %>% + ggplot(aes(x=as.factor(Iteration), y=Value, color=Infiniband)) + + theme_bw(base_size=12) + + geom_point(alpha=.3) + + ylim(0,NA) + + geom_line (aes(group=as.factor(Rank):as.factor(Partitioning):as.factor(Infiniband))) + + theme(legend.position="top") + + facet_grid(NP~Partitioning) +#+end_src + +#+RESULTS: +[[file:img/exp_10-v2_comp.png]] +*** Plot mean comp (per rank, all iterations) +#+begin_src R :results output graphics :file img/exp_10-v2_comp_mean.png :exports both :width 600 :height 400 :session +library(tidyr); +library(ggplot2); +df %>% + select(-Start, -End, -N, -Duration) %>% + mutate(Total = Comm+Comp) %>% + gather(Variable, Value, -Rank, -Iteration, -Platform, -Nodes, -NP, -Partitioning, -EID, -Infiniband) %>% + filter(Variable == "Comp", Iteration != 10, Rank != 0) %>% +# group_by(Rank, Platform, Nodes, NP, Partitioning, EID, Infiniband) %>% +# summarize(MeanComp = mean(Value), SE=3*sd(Value)/sqrt(n()), N=n()) %>% + ggplot(aes(x=Rank, y=Value, color=Partitioning)) + + theme_bw(base_size=12) + + geom_point(alpha=.3) + + ylim(0,NA) + + geom_line (aes(group=as.factor(Rank):as.factor(Partitioning):as.factor(Infiniband))) + + theme(legend.position="top") + + facet_grid(NP~Partitioning) +#+end_src + +#+RESULTS: +[[file:img/exp_10-v2_comp_mean.png]] +*** Partition analysis +**** Process log files +#+begin_src shell :results output +EDIR=exp_10-v2_grimoire_8 +for file in $(find $EDIR | grep results | grep log$); do + OUTPUT=$(dirname $file)/$(basename $file .log).csv + cat $file | tr -s " " | sed "s/^ //" | tr ' ' ',' | cut -d"," -f1,3,5,7,9,11,13 | uniq > $OUTPUT + cat $file | tr -s " " | sed "s/^ //" | tr ' ' ',' | cut -d"," -f2,4,6,8,10,12,14 >> $OUTPUT + head $OUTPUT + tail $OUTPUT + echo +done +#+end_src + +#+RESULTS: +#+begin_example +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,14976,72536,72796,2384,17,14976 +2,14641,72392,72575,448,12,14641 +3,22264,72647,167221,4446,16,22264 +4,28129,72464,251684,7187,11,28129 +5,23239,72488,178813,4341,25,23239 +6,26093,72525,209615,5640,14,26093 +7,22996,72391,183401,4422,10,22996 +8,21190,72393,160643,3505,7,21190 +9,25178,72612,201462,5195,10,25178 +118,24285,72483,188018,6181,14,24285 +119,26512,72604,229879,6447,15,26512 +120,24865,72374,200264,5134,10,24865 +121,22912,72446,173477,4266,15,22912 +122,14466,72460,73072,776,11,14466 +123,14046,72589,72589,944,8,14046 +124,15427,72509,73961,2155,15,15427 +125,21776,72610,153930,4907,13,21776 +126,21662,72595,161294,4243,12,21662 +127,18484,72603,111590,4500,17,18484 + +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,22405,85371,172276,3477,4,22405 +2,22923,86049,179769,3742,5,22923 +3,24316,99541,175416,3029,5,24316 +4,22907,90905,170125,3170,4,22907 +5,22636,86964,173194,3448,5,22636 +6,22774,88416,171711,3333,6,22774 +7,23564,91625,177865,3452,7,23564 +8,22211,83394,173364,3612,5,22211 +9,23427,89980,179685,3588,4,23427 +118,19344,51553,177808,5044,6,19344 +119,19050,54006,174756,4829,3,19050 +120,18742,53558,169698,4652,6,18742 +121,19158,51974,179487,5103,4,19158 +122,19820,53397,179497,5028,8,19820 +123,19357,55432,172417,4679,6,19357 +124,18855,51717,176727,5006,3,18855 +125,19473,52093,179803,5107,6,19473 +126,18948,49074,179769,5225,4,18948 +127,19037,50759,172104,4863,6,19037 + +#+end_example +**** Load them in R +#+begin_src R :results output :session :exports both +library(readr); +library(dplyr); +library(tidyr); + +read_npoin <- function(filename) +{ + meta <- unlist(strsplit(unlist(strsplit(filename, "/"))[2], "_")); + meta <- gsub(".dir", "", meta); + read_csv(filename) %>% + gather(Variable, Value, -Rank) %>% + mutate(EID = meta[1], + Platform = meta[2], + Nodes = meta[3], + NP = meta[4], + Partitioning = meta[5], + Infiniband = as.logical(meta[6])); + +} +files <- list.files("exp_10-v2_grimoire_8", pattern="results_NPOIN_NELEM_NELEW_NBOUN.csv", recursive=TRUE, full.names=TRUE); +files; +df <- do.call("rbind", lapply(files, function(x) { read_npoin(x) })) +df %>% filter(Rank == 111, Variable == "NELEW") +#+end_src + +#+RESULTS: +#+begin_example +[1] "exp_10-v2_grimoire_8/10-v2_grimoire_8_128_metis_true.dir/results_NPOIN_NELEM_NELEW_NBOUN.csv" +[2] "exp_10-v2_grimoire_8/10-v2_grimoire_8_128_sfc_true.dir/results_NPOIN_NELEM_NELEW_NBOUN.csv" +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +# A tibble: 2 × 9 + Rank Variable Value EID Platform Nodes NP Partitioning Infiniband + +1 111 NELEW 172833 10-v2 grimoire 8 128 metis TRUE +2 111 NELEW 258786 10-v2 grimoire 8 128 sfc TRUE +#+end_example + +#+begin_src R :results output graphics :file img/exp_10-v2_partition.png :exports both :width 600 :height 800 :session +df %>% + ggplot(aes(x=Rank, y=Value, color=Partitioning)) + + theme_bw(base_size=12) + + geom_point() + + ylim(0,NA) + + geom_line (aes(group=as.factor(Rank):as.factor(Partitioning):as.factor(Infiniband))) + + theme(legend.position="top") + + facet_grid(Variable~Partitioning, scales="free_y") +#+end_src + +#+RESULTS: +[[file:img/exp_10-v2_partition.png]] +** 8-node grimoire infiniband-only before Alya modifs :EXP11: +*** Read :ATTACH: +:PROPERTIES: +:Attachments: exp_11-v1_grimoire_8.csv.gz +:ID: 11326040-b8e2-41f3-8479-c378f70864d9 +:END: + +#+begin_src R :results output :session :exports both +library(readr); +library(dplyr); +df <- read_csv("data/11/326040-b8e2-41f3-8479-c378f70864d9/exp_11-v1_grimoire_8.csv.gz") %>% + select(-X1) %>% + rename(Start = S, End = E) %>% + mutate(Duration = End - Start) %>% + mutate(Platform = as.factor(Platform), + Partitioning = as.factor(Partitioning), + Nodes = as.factor(Nodes)); +df %>% summary; +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + X1 = col_integer(), + Rank = col_integer(), + Iteration = col_integer(), + Platform = col_character(), + Nodes = col_integer(), + NP = col_integer(), + Partitioning = col_character(), + EID = col_character(), + Infiniband = col_logical(), + N = col_integer(), + S = col_double(), + E = col_double(), + Comm = col_double(), + Comp = col_double() +) +Warning message: +Missing column names filled in: 'X1' [1] + Rank Iteration Platform Nodes NP + Min. : 0.00 Min. : 1.0 grimoire:2560 8:2560 Min. :128 + 1st Qu.: 31.75 1st Qu.: 3.0 1st Qu.:128 + Median : 63.50 Median : 5.5 Median :128 + Mean : 63.50 Mean : 5.5 Mean :128 + 3rd Qu.: 95.25 3rd Qu.: 8.0 3rd Qu.:128 + Max. :127.00 Max. :10.0 Max. :128 + Partitioning EID Infiniband N Start + metis:1280 Length:2560 Mode:logical Min. : 3579 Min. :121.7 + sfc :1280 Class :character TRUE:2560 1st Qu.:17937 1st Qu.:168.6 + Mode :character NA's:0 Median :24815 Median :217.9 + Mean :28185 Mean :217.4 + 3rd Qu.:37093 3rd Qu.:265.8 + Max. :81957 Max. :313.0 + End Comm Comp Duration + Min. :142.9 Min. : 0.3322 Min. : 0.02086 Min. :15.47 + 1st Qu.:189.2 1st Qu.: 7.0911 1st Qu.:10.84112 1st Qu.:19.75 + Median :237.8 Median : 8.1973 Median :11.59094 Median :20.00 + Mean :237.1 Mean : 7.9373 Mean :11.81608 Mean :19.75 + 3rd Qu.:285.6 3rd Qu.: 9.0958 3rd Qu.:12.73405 3rd Qu.:20.39 + Max. :328.8 Max. :21.1947 Max. :20.53365 Max. :21.22 +#+end_example +*** Plot Comp +#+begin_src R :results output graphics :file img/exp_11-v1_comp.png :exports both :width 500 :height 400 :session +library(tidyr); +library(ggplot2); +df %>% + select(-Start, -End, -N, -Duration) %>% + mutate(Total = Comm+Comp) %>% + gather(Variable, Value, -Rank, -Iteration, -Platform, -Nodes, -NP, -Partitioning, -EID, -Infiniband) %>% + filter(Variable == "Comp", Iteration != 10, Rank != 0) %>% + ggplot(aes(x=as.factor(Iteration), y=Value, color=Infiniband)) + + theme_bw(base_size=12) + + geom_point(alpha=.3) + + ylim(0,NA) + + geom_line (aes(group=as.factor(Rank):as.factor(Partitioning):as.factor(Infiniband))) + + theme(legend.position="top") + + facet_grid(NP~Partitioning) +#+end_src + +#+RESULTS: +[[file:img/exp_11-v1_comp.png]] +*** Partition analysis +**** Process log files +#+begin_src shell :results output +EDIR=exp_11-v1_grimoire_8 +for file in $(find $EDIR | grep results | grep log$); do + OUTPUT=$(dirname $file)/$(basename $file .log).csv + cat $file | tr -s " " | sed "s/^ //" | tr ' ' ',' | cut -d"," -f1,3,5,7,9,11,13 | uniq > $OUTPUT + cat $file | tr -s " " | sed "s/^ //" | tr ' ' ',' | cut -d"," -f2,4,6,8,10,12,14 >> $OUTPUT + head $OUTPUT + tail $OUTPUT + echo +done +#+end_src + +#+RESULTS: +#+begin_example +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,32182,174168,174611,2973,16,32182 +2,20919,53529,174758,5381,15,20919 +3,19727,49972,175172,5020,11,19727 +4,21918,69049,174054,4277,23,21918 +5,21683,64345,174280,4439,14,21683 +6,22205,64122,175002,4582,18,22205 +7,21361,63432,174587,4436,11,21361 +8,22162,75699,174459,3914,8,22162 +9,21389,66339,174799,4332,8,21389 +118,19736,47749,174249,5084,13,19736 +119,27122,103080,174535,7852,12,27122 +120,21785,59025,175145,5304,13,21785 +121,20722,56163,175018,4869,12,20722 +122,21269,59300,174615,4657,10,21269 +123,22403,70519,174270,4135,13,22403 +124,30100,145205,174682,2874,9,30100 +125,31603,142203,174120,5271,15,31603 +126,22843,73894,175386,4418,11,22843 +127,26719,97390,174439,6359,13,26719 + +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,22405,85371,172276,3477,4,22405 +2,22923,86049,179769,3742,5,22923 +3,24316,99541,175416,3029,5,24316 +4,22907,90905,170125,3170,4,22907 +5,22636,86964,173194,3448,5,22636 +6,22774,88416,171711,3333,6,22774 +7,23564,91625,177865,3452,7,23564 +8,22211,83394,173364,3612,5,22211 +9,23427,89980,179685,3588,4,23427 +118,19344,51553,177808,5044,6,19344 +119,19050,54006,174756,4829,3,19050 +120,18742,53558,169698,4652,6,18742 +121,19158,51974,179487,5103,4,19158 +122,19820,53397,179497,5028,8,19820 +123,19357,55432,172417,4679,6,19357 +124,18855,51717,176727,5006,3,18855 +125,19473,52093,179803,5107,6,19473 +126,18948,49074,179769,5225,4,18948 +127,19037,50759,172104,4863,6,19037 + +#+end_example +**** Load them in R +#+begin_src R :results output :session :exports both +library(readr); +library(dplyr); +library(tidyr); + +read_npoin <- function(filename) +{ + meta <- unlist(strsplit(unlist(strsplit(filename, "/"))[2], "_")); + meta <- gsub(".dir", "", meta); + read_csv(filename) %>% + gather(Variable, Value, -Rank) %>% + mutate(EID = meta[1], + Platform = meta[2], + Nodes = meta[3], + NP = meta[4], + Partitioning = meta[5], + Infiniband = as.logical(meta[6])); + +} +files <- list.files("exp_11-v1_grimoire_8", pattern="results_NPOIN_NELEM_NELEW_NBOUN.csv", recursive=TRUE, full.names=TRUE); +files; +df <- do.call("rbind", lapply(files, function(x) { read_npoin(x) })) +df %>% filter(Rank == 111, Variable == "NELEW") +#+end_src + +#+RESULTS: +#+begin_example +[1] "exp_11-v1_grimoire_8/11-v1_grimoire_8_128_metis_true.dir/results_NPOIN_NELEM_NELEW_NBOUN.csv" +[2] "exp_11-v1_grimoire_8/11-v1_grimoire_8_128_sfc_true.dir/results_NPOIN_NELEM_NELEW_NBOUN.csv" +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +# A tibble: 2 × 9 + Rank Variable Value EID Platform Nodes NP Partitioning Infiniband + +1 111 NELEW 172833 11-v1 grimoire 8 128 metis TRUE +2 111 NELEW 173994 11-v1 grimoire 8 128 sfc TRUE +#+end_example +**** Plot +#+begin_src R :results output graphics :file img/exp_11-v1_partition.png :exports both :width 600 :height 800 :session +df %>% + ggplot(aes(x=Rank, y=Value, color=Partitioning)) + + theme_bw(base_size=12) + + geom_point() + + ylim(0,NA) + + geom_line (aes(group=as.factor(Rank):as.factor(Partitioning):as.factor(Infiniband))) + + theme(legend.position="top") + + facet_grid(Variable~Partitioning, scales="free_y") +#+end_src + +#+RESULTS: +[[file:img/exp_11-v1_partition.png]] +** 6-node grimoire tracing only application functions :EXP12: +:PROPERTIES: +:Attachments: scorep_12-v1_grimoire_6_48_metis_true_traces.csv.gz scorep_12-v1_grimoire_6_48_sfc_true_traces.csv.gz scorep_12-v1_grimoire_6_96_metis_true_traces.csv.gz scorep_12-v1_grimoire_6_96_sfc_true_traces.csv.gz +:ID: 58ddccd9-95c8-4841-9c07-336a0209e264 +:END: + +*** Which files +#+name: exp_12_traces +#+begin_src shell :results output org +find data/ | grep scorep_12-v1 +#+end_src + +#+RESULTS: exp_12_traces +#+BEGIN_SRC org +data/58/ddccd9-95c8-4841-9c07-336a0209e264/scorep_12-v1_grimoire_6_96_metis_true_traces.csv.gz +data/58/ddccd9-95c8-4841-9c07-336a0209e264/scorep_12-v1_grimoire_6_48_metis_true_traces.csv.gz +data/58/ddccd9-95c8-4841-9c07-336a0209e264/scorep_12-v1_grimoire_6_96_sfc_true_traces.csv.gz +data/58/ddccd9-95c8-4841-9c07-336a0209e264/scorep_12-v1_grimoire_6_48_sfc_true_traces.csv.gz +#+END_SRC +*** Read in R :ATTACH: + +- =nsi_matrix= +- =solver= +- =nsi_inisol= (hardly seen) + +#+begin_src R :results output :session :exports both :var files=exp_12_traces +library(readr); +library(dplyr); +alya_scorep_trace_read <- function(filename) +{ + meta <- unlist(strsplit(unlist(strsplit(filename, "/"))[4], "_")); + read_csv(filename, + col_names=c("Rank", "Start", "End", "Value"), + progress=FALSE) %>% + # Transform Value to factor + mutate(Value = as.factor(Value)) %>% + # Detect begin and end of iterations + mutate(Iteration = case_when( + (.$Value == "timste") ~ 1, + (.$Value == "endste") ~ -1, + TRUE ~ 0)) %>% + group_by(Rank) %>% + # Create a logical to detect observations within iterations + mutate(Iteration = as.logical(cumsum(Iteration))) %>% + # Get only observations that belongs to some iteration + filter(Iteration == TRUE) %>% + ungroup() %>% + # Create the iteration by cumsum + mutate(Iteration = case_when( + (.$Value == "timste") ~ 1, + TRUE ~ 0)) %>% + group_by(Rank) %>% + mutate(Iteration = cumsum(Iteration)) %>% + mutate(EID = meta[2], + Platform = meta[3], + Nodes = meta[4], + NP = meta[5], + Partitioning = meta[6]); +} + +files <- strsplit(files, "\n")[[1]] +df <- do.call("rbind", lapply(files, function(x) { alya_scorep_trace_read(x) })); + +# +# Keep only states that are interesting +# + +df <- df %>% filter(Value %in% c("nsi_matrix", "solver", "nsi_inisol")); +#+end_src + +#+RESULTS: +#+begin_example + +Attaching package: ‘dplyr’ + +The following objects are masked from ‘package:stats’: + + filter, lag + +The following objects are masked from ‘package:base’: + + intersect, setdiff, setequal, union +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +#+end_example + +#+begin_src R :results output :session :exports both +df %>% + group_by(Rank, Iteration, EID, Platform, Nodes, NP, Partitioning, Value) %>% + summarize(NewComp = sum(End - Start)); +#+end_src + +#+RESULTS: +#+begin_example +Source: local data frame [86,400 x 9] +Groups: Rank, Iteration, EID, Platform, Nodes, NP, Partitioning [?] + + Rank Iteration EID Platform Nodes NP Partitioning Value NewComp + +1 0 1 12-v1 grimoire 6 48 metis nsi_inisol 0.000013 +2 0 1 12-v1 grimoire 6 48 metis nsi_matrix 0.000075 +3 0 1 12-v1 grimoire 6 48 metis solver 42.482770 +4 0 1 12-v1 grimoire 6 48 sfc nsi_inisol 0.000012 +5 0 1 12-v1 grimoire 6 48 sfc nsi_matrix 0.000079 +6 0 1 12-v1 grimoire 6 48 sfc solver 35.042236 +7 0 1 12-v1 grimoire 6 96 metis nsi_inisol 0.000014 +8 0 1 12-v1 grimoire 6 96 metis nsi_matrix 0.000076 +9 0 1 12-v1 grimoire 6 96 metis solver 21.981336 +10 0 1 12-v1 grimoire 6 96 sfc nsi_inisol 0.000012 +# ... with 86,390 more rows +#+end_example +*** Plots +#+begin_src R :results output graphics :file img/alya_nastin_solver_3_funs.png :exports both :width 600 :height 400 :session +library(ggplot2); +df %>% + group_by(Rank, Iteration, EID, Platform, Nodes, NP, Partitioning) %>% + summarize(NewComp = sum(End - Start)) %>% + ggplot(aes(x=Iteration, y=NewComp, color=as.factor(Rank))) + + theme_bw(base_size=12) + + geom_point(alpha=.3) + + ylim(0,NA) + + geom_line (aes(group=as.factor(Rank):as.factor(Partitioning))) + + theme(legend.position="none") + + facet_grid(NP~Partitioning); +#+end_src + +#+RESULTS: +[[file:img/alya_nastin_solver_3_funs.png]] + +#+begin_src R :results output graphics :file img/alya_nastin_solver_solver.png :exports both :width 600 :height 400 :session +df %>% + filter(Value == "solver") %>% + group_by(Rank, Iteration, EID, Platform, Nodes, NP, Value, Partitioning) %>% + summarize(NewComp = sum(End - Start)) %>% + ggplot(aes(x=Iteration, y=NewComp, color=as.factor(Rank))) + + theme_bw(base_size=12) + + geom_point(alpha=.3) + + ylim(0,NA) + + geom_line (aes(group=as.factor(Rank):as.factor(Partitioning))) + + theme(legend.position="none") + + facet_grid(NP~Partitioning); +#+end_src + +#+RESULTS: +[[file:img/alya_nastin_solver_solver.png]] + +#+begin_src R :results output graphics :file img/alya_nastin_solver_nsi_matrix.png :exports both :width 600 :height 400 :session +df %>% + filter(Value == "nsi_matrix") %>% + group_by(Rank, Iteration, EID, Platform, Nodes, NP, Value, Partitioning) %>% + summarize(NewComp = sum(End - Start)) %>% + ggplot(aes(x=Iteration, y=NewComp, color=as.factor(Rank))) + + theme_bw(base_size=12) + + geom_point(alpha=.3) + + ylim(0,NA) + + geom_line (aes(group=as.factor(Rank):as.factor(Partitioning))) + + theme(legend.position="none") + + facet_grid(NP~Partitioning);#, scales="free_y") +#+end_src + +#+RESULTS: +[[file:img/alya_nastin_solver_nsi_matrix.png]] + + +#+begin_src R :results output graphics :file img/alya_nastin_solver_nsi_inisol.png :exports both :width 600 :height 400 :session +df %>% + filter(Value == "nsi_inisol") %>% + group_by(Rank, Iteration, EID, Platform, Nodes, NP, Value, Partitioning) %>% + summarize(NewComp = sum(End - Start)) %>% + ggplot(aes(x=Iteration, y=NewComp, color=as.factor(Rank))) + + theme_bw(base_size=12) + + geom_point(alpha=.3) + + ylim(0,NA) + + geom_line (aes(group=as.factor(Rank):as.factor(Partitioning))) + + theme(legend.position="none") + + facet_grid(NP~Partitioning);#, scales="free_y") +#+end_src + +#+RESULTS: +[[file:img/alya_nastin_solver_nsi_inisol.png]] +** 12-node grisou (computing instrumentations, Alya modifs yes/no) :EXP13:ATTACH: +:PROPERTIES: +:ID: 0717e0e2-99b8-44c7-acfc-4d4ad685a2c5 +:Attachments: exp_13-v1_grisou_12 +:END: + +*** Which files +#+name: exp_13_traces +#+begin_src shell :results output +cd data/07/17e0e2-99b8-44c7-acfc-4d4ad685a2c5/ +find exp_13-v1_grisou_12/| grep csv.gz +#+end_src + +#+RESULTS: exp_13_traces +: exp_13-v1_grisou_12/13-v1_grisou_12_96_sfc_true_Alya.x.modif.dir/scorep_13-v1_grisou_12_96_sfc_true_Alya.x.modif/traces.csv.gz +: exp_13-v1_grisou_12/13-v1_grisou_12_96_metis_true_Alya.x.modif.dir/scorep_13-v1_grisou_12_96_metis_true_Alya.x.modif/traces.csv.gz +: exp_13-v1_grisou_12/13-v1_grisou_12_192_metis_true_Alya.x.dir/scorep_13-v1_grisou_12_192_metis_true_Alya.x/traces.csv.gz +: exp_13-v1_grisou_12/13-v1_grisou_12_192_sfc_true_Alya.x.dir/scorep_13-v1_grisou_12_192_sfc_true_Alya.x/traces.csv.gz +: exp_13-v1_grisou_12/13-v1_grisou_12_192_metis_true_Alya.x.modif.dir/scorep_13-v1_grisou_12_192_metis_true_Alya.x.modif/traces.csv.gz +: exp_13-v1_grisou_12/13-v1_grisou_12_96_metis_true_Alya.x.dir/scorep_13-v1_grisou_12_96_metis_true_Alya.x/traces.csv.gz +: exp_13-v1_grisou_12/13-v1_grisou_12_96_sfc_true_Alya.x.dir/scorep_13-v1_grisou_12_96_sfc_true_Alya.x/traces.csv.gz +: exp_13-v1_grisou_12/13-v1_grisou_12_192_sfc_true_Alya.x.modif.dir/scorep_13-v1_grisou_12_192_sfc_true_Alya.x.modif/traces.csv.gz + +#+begin_src R :results output :session :exports both :var files=exp_13_traces +files <- strsplit(files, "\n")[[1]] +filename <- files[[1]] +filename; +meta <- unlist(strsplit(unlist(strsplit(filename, "/"))[3], "_")); +EID = meta[2] +Platform = meta[3] +Nodes = meta[4] +NP = meta[5] +Partitioning = meta[6] +Infiniband = meta[7] +Alya = meta[8] +EID; +Platform; +Nodes; +NP; +Partitioning; +Infiniband; +Alya; +#+end_src + +#+RESULTS: +: [1] "exp_13-v1_grisou_12/13-v1_grisou_12_96_sfc_true_Alya.x.modif.dir/scorep_13-v1_grisou_12_96_sfc_true_Alya.x.modif/traces.csv.gz" +: [1] "13-v1" +: [1] "grisou" +: [1] "12" +: [1] "96" +: [1] "sfc" +: [1] "true" +: [1] "Alya.x.modif" +*** Read in R to =dft13= +#+begin_src R :results output :session :exports both :var files=exp_13_traces +library(readr); +library(dplyr); +alya_scorep_trace_read <- function(filename, basedir = ".") +{ + meta <- unlist(strsplit(unlist(strsplit(filename, "/"))[3], "_")); + fullpath <- paste0(basedir, "/", filename); + read_csv(fullpath, + col_names=c("Rank", "Start", "End", "Value"), + progress=FALSE) %>% + # Transform Value to factor + mutate(Value = as.factor(Value)) %>% + # Detect begin and end of iterations + mutate(Iteration = case_when( + (.$Value == "timste") ~ 1, + (.$Value == "endste") ~ -1, + TRUE ~ 0)) %>% + group_by(Rank) %>% + # Create a logical to detect observations within iterations + mutate(Iteration = as.logical(cumsum(Iteration))) %>% + # Get only observations that belongs to some iteration + filter(Iteration == TRUE) %>% + ungroup() %>% + # Create the iteration by cumsum + mutate(Iteration = case_when( + (.$Value == "timste") ~ 1, + TRUE ~ 0)) %>% + group_by(Rank) %>% + mutate(Iteration = cumsum(Iteration)) %>% + mutate(EID = as.factor(meta[2]), + Platform = as.factor(meta[3]), + Nodes = as.factor(meta[4]), + NP = as.factor(meta[5]), + Partitioning = as.factor(meta[6]), + Infiniband = as.logical(meta[7]), + Alya = as.factor(meta[8])); +} + +files <- strsplit(files, "\n")[[1]] +dft13 <- do.call("rbind", lapply(files, function(x) { alya_scorep_trace_read(x, basedir="data/07/17e0e2-99b8-44c7-acfc-4d4ad685a2c5") })); + +# +# Keep only states that are interesting +# + +dft13 <- dft13 %>% filter(Value %in% c("nsi_matrix", "solver", "nsi_inisol")); +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +Warning messages: +1: In bind_rows_(x, .id) : Unequal factor levels: coercing to character +2: In bind_rows_(x, .id) : Unequal factor levels: coercing to character +3: In bind_rows_(x, .id) : Unequal factor levels: coercing to character +#+end_example + +#+begin_src R :results output :session :exports both +dft13 %>% .$NP %>% unique +dft13 %>% .$Alya %>% unique +dft13; +dft13 %>% mutate(Partitioning=as.factor(Partitioning), NP=as.factor(NP), Alya=as.factor(Alya)) %>% summary; +#+end_src + +#+RESULTS: +#+begin_example +[1] "96" "192" +[1] "Alya.x.modif" "Alya.x" +Source: local data frame [2,695,680 x 12] +Groups: Rank [192] + + Rank Start End Value Iteration EID Platform Nodes NP + +1 95 0.333969 0.333970 nsi_inisol 1 13-v1 grisou 12 96 +2 95 0.335719 1.417964 nsi_matrix 1 13-v1 grisou 12 96 +3 95 1.420657 1.978257 solver 1 13-v1 grisou 12 96 +4 95 1.982556 2.796641 solver 1 13-v1 grisou 12 96 +5 95 2.799346 2.991894 solver 1 13-v1 grisou 12 96 +6 95 3.010278 3.864155 solver 1 13-v1 grisou 12 96 +7 95 4.066358 4.066361 nsi_inisol 1 13-v1 grisou 12 96 +8 95 4.066362 4.119009 solver 1 13-v1 grisou 12 96 +9 95 4.888364 5.940038 nsi_matrix 1 13-v1 grisou 12 96 +10 95 5.942733 6.901603 solver 1 13-v1 grisou 12 96 +# ... with 2,695,670 more rows, and 3 more variables: Partitioning , +# Infiniband , Alya + Rank Start End Value + Min. : 0.00 Min. : 0.1715 Min. : 0.1715 solver :1843200 + 1st Qu.: 35.75 1st Qu.: 270.0145 1st Qu.: 270.2660 nsi_inisol: 483840 + Median : 71.50 Median : 529.9820 Median : 530.3756 nsi_matrix: 368640 + Mean : 79.50 Mean : 572.4097 Mean : 572.7664 doiter : 0 + 3rd Qu.:119.25 3rd Qu.: 785.8916 3rd Qu.: 786.1151 endste : 0 + Max. :191.00 Max. :1721.8589 Max. :1721.8598 nastin : 0 + (Other) : 0 + Iteration EID Platform Nodes NP + Min. : 1.00 13-v1:2695680 grisou:2695680 12:2695680 192:1797120 + 1st Qu.: 21.00 96 : 898560 + Median : 47.00 + Mean : 47.81 + 3rd Qu.: 74.00 + Max. :100.00 + + Partitioning Infiniband Alya + metis:1347840 Mode:logical Alya.x :1347840 + sfc :1347840 TRUE:2695680 Alya.x.modif:1347840 + NA's:0 +#+end_example +*** Plot only "computing time" +**** nastin solver 3 functions =nsi_matrix=, =solver=, =nsi_inisol= +#+begin_src R :results output graphics :file img/exp_13_alya_nastin_solver_3_funs.png :exports both :width 1200 :height 400 :session +library(ggplot2); +dft13 %>% + filter(Rank != 0) %>% + filter(NP == 192, Partitioning == "sfc") %>% + group_by(Rank, Iteration, EID, Platform, Nodes, NP, Partitioning, Infiniband, Alya) %>% + summarize(NewComp = sum(End - Start)) %>% + ggplot(aes(x=Iteration, y=NewComp, color=as.factor(Rank))) + + theme_bw(base_size=12) + + geom_point() + + ylim(0,NA) + + geom_line (aes(group=as.factor(Rank):as.factor(Partitioning))) + + theme(legend.position="none") + + facet_grid(Partitioning~Alya); +#+end_src + +#+RESULTS: +[[file:img/exp_13_alya_nastin_solver_3_funs.png]] +**** nastin solver only =nsi_matrix= +#+begin_src R :results output graphics :file img/exp_13_alya_nastin_nsi_matrix.png :exports both :width 1200 :height 400 :session +library(ggplot2); +dft13 %>% + filter(Value == "nsi_matrix") %>% + filter(Rank != 0) %>% + filter(NP == 192, Partitioning == "sfc") %>% + group_by(Rank, Iteration, EID, Platform, Nodes, NP, Partitioning, Infiniband, Alya) %>% + summarize(NewComp = sum(End - Start)) %>% + ggplot(aes(x=Iteration, y=NewComp, color=as.factor(Rank))) + + theme_bw(base_size=12) + + geom_point() + + ylim(0,NA) + + geom_line (aes(group=as.factor(Rank):as.factor(Partitioning))) + + theme(legend.position="none") + + facet_grid(Partitioning~Alya); +#+end_src + +#+RESULTS: +[[file:img/exp_13_alya_nastin_nsi_matrix.png]] + +**** nastin solver only =solver= +#+begin_src R :results output graphics :file img/exp_13_alya_nastin_solver.png :exports both :width 1200 :height 400 :session +library(ggplot2); +dft13 %>% + filter(Value == "solver") %>% + filter(Rank != 0) %>% + filter(NP == 192, Partitioning == "sfc") %>% + group_by(Rank, Iteration, EID, Platform, Nodes, NP, Partitioning, Infiniband, Alya) %>% + summarize(NewComp = sum(End - Start)) %>% + ggplot(aes(x=Iteration, y=NewComp, color=as.factor(Rank))) + + theme_bw(base_size=12) + + geom_point() + + ylim(0,NA) + + geom_line (aes(group=as.factor(Rank):as.factor(Partitioning))) + + theme(legend.position="none") + + facet_grid(Partitioning~Alya); +#+end_src + +#+RESULTS: +[[file:img/exp_13_alya_nastin_solver.png]] + +**** nastin solver only =nsi_inisol= +#+begin_src R :results output graphics :file img/exp_13_alya_nastin_nsi_inisol.png :exports both :width 1200 :height 400 :session +library(ggplot2); +dft13 %>% + filter(Value == "nsi_inisol") %>% + filter(Rank != 0) %>% + filter(NP == 192, Partitioning == "sfc") %>% + group_by(Rank, Iteration, EID, Platform, Nodes, NP, Partitioning, Infiniband, Alya) %>% + summarize(NewComp = sum(End - Start)) %>% + ggplot(aes(x=Iteration, y=NewComp, color=as.factor(Rank))) + + theme_bw(base_size=12) + + geom_point() + + ylim(0,NA) + + geom_line (aes(group=as.factor(Rank):as.factor(Partitioning))) + + theme(legend.position="none") + + facet_grid(Partitioning~Alya); +#+end_src + +#+RESULTS: +[[file:img/exp_13_alya_nastin_nsi_inisol.png]] +**** Amount of calls to these functions along iterations +#+begin_src R :results output :session :exports both +dft13 %>% + filter(Value == "nsi_matrix") %>% + filter(Rank != 0) %>% + filter(NP==192, Rank == 95, Iteration %in% c(1,100), Alya == "Alya.x") %>% + mutate(Duration = End - Start) %>% + select(-EID, -Platform, -Nodes, -Infiniband, -NP, -Partitioning) %>% + group_by(Iteration) %>% + mutate(N=n()) %>% + ungroup() %>% + arrange(Start, Iteration) %>% + as.data.frame() +#+end_src + +#+RESULTS: +#+begin_example + Rank Start End Value Iteration Alya Duration N +1 95 0.283716 0.856840 nsi_matrix 1 Alya.x 0.573124 10 +2 95 0.378482 0.927219 nsi_matrix 1 Alya.x 0.548737 10 +3 95 3.298924 3.875023 nsi_matrix 1 Alya.x 0.576099 10 +4 95 3.320866 3.865816 nsi_matrix 1 Alya.x 0.544950 10 +5 95 6.500311 7.052999 nsi_matrix 1 Alya.x 0.552688 10 +6 95 6.595464 7.164909 nsi_matrix 1 Alya.x 0.569445 10 +7 95 9.646956 10.194573 nsi_matrix 1 Alya.x 0.547617 10 +8 95 9.782018 10.349733 nsi_matrix 1 Alya.x 0.567715 10 +9 95 12.637448 13.184027 nsi_matrix 1 Alya.x 0.546579 10 +10 95 12.929592 13.499010 nsi_matrix 1 Alya.x 0.569418 10 +11 95 902.438214 902.987738 nsi_matrix 100 Alya.x 0.549524 6 +12 95 905.091038 905.636659 nsi_matrix 100 Alya.x 0.545621 6 +13 95 907.747868 908.293808 nsi_matrix 100 Alya.x 0.545940 6 +14 95 914.429194 914.996252 nsi_matrix 100 Alya.x 0.567058 6 +15 95 917.061845 917.634066 nsi_matrix 100 Alya.x 0.572221 6 +16 95 919.813262 920.384819 nsi_matrix 100 Alya.x 0.571557 6 +#+end_example + +There are, for =nsi_matrix= in Rank 95: +- 15 calls in iteration 1 +- but only 9 in iteration 100. + +#+begin_src R :results output :session :exports both +dft13 %>% + filter(Value == "solver") %>% + filter(Rank != 0) %>% + filter(NP == 192, Partitioning == "sfc") %>% + filter(Rank == 3, Iteration %in% c(1,100), Alya == "Alya.x") %>% + mutate(Duration = End-Start) %>% + select(-EID, -Platform, -Nodes, -Infiniband, -NP, -Partitioning) %>% + group_by(Iteration) %>% + mutate(N=n()) %>% + ungroup() %>% + arrange(Start, Iteration) %>% + as.data.frame() +#+end_src + +#+RESULTS: +#+begin_example + Rank Start End Value Iteration Alya Duration N +1 3 0.806091 1.397790 solver 1 Alya.x 0.591699 25 +2 3 1.400163 1.883410 solver 1 Alya.x 0.483247 25 +3 3 1.884954 1.987991 solver 1 Alya.x 0.103037 25 +4 3 1.995055 2.475562 solver 1 Alya.x 0.480507 25 +5 3 2.558144 2.693178 solver 1 Alya.x 0.135034 25 +6 3 3.821168 4.586247 solver 1 Alya.x 0.765079 25 +7 3 4.588758 5.062848 solver 1 Alya.x 0.474090 25 +8 3 5.064446 5.387481 solver 1 Alya.x 0.323035 25 +9 3 5.394513 5.799146 solver 1 Alya.x 0.404633 25 +10 3 5.881718 6.016954 solver 1 Alya.x 0.135236 25 +11 3 7.130104 7.898844 solver 1 Alya.x 0.768740 25 +12 3 7.901355 8.296631 solver 1 Alya.x 0.395276 25 +13 3 8.298181 8.584412 solver 1 Alya.x 0.286231 25 +14 3 8.591484 9.002578 solver 1 Alya.x 0.411094 25 +15 3 9.085122 9.221985 solver 1 Alya.x 0.136863 25 +16 3 10.303988 11.044705 solver 1 Alya.x 0.740717 25 +17 3 11.047239 11.510650 solver 1 Alya.x 0.463411 25 +18 3 11.512203 11.736265 solver 1 Alya.x 0.224062 25 +19 3 11.743266 12.150266 solver 1 Alya.x 0.407000 25 +20 3 12.232800 12.369987 solver 1 Alya.x 0.137187 25 +21 3 13.452176 14.151580 solver 1 Alya.x 0.699404 25 +22 3 14.154113 14.507091 solver 1 Alya.x 0.352978 25 +23 3 14.508630 14.696163 solver 1 Alya.x 0.187533 25 +24 3 14.703222 15.079027 solver 1 Alya.x 0.375805 25 +25 3 15.161596 15.297411 solver 1 Alya.x 0.135815 25 +26 3 914.954351 915.583598 solver 100 Alya.x 0.629247 15 +27 3 915.586138 915.877785 solver 100 Alya.x 0.291647 15 +28 3 915.879321 916.002794 solver 100 Alya.x 0.123473 15 +29 3 916.009884 916.294803 solver 100 Alya.x 0.284919 15 +30 3 916.381085 916.513058 solver 100 Alya.x 0.131973 15 +31 3 917.587713 918.332705 solver 100 Alya.x 0.744992 15 +32 3 918.335221 918.630677 solver 100 Alya.x 0.295456 15 +33 3 918.632221 918.755400 solver 100 Alya.x 0.123179 15 +34 3 918.762429 919.045410 solver 100 Alya.x 0.282981 15 +35 3 919.131495 919.265010 solver 100 Alya.x 0.133515 15 +36 3 920.334900 921.001416 solver 100 Alya.x 0.666516 15 +37 3 921.003959 921.290349 solver 100 Alya.x 0.286390 15 +38 3 921.291913 921.425185 solver 100 Alya.x 0.133272 15 +39 3 921.432295 921.728660 solver 100 Alya.x 0.296365 15 +40 3 921.815039 921.947326 solver 100 Alya.x 0.132287 15 +#+end_example + +There are, for =solver= in Rank 95: +- 25 calls in iteration 1 +- but only 15 in iteration 100. + +Let's check for other ranks and kernels: + +#+begin_src R :results output :session :exports both +dft13 %>% + filter(Value == "solver") %>% + filter(Rank != 0) %>% + filter(NP == 192, Partitioning == "sfc") %>% + filter(Iteration %in% c(1,100), Alya == "Alya.x") %>% + mutate(Duration = End-Start) %>% + select(-EID, -Platform, -Nodes, -Infiniband, -NP, -Partitioning) %>% + group_by(Rank, Value, Iteration) %>% + mutate(N=n()) %>% + ungroup() %>% + select(Rank, Iteration, N, Value) %>% + unique %>% + sample_n(20) %>% + arrange(Iteration, Value, N) +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 20 × 4 + Rank Iteration N Value + +1 163 1 25 solver +2 36 1 25 solver +3 129 1 25 solver +4 66 1 25 solver +5 89 1 25 solver +6 185 1 25 solver +7 97 1 25 solver +8 152 1 25 solver +9 114 1 25 solver +10 159 1 25 solver +11 94 1 25 solver +12 57 1 25 solver +13 134 1 25 solver +14 46 100 15 solver +15 101 100 15 solver +16 159 100 15 solver +17 147 100 15 solver +18 153 100 15 solver +19 124 100 15 solver +20 174 100 15 solver +#+end_example + +#+begin_src R :results output :session :exports both +dft13 %>% + filter(Value == "nsi_matrix") %>% + filter(Rank != 0) %>% + filter(NP == 192, Partitioning == "sfc") %>% + filter(Iteration %in% c(1,100), Alya == "Alya.x") %>% + mutate(Duration = End-Start) %>% + select(-EID, -Platform, -Nodes, -Infiniband, -NP, -Partitioning) %>% + group_by(Rank, Value, Iteration) %>% + mutate(N=n()) %>% + ungroup() %>% + select(Rank, Iteration, N, Value) %>% + unique %>% + sample_n(20) %>% + arrange(Iteration, Value, N) +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 20 × 4 + Rank Iteration N Value + +1 167 1 5 nsi_matrix +2 63 1 5 nsi_matrix +3 153 1 5 nsi_matrix +4 75 1 5 nsi_matrix +5 124 1 5 nsi_matrix +6 23 1 5 nsi_matrix +7 26 1 5 nsi_matrix +8 18 1 5 nsi_matrix +9 13 1 5 nsi_matrix +10 31 100 3 nsi_matrix +11 4 100 3 nsi_matrix +12 125 100 3 nsi_matrix +13 17 100 3 nsi_matrix +14 75 100 3 nsi_matrix +15 29 100 3 nsi_matrix +16 69 100 3 nsi_matrix +17 117 100 3 nsi_matrix +18 106 100 3 nsi_matrix +19 129 100 3 nsi_matrix +20 111 100 3 nsi_matrix +#+end_example + +#+begin_src R :results output :session :exports both +dft13 %>% + filter(Value == "nsi_inisol") %>% + filter(Rank != 0) %>% + filter(NP == 192, Partitioning == "sfc") %>% + filter(Iteration %in% c(1,100), Alya == "Alya.x") %>% + mutate(Duration = End-Start) %>% + select(-EID, -Platform, -Nodes, -Infiniband, -NP, -Partitioning) %>% + group_by(Rank, Value, Iteration) %>% + mutate(N=n()) %>% + ungroup() %>% + select(Rank, Iteration, N, Value) %>% + unique %>% + sample_n(20) %>% + arrange(Iteration, Value, N) +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 20 × 4 + Rank Iteration N Value + +1 27 1 6 nsi_inisol +2 171 1 6 nsi_inisol +3 155 1 6 nsi_inisol +4 166 1 6 nsi_inisol +5 134 1 6 nsi_inisol +6 137 1 6 nsi_inisol +7 45 100 4 nsi_inisol +8 110 100 4 nsi_inisol +9 7 100 4 nsi_inisol +10 9 100 4 nsi_inisol +11 89 100 4 nsi_inisol +12 171 100 4 nsi_inisol +13 190 100 4 nsi_inisol +14 16 100 4 nsi_inisol +15 30 100 4 nsi_inisol +16 186 100 4 nsi_inisol +17 32 100 4 nsi_inisol +18 10 100 4 nsi_inisol +19 23 100 4 nsi_inisol +20 146 100 4 nsi_inisol +#+end_example + +Ok, looks like it happens for all ranks and all three kernels. +*** Partition processing and reading to =dfp13= +**** Process log files :deprecated: +This has already been done. Not necessary to run it again. +#+begin_src shell :results output +EDIR=exp_13-v1_grisou_12 +for file in $(find $EDIR | grep results | grep log$); do + OUTPUT=$(dirname $file)/$(basename $file .log).csv + cat $file | tr -s " " | sed "s/^ //" | tr ' ' ',' | cut -d"," -f1,3,5,7,9,11,13 | uniq > $OUTPUT + cat $file | tr -s " " | sed "s/^ //" | tr ' ' ',' | cut -d"," -f2,4,6,8,10,12,14 >> $OUTPUT + head $OUTPUT + tail $OUTPUT + echo +done +#+end_src + +#+RESULTS: +#+begin_example +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,19074,96948,97208,2652,14,19074 +2,22045,96927,135194,2345,14,22045 +3,37420,96812,334007,9529,13,37420 +4,30041,96942,231282,5441,22,30041 +5,33838,96890,274890,7297,16,33838 +6,29070,96827,226562,5140,9,29070 +7,32347,96944,266119,6787,8,32347 +8,28077,96663,216923,4865,21,28077 +9,29850,96977,237277,5651,14,29850 +86,37931,96789,333099,9441,13,37931 +87,34578,96921,288111,9405,11,34578 +88,28456,96827,200262,7317,13,28456 +89,35143,97224,298674,8919,13,35143 +90,32222,96838,261743,6612,14,32222 +91,24232,96830,163453,3222,9,24232 +92,18280,97057,97057,1365,7,18280 +93,21127,97038,104990,3262,16,21127 +94,29570,96812,223299,6226,11,29570 +95,26541,97016,173035,6317,16,26541 + +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,26205,84975,235380,6679,1,26205 +2,25585,77582,231837,6526,3,25585 +3,25400,80246,231231,6601,1,25400 +4,25438,75758,234708,6609,3,25438 +5,26406,81896,239781,6312,3,26406 +6,27502,95784,232364,5466,3,27502 +7,26084,82696,234631,6072,2,26084 +8,25678,78418,226628,6240,3,25678 +9,27945,100132,233397,5326,2,27945 +86,42850,239542,239542,7616,2,42850 +87,32666,141054,226647,7925,4,32666 +88,26385,86684,226804,5612,3,26385 +89,28345,99000,239538,6482,3,28345 +90,25067,72579,229554,6279,5,25067 +91,26476,78672,234982,6248,6,26476 +92,24732,69329,230939,6461,3,24732 +93,26325,75795,237965,6480,6,26325 +94,26700,84639,231454,5870,5,26700 +95,27806,91670,237940,5852,3,27806 + +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,14777,51617,119407,2695,4,14777 +2,14355,48681,119571,2829,2,14355 +3,15811,59853,119398,2382,8,15811 +4,13971,46640,114060,2702,5,13971 +5,15956,61067,117792,2279,6,15956 +6,14174,45564,119584,2966,5,14174 +7,13726,42104,119489,3111,4,13726 +8,14348,47398,118708,2849,5,14348 +9,13241,37425,112720,3062,7,13241 +182,13291,41151,119231,3125,2,13291 +183,12167,34134,114984,3484,1,12167 +184,12777,36965,112710,2996,3,12777 +185,13270,41033,116648,3018,4,13270 +186,12727,35021,112596,3165,3,12727 +187,12767,37392,117632,3197,2,12767 +188,12278,36271,113276,3455,1,12278 +189,14272,40797,133607,3999,1,14272 +190,12589,34973,112463,3186,2,12589 +191,12543,35660,108215,3399,2,12543 + +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,22391,115758,116201,2773,17,22391 +2,18665,78117,116201,2224,18,18665 +3,13657,33602,116107,3330,12,13657 +4,13789,33607,116562,3312,12,13789 +5,13741,34479,116214,3280,11,13741 +6,15414,50810,115805,2684,26,15414 +7,14543,46329,115864,2777,10,14543 +8,14137,34979,115739,3382,16,14137 +9,14986,47013,116828,2819,12,14986 +182,14122,37659,115339,3118,11,14122 +183,14068,37478,115968,3182,12,14068 +184,15411,45675,116180,2817,14,15411 +185,15406,46701,116587,2783,15,15406 +186,19875,88273,116135,2108,13,19875 +187,22958,114601,116053,1817,16,22958 +188,19757,84545,116610,4213,16,19757 +189,14620,41964,116199,3117,12,14620 +190,16129,53509,116005,2870,12,16129 +191,19139,75123,115333,4732,16,19139 + +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,14777,51617,119407,2695,4,14777 +2,14355,48681,119571,2829,2,14355 +3,15811,59853,119398,2382,8,15811 +4,13971,46640,114060,2702,5,13971 +5,15956,61067,117792,2279,6,15956 +6,14174,45564,119584,2966,5,14174 +7,13726,42104,119489,3111,4,13726 +8,14348,47398,118708,2849,5,14348 +9,13241,37425,112720,3062,7,13241 +182,13291,41151,119231,3125,2,13291 +183,12167,34134,114984,3484,1,12167 +184,12777,36965,112710,2996,3,12777 +185,13270,41033,116648,3018,4,13270 +186,12727,35021,112596,3165,3,12727 +187,12767,37392,117632,3197,2,12767 +188,12278,36271,113276,3455,1,12278 +189,14272,40797,133607,3999,1,14272 +190,12589,34973,112463,3186,2,12589 +191,12543,35660,108215,3399,2,12543 + +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,26205,84975,235380,6679,1,26205 +2,25585,77582,231837,6526,3,25585 +3,25400,80246,231231,6601,1,25400 +4,25438,75758,234708,6609,3,25438 +5,26406,81896,239781,6312,3,26406 +6,27502,95784,232364,5466,3,27502 +7,26084,82696,234631,6072,2,26084 +8,25678,78418,226628,6240,3,25678 +9,27945,100132,233397,5326,2,27945 +86,42850,239542,239542,7616,2,42850 +87,32666,141054,226647,7925,4,32666 +88,26385,86684,226804,5612,3,26385 +89,28345,99000,239538,6482,3,28345 +90,25067,72579,229554,6279,5,25067 +91,26476,78672,234982,6248,6,26476 +92,24732,69329,230939,6461,3,24732 +93,26325,75795,237965,6480,6,26325 +94,26700,84639,231454,5870,5,26700 +95,27806,91670,237940,5852,3,27806 + +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,39242,194445,234302,5042,17,39242 +2,26669,67067,232312,6639,13,26669 +3,28792,85760,233695,6014,24,28792 +4,28333,82323,232988,6163,12,28333 +5,28519,87908,233618,5864,10,28519 +6,29001,98118,233378,5377,6,29001 +7,28361,89068,233513,5797,6,28361 +8,30944,105897,233817,5141,21,30944 +9,29093,89869,233224,5813,14,29093 +86,26762,67433,233208,6686,13,26762 +87,27239,68676,233356,6608,14,27239 +88,26002,64284,233259,6717,10,26002 +89,33558,119562,233842,9571,11,33558 +90,28270,76939,233804,7049,12,28270 +91,27152,74923,233598,6402,11,27152 +92,29544,93864,233745,5565,11,29544 +93,40608,201534,233173,3896,10,40608 +94,34003,127993,232868,7401,10,34003 +95,34604,129341,233472,7659,13,34604 + +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,9924,48191,48191,1030,17,9924 +2,10099,48316,48576,1622,11,10099 +3,10178,48177,48360,180,13,10178 +4,12632,48021,82050,1985,14,12632 +5,19592,48328,168948,4879,15,19592 +6,19218,48419,165914,4699,12,19218 +7,15023,48096,112771,2615,15,15023 +8,15439,47989,115964,2765,26,15439 +9,19266,48171,157121,4519,18,19266 +182,14742,48263,110099,2721,14,14742 +183,9944,48050,48662,574,12,9944 +184,9674,48278,48278,674,7,9674 +185,9761,48251,48251,456,10,9761 +186,10670,48164,49616,1545,18,10670 +187,10975,48238,54738,1717,14,10975 +188,16603,48242,125207,3894,16,16603 +189,13632,48288,98488,2381,12,13632 +190,15115,48224,106984,3735,14,15115 +191,12197,47973,64554,2533,19,12197 + +#+end_example +**** Load them in R +#+begin_src R :results output :session :exports both +library(readr); +library(dplyr); +library(tidyr); + +read_npoin <- function(filename) +{ + meta <- unlist(strsplit(unlist(strsplit(filename, "/"))[5], "_")); + meta <- gsub(".dir", "", meta); + read_csv(filename) %>% + gather(Variable, Value, -Rank) %>% + mutate(EID = meta[1], + Platform = meta[2], + Nodes = meta[3], + NP = meta[4], + Partitioning = meta[5], + Infiniband = as.logical(meta[6]), + Alya = meta[7]); + +} +files <- list.files("data/07/17e0e2-99b8-44c7-acfc-4d4ad685a2c5/exp_13-v1_grisou_12", pattern="results_NPOIN_NELEM_NELEW_NBOUN.csv", recursive=TRUE, full.names=TRUE); +files; +dfp13 <- do.call("rbind", lapply(files, function(x) { read_npoin(x) })) +dfp13 <- dfp13 %>% + mutate(Partitioning=as.factor(Partitioning), + NP=as.factor(NP), + Alya=as.factor(Alya), + Nodes=as.factor(Nodes), + EID=as.factor(EID), + Platform=as.factor(Platform)) +dfp13 %>% filter(Rank == 111, Variable == "NELEW") +#+end_src + +#+RESULTS: +#+begin_example +[1] "data/07/17e0e2-99b8-44c7-acfc-4d4ad685a2c5/exp_13-v1_grisou_12/13-v1_grisou_12_192_metis_true_Alya.x.dir/results_NPOIN_NELEM_NELEW_NBOUN.csv" +[2] "data/07/17e0e2-99b8-44c7-acfc-4d4ad685a2c5/exp_13-v1_grisou_12/13-v1_grisou_12_192_metis_true_Alya.x.modif.dir/results_NPOIN_NELEM_NELEW_NBOUN.csv" +[3] "data/07/17e0e2-99b8-44c7-acfc-4d4ad685a2c5/exp_13-v1_grisou_12/13-v1_grisou_12_192_sfc_true_Alya.x.dir/results_NPOIN_NELEM_NELEW_NBOUN.csv" +[4] "data/07/17e0e2-99b8-44c7-acfc-4d4ad685a2c5/exp_13-v1_grisou_12/13-v1_grisou_12_192_sfc_true_Alya.x.modif.dir/results_NPOIN_NELEM_NELEW_NBOUN.csv" +[5] "data/07/17e0e2-99b8-44c7-acfc-4d4ad685a2c5/exp_13-v1_grisou_12/13-v1_grisou_12_96_metis_true_Alya.x.dir/results_NPOIN_NELEM_NELEW_NBOUN.csv" +[6] "data/07/17e0e2-99b8-44c7-acfc-4d4ad685a2c5/exp_13-v1_grisou_12/13-v1_grisou_12_96_metis_true_Alya.x.modif.dir/results_NPOIN_NELEM_NELEW_NBOUN.csv" +[7] "data/07/17e0e2-99b8-44c7-acfc-4d4ad685a2c5/exp_13-v1_grisou_12/13-v1_grisou_12_96_sfc_true_Alya.x.dir/results_NPOIN_NELEM_NELEW_NBOUN.csv" +[8] "data/07/17e0e2-99b8-44c7-acfc-4d4ad685a2c5/exp_13-v1_grisou_12/13-v1_grisou_12_96_sfc_true_Alya.x.modif.dir/results_NPOIN_NELEM_NELEW_NBOUN.csv" +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +# A tibble: 4 × 10 + Rank Variable Value EID Platform Nodes NP Partitioning Infiniband + +1 111 NELEW 114133 13-v1 grisou 12 192 metis TRUE +2 111 NELEW 114133 13-v1 grisou 12 192 metis TRUE +3 111 NELEW 115711 13-v1 grisou 12 192 sfc TRUE +4 111 NELEW 101339 13-v1 grisou 12 192 sfc TRUE +# ... with 1 more variables: Alya +#+end_example + +#+begin_src R :results output :session :exports both +dfp13 %>% summary +#+end_src + +#+RESULTS: +#+begin_example + Rank Variable Value EID + Min. : 1.00 Length:6864 Min. : 1 13-v1:6864 + 1st Qu.: 36.00 Class :character 1st Qu.: 3245 + Median : 72.00 Mode :character Median : 15673 + Mean : 80.06 Mean : 43719 + 3rd Qu.:120.00 3rd Qu.: 50688 + Max. :191.00 Max. :335716 + Platform Nodes NP Partitioning Infiniband + grisou:6864 12:6864 192:4584 metis:3432 Mode:logical + 96 :2280 sfc :3432 TRUE:6864 + NA's:0 + + + + Alya + Alya.x :3432 + Alya.x.modif:3432 +#+end_example +*** Merge =dft13= and =dfp13= +#+begin_src R :results output :session :exports both +library(tidyr); +dft13z <- dft13 %>% + # Get all ranks except rank 0 + filter(Rank != 0) %>% + # Create a new variable named by the concation of the value (kernel name) and iteration + mutate(Variable = Value) %>% ##paste0(Value, "-", Iteration)) %>% + # Calculate the duration (put in the Value column) & remove unnecessary columns + mutate(Value = End - Start) %>% select(-Start, -End) %>% + select(Rank, Variable, Value, Iteration, EID, Platform, Nodes, NP, Partitioning, Infiniband, Alya) %>% + # Since we have several calls to our kernels for each iteration, we need to sum up + group_by(Rank, Variable, Iteration, EID, Platform, Nodes, NP, Partitioning, Infiniband, Alya) %>% + summarize(N=n(), Value=sum(Value)) %>% + ungroup() %>% + # Impose an order + select(Rank, Variable, Value, Iteration, EID, Platform, Nodes, NP, Partitioning, Alya); +dfp13z <- dfp13 %>% + mutate(Iteration = NA) %>% + select(Rank, Variable, Value, Iteration, EID, Platform, Nodes, NP, Partitioning, Alya); +dfm <- rbind(dft13z, dfp13z); +dfm <- dfm %>% + mutate(Iteration = as.factor(Iteration), + NP = factor(NP), + Partitioning = factor(Partitioning), + Alya = factor(Alya), + Variable = factor(Variable)); +dfm %>% head; +dfm %>% tail; +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 6 × 10 + Rank Variable Value Iteration EID Platform Nodes NP Partitioning + +1 1 nsi_inisol 1.7e-05 1 13-v1 grisou 12 192 metis +2 1 nsi_inisol 1.4e-05 1 13-v1 grisou 12 192 metis +3 1 nsi_inisol 1.7e-05 1 13-v1 grisou 12 192 sfc +4 1 nsi_inisol 2.6e-05 1 13-v1 grisou 12 192 sfc +5 1 nsi_inisol 1.8e-05 1 13-v1 grisou 12 96 metis +6 1 nsi_inisol 1.9e-05 1 13-v1 grisou 12 96 metis +# ... with 1 more variables: Alya +# A tibble: 6 × 10 + Rank Variable Value Iteration EID Platform Nodes NP Partitioning + +1 90 NBBOU 32222 NA 13-v1 grisou 12 96 sfc +2 91 NBBOU 24232 NA 13-v1 grisou 12 96 sfc +3 92 NBBOU 18280 NA 13-v1 grisou 12 96 sfc +4 93 NBBOU 21127 NA 13-v1 grisou 12 96 sfc +5 94 NBBOU 29570 NA 13-v1 grisou 12 96 sfc +6 95 NBBOU 26541 NA 13-v1 grisou 12 96 sfc +# ... with 1 more variables: Alya +#+end_example +*** Plot +#+begin_src R :results output graphics :file img/exp_13_solver_partition.png :exports both :width 800 :height 600 :session +library(cowplot); +dfm %>% + filter(Partitioning != "metis") %>% + filter(NP == 192) %>% + filter(is.na(Iteration) | Iteration %in% c(1,10,12, 100)) %>% + filter((Variable %in% c("NELEM", "NELEW")) | grepl("solver", Variable)) %>% + ggplot(aes(x=Rank, y=Value, color=Iteration)) + + theme_bw(base_size=14) + + geom_point() + + ylim(0,NA) + + theme(legend.position="top", + legend.direction="horizontal", + panel.margin = unit(.08, "lines"), + legend.spacing = unit(.0, "lines"), + legend.margin=margin(b = -.4, unit='cm'), + plot.margin = unit(x = c(0, 0, 0, 0), units = "mm") + ) + +# ggtitle ("SFC only, NP=192, 12-nodes@grisou, Ethernet, Original (left) and modified (right) Alya, Color represents four iterations") + + facet_grid(Variable~Alya, scales="free_y") -> p; +save_plot("img/exp_13_solver_partition.pdf", p, base_aspect_ratio = 3.5); +p; +#+end_src + +#+RESULTS: +[[file:img/exp_13_solver_partition.png]] +*** During the HPC4E-report preparation meeting +#+begin_src R :results output graphics :file img/exp_13_solver_partition_v2.png :exports both :width 800 :height 600 :session +library(cowplot); +dfm %>% + filter(Partitioning != "metis") %>% + filter(NP == 192) %>% + filter(is.na(Iteration) | Iteration %in% c(1,10,12, 100)) %>% + filter((Variable %in% c("NELEM", "NELEW", "NPOIN")) | grepl("solver", Variable)) %>% + ggplot(aes(x=Rank, y=Value, color=Iteration)) + + theme_bw(base_size=12) + + geom_point() + + ylim(0,NA) + + ggtitle ("SFC only, NP=192, 12-nodes@grisou, Ethernet, Original (left) and modified (right) Alya, Color represents four iterations") + + facet_grid(Variable~Alya, scales="free_y") -> p; +save_plot("img/exp_13_solver_partition_v2.pdf", p, base_aspect_ratio = 3); +p; +#+end_src + +#+RESULTS: +[[file:img/exp_13_solver_partition_v2.png]] + +*** General makespan timings +#+begin_src R :results output :session :exports both +dft13 %>% group_by (Partitioning, Alya) %>% summarize(Makespan=max(End)-min(Start)) +#+end_src + +#+RESULTS: +: Source: local data frame [4 x 3] +: Groups: Partitioning [?] +: +: Partitioning Alya Makespan +: +: 1 metis Alya.x 1721.482 +: 2 metis Alya.x.modif 1695.907 +: 3 sfc Alya.x 1606.594 +: 4 sfc Alya.x.modif 1382.505 + +** 44-node grisou (number of non-zero entries) :EXP15: +*** List files +#+begin_src shell :results output +find exp_15-* | grep otf2$ +#+end_src + +#+RESULTS: +: exp_15-v1_grisou_44/15-v1_grisou_44_704_metis_false_Alya.x.modif.dir/scorep_15-v1_grisou_44_704_metis_false_Alya.x.modif/traces.otf2 +: exp_15-v1_grisou_44/15-v1_grisou_44_704_metis_false_Alya.x.orig.dir/scorep_15-v1_grisou_44_704_metis_false_Alya.x.orig/traces.otf2 +: exp_15-v1_grisou_44/15-v1_grisou_44_704_sfc_false_Alya.x.orig.dir/scorep_15-v1_grisou_44_704_sfc_false_Alya.x.orig/traces.otf2 +: exp_15-v1_grisou_44/15-v1_grisou_44_704_sfc_false_Alya.x.modif.dir/scorep_15-v1_grisou_44_704_sfc_false_Alya.x.modif/traces.otf2 +*** Post-processing (convert to CSV) +#+begin_src shell :results output +export PATH=$PATH:~/dev/akypuera3/b/ +BASEDIR=$(pwd) +for directory in $(find exp_15-* | grep otf2$); do + ./scripts/otf22csv_faster.sh $(dirname $directory) +done +#+end_src + +#+RESULTS: +#+begin_example +/home/schnorr/dev/Alya-Perf/exp_15-v1_grisou_44/15-v1_grisou_44_704_metis_false_Alya.x.modif.dir/scorep_15-v1_grisou_44_704_metis_false_Alya.x.modif/traces.otf2 +~/dev/Alya-Perf/exp_15-v1_grisou_44/15-v1_grisou_44_704_metis_false_Alya.x.modif.dir/scorep_15-v1_grisou_44_704_metis_false_Alya.x.modif ~/dev/Alya-Perf +~/dev/Alya-Perf +/home/schnorr/dev/Alya-Perf/exp_15-v1_grisou_44/15-v1_grisou_44_704_metis_false_Alya.x.orig.dir/scorep_15-v1_grisou_44_704_metis_false_Alya.x.orig/traces.otf2 +~/dev/Alya-Perf/exp_15-v1_grisou_44/15-v1_grisou_44_704_metis_false_Alya.x.orig.dir/scorep_15-v1_grisou_44_704_metis_false_Alya.x.orig ~/dev/Alya-Perf +~/dev/Alya-Perf +/home/schnorr/dev/Alya-Perf/exp_15-v1_grisou_44/15-v1_grisou_44_704_sfc_false_Alya.x.orig.dir/scorep_15-v1_grisou_44_704_sfc_false_Alya.x.orig/traces.otf2 +~/dev/Alya-Perf/exp_15-v1_grisou_44/15-v1_grisou_44_704_sfc_false_Alya.x.orig.dir/scorep_15-v1_grisou_44_704_sfc_false_Alya.x.orig ~/dev/Alya-Perf +~/dev/Alya-Perf +/home/schnorr/dev/Alya-Perf/exp_15-v1_grisou_44/15-v1_grisou_44_704_sfc_false_Alya.x.modif.dir/scorep_15-v1_grisou_44_704_sfc_false_Alya.x.modif/traces.otf2 +~/dev/Alya-Perf/exp_15-v1_grisou_44/15-v1_grisou_44_704_sfc_false_Alya.x.modif.dir/scorep_15-v1_grisou_44_704_sfc_false_Alya.x.modif ~/dev/Alya-Perf +~/dev/Alya-Perf +#+end_example + +#+name: exp_15_traces +#+begin_src shell :results output +find exp_15-* | grep traces.csv.gz$ | sort +#+end_src + +#+RESULTS: exp_15_traces +: exp_15-v1_grisou_44/15-v1_grisou_44_704_metis_false_Alya.x.modif.dir/scorep_15-v1_grisou_44_704_metis_false_Alya.x.modif/traces.csv.gz +: exp_15-v1_grisou_44/15-v1_grisou_44_704_metis_false_Alya.x.orig.dir/scorep_15-v1_grisou_44_704_metis_false_Alya.x.orig/traces.csv.gz +: exp_15-v1_grisou_44/15-v1_grisou_44_704_sfc_false_Alya.x.modif.dir/scorep_15-v1_grisou_44_704_sfc_false_Alya.x.modif/traces.csv.gz +: exp_15-v1_grisou_44/15-v1_grisou_44_704_sfc_false_Alya.x.orig.dir/scorep_15-v1_grisou_44_704_sfc_false_Alya.x.orig/traces.csv.gz + +*** 1 Trace: Read and filter computation states in R to =dft15= +**** Read +#+name: read_exp_15_traces +#+begin_src R :results output :session :exports both :var files=exp_15_traces +library(readr); +library(dplyr); +alya_scorep_trace_read <- function(filename, basedir = ".") +{ + meta <- unlist(strsplit(unlist(strsplit(filename, "/"))[3], "_")); + fullpath <- paste0(basedir, "/", filename); + read_delim(fullpath, + delim=" ", + col_names=c("Rank", "Start", "End", "Value"), + progress=FALSE) %>% + # Transform Value to factor + mutate(Value = as.factor(Value)) %>% + # Detect begin and end of iterations + mutate(Iteration = case_when( + (.$Value == "timste") ~ 1, + (.$Value == "endste") ~ -1, + TRUE ~ 0)) %>% + group_by(Rank) %>% + # Create a logical to detect observations within iterations + mutate(Iteration = as.logical(cumsum(Iteration))) %>% + # Get only observations that belongs to some iteration + filter(Iteration == TRUE) %>% + ungroup() %>% + # Create the iteration by cumsum + mutate(Iteration = case_when( + (.$Value == "timste") ~ 1, + TRUE ~ 0)) %>% + group_by(Rank) %>% + mutate(Iteration = cumsum(Iteration)) %>% + mutate(EID = as.factor(meta[2]), + Platform = as.factor(meta[3]), + Nodes = as.factor(meta[4]), + NP = as.factor(meta[5]), + Partitioning = as.factor(meta[6]), + Infiniband = as.logical(meta[7]), + Alya = as.factor(meta[8])); +} +# Clean-up the files variable +files <- strsplit(files, "\n")[[1]] +# Do the reading, bind_rows everything +dft15 <- do.call("bind_rows", lapply(files, function(file) { alya_scorep_trace_read(file, basedir=".") } ));#ata/07/17e0e2-99b8-44c7-acfc-4d4ad685a2c5") })); +# Fix the columns factor +dft15 <- dft15 %>% mutate(Partitioning = as.factor(Partitioning), Alya = as.factor(Alya)); +dft15 %>% summary; +#+end_src + +#+RESULTS: read_exp_15_traces +#+begin_example + +Attaching package: ‘dplyr’ + +The following objects are masked from ‘package:stats’: + + filter, lag + +The following objects are masked from ‘package:base’: + + intersect, setdiff, setequal, union +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +Warning messages: +1: In bind_rows_(x, .id) : Unequal factor levels: coercing to character +2: In bind_rows_(x, .id) : + binding character and factor vector, coercing into character vector +3: In bind_rows_(x, .id) : + binding character and factor vector, coercing into character vector +4: In bind_rows_(x, .id) : Unequal factor levels: coercing to character +5: In bind_rows_(x, .id) : + binding character and factor vector, coercing into character vector +6: In bind_rows_(x, .id) : + binding character and factor vector, coercing into character vector +7: In bind_rows_(x, .id) : + binding character and factor vector, coercing into character vector +8: In bind_rows_(x, .id) : + binding character and factor vector, coercing into character vector +9: In bind_rows_(x, .id) : + binding character and factor vector, coercing into character vector + Rank Start End Value + Min. : 0.0 Min. : 0.0882 Min. : 0.0882 solver :8729600 + 1st Qu.:175.8 1st Qu.: 237.9752 1st Qu.: 238.0672 nsi_updunk:6307840 + Median :351.5 Median : 449.0312 Median : 449.0313 nsi_solsgs:3491840 + Mean :351.5 Mean : 465.6487 Mean : 465.7192 nsi_updbcs:2872320 + 3rd Qu.:527.2 3rd Qu.: 657.0840 3rd Qu.: 657.0843 nsi_inisol:2309120 + Max. :703.0 Max. :1135.5267 Max. :1135.5267 nastin :2252800 + (Other) :8616960 + Iteration EID Platform Nodes + Min. : 1.00 15-v1:34580480 grisou:34580480 44:34580480 + 1st Qu.: 47.00 + Median : 98.00 + Mean : 98.34 + 3rd Qu.:149.00 + Max. :200.00 + + NP Partitioning Infiniband Alya + 704:34580480 metis:17290240 Mode :logical Alya.x.modif:17290240 + sfc :17290240 FALSE:34580480 Alya.x.orig :17290240 +#+end_example +**** Filter to get only the computation states we are interested in +#+begin_src R :results output :session :exports both +dft15f <- dft15 %>% filter(Value %in% c("nsi_matrix", "solver")); +dft15f +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 10,475,520 x 12 +# Groups: Rank [704] + Rank Start End Value Iteration EID Platform Nodes NP + + 1 0 0.116300 0.116323 nsi_matrix 1 15-v1 grisou 44 704 + 2 169 0.116469 0.247080 nsi_matrix 1 15-v1 grisou 44 704 + 3 511 0.116494 0.247081 nsi_matrix 1 15-v1 grisou 44 704 + 4 646 0.116509 0.248749 nsi_matrix 1 15-v1 grisou 44 704 + 5 391 0.116514 0.252305 nsi_matrix 1 15-v1 grisou 44 704 + 6 631 0.116504 0.252883 nsi_matrix 1 15-v1 grisou 44 704 + 7 655 0.116534 0.253021 nsi_matrix 1 15-v1 grisou 44 704 + 8 695 0.116470 0.253728 nsi_matrix 1 15-v1 grisou 44 704 + 9 374 0.116484 0.253761 nsi_matrix 1 15-v1 grisou 44 704 +10 387 0.116504 0.253844 nsi_matrix 1 15-v1 grisou 44 704 +# ... with 10,475,510 more rows, and 3 more variables: Partitioning , +# Infiniband , Alya +#+end_example +*** 2 Per-rank number of elements, points, etc (process and read) +The files are obtained with manual instrumentation of Alya code. +**** Process log files +#+begin_src shell :results output +EDIR=exp_15-v1_grisou_44 +for file in $(find $EDIR | grep results | grep log$ | sort); do + OUTPUT=$(dirname $file)/$(basename $file .log).csv + cat $file | tr -s " " | sed "s/^ //" | tr ' ' ',' | cut -d"," -f1,3,5,7,9,11,13 | uniq > $OUTPUT + cat $file | tr -s " " | sed "s/^ //" | tr ' ' ',' | cut -d"," -f2,4,6,8,10,12,14 >> $OUTPUT + head $OUTPUT + tail $OUTPUT + echo +done +#+end_src + +#+RESULTS: +#+begin_example +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,4743,17409,32204,591,10,4743 +2,4140,13120,31555,736,9,4140 +3,4073,12119,32464,812,7,4073 +4,5416,22804,32324,376,13,5416 +5,4393,15261,30631,612,8,4393 +6,4064,12109,31914,790,8,4064 +7,5310,22579,32154,382,12,5310 +8,4283,13578,32453,751,10,4283 +9,4038,11988,31728,793,6,4038 +694,3586,9252,30627,845,6,3586 +695,3573,8651,30626,880,7,3573 +696,3789,9266,32156,928,6,3789 +697,3703,9287,31107,867,8,3703 +698,4062,11473,32358,837,7,4062 +699,3681,10121,32016,877,3,3681 +700,3697,10030,31585,866,3,3697 +701,3652,9850,30665,830,4,3652 +702,3870,10305,31765,850,6,3870 +703,4020,11401,32101,825,6,4020 + +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,4743,17409,32204,591,10,4743 +2,4140,13120,31555,736,9,4140 +3,4073,12119,32464,812,7,4073 +4,5416,22804,32324,376,13,5416 +5,4393,15261,30631,612,8,4393 +6,4064,12109,31914,790,8,4064 +7,5310,22579,32154,382,12,5310 +8,4283,13578,32453,751,10,4283 +9,4038,11988,31728,793,6,4038 +694,3586,9252,30627,845,6,3586 +695,3573,8651,30626,880,7,3573 +696,3789,9266,32156,928,6,3789 +697,3703,9287,31107,867,8,3703 +698,4062,11473,32358,837,7,4062 +699,3681,10121,32016,877,3,3681 +700,3697,10030,31585,866,3,3697 +701,3652,9850,30665,830,4,3652 +702,3870,10305,31765,850,6,3870 +703,4020,11401,32101,825,6,4020 + +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,3215,13088,13088,387,24,3215 +2,3068,13094,13094,265,16,3068 +3,3060,13086,13086,70,13,3060 +4,3265,13092,13092,586,19,3265 +5,3275,13101,13361,914,21,3275 +6,3024,13106,13106,300,14,3024 +7,3018,13096,13096,116,16,3018 +8,2994,13100,13100,14,15,2994 +9,3003,13090,13273,121,13,3003 +694,3765,13097,22845,533,15,3765 +695,3948,13085,23885,481,15,3948 +696,4566,13108,32555,827,12,4566 +697,4915,13066,33827,951,22,4915 +698,4257,13110,29261,995,15,4257 +699,3772,13132,19330,825,18,3772 +700,4661,13103,29948,1472,20,4661 +701,3588,13098,18284,1121,16,3588 +702,3312,13098,13098,106,26,3312 +703,3064,13096,13096,381,21,3064 + +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,7070,31542,31542,652,20,7070 +2,7222,31267,31527,1476,20,7222 +3,6680,31528,31528,524,14,6680 +4,6433,31379,31562,129,13,6433 +5,6734,31539,31539,77,18,6734 +6,6133,24911,31560,838,16,6133 +7,3800,9018,31538,906,13,3800 +8,4095,8713,31453,960,16,4095 +9,4041,8784,31699,909,11,4041 +694,4721,12142,31657,764,21,4721 +695,4360,11261,31521,958,15,4360 +696,4715,14535,31538,733,15,4715 +697,5309,18369,31484,659,17,5309 +698,4427,12713,31430,796,13,4427 +699,4600,12161,31572,901,22,4600 +700,4631,14031,31540,1044,13,4631 +701,5290,18228,31523,1443,14,5290 +702,5418,17552,31595,1806,17,5418 +703,6894,29918,31451,686,23,6894 + +#+end_example +**** Read in R +#+begin_src R :results output :session :exports both +library(readr); +library(dplyr); +library(tidyr); + +read_npoin <- function(filename) +{ + meta <- unlist(strsplit(unlist(strsplit(filename, "/"))[2], "_")); + meta <- gsub(".dir", "", meta); + read_csv(filename) %>% + gather(Variable, Value, -Rank) %>% + mutate(EID = meta[1], + Platform = meta[2], + Nodes = meta[3], + NP = meta[4], + Partitioning = meta[5], + Infiniband = as.logical(meta[6]), + Alya = meta[7]); + +} +files <- list.files("exp_15-v1_grisou_44", pattern="results_NPOIN_NELEM_NELEW_NBOUN.csv", recursive=TRUE, full.names=TRUE); +dfp15 <- do.call("bind_rows", lapply(files, function(file) { read_npoin(file) })) +dfp15 <- dfp15 %>% + mutate(Partitioning=as.factor(Partitioning), + NP=as.factor(NP), + Alya=as.factor(Alya), + Nodes=as.factor(Nodes), + EID=as.factor(EID), + Platform=as.factor(Platform)) +dfp15 +dfp15 %>% summary; +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +# A tibble: 16,872 x 10 + Rank Variable Value EID Platform Nodes NP Partitioning Infiniband + + 1 1 NPOIN 4743 15-v1 grisou 44 704 metis FALSE + 2 2 NPOIN 4140 15-v1 grisou 44 704 metis FALSE + 3 3 NPOIN 4073 15-v1 grisou 44 704 metis FALSE + 4 4 NPOIN 5416 15-v1 grisou 44 704 metis FALSE + 5 5 NPOIN 4393 15-v1 grisou 44 704 metis FALSE + 6 6 NPOIN 4064 15-v1 grisou 44 704 metis FALSE + 7 7 NPOIN 5310 15-v1 grisou 44 704 metis FALSE + 8 8 NPOIN 4283 15-v1 grisou 44 704 metis FALSE + 9 9 NPOIN 4038 15-v1 grisou 44 704 metis FALSE +10 10 NPOIN 5952 15-v1 grisou 44 704 metis FALSE +# ... with 16,862 more rows, and 1 more variables: Alya + Rank Variable Value EID + Min. : 1 Length:16872 Min. : 0.0 15-v1:16872 + 1st Qu.:176 Class :character 1st Qu.: 815.8 + Median :352 Mode :character Median : 4204.0 + Mean :352 Mean : 9028.3 + 3rd Qu.:528 3rd Qu.:12905.2 + Max. :703 Max. :51434.0 + Platform Nodes NP Partitioning Infiniband + grisou:16872 44:16872 704:16872 metis:8436 Mode :logical + sfc :8436 FALSE:16872 + + + + + Alya + Alya.x.modif:8436 + Alya.x.orig :8436 +#+end_example +*** 3 Per-rank number of graph entries +The data is obtained by recent changes in Alya, they are in the =partition.par.post.res= file. +**** Process log files +To process such files, I need to extract all entries in section +identified by =NUMBER_NODE_GRAPH_ENTRIES=. To do so, +#+begin_src shell :results output +EDIR=exp_15-v1_grisou_44 +for file in $(find $EDIR | grep partition.par.post.res$ | sort); do + OUTPUT=$(dirname $file)/$(basename $file .log)-entries.csv + NP=$(echo $file | cut -d"/" -f2 | cut -d"_" -f4) + START_LINE=$(cat -n $file | grep NUMBER_NODE_GRAPH_ENTRIES | grep ComponentNames | tr '\t' ' ' | sed "s/^[[:space:]]*//" | cut -d" " -f1) + START_LINE=$(($START_LINE + 2)) + END_LINE=$(($START_LINE + $NP - 1)) + rm -f $OUTPUT + echo "Rank,ENTRIES" >> $OUTPUT + sed -n "${START_LINE},${END_LINE}p" $file | tr -s ' ' | sed "s/^ //" | tr ' ' ',' >> $OUTPUT + head $OUTPUT + tail $OUTPUT + echo +done +#+end_src + +#+RESULTS: +#+begin_example +Rank,ENTRIES +1,0.675840E+05 +2,0.733730E+05 +3,0.675900E+05 +4,0.679910E+05 +5,0.792140E+05 +6,0.687050E+05 +7,0.672740E+05 +8,0.781860E+05 +9,0.697650E+05 +695,0.620460E+05 +696,0.618930E+05 +697,0.653090E+05 +698,0.634750E+05 +699,0.678220E+05 +700,0.642810E+05 +701,0.639790E+05 +702,0.626180E+05 +703,0.656300E+05 +704,0.672040E+05 + +Rank,ENTRIES +1,0.709660E+05 +2,0.396250E+05 +3,0.386560E+05 +4,0.386280E+05 +5,0.399290E+05 +6,0.402150E+05 +7,0.384340E+05 +8,0.383040E+05 +9,0.381220E+05 +695,0.548130E+05 +696,0.572480E+05 +697,0.717460E+05 +698,0.757930E+05 +699,0.655330E+05 +700,0.508300E+05 +701,0.689990E+05 +702,0.480920E+05 +703,0.402860E+05 +704,0.386360E+05 + +Rank,ENTRIES +1,0.709740E+05 +2,0.910620E+05 +3,0.919480E+05 +4,0.884060E+05 +5,0.866770E+05 +6,0.886620E+05 +7,0.833750E+05 +8,0.646940E+05 +9,0.665630E+05 +695,0.720350E+05 +696,0.688360E+05 +697,0.718070E+05 +698,0.764690E+05 +699,0.694130E+05 +700,0.707080E+05 +701,0.709610E+05 +702,0.763700E+05 +703,0.771440E+05 +704,0.893860E+05 + +#+end_example + +**** Read in R +#+begin_src R :results output :session :exports both +library(readr); +library(dplyr); +library(tidyr); + +read_entries <- function(filename) +{ + meta <- unlist(strsplit(unlist(strsplit(filename, "/"))[2], "_")); + meta <- gsub(".dir", "", meta); + read_csv(filename) %>% + gather(Variable, Value, -Rank) %>% + mutate(EID = meta[1], + Platform = meta[2], + Nodes = meta[3], + NP = meta[4], + Partitioning = meta[5], + Infiniband = as.logical(meta[6]), + Alya = meta[7]); + +} +files <- list.files("exp_15-v1_grisou_44", pattern="fensap-partition.par.post.res-entries.csv", recursive=TRUE, full.names=TRUE); +dfe15 <- do.call("bind_rows", lapply(files, function(file) { read_entries(file) })) +dfe15 <- dfe15 %>% + mutate(Partitioning=as.factor(Partitioning), + NP=as.factor(NP), + Alya=as.factor(Alya), + Nodes=as.factor(Nodes), + EID=as.factor(EID), + Platform=as.factor(Platform)) %>% + # Here ranks are identified starting from 1, so: + # we must reduce their values by 1 to match + # the trace identification + mutate(Rank = Rank - 1); +dfe15 +dfe15 %>% summary; +dfe15 %>% .$Rank %>% unique %>% sort +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + Rank = col_integer(), + ENTRIES = col_double() +) +Parsed with column specification: +cols( + Rank = col_integer(), + ENTRIES = col_double() +) +Parsed with column specification: +cols( + Rank = col_integer(), + ENTRIES = col_double() +) +# A tibble: 2,112 x 10 + Rank Variable Value EID Platform Nodes NP Partitioning Infiniband + + 1 0 ENTRIES 67584 15-v1 grisou 44 704 metis FALSE + 2 1 ENTRIES 73373 15-v1 grisou 44 704 metis FALSE + 3 2 ENTRIES 67590 15-v1 grisou 44 704 metis FALSE + 4 3 ENTRIES 67991 15-v1 grisou 44 704 metis FALSE + 5 4 ENTRIES 79214 15-v1 grisou 44 704 metis FALSE + 6 5 ENTRIES 68705 15-v1 grisou 44 704 metis FALSE + 7 6 ENTRIES 67274 15-v1 grisou 44 704 metis FALSE + 8 7 ENTRIES 78186 15-v1 grisou 44 704 metis FALSE + 9 8 ENTRIES 69765 15-v1 grisou 44 704 metis FALSE +10 9 ENTRIES 66860 15-v1 grisou 44 704 metis FALSE +# ... with 2,102 more rows, and 1 more variables: Alya + Rank Variable Value EID Platform + Min. : 0.0 Length:2112 Min. : 36475 15-v1:2112 grisou:2112 + 1st Qu.:175.8 Class :character 1st Qu.: 65154 + Median :351.5 Mode :character Median : 68318 + Mean :351.5 Mean : 69842 + 3rd Qu.:527.2 3rd Qu.: 75119 + Max. :703.0 Max. :105113 + Nodes NP Partitioning Infiniband Alya + 44:2112 704:2112 metis: 704 Mode :logical Alya.x.modif:1408 + sfc :1408 FALSE:2112 Alya.x.orig : 704 + [1] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 + [19] 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 + [37] 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 + [55] 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 + [73] 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 + [91] 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 +[109] 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 +[127] 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 +[145] 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 +[163] 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 +[181] 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 +[199] 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 +[217] 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 +[235] 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 +[253] 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 +[271] 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 +[289] 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 +[307] 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 +[325] 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 +[343] 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 +[361] 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 +[379] 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 +[397] 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 +[415] 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 +[433] 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 +[451] 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 +[469] 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 +[487] 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 +[505] 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 +[523] 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 +[541] 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 +[559] 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 +[577] 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 +[595] 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 +[613] 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 +[631] 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 +[649] 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 +[667] 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 +[685] 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 +[703] 702 703 +#+end_example + +*** Merge trace and per-rank non-temporal data +#+begin_src R :results output :session :exports both +library(tidyr); +dft15z <- dft15f %>% + # Get all ranks except rank 0 + filter(Rank != 0) %>% + # Create a new variable named by the concation of the value (kernel name) and iteration + mutate(Variable = Value) %>% ##paste0(Value, "-", Iteration)) %>% + # Calculate the duration (put in the Value column) & remove unnecessary columns + mutate(Value = End - Start) %>% select(-Start, -End) %>% + select(Rank, Variable, Value, Iteration, EID, Platform, Nodes, NP, Partitioning, Infiniband, Alya) %>% + # Since we have several calls to our kernels for each iteration, we need to sum up + group_by(Rank, Variable, Iteration, EID, Platform, Nodes, NP, Partitioning, Infiniband, Alya) %>% + summarize(N=n(), Value=sum(Value)) %>% + ungroup() %>% + # Impose an order + select(Rank, Variable, Value, Iteration, EID, Platform, Nodes, NP, Partitioning, Alya); +dfp15z <- dfp15 %>% + mutate(Iteration = NA) %>% + select(Rank, Variable, Value, Iteration, EID, Platform, Nodes, NP, Partitioning, Alya); +dfe15z <- dfe15 %>% + mutate(Iteration = NA) %>% + select(Rank, Variable, Value, Iteration, EID, Platform, Nodes, NP, Partitioning, Alya); +dfm15 <- bind_rows(dft15z, dfp15z, dfe15z); +dfm15 <- dfm15 %>% + mutate(Iteration = as.factor(Iteration), + NP = factor(NP), + Partitioning = factor(Partitioning), + Alya = factor(Alya), + Variable = factor(Variable)); +dfm15 %>% summary +dfm15 %>% .$Variable %>% unique +#+end_src + +#+RESULTS: +#+begin_example +Warning messages: +1: In bind_rows_(x, .id) : + binding factor and character vector, coercing into character vector +2: In bind_rows_(x, .id) : + binding character and factor vector, coercing into character vector + Rank Variable Value Iteration + Min. : 0 nsi_matrix:562400 Min. : 0.00 1 : 5624 + 1st Qu.:176 solver :562400 1st Qu.: 0.48 2 : 5624 + Median :352 NBBOU : 2812 Median : 1.89 3 : 5624 + Mean :352 NBOUN : 2812 Mean : 264.05 4 : 5624 + 3rd Qu.:528 NELEM : 2812 3rd Qu.: 3.70 5 : 5624 + Max. :703 NELEW : 2812 Max. :105113.00 (Other):1096680 + (Other) : 7736 NA's : 18984 + EID Platform Nodes NP Partitioning + 15-v1:1143784 grisou:1143784 44:1143784 704:1143784 metis:571540 + sfc :572244 + + + + + + Alya + Alya.x.modif:572244 + Alya.x.orig :571540 +[1] nsi_matrix solver NPOIN NELEM NELEW NBOUN NNEIG +[8] NBBOU ENTRIES +Levels: ENTRIES NBBOU NBOUN NELEM NELEW NNEIG NPOIN nsi_matrix solver +#+end_example + +*** Plot the correlation between computaton time and NELEM, NELEW, ENTRIES +#+begin_src R :results output graphics :file img/exp_15_solver_partition_entries.png :exports both :width 800 :height 600 :session +library(cowplot); +dfm15 %>% + filter(Partitioning != "metis") %>% + filter(is.na(Iteration) | Iteration %in% c(50, 100, 150, 200)) %>% + filter((Variable %in% c("NELEM", "NELEW", "ENTRIES")) | grepl("solver", Variable)) -> dfm15sel; + +title = paste("Experiment:", + (dfm15sel %>% .$EID %>% unique), + (dfm15sel %>% .$Platform %>% unique), + (dfm15sel %>% .$Nodes %>% unique), + (dfm15sel %>% .$NP %>% unique), + (dfm15sel %>% .$Partitioning %>% unique), sep = " "); + +dfm15sel %>% + ggplot(aes(x=Rank, y=Value, color=Iteration)) + + theme_bw(base_size=16) + + geom_point(size=.3) + +# ylim(0,NA) + + theme(legend.position="top", + legend.direction="horizontal", + panel.margin = unit(.08, "lines"), + legend.spacing = unit(.0, "lines"), + legend.margin=margin(b = -.4, unit='cm'), + plot.margin = unit(x = c(0, 0, 0, 0), units = "mm")) + +# ggtitle (title) + + facet_grid(Variable~Alya, scales="free_y") -> p; +save_plot("img/exp_15_solver_partition_entries.pdf", p, base_aspect_ratio = 1.6); +p; +#+end_src + +#+RESULTS: +[[file:img/exp_15_solver_partition_entries.png]] + +*** Analysis with Arnaud +**** Solver time as a function of NELEW +#+begin_src R :results output graphics :file img/exp_15_solver_as_nelew.png :exports both :width 600 :height 400 :session +dfm15sel %>% + filter(Rank != 0) %>% + filter(Partitioning == "sfc", Iteration == 150 | is.na(Iteration)) %>% + select(-Iteration) %>% + spread(Variable, Value) %>% + ggplot(aes(x=NELEW, y=solver, color=Rank)) + geom_point() + facet_wrap(~Alya); +#+end_src + +#+RESULTS: +[[file:img/exp_15_solver_as_nelew.png]] +**** Solver time as a function of ENTRIES +#+begin_src R :results output graphics :file img/exp_15_solver_as_entries.png :exports both :width 600 :height 400 :session +dfm15sel %>% + filter(Rank != 0) %>% + filter(Partitioning == "sfc", Iteration == 100 | is.na(Iteration)) %>% + select(-Iteration) %>% + spread(Variable, Value) %>% + ggplot(aes(x=ENTRIES, y=solver, color=Rank)) + geom_point() + facet_wrap(~Alya); +#+end_src + +#+RESULTS: +[[file:img/exp_15_solver_as_entries.png]] + +**** How much time one iteration takes + +#+begin_src R :results output :session :exports both +dft15 %>% + group_by(EID, Platform, Nodes, NP, Infiniband, Alya, Partitioning) %>% + summarize(N=n(), Makespan = max(End) - min(Start), NBIter = max(Iteration)) %>% + as.data.frame() +#+end_src + +#+RESULTS: +#+begin_example + EID Platform Nodes NP Infiniband Alya Partitioning N +1 15-v1 grisou 44 704 FALSE Alya.x.modif metis 8645120 +2 15-v1 grisou 44 704 FALSE Alya.x.modif sfc 8645120 +3 15-v1 grisou 44 704 FALSE Alya.x.orig metis 8645120 +4 15-v1 grisou 44 704 FALSE Alya.x.orig sfc 8645120 + Makespan NBIter +1 721.8956 200 +2 1058.6879 200 +3 723.8556 200 +4 1135.4094 200 +#+end_example + +**** How many calls to "solver" in iteration + +#+begin_src R :results output :session :exports both +dft15 %>% + filter(Value == "solver") %>% + filter(Iteration == 150) %>% + filter(Partitioning == "sfc") %>% + group_by(Rank, EID, Platform, Nodes, NP, Infiniband, Alya, Partitioning) %>% + summarize(N=n()) %>% + filter(grepl("modif", Alya)) %>% + .$N %>% unique %>% sort +# summary +#+end_src + +#+RESULTS: +: [1] 15 + +*** Plot iteration solver computation time along time +#+begin_src R :results output graphics :file img/exp_15_solver_as_iteration.png :exports both :width 1200 :height 400 :session +library(ggplot2); +dfm15 %>% + filter(Variable == "solver") %>% + ggplot(aes(x=as.integer(Iteration), y=Value, color=as.factor(Rank))) + + theme_bw(base_size=16) + + geom_point(alpha=.1, size=.3) + + ylim(0,NA) + + geom_line (aes(group=as.factor(Rank):as.factor(Partitioning))) + + theme(legend.position="none") + + facet_grid(NP~Partitioning); +#+end_src + +#+RESULTS: +[[file:img/exp_15_solver_as_iteration.png]] + +*** Hilbert Load Curve (@BSC) + +#+begin_src R :results output :session :exports both +dfm15 %>% + filter(Iteration == 130 | is.na(Iteration)) %>% + filter(Partitioning == "sfc", Alya == "Alya.x.modif") -> dfm15.sel; + +dfm15.sel %>% +# filter(Variable == "solver") %>% + filter(Iteration == 130) %>% + group_by(Rank, Iteration, Partitioning, EID, Platform, Nodes, NP) %>% + summarize(Computing = sum(Value)) %>% + ungroup() %>% + select(-Iteration) %>% + arrange(Rank) -> dfm15.time; + +dfm15.sel %>% + filter(is.na(Iteration)) %>% + select(-Iteration) %>% + arrange(Rank) %>% + filter(Variable == "NPOIN") -> dfm15.point; +#+end_src + +#+RESULTS: + +#+begin_src R :results output :session :exports both +dfm15.time +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 703 x 7 + Rank Partitioning EID Platform Nodes NP Computing + + 1 1 sfc 15-v1 grisou 44 704 4.348808 + 2 2 sfc 15-v1 grisou 44 704 4.342806 + 3 3 sfc 15-v1 grisou 44 704 4.338238 + 4 4 sfc 15-v1 grisou 44 704 4.349558 + 5 5 sfc 15-v1 grisou 44 704 4.348942 + 6 6 sfc 15-v1 grisou 44 704 4.342368 + 7 7 sfc 15-v1 grisou 44 704 4.343007 + 8 8 sfc 15-v1 grisou 44 704 4.349633 + 9 9 sfc 15-v1 grisou 44 704 4.349449 +10 10 sfc 15-v1 grisou 44 704 4.342640 +# ... with 693 more rows +#+end_example + +#+begin_src R :results output :session :exports both +dfm15.time %>% + left_join(dfm15.point %>% rename(Npoint = Value)) %>% + select(-Partitioning, -EID, -Platform, -Alya, -Variable) -> dfm15.2; +dfm15.2; +#+end_src + +#+RESULTS: +#+begin_example +Joining, by = c("Rank", "Partitioning", "EID", "Platform", "Nodes", "NP") +# A tibble: 703 x 5 + Rank Nodes NP Computing Npoint + + 1 1 44 704 4.348808 3215 + 2 2 44 704 4.342806 3068 + 3 3 44 704 4.338238 3060 + 4 4 44 704 4.349558 3265 + 5 5 44 704 4.348942 3275 + 6 6 44 704 4.342368 3024 + 7 7 44 704 4.343007 3018 + 8 8 44 704 4.349633 2994 + 9 9 44 704 4.349449 3003 +10 10 44 704 4.342640 3093 +# ... with 693 more rows +#+end_example + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session +m1=.85 +m2=m1+.05 +dfm15.2 %>% + arrange(Rank) %>% + mutate(Computing.2 = cumsum(Computing)/sum(Computing), + Npoint.2 = cumsum(Npoint)/sum(Npoint)) %>% + ggplot(aes(x=Npoint.2, y=Computing.2)) + geom_point(size=2) + geom_line() + + ylim(m1,m2) + xlim(m1,m2) + + geom_hline(yintercept=(0:703)/703) # + geom_vline(xintercept=(0:703)/703) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-24763lPR/figure24763_IF.png]] + +#+begin_src R :results output :session :exports both +dfm15.sel %>% pull(Variable) %>% unique +#summary(dfm15.sel) %>% +#+end_src + +#+RESULTS: +: [1] nsi_matrix solver NPOIN NELEM NELEW NBOUN NNEIG +: [8] NBBOU ENTRIES +: Levels: ENTRIES NBBOU NBOUN NELEM NELEW NNEIG NPOIN nsi_matrix solver + +*** Gantt-chart +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 1700 :height 400 :session +dft15 %>% + filter(End-Start > 0.1) %>% + filter(Partitioning == "sfc", Alya == "Alya.x.modif") %>% + filter(Iteration %in% c(7)) %>%#, 151, 152, 153)) %>% + filter(End < 792) -> tx; +tx %>% + ggplot() + + geom_rect(aes(fill=Value, + xmin=Start, + xmax=End, + ymin=Rank, + ymax=Rank+0.9)); +#+end_src + +#+RESULTS: +[[file:/tmp/babel-24763lPR/figure24763cZy.png]] + +#+begin_src R :results output :session :exports both +tx %>% group_by(Rank, Value, Iteration) %>% summarize(N=n()) +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 2,111 x 4 +# Groups: Rank, Value [?] + Rank Value Iteration N + + 1 0 nsi_solsgs 7 2 + 2 0 solver 7 20 + 3 1 nsi_matrix 7 5 + 4 1 nsi_solsgs 7 2 + 5 1 solver 7 20 + 6 2 nsi_matrix 7 5 + 7 2 nsi_solsgs 7 2 + 8 2 solver 7 20 + 9 3 nsi_matrix 7 5 +10 3 nsi_solsgs 7 2 +# ... with 2,101 more rows +#+end_example + +*** Gantt-chart replay + +#+begin_src R :results output :session :exports both +dft15 %>% pull(Value) %>% unique +#+end_src + +#+RESULTS: +: [1] timste nsi_updunk nsi_updbcs nsi_begste nastin nsi_inisol +: [7] nsi_ifconf nsi_solsgs nsi_matrix solver nsi_solite nsi_doiter +: [13] doiter nsi_concou nsi_endste +: 21 Levels: doiter endste nastin nsi_begste nsi_concou nsi_doiter ... timste + +#+begin_src R :results output graphics :file img/exp_15_gantt_chart_Alya.x.modif_iteration_7.png :exports both :width 800 :height 400 :session +library(ggplot2); +dft15 %>% + filter(End-Start > 0.1) %>% + filter(Partitioning == "sfc", Alya == "Alya.x.modif") %>% + filter(Iteration %in% c(7)) %>%#, 151, 152, 153)) %>% + filter(End < 792) -> tx; +tx %>% + ggplot() + + geom_rect(aes(fill=Value, + xmin=Start, + xmax=End, + ymin=Rank, + ymax=Rank+0.9)); +#+end_src + +#+RESULTS: +[[file:/tmp/babel-10163Wik/figure10163A6o.png]] + +** 8-node grimoire :EXP16: +*** List files +#+begin_src shell :results output +find exp_16-* | grep otf2$ +#+end_src + +#+RESULTS: +: exp_16-v1_grimoire_8/16-v1_grimoire_8_128_metis_false_Alya.x.orig.dir/scorep_16-v1_grimoire_8_128_metis_false_Alya.x.orig/traces.otf2 +: exp_16-v1_grimoire_8/16-v1_grimoire_8_128_metis_true_Alya.x.modif.dir/scorep_16-v1_grimoire_8_128_metis_true_Alya.x.modif/traces.otf2 +: exp_16-v1_grimoire_8/16-v1_grimoire_8_128_sfc_true_Alya.x.modif.dir/scorep_16-v1_grimoire_8_128_sfc_true_Alya.x.modif/traces.otf2 +: exp_16-v1_grimoire_8/16-v1_grimoire_8_128_metis_true_Alya.x.orig.dir/scorep_16-v1_grimoire_8_128_metis_true_Alya.x.orig/traces.otf2 +: exp_16-v1_grimoire_8/16-v1_grimoire_8_128_sfc_true_Alya.x.orig.dir/scorep_16-v1_grimoire_8_128_sfc_true_Alya.x.orig/traces.otf2 +*** Post-processing (convert to CSV) +#+begin_src shell :results output +export PATH=$PATH:~/dev/akypuera3/b/ +BASEDIR=$(pwd) +for directory in $(find exp_16-* | grep otf2$); do + ./scripts/otf22csv_faster.sh $(dirname $directory) +done +#+end_src + +#+RESULTS: +#+begin_example +/home/schnorr/dev/Alya-Perf/exp_16-v1_grimoire_8/16-v1_grimoire_8_128_metis_false_Alya.x.orig.dir/scorep_16-v1_grimoire_8_128_metis_false_Alya.x.orig/traces.otf2 +~/dev/Alya-Perf/exp_16-v1_grimoire_8/16-v1_grimoire_8_128_metis_false_Alya.x.orig.dir/scorep_16-v1_grimoire_8_128_metis_false_Alya.x.orig ~/dev/Alya-Perf +~/dev/Alya-Perf +/home/schnorr/dev/Alya-Perf/exp_16-v1_grimoire_8/16-v1_grimoire_8_128_metis_true_Alya.x.modif.dir/scorep_16-v1_grimoire_8_128_metis_true_Alya.x.modif/traces.otf2 +~/dev/Alya-Perf/exp_16-v1_grimoire_8/16-v1_grimoire_8_128_metis_true_Alya.x.modif.dir/scorep_16-v1_grimoire_8_128_metis_true_Alya.x.modif ~/dev/Alya-Perf +~/dev/Alya-Perf +/home/schnorr/dev/Alya-Perf/exp_16-v1_grimoire_8/16-v1_grimoire_8_128_sfc_true_Alya.x.modif.dir/scorep_16-v1_grimoire_8_128_sfc_true_Alya.x.modif/traces.otf2 +~/dev/Alya-Perf/exp_16-v1_grimoire_8/16-v1_grimoire_8_128_sfc_true_Alya.x.modif.dir/scorep_16-v1_grimoire_8_128_sfc_true_Alya.x.modif ~/dev/Alya-Perf +~/dev/Alya-Perf +/home/schnorr/dev/Alya-Perf/exp_16-v1_grimoire_8/16-v1_grimoire_8_128_metis_true_Alya.x.orig.dir/scorep_16-v1_grimoire_8_128_metis_true_Alya.x.orig/traces.otf2 +~/dev/Alya-Perf/exp_16-v1_grimoire_8/16-v1_grimoire_8_128_metis_true_Alya.x.orig.dir/scorep_16-v1_grimoire_8_128_metis_true_Alya.x.orig ~/dev/Alya-Perf +~/dev/Alya-Perf +/home/schnorr/dev/Alya-Perf/exp_16-v1_grimoire_8/16-v1_grimoire_8_128_sfc_true_Alya.x.orig.dir/scorep_16-v1_grimoire_8_128_sfc_true_Alya.x.orig/traces.otf2 +~/dev/Alya-Perf/exp_16-v1_grimoire_8/16-v1_grimoire_8_128_sfc_true_Alya.x.orig.dir/scorep_16-v1_grimoire_8_128_sfc_true_Alya.x.orig ~/dev/Alya-Perf +~/dev/Alya-Perf +#+end_example + +#+name: exp_16_traces +#+begin_src shell :results output +find exp_16-* | grep traces.csv.gz$ +#+end_src + +#+RESULTS: exp_16_traces +: exp_16-v1_grimoire_8/16-v1_grimoire_8_128_metis_false_Alya.x.orig.dir/scorep_16-v1_grimoire_8_128_metis_false_Alya.x.orig/traces.csv.gz +: exp_16-v1_grimoire_8/16-v1_grimoire_8_128_metis_true_Alya.x.modif.dir/scorep_16-v1_grimoire_8_128_metis_true_Alya.x.modif/traces.csv.gz +: exp_16-v1_grimoire_8/16-v1_grimoire_8_128_sfc_true_Alya.x.modif.dir/scorep_16-v1_grimoire_8_128_sfc_true_Alya.x.modif/traces.csv.gz +: exp_16-v1_grimoire_8/16-v1_grimoire_8_128_metis_true_Alya.x.orig.dir/scorep_16-v1_grimoire_8_128_metis_true_Alya.x.orig/traces.csv.gz +: exp_16-v1_grimoire_8/16-v1_grimoire_8_128_sfc_true_Alya.x.orig.dir/scorep_16-v1_grimoire_8_128_sfc_true_Alya.x.orig/traces.csv.gz + +*** 1 Trace: Read computation states in R to =dft16= +**** Read +#+name: read_exp_16_traces +#+begin_src R :results output :session :exports both :var files=exp_16_traces +library(readr); +library(dplyr); +alya_scorep_trace_read <- function(filename, basedir = ".") +{ + meta <- unlist(strsplit(unlist(strsplit(filename, "/"))[3], "_")); + fullpath <- paste0(basedir, "/", filename); + read_delim(fullpath, + delim=" ", + col_names=c("Rank", "Start", "End", "Value"), + progress=FALSE) %>% + # Transform Value to factor + mutate(Value = as.factor(Value)) %>% + # Detect begin and end of iterations + mutate(Iteration = case_when( + (.$Value == "timste") ~ 1, + (.$Value == "endste") ~ -1, + TRUE ~ 0)) %>% + group_by(Rank) %>% + # Create a logical to detect observations within iterations + mutate(Iteration = as.logical(cumsum(Iteration))) %>% + # Get only observations that belongs to some iteration + filter(Iteration == TRUE) %>% + ungroup() %>% + # Create the iteration by cumsum + mutate(Iteration = case_when( + (.$Value == "timste") ~ 1, + TRUE ~ 0)) %>% + group_by(Rank) %>% + mutate(Iteration = cumsum(Iteration)) %>% + mutate(EID = as.factor(meta[2]), + Platform = as.factor(meta[3]), + Nodes = as.factor(meta[4]), + NP = as.factor(meta[5]), + Partitioning = as.factor(meta[6]), + Infiniband = as.logical(meta[7]), + Alya = as.factor(meta[8])); +} +# Clean-up the files variable +files <- strsplit(files, "\n")[[1]] +# Do the reading, bind_rows everything +dft16 <- do.call("bind_rows", lapply(files, function(file) { alya_scorep_trace_read(file, basedir=".") } ));#ata/07/17e0e2-99b8-44c7-acfc-4d4ad685a2c5") })); +# Fix the columns factor +dft16 <- dft16 %>% mutate(Partitioning = as.factor(Partitioning), Alya = as.factor(Alya)); +dft16 %>% summary; +#+end_src + +#+RESULTS: read_exp_16_traces +#+begin_example +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +Parsed with column specification: +cols( + Rank = col_integer(), + Start = col_double(), + End = col_double(), + Value = col_character() +) +Warning messages: +1: In bind_rows_(x, .id) : Unequal factor levels: coercing to character +2: In bind_rows_(x, .id) : Unequal factor levels: coercing to character + Rank Start End Value + Min. : 0.00 Min. : 0.243 Min. : 0.243 solver :1984000 + 1st Qu.: 31.75 1st Qu.: 629.751 1st Qu.: 629.751 nsi_updunk:1433600 + Median : 63.50 Median :1206.153 Median :1206.207 nsi_solsgs: 793600 + Mean : 63.50 Mean :1208.624 Mean :1208.801 nsi_updbcs: 652800 + 3rd Qu.: 95.25 3rd Qu.:1772.050 3rd Qu.:1772.199 nsi_inisol: 524800 + Max. :127.00 Max. :2529.016 Max. :2529.016 nastin : 512000 + (Other) :1958400 + Iteration EID Platform Nodes NP + Min. : 1.00 16-v1:7859200 grimoire:7859200 8:7859200 128:7859200 + 1st Qu.: 47.00 + Median : 98.00 + Mean : 98.34 + 3rd Qu.:149.00 + Max. :200.00 + + Partitioning Infiniband Alya + metis:4715520 Mode :logical Alya.x.modif:3143680 + sfc :3143680 FALSE:1571840 Alya.x.orig :4715520 + TRUE :6287360 + NA's :0 +#+end_example +**** Filter to get only the computation states we are interested in +#+begin_src R :results output :session :exports both +dft16f <- dft16 %>% filter(Value %in% c("nsi_matrix", "solver")); +dft16f +#+end_src + +#+RESULTS: +#+begin_example +Source: local data frame [2,380,800 x 12] +Groups: Rank [128] + + Rank Start End Value Iteration EID Platform Nodes NP + +1 0 0.429466 0.429481 nsi_matrix 1 16-v1 grimoire 8 128 +2 75 0.430559 1.132667 nsi_matrix 1 16-v1 grimoire 8 128 +3 113 0.430731 1.190444 nsi_matrix 1 16-v1 grimoire 8 128 +4 112 0.430729 1.204946 nsi_matrix 1 16-v1 grimoire 8 128 +5 94 0.430749 1.206544 nsi_matrix 1 16-v1 grimoire 8 128 +6 120 0.430872 1.208798 nsi_matrix 1 16-v1 grimoire 8 128 +7 117 0.430724 1.210292 nsi_matrix 1 16-v1 grimoire 8 128 +8 127 0.430743 1.210979 nsi_matrix 1 16-v1 grimoire 8 128 +9 118 0.430760 1.217279 nsi_matrix 1 16-v1 grimoire 8 128 +10 90 0.430767 1.219594 nsi_matrix 1 16-v1 grimoire 8 128 +# ... with 2,380,790 more rows, and 3 more variables: Partitioning , +# Infiniband , Alya +#+end_example +*** 2 Per-rank number of elements, points, etc (process and read) +The files are obtained with manual instrumentation of Alya code. +**** Process log files +#+begin_src shell :results output +EDIR=exp_16-v1_grimoire_8 +for file in $(find $EDIR | grep results | grep log$); do + OUTPUT=$(dirname $file)/$(basename $file .log).csv + cat $file | tr -s " " | sed "s/^ //" | tr ' ' ',' | cut -d"," -f1,3,5,7,9,11,13 | uniq > $OUTPUT + cat $file | tr -s " " | sed "s/^ //" | tr ' ' ',' | cut -d"," -f2,4,6,8,10,12,14 >> $OUTPUT + head $OUTPUT + tail $OUTPUT + echo +done +#+end_src + +#+RESULTS: +#+begin_example +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,22405,85371,172276,3477,4,22405 +2,22923,86049,179769,3742,5,22923 +3,24316,99541,175416,3029,5,24316 +4,22907,90905,170125,3170,4,22907 +5,22636,86964,173194,3448,5,22636 +6,22774,88416,171711,3333,6,22774 +7,23564,91625,177865,3452,7,23564 +8,22211,83394,173364,3612,5,22211 +9,23427,89980,179685,3588,4,23427 +118,19344,51553,177808,5044,6,19344 +119,19050,54006,174756,4829,3,19050 +120,18742,53558,169698,4652,6,18742 +121,19158,51974,179487,5103,4,19158 +122,19820,53397,179497,5028,8,19820 +123,19357,55432,172417,4679,6,19357 +124,18855,51717,176727,5006,3,18855 +125,19473,52093,179803,5107,6,19473 +126,18948,49074,179769,5225,4,18948 +127,19037,50759,172104,4863,6,19037 + +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,22405,85371,172276,3477,4,22405 +2,22923,86049,179769,3742,5,22923 +3,24316,99541,175416,3029,5,24316 +4,22907,90905,170125,3170,4,22907 +5,22636,86964,173194,3448,5,22636 +6,22774,88416,171711,3333,6,22774 +7,23564,91625,177865,3452,7,23564 +8,22211,83394,173364,3612,5,22211 +9,23427,89980,179685,3588,4,23427 +118,19344,51553,177808,5044,6,19344 +119,19050,54006,174756,4829,3,19050 +120,18742,53558,169698,4652,6,18742 +121,19158,51974,179487,5103,4,19158 +122,19820,53397,179497,5028,8,19820 +123,19357,55432,172417,4679,6,19357 +124,18855,51717,176727,5006,3,18855 +125,19473,52093,179803,5107,6,19473 +126,18948,49074,179769,5225,4,18948 +127,19037,50759,172104,4863,6,19037 + +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,14976,72536,72796,2384,17,14976 +2,14641,72392,72575,448,12,14641 +3,22264,72647,167221,4446,16,22264 +4,28129,72464,251684,7187,11,28129 +5,23239,72488,178813,4341,25,23239 +6,26093,72525,209615,5640,14,26093 +7,22996,72391,183401,4422,10,22996 +8,21190,72393,160643,3505,7,21190 +9,25178,72612,201462,5195,10,25178 +118,24285,72483,188018,6181,14,24285 +119,26512,72604,229879,6447,15,26512 +120,24865,72374,200264,5134,10,24865 +121,22912,72446,173477,4266,15,22912 +122,14466,72460,73072,776,11,14466 +123,14046,72589,72589,944,8,14046 +124,15427,72509,73961,2155,15,15427 +125,21776,72610,153930,4907,13,21776 +126,21662,72595,161294,4243,12,21662 +127,18484,72603,111590,4500,17,18484 + +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,22405,85371,172276,3477,4,22405 +2,22923,86049,179769,3742,5,22923 +3,24316,99541,175416,3029,5,24316 +4,22907,90905,170125,3170,4,22907 +5,22636,86964,173194,3448,5,22636 +6,22774,88416,171711,3333,6,22774 +7,23564,91625,177865,3452,7,23564 +8,22211,83394,173364,3612,5,22211 +9,23427,89980,179685,3588,4,23427 +118,19344,51553,177808,5044,6,19344 +119,19050,54006,174756,4829,3,19050 +120,18742,53558,169698,4652,6,18742 +121,19158,51974,179487,5103,4,19158 +122,19820,53397,179497,5028,8,19820 +123,19357,55432,172417,4679,6,19357 +124,18855,51717,176727,5006,3,18855 +125,19473,52093,179803,5107,6,19473 +126,18948,49074,179769,5225,4,18948 +127,19037,50759,172104,4863,6,19037 + +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,32182,174168,174611,2973,16,32182 +2,20919,53529,174758,5381,15,20919 +3,19727,49972,175172,5020,11,19727 +4,21918,69049,174054,4277,23,21918 +5,21683,64345,174280,4439,14,21683 +6,22205,64122,175002,4582,18,22205 +7,21361,63432,174587,4436,11,21361 +8,22162,75699,174459,3914,8,22162 +9,21389,66339,174799,4332,8,21389 +118,19736,47749,174249,5084,13,19736 +119,27122,103080,174535,7852,12,27122 +120,21785,59025,175145,5304,13,21785 +121,20722,56163,175018,4869,12,20722 +122,21269,59300,174615,4657,10,21269 +123,22403,70519,174270,4135,13,22403 +124,30100,145205,174682,2874,9,30100 +125,31603,142203,174120,5271,15,31603 +126,22843,73894,175386,4418,11,22843 +127,26719,97390,174439,6359,13,26719 + +#+end_example +**** Read in R +#+begin_src R :results output :session :exports both +library(readr); +library(dplyr); +library(tidyr); + +read_npoin <- function(filename) +{ + meta <- unlist(strsplit(unlist(strsplit(filename, "/"))[2], "_")); + meta <- gsub(".dir", "", meta); + read_csv(filename) %>% + gather(Variable, Value, -Rank) %>% + mutate(EID = meta[1], + Platform = meta[2], + Nodes = meta[3], + NP = meta[4], + Partitioning = meta[5], + Infiniband = as.logical(meta[6]), + Alya = meta[7]); + +} +files <- list.files("exp_16-v1_grimoire_8", pattern="results_NPOIN_NELEM_NELEW_NBOUN.csv", recursive=TRUE, full.names=TRUE); +dfp16 <- do.call("bind_rows", lapply(files, function(file) { read_npoin(file) })) +dfp16 <- dfp16 %>% + mutate(Partitioning=as.factor(Partitioning), + NP=as.factor(NP), + Alya=as.factor(Alya), + Nodes=as.factor(Nodes), + EID=as.factor(EID), + Platform=as.factor(Platform)) +dfp16 +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +# A tibble: 3,810 × 10 + Rank Variable Value EID Platform Nodes NP Partitioning Infiniband + +1 1 NPOIN 22405 16-v1 grimoire 8 128 metis FALSE +2 2 NPOIN 22923 16-v1 grimoire 8 128 metis FALSE +3 3 NPOIN 24316 16-v1 grimoire 8 128 metis FALSE +4 4 NPOIN 22907 16-v1 grimoire 8 128 metis FALSE +5 5 NPOIN 22636 16-v1 grimoire 8 128 metis FALSE +6 6 NPOIN 22774 16-v1 grimoire 8 128 metis FALSE +7 7 NPOIN 23564 16-v1 grimoire 8 128 metis FALSE +8 8 NPOIN 22211 16-v1 grimoire 8 128 metis FALSE +9 9 NPOIN 23427 16-v1 grimoire 8 128 metis FALSE +10 10 NPOIN 22871 16-v1 grimoire 8 128 metis FALSE +# ... with 3,800 more rows, and 1 more variables: Alya +#+end_example +*** 3 Per-rank number of graph entries +The data is obtained by recent changes in Alya, they are in the =partition.par.post.res= file. +**** Process log files +To process such files, I need to extract all entries in section +identified by =NUMBER_NODE_GRAPH_ENTRIES=. To do so, +#+begin_src shell :results output +EDIR=exp_16-v1_grimoire_8 +for file in $(find $EDIR | grep partition.par.post.res$); do + OUTPUT=$(dirname $file)/$(basename $file .log)-entries.csv + NP=$(echo $file | cut -d"/" -f2 | cut -d"_" -f4) + START_LINE=$(cat -n $file | grep NUMBER_NODE_GRAPH_ENTRIES | grep ComponentNames | tr '\t' ' ' | sed "s/^[[:space:]]*//" | cut -d" " -f1) + START_LINE=$(($START_LINE + 2)) + END_LINE=$(($START_LINE + $NP - 1)) + rm -f $OUTPUT + echo "Rank,ENTRIES" >> $OUTPUT + sed -n "${START_LINE},${END_LINE}p" $file | tr -s ' ' | sed "s/^ //" | tr ' ' ',' >> $OUTPUT + head $OUTPUT + tail $OUTPUT + echo +done +#+end_src + +#+RESULTS: +#+begin_example +Rank,ENTRIES +1,0.360783E+06 +2,0.369863E+06 +3,0.381955E+06 +4,0.389698E+06 +5,0.371953E+06 +6,0.372738E+06 +7,0.372212E+06 +8,0.385382E+06 +9,0.369259E+06 +119,0.348536E+06 +120,0.342990E+06 +121,0.335094E+06 +122,0.348674E+06 +123,0.354342E+06 +124,0.343209E+06 +125,0.343299E+06 +126,0.351681E+06 +127,0.346950E+06 +128,0.339977E+06 + +Rank,ENTRIES +1,0.360783E+06 +2,0.369863E+06 +3,0.381955E+06 +4,0.389698E+06 +5,0.371953E+06 +6,0.372738E+06 +7,0.372212E+06 +8,0.385382E+06 +9,0.369259E+06 +119,0.348536E+06 +120,0.342990E+06 +121,0.335094E+06 +122,0.348674E+06 +123,0.354342E+06 +124,0.343209E+06 +125,0.343299E+06 +126,0.351681E+06 +127,0.346950E+06 +128,0.339977E+06 + +Rank,ENTRIES +1,0.372646E+06 +2,0.201132E+06 +3,0.198459E+06 +4,0.360684E+06 +5,0.499009E+06 +6,0.381273E+06 +7,0.436721E+06 +8,0.384670E+06 +9,0.346130E+06 +119,0.399209E+06 +120,0.462622E+06 +121,0.417481E+06 +122,0.372852E+06 +123,0.197842E+06 +124,0.194630E+06 +125,0.205185E+06 +126,0.342284E+06 +127,0.349014E+06 +128,0.269424E+06 + +Rank,ENTRIES +1,0.675840E+05 +2,0.733730E+05 +3,0.675900E+05 +4,0.679910E+05 +5,0.792140E+05 +6,0.687050E+05 +7,0.672740E+05 +8,0.781860E+05 +9,0.697650E+05 +119,0.665170E+05 +120,0.660530E+05 +121,0.680130E+05 +122,0.693500E+05 +123,0.704970E+05 +124,0.677320E+05 +125,0.659370E+05 +126,0.718380E+05 +127,0.664600E+05 +128,0.671620E+05 + +Rank,ENTRIES +1,0.371651E+06 +2,0.456808E+06 +3,0.356423E+06 +4,0.348377E+06 +5,0.365696E+06 +6,0.363469E+06 +7,0.368031E+06 +8,0.361453E+06 +9,0.369350E+06 +119,0.346784E+06 +120,0.408946E+06 +121,0.364677E+06 +122,0.356768E+06 +123,0.360449E+06 +124,0.370175E+06 +125,0.437280E+06 +126,0.446339E+06 +127,0.374283E+06 +128,0.403337E+06 + +#+end_example + +**** Read in R +#+begin_src R :results output :session :exports both +library(readr); +library(dplyr); +library(tidyr); + +read_entries <- function(filename) +{ + meta <- unlist(strsplit(unlist(strsplit(filename, "/"))[2], "_")); + meta <- gsub(".dir", "", meta); + read_csv(filename) %>% + gather(Variable, Value, -Rank) %>% + mutate(EID = meta[1], + Platform = meta[2], + Nodes = meta[3], + NP = meta[4], + Partitioning = meta[5], + Infiniband = as.logical(meta[6]), + Alya = meta[7]); + +} +files <- list.files("exp_16-v1_grimoire_8", pattern="fensap-partition.par.post.res-entries.csv", recursive=TRUE, full.names=TRUE); +dfe16 <- do.call("bind_rows", lapply(files, function(file) { read_entries(file) })) +dfe16 <- dfe16 %>% + mutate(Partitioning=as.factor(Partitioning), + NP=as.factor(NP), + Alya=as.factor(Alya), + Nodes=as.factor(Nodes), + EID=as.factor(EID), + Platform=as.factor(Platform)) %>% + # Here ranks are identified starting from 1, so: + # we must reduce their values by 1 to match + # the trace identification + mutate(Rank = Rank - 1) +dfe16 +dfe16 %>% summary +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + Rank = col_integer(), + ENTRIES = col_double() +) +Parsed with column specification: +cols( + Rank = col_integer(), + ENTRIES = col_double() +) +Parsed with column specification: +cols( + Rank = col_integer(), + ENTRIES = col_double() +) +Parsed with column specification: +cols( + Rank = col_integer(), + ENTRIES = col_double() +) +Parsed with column specification: +cols( + Rank = col_integer(), + ENTRIES = col_double() +) +# A tibble: 640 × 10 + Rank Variable Value EID Platform Nodes NP Partitioning Infiniband + +1 0 ENTRIES 360783 16-v1 grimoire 8 128 metis FALSE +2 1 ENTRIES 369863 16-v1 grimoire 8 128 metis FALSE +3 2 ENTRIES 381955 16-v1 grimoire 8 128 metis FALSE +4 3 ENTRIES 389698 16-v1 grimoire 8 128 metis FALSE +5 4 ENTRIES 371953 16-v1 grimoire 8 128 metis FALSE +6 5 ENTRIES 372738 16-v1 grimoire 8 128 metis FALSE +7 6 ENTRIES 372212 16-v1 grimoire 8 128 metis FALSE +8 7 ENTRIES 385382 16-v1 grimoire 8 128 metis FALSE +9 8 ENTRIES 369259 16-v1 grimoire 8 128 metis FALSE +10 9 ENTRIES 386269 16-v1 grimoire 8 128 metis FALSE +# ... with 630 more rows, and 1 more variables: Alya + Rank Variable Value EID + Min. : 0.00 Length:640 Min. : 60096 16-v1:640 + 1st Qu.: 31.75 Class :character 1st Qu.:311744 + Median : 63.50 Mode :character Median :355139 + Mean : 63.50 Mean :306673 + 3rd Qu.: 95.25 3rd Qu.:374714 + Max. :127.00 Max. :524192 + Platform Nodes NP Partitioning Infiniband + grimoire:640 8:640 128:640 metis:384 Mode :logical + sfc :256 FALSE:128 + TRUE :512 + NA's :0 + + + Alya + Alya.x.modif:256 + Alya.x.orig :384 +#+end_example + +*** Merge trace and per-rank non-temporal data +#+begin_src R :results output :session :exports both +library(tidyr); +dft16z <- dft16f %>% + # Get all ranks except rank 0 + filter(Rank != 0) %>% + # Create a new variable named by the concation of the value (kernel name) and iteration + mutate(Variable = Value) %>% ##paste0(Value, "-", Iteration)) %>% + # Calculate the duration (put in the Value column) & remove unnecessary columns + mutate(Value = End - Start) %>% select(-Start, -End) %>% + select(Rank, Variable, Value, Iteration, EID, Platform, Nodes, NP, Partitioning, Infiniband, Alya) %>% + # Since we have several calls to our kernels for each iteration, we need to sum up + group_by(Rank, Variable, Iteration, EID, Platform, Nodes, NP, Partitioning, Infiniband, Alya) %>% + summarize(N=n(), Value=sum(Value)) %>% + ungroup() %>% + # Impose an order + select(Rank, Variable, Value, Iteration, EID, Platform, Nodes, NP, Partitioning, Infiniband, Alya); +dfp16z <- dfp16 %>% + mutate(Iteration = NA) %>% + select(Rank, Variable, Value, Iteration, EID, Platform, Nodes, NP, Partitioning, Infiniband, Alya); +dfe16z <- dfe16 %>% + mutate(Iteration = NA) %>% + select(Rank, Variable, Value, Iteration, EID, Platform, Nodes, NP, Partitioning, Infiniband, Alya); +dfm16 <- bind_rows(dft16z, dfp16z, dfe16z); +dfm16 <- dfm16 %>% + mutate(Iteration = as.factor(Iteration), + NP = factor(NP), + Partitioning = factor(Partitioning), + Alya = factor(Alya), + Variable = factor(Variable)); +dfm16 %>% summary +dfm16 %>% .$Variable %>% unique +#+end_src + +#+RESULTS: +#+begin_example +Warning message: +In bind_rows_(x, .id) : + binding factor and character vector, coercing into character vector + Rank Variable Value Iteration + Min. : 0 nsi_matrix:127000 Min. : 1.0 1 : 1270 + 1st Qu.: 32 solver :127000 1st Qu.: 2.7 2 : 1270 + Median : 64 ENTRIES : 640 Median : 4.0 3 : 1270 + Mean : 64 NBBOU : 635 Mean : 1488.2 4 : 1270 + 3rd Qu.: 96 NBOUN : 635 3rd Qu.: 5.8 5 : 1270 + Max. :127 NELEM : 635 Max. :524192.0 (Other):247650 + (Other) : 1905 NA's : 4450 + EID Platform Nodes NP Partitioning + 16-v1:258450 grimoire:258450 8:258450 128:258450 metis:155070 + sfc :103380 + + + + + + Infiniband Alya + Mode :logical Alya.x.modif:103380 + FALSE:51690 Alya.x.orig :155070 + TRUE :206760 + NA's :0 +[1] nsi_matrix solver NPOIN NELEM NELEW NBOUN NNEIG +[8] NBBOU ENTRIES +Levels: ENTRIES NBBOU NBOUN NELEM NELEW NNEIG NPOIN nsi_matrix solver +#+end_example + +*** Plot the correlation between computaton time and NELEM, NELEW, ENTRIES +#+begin_src R :results output graphics :file img/exp_16_solver_partition_entries.png :exports both :width 800 :height 600 :session +library(cowplot); +dfm16 %>% + filter(Partitioning == "sfc") %>% + filter(Infiniband == TRUE) %>% + filter(is.na(Iteration) | Iteration %in% c(50, 100, 150, 200)) %>% + filter((Variable %in% c("NELEM", "NELEW", "ENTRIES")) | grepl("solver", Variable)) -> dfm16sel; + +title = paste("Experiment:", + (dfm16sel %>% .$EID %>% unique), + (dfm16sel %>% .$Platform %>% unique), + (dfm16sel %>% .$Nodes %>% unique), + (dfm16sel %>% .$NP %>% unique), + (dfm16sel %>% .$Partitioning %>% unique), + (dfm16sel %>% .$Infiniband %>% unique), + sep = " "); + +dfm16sel %>% + ggplot(aes(x=Rank, y=Value, color=Iteration)) + + theme_bw(base_size=14) + + geom_point() + + ylim(0,NA) + + theme(legend.position="top", + legend.direction="horizontal", + panel.margin = unit(.08, "lines"), + legend.spacing = unit(.0, "lines"), + legend.margin=margin(b = -.3, unit='cm'), + plot.margin = unit(x = c(0, 0, 0, 0), units = "mm")) + + ggtitle (title) + + facet_grid(Variable~Alya, scales="free_y") -> p; +save_plot("img/exp_16_solver_partition_entries.pdf", p, base_aspect_ratio = 3.5); +p; +#+end_src + +#+RESULTS: +[[file:img/exp_16_solver_partition_entries.png]] +*** Check doiter to verify iteration duration +#+begin_src R :results output :session :exports both +dft16 %>% + group_by(Iteration, Partitioning, Infiniband, Alya, Platform, EID, Nodes, NP) %>% + summarize(Duration = max(End)-min(Start)) %>% + ungroup() %>% + filter(Partitioning == "sfc", Infiniband == TRUE) %>% + select(-Platform, -EID, -Nodes, -NP) %>% + head(n=100) %>% + as.data.frame() +#+end_src + +#+RESULTS: +#+begin_example + Iteration Partitioning Infiniband Alya Duration +1 1 sfc TRUE Alya.x.modif 17.756578 +2 1 sfc TRUE Alya.x.orig 20.941277 +3 2 sfc TRUE Alya.x.modif 17.117745 +4 2 sfc TRUE Alya.x.orig 20.327477 +5 3 sfc TRUE Alya.x.modif 17.066452 +6 3 sfc TRUE Alya.x.orig 20.412655 +7 4 sfc TRUE Alya.x.modif 16.960388 +8 4 sfc TRUE Alya.x.orig 20.091386 +9 5 sfc TRUE Alya.x.modif 16.866464 +10 5 sfc TRUE Alya.x.orig 20.008922 +11 6 sfc TRUE Alya.x.modif 16.846428 +12 6 sfc TRUE Alya.x.orig 19.990011 +13 7 sfc TRUE Alya.x.modif 16.496945 +14 7 sfc TRUE Alya.x.orig 20.041291 +15 8 sfc TRUE Alya.x.modif 16.504957 +16 8 sfc TRUE Alya.x.orig 19.540478 +17 9 sfc TRUE Alya.x.modif 16.239494 +18 9 sfc TRUE Alya.x.orig 19.506054 +19 10 sfc TRUE Alya.x.modif 13.295363 +20 10 sfc TRUE Alya.x.orig 15.737029 +21 11 sfc TRUE Alya.x.modif 13.309243 +22 11 sfc TRUE Alya.x.orig 15.792624 +23 12 sfc TRUE Alya.x.modif 10.215306 +24 12 sfc TRUE Alya.x.orig 12.058212 +25 13 sfc TRUE Alya.x.modif 9.988408 +26 13 sfc TRUE Alya.x.orig 12.180781 +27 14 sfc TRUE Alya.x.modif 10.165395 +28 14 sfc TRUE Alya.x.orig 12.023906 +29 15 sfc TRUE Alya.x.modif 10.110094 +30 15 sfc TRUE Alya.x.orig 11.953535 +31 16 sfc TRUE Alya.x.modif 10.148953 +32 16 sfc TRUE Alya.x.orig 11.879859 +33 17 sfc TRUE Alya.x.modif 10.057250 +34 17 sfc TRUE Alya.x.orig 12.007429 +35 18 sfc TRUE Alya.x.modif 10.186872 +36 18 sfc TRUE Alya.x.orig 11.914862 +37 19 sfc TRUE Alya.x.modif 10.128210 +38 19 sfc TRUE Alya.x.orig 11.862125 +39 20 sfc TRUE Alya.x.modif 9.999596 +40 20 sfc TRUE Alya.x.orig 11.919397 +41 21 sfc TRUE Alya.x.modif 10.091912 +42 21 sfc TRUE Alya.x.orig 12.247608 +43 22 sfc TRUE Alya.x.modif 10.104880 +44 22 sfc TRUE Alya.x.orig 12.039911 +45 23 sfc TRUE Alya.x.modif 9.919457 +46 23 sfc TRUE Alya.x.orig 11.934311 +47 24 sfc TRUE Alya.x.modif 9.991665 +48 24 sfc TRUE Alya.x.orig 11.993532 +49 25 sfc TRUE Alya.x.modif 10.103905 +50 25 sfc TRUE Alya.x.orig 11.880654 +51 26 sfc TRUE Alya.x.modif 10.190846 +52 26 sfc TRUE Alya.x.orig 11.911790 +53 27 sfc TRUE Alya.x.modif 10.105440 +54 27 sfc TRUE Alya.x.orig 11.895977 +55 28 sfc TRUE Alya.x.modif 10.087383 +56 28 sfc TRUE Alya.x.orig 12.017077 +57 29 sfc TRUE Alya.x.modif 9.849281 +58 29 sfc TRUE Alya.x.orig 11.878089 +59 30 sfc TRUE Alya.x.modif 9.890735 +60 30 sfc TRUE Alya.x.orig 11.919046 +61 31 sfc TRUE Alya.x.modif 10.014491 +62 31 sfc TRUE Alya.x.orig 11.833787 +63 32 sfc TRUE Alya.x.modif 9.874768 +64 32 sfc TRUE Alya.x.orig 11.929741 +65 33 sfc TRUE Alya.x.modif 9.846555 +66 33 sfc TRUE Alya.x.orig 11.786520 +67 34 sfc TRUE Alya.x.modif 9.994509 +68 34 sfc TRUE Alya.x.orig 11.853944 +69 35 sfc TRUE Alya.x.modif 9.700519 +70 35 sfc TRUE Alya.x.orig 11.885302 +71 36 sfc TRUE Alya.x.modif 9.728538 +72 36 sfc TRUE Alya.x.orig 11.809225 +73 37 sfc TRUE Alya.x.modif 9.683195 +74 37 sfc TRUE Alya.x.orig 11.887436 +75 38 sfc TRUE Alya.x.modif 9.729218 +76 38 sfc TRUE Alya.x.orig 11.993540 +77 39 sfc TRUE Alya.x.modif 9.638608 +78 39 sfc TRUE Alya.x.orig 11.743263 +79 40 sfc TRUE Alya.x.modif 9.652496 +80 40 sfc TRUE Alya.x.orig 11.822727 +81 41 sfc TRUE Alya.x.modif 9.731062 +82 41 sfc TRUE Alya.x.orig 11.695359 +83 42 sfc TRUE Alya.x.modif 9.711456 +84 42 sfc TRUE Alya.x.orig 11.847502 +85 43 sfc TRUE Alya.x.modif 9.571125 +86 43 sfc TRUE Alya.x.orig 11.761173 +87 44 sfc TRUE Alya.x.modif 9.561946 +88 44 sfc TRUE Alya.x.orig 11.770252 +89 45 sfc TRUE Alya.x.modif 9.715087 +90 45 sfc TRUE Alya.x.orig 11.695930 +91 46 sfc TRUE Alya.x.modif 9.623916 +92 46 sfc TRUE Alya.x.orig 11.655520 +93 47 sfc TRUE Alya.x.modif 9.627460 +94 47 sfc TRUE Alya.x.orig 11.747964 +95 48 sfc TRUE Alya.x.modif 9.634499 +96 48 sfc TRUE Alya.x.orig 11.622833 +97 49 sfc TRUE Alya.x.modif 9.584689 +98 49 sfc TRUE Alya.x.orig 11.777986 +99 50 sfc TRUE Alya.x.modif 9.727763 +100 50 sfc TRUE Alya.x.orig 11.625774 +#+end_example +*** Sum all functions that have been traced to get compute time +#+begin_src R :results output :session :exports both +dft16 %>% + filter(Partitioning == "sfc", Infiniband == TRUE) %>% + mutate(Duration = End-Start) %>% + group_by(Iteration, Rank, Value, Partitioning, Infiniband, Alya, Platform, EID, Nodes, NP) %>% + summarize(N=n(), D=sum(Duration)) %>% + filter(Iteration == 50) -> ret; +ret; +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 3,840 x 12 +# Groups: Iteration, Rank, Value, Partitioning, Infiniband, Alya, Platform, +# EID, Nodes [3,840] + Iteration Rank Value Partitioning Infiniband Alya Platform + + 1 50 0 doiter sfc TRUE Alya.x.modif grimoire + 2 50 0 doiter sfc TRUE Alya.x.orig grimoire + 3 50 0 nastin sfc TRUE Alya.x.modif grimoire + 4 50 0 nastin sfc TRUE Alya.x.orig grimoire + 5 50 0 nsi_begste sfc TRUE Alya.x.modif grimoire + 6 50 0 nsi_begste sfc TRUE Alya.x.orig grimoire + 7 50 0 nsi_concou sfc TRUE Alya.x.modif grimoire + 8 50 0 nsi_concou sfc TRUE Alya.x.orig grimoire + 9 50 0 nsi_doiter sfc TRUE Alya.x.modif grimoire +10 50 0 nsi_doiter sfc TRUE Alya.x.orig grimoire +# ... with 3,830 more rows, and 5 more variables: EID , Nodes , +# NP , N , D +#+end_example + +#+begin_src R :results output :session :exports both +ret %>% + ungroup() %>% + filter(grepl("modif", Alya)) %>% + head(n=100) %>% + select(-Platform, -EID, -Nodes, -NP) %>% + as.data.frame() +#+end_src + +#+RESULTS: +#+begin_example + Iteration Rank Value Partitioning Infiniband Alya N D +1 50 0 doiter sfc TRUE Alya.x.modif 1 0.000010 +2 50 0 nastin sfc TRUE Alya.x.modif 4 0.000002 +3 50 0 nsi_begste sfc TRUE Alya.x.modif 1 0.000002 +4 50 0 nsi_concou sfc TRUE Alya.x.modif 1 0.000009 +5 50 0 nsi_doiter sfc TRUE Alya.x.modif 1 0.000001 +6 50 0 nsi_endste sfc TRUE Alya.x.modif 1 0.000001 +7 50 0 nsi_ifconf sfc TRUE Alya.x.modif 3 0.000002 +8 50 0 nsi_inisol sfc TRUE Alya.x.modif 4 0.000009 +9 50 0 nsi_matrix sfc TRUE Alya.x.modif 3 0.000045 +10 50 0 nsi_solite sfc TRUE Alya.x.modif 3 0.000685 +11 50 0 nsi_solsgs sfc TRUE Alya.x.modif 6 1.519370 +12 50 0 nsi_updbcs sfc TRUE Alya.x.modif 5 0.000008 +13 50 0 nsi_updunk sfc TRUE Alya.x.modif 11 0.000010 +14 50 0 solver sfc TRUE Alya.x.modif 15 8.010709 +15 50 0 timste sfc TRUE Alya.x.modif 1 0.000242 +16 50 1 doiter sfc TRUE Alya.x.modif 1 0.000010 +17 50 1 nastin sfc TRUE Alya.x.modif 4 0.000001 +18 50 1 nsi_begste sfc TRUE Alya.x.modif 1 0.000002 +19 50 1 nsi_concou sfc TRUE Alya.x.modif 1 0.000001 +20 50 1 nsi_doiter sfc TRUE Alya.x.modif 1 0.000001 +21 50 1 nsi_endste sfc TRUE Alya.x.modif 1 0.000435 +22 50 1 nsi_ifconf sfc TRUE Alya.x.modif 3 0.000001 +23 50 1 nsi_inisol sfc TRUE Alya.x.modif 4 0.000018 +24 50 1 nsi_matrix sfc TRUE Alya.x.modif 3 1.896372 +25 50 1 nsi_solite sfc TRUE Alya.x.modif 3 0.000632 +26 50 1 nsi_solsgs sfc TRUE Alya.x.modif 6 1.518940 +27 50 1 nsi_updbcs sfc TRUE Alya.x.modif 5 0.000282 +28 50 1 nsi_updunk sfc TRUE Alya.x.modif 11 0.004265 +29 50 1 solver sfc TRUE Alya.x.modif 15 5.649254 +30 50 1 timste sfc TRUE Alya.x.modif 1 0.000006 +31 50 2 doiter sfc TRUE Alya.x.modif 1 0.000008 +32 50 2 nastin sfc TRUE Alya.x.modif 4 0.000004 +33 50 2 nsi_begste sfc TRUE Alya.x.modif 1 0.000001 +34 50 2 nsi_concou sfc TRUE Alya.x.modif 1 0.000001 +35 50 2 nsi_doiter sfc TRUE Alya.x.modif 1 0.000000 +36 50 2 nsi_endste sfc TRUE Alya.x.modif 1 0.000369 +37 50 2 nsi_ifconf sfc TRUE Alya.x.modif 3 0.000001 +38 50 2 nsi_inisol sfc TRUE Alya.x.modif 4 0.000011 +39 50 2 nsi_matrix sfc TRUE Alya.x.modif 3 1.868698 +40 50 2 nsi_solite sfc TRUE Alya.x.modif 3 0.000310 +41 50 2 nsi_solsgs sfc TRUE Alya.x.modif 6 1.518933 +42 50 2 nsi_updbcs sfc TRUE Alya.x.modif 5 0.000163 +43 50 2 nsi_updunk sfc TRUE Alya.x.modif 11 0.003600 +44 50 2 solver sfc TRUE Alya.x.modif 15 5.705557 +45 50 2 timste sfc TRUE Alya.x.modif 1 0.000005 +46 50 3 doiter sfc TRUE Alya.x.modif 1 0.000008 +47 50 3 nastin sfc TRUE Alya.x.modif 4 0.000002 +48 50 3 nsi_begste sfc TRUE Alya.x.modif 1 0.000001 +49 50 3 nsi_concou sfc TRUE Alya.x.modif 1 0.000001 +50 50 3 nsi_doiter sfc TRUE Alya.x.modif 1 0.000000 +51 50 3 nsi_endste sfc TRUE Alya.x.modif 1 0.000498 +52 50 3 nsi_ifconf sfc TRUE Alya.x.modif 3 0.000001 +53 50 3 nsi_inisol sfc TRUE Alya.x.modif 4 0.000005 +54 50 3 nsi_matrix sfc TRUE Alya.x.modif 3 2.669050 +55 50 3 nsi_solite sfc TRUE Alya.x.modif 3 0.000575 +56 50 3 nsi_solsgs sfc TRUE Alya.x.modif 6 1.518712 +57 50 3 nsi_updbcs sfc TRUE Alya.x.modif 5 0.000274 +58 50 3 nsi_updunk sfc TRUE Alya.x.modif 11 0.005965 +59 50 3 solver sfc TRUE Alya.x.modif 15 4.819236 +60 50 3 timste sfc TRUE Alya.x.modif 1 0.000006 +61 50 4 doiter sfc TRUE Alya.x.modif 1 0.000009 +62 50 4 nastin sfc TRUE Alya.x.modif 4 0.000000 +63 50 4 nsi_begste sfc TRUE Alya.x.modif 1 0.000002 +64 50 4 nsi_concou sfc TRUE Alya.x.modif 1 0.000001 +65 50 4 nsi_doiter sfc TRUE Alya.x.modif 1 0.000001 +66 50 4 nsi_endste sfc TRUE Alya.x.modif 1 0.000576 +67 50 4 nsi_ifconf sfc TRUE Alya.x.modif 3 0.000002 +68 50 4 nsi_inisol sfc TRUE Alya.x.modif 4 0.000007 +69 50 4 nsi_matrix sfc TRUE Alya.x.modif 3 3.379627 +70 50 4 nsi_solite sfc TRUE Alya.x.modif 3 0.000760 +71 50 4 nsi_solsgs sfc TRUE Alya.x.modif 6 1.518519 +72 50 4 nsi_updbcs sfc TRUE Alya.x.modif 5 0.000369 +73 50 4 nsi_updunk sfc TRUE Alya.x.modif 11 0.006534 +74 50 4 solver sfc TRUE Alya.x.modif 15 4.039487 +75 50 4 timste sfc TRUE Alya.x.modif 1 0.000005 +76 50 5 doiter sfc TRUE Alya.x.modif 1 0.000007 +77 50 5 nastin sfc TRUE Alya.x.modif 4 0.000002 +78 50 5 nsi_begste sfc TRUE Alya.x.modif 1 0.000002 +79 50 5 nsi_concou sfc TRUE Alya.x.modif 1 0.000000 +80 50 5 nsi_doiter sfc TRUE Alya.x.modif 1 0.000000 +81 50 5 nsi_endste sfc TRUE Alya.x.modif 1 0.000514 +82 50 5 nsi_ifconf sfc TRUE Alya.x.modif 3 0.000001 +83 50 5 nsi_inisol sfc TRUE Alya.x.modif 4 0.000008 +84 50 5 nsi_matrix sfc TRUE Alya.x.modif 3 2.733239 +85 50 5 nsi_solite sfc TRUE Alya.x.modif 3 0.000602 +86 50 5 nsi_solsgs sfc TRUE Alya.x.modif 6 1.518674 +87 50 5 nsi_updbcs sfc TRUE Alya.x.modif 5 0.000279 +88 50 5 nsi_updunk sfc TRUE Alya.x.modif 11 0.006134 +89 50 5 solver sfc TRUE Alya.x.modif 15 4.745333 +90 50 5 timste sfc TRUE Alya.x.modif 1 0.000005 +91 50 6 doiter sfc TRUE Alya.x.modif 1 0.000008 +92 50 6 nastin sfc TRUE Alya.x.modif 4 0.000003 +93 50 6 nsi_begste sfc TRUE Alya.x.modif 1 0.000001 +94 50 6 nsi_concou sfc TRUE Alya.x.modif 1 0.000001 +95 50 6 nsi_doiter sfc TRUE Alya.x.modif 1 0.000000 +96 50 6 nsi_endste sfc TRUE Alya.x.modif 1 0.000550 +97 50 6 nsi_ifconf sfc TRUE Alya.x.modif 3 0.000001 +98 50 6 nsi_inisol sfc TRUE Alya.x.modif 4 0.000007 +99 50 6 nsi_matrix sfc TRUE Alya.x.modif 3 3.059509 +100 50 6 nsi_solite sfc TRUE Alya.x.modif 3 0.000682 +#+end_example +** 15-node chetemi :EXP17: +Goal: Check Score-P profile with per-region MPI counts +*** Alya.f90 Instrumentation +#+begin_src shell :results output +cd ~/misc/alya-bsc/ +svn diff Sources/kernel/master/Alya.f90 +#+end_src + +#+RESULTS: +#+begin_example +Index: Sources/kernel/master/Alya.f90 +=================================================================== +--- Sources/kernel/master/Alya.f90 (revision 8414) ++++ Sources/kernel/master/Alya.f90 (working copy) +@@ -1,3 +1,4 @@ ++#include "scorep/SCOREP_User.inc" + !> @file Alya.f90 + !! @author Guillaume Houzeaux + !! @brief Ayla main +@@ -21,6 +22,9 @@ + use def_master, only : kfl_gocou + use def_coupli, only : kfl_gozon + implicit none ++ INTEGER :: iter ++ character*100 striter ++ SCOREP_USER_REGION_DEFINE(lucas) + ! + ! DLB should be disabled as we only wabnt to activate it for particular loops + ! Master does not disble to lend its resources automatically +@@ -28,7 +32,7 @@ + #ifdef ALYA_DLB + include 'dlbf.h' + if( INOTMASTER ) then +- if( dlb_disable() < DLB_SUCCESS ) call par_livinf(17_ip,'DLB COULD NOT BE DISABLED',0_ip) ++ if( dlb_disable() < DLB_SUCCESS ) call runend('ALYA: DLB DOES NOT WORK PROPERLY') + end if + #endif + #ifdef EXTRAE +@@ -39,6 +43,7 @@ + + call Parall(22270_ip) + ++ iter = 1 + optimization: do while ( kfl_goopt == 1 ) + + call Iniunk() +@@ -50,6 +55,9 @@ + + call Timste() + ++ write(striter, '(a,i1)') 'iter',iter ++ SCOREP_USER_REGION_BY_NAME_BEGIN(striter, SCOREP_USER_REGION_TYPE_PHASE) ++ + reset: do + call Begste() + +@@ -77,6 +85,10 @@ + + call Endste() + ++ SCOREP_USER_REGION_BY_NAME_END(striter) ++ iter = iter + 1 ++ lucas = SCOREP_USER_INVALID_REGION ++ + call Filter(ITASK_ENDTIM) + call Output(ITASK_ENDTIM) + +#+end_example +*** Contents of the experiment +#+begin_src shell :results output +ls -lhR exp_17_chetemi_15_manual +#+end_src + +#+RESULTS: +#+begin_example +exp_17_chetemi_15_manual: +total 1.1M +-rw-r----- 1 schnorr schnorr 227 Jan 23 06:44 fensap.ker.log +-rw-r----- 1 schnorr schnorr 113K Jan 23 06:44 fensap.log +-rw-r----- 1 schnorr schnorr 7.6K Jan 23 06:44 fensap.nsi.log +-rw-r----- 1 schnorr schnorr 3.5K Jan 23 06:44 fensap.par.log +-rw-r----- 1 schnorr schnorr 849K Jan 23 06:44 fensap-sgs.nsi.log +-rw-r----- 1 schnorr schnorr 1.3K Jan 23 06:44 fensap-system.log +-rw-r--r-- 1 schnorr schnorr 41K Jan 23 06:44 results_NPOIN_NELEM_NELEW_NBOUN.log +drwxr-xr-x 2 schnorr schnorr 4.0K Jan 23 06:42 scorep-20180123_0156_642858649812104 + +exp_17_chetemi_15_manual/scorep-20180123_0156_642858649812104: +total 1.4M +-rw-r--r-- 1 schnorr schnorr 1.4M Jan 23 06:42 profile.cubex +-rw-r--r-- 1 schnorr schnorr 1.3K Jan 23 06:42 scorep.cfg +#+end_example +*** Check profile.cubex + +You'll need a CUBE installation, with their auxiliary tools. + +#+name: exp17_cubex_to_open +#+begin_src shell :results output +cd exp_17_chetemi_15_manual/scorep-20180123_0156_642858649812104 +~/install/cube-4.3.5/bin/cube_dump -c all -m all -s csv2 profile.cubex > profile.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex | tail -n+33 | head -n279 | tr '(' ';' | sed -e "s/[[:space:]]*//g" -e "s/,.*$//" -e "s/id=//" -e "s/:[^;]*;/;/" > regions-codes.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex > cube_info.txt +#+end_src + +#+RESULTS: exp17_cubex_to_open + +*** Code from Arnaud to parse the Call tree + +#+name: exp17_cube_calltree +#+header: :var dep0=exp17_cubex_to_open +#+begin_src perl :results output :exports both +use strict; +my($filename) = "./exp_17_chetemi_15_manual/scorep-20180123_0156_642858649812104/cube_info.txt"; +my($line); +open(INPUT,$filename); +my($in_CALLTREE) = 0; + +my($VAR_iteration) = -1; +my($VAR_type) = -1; +my($VAR_id) = -1; + +my($filename_out) = $filename; +$filename_out =~ s/txt$/csv/; + +open(OUTPUT,"> ".$filename_out); + +while(defined($line=)) { + chomp $line; + if($line =~ "CALL TREE") { $in_CALLTREE = 1; } + if(!$in_CALLTREE) { next; } + if($line =~ "SYSTEM DIMENSION") { $in_CALLTREE = 0; } + + if($line =~ /^iter(\d*)\s*.*id=(\d*),.*/) { + $VAR_iteration = $1; + $VAR_type = "Computation"; + $VAR_id = $2; + # print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } + if($line =~ /^\s*\|\-(\S*)\s.*id=(\d*),.*/) { +# print $line."\n"; + $VAR_type = $1; + $VAR_id = $2; +# print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } +} +close(OUTPUT); +print $filename_out; +#+end_src + +#+RESULTS: exp17_cube_calltree +: ./exp_17_chetemi_15_manual/scorep-20180123_0156_642858649812104/cube_info.csv + +*** Enrich the call tree with profile measurements + +#+name: exp17_enrich +#+header: :var CSV=exp17_cube_calltree +#+begin_src R :results output :session :exports both +WD = "./exp_17_chetemi_15_manual/scorep-20180123_0156_642858649812104/"; +PROFILE = paste0(WD, "profile.csv"); +REGION = paste0(WD, "regions-codes.csv"); +read_delim(CSV, delim=",", col_names=FALSE) %>% + rename(Phase = X1, Code = X2, ID = X3) -> d; +df.PROF <- read_csv(PROFILE); +df.REGION <- read_delim(REGION, col_names=FALSE, delim=";"); +d %>% + left_join(df.PROF, by=c("ID" = "Cnode ID")) %>% + rename(Rank = `Thread ID`) -> df +#+end_src + +#+RESULTS: exp17_enrich +#+begin_example +Parsed with column specification: +cols( + X1 = col_integer(), + X2 = col_character(), + X3 = col_integer() +) +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 6594 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 4501 bytes_sent an integer 3570819600 file 2 4501 bytes_received an integer 3570819600 row 3 4502 bytes_sent an integer 3570819600 col 4 4502 bytes_received an integer 3570819600 expected 5 4503 bytes_sent an integer 3570819600 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Warning message: +In rbind(names(probs), probs_f) : + number of columns of result is not a multiple of vector length (arg 1) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +#+end_example + +*** 2 Per-rank number of elements, points, etc (process and read) +**** Process log files + +#+name: exp17_log_to_csv_points +#+begin_src shell :results output +EDIR=exp_17_chetemi_15_manual/ +for file in $(find $EDIR | grep results | grep log$); do + OUTPUT=$(dirname $file)/$(basename $file .log).csv + cat $file | tr -s " " | sed "s/^ //" | tr ' ' ',' | cut -d"," -f1,3,5,7,9,11,13 | uniq > $OUTPUT + cat $file | tr -s " " | sed "s/^ //" | tr ' ' ',' | cut -d"," -f2,4,6,8,10,12,14 >> $OUTPUT + head $OUTPUT + tail $OUTPUT + echo +done +#+end_src + +#+RESULTS: exp17_log_to_csv_points +#+begin_example +Rank,NPOIN,NELEM,NELEW,NBOUN,NNEIG,NBBOU +1,15156,73772,74032,2384,18,15156 +2,14967,74015,74198,471,13,14967 +3,11537,43926,74370,1797,14,11537 +4,8919,20973,73418,2127,13,8919 +5,9211,20762,74212,2191,14,9211 +6,8915,21347,74002,2097,9,8915 +7,8657,20938,73503,2113,12,8657 +8,9076,24188,73808,1979,10,9076 +9,10327,34522,73757,1590,13,10327 +290,9808,27318,74279,1877,15,9808 +291,13682,63829,74331,1228,15,13682 +292,14573,74413,74413,1097,8,14573 +293,15587,72889,74341,1767,19,15587 +294,13482,58001,74276,2750,15,13482 +295,9462,25039,74549,1962,13,9462 +296,9744,28896,74320,1986,10,9744 +297,11034,37519,74875,1702,15,11034 +298,10946,35249,73659,2651,12,10946 +299,13374,51818,73849,3020,19,13374 + +#+end_example + +**** Read in R + +#+name: exp17_points +#+header: :var dep0=exp17_log_to_csv_points +#+begin_src R :results output :session :exports both +library(readr); +library(dplyr); +library(tidyr); + +read_npoin <- function(filename) +{ + meta <- unlist(strsplit(unlist(strsplit(filename, "/"))[2], "_")); + meta <- gsub(".dir", "", meta); + read_csv(filename); +} +files <- list.files("exp_17_chetemi_15_manual", pattern="results_NPOIN_NELEM_NELEW_NBOUN.csv", recursive=TRUE, full.names=TRUE); +dfp17 <- do.call("bind_rows", lapply(files, function(file) { read_npoin(file) })) +dfp17 +#+end_src + +#+RESULTS: exp17_points +#+begin_example +Parsed with column specification: +cols( + Rank = col_integer(), + NPOIN = col_integer(), + NELEM = col_integer(), + NELEW = col_integer(), + NBOUN = col_integer(), + NNEIG = col_integer(), + NBBOU = col_integer() +) +# A tibble: 299 x 7 + Rank NPOIN NELEM NELEW NBOUN NNEIG NBBOU + + 1 1 15156 73772 74032 2384 18 15156 + 2 2 14967 74015 74198 471 13 14967 + 3 3 11537 43926 74370 1797 14 11537 + 4 4 8919 20973 73418 2127 13 8919 + 5 5 9211 20762 74212 2191 14 9211 + 6 6 8915 21347 74002 2097 9 8915 + 7 7 8657 20938 73503 2113 12 8657 + 8 8 9076 24188 73808 1979 10 9076 + 9 9 10327 34522 73757 1590 13 10327 +10 10 9811 26661 73911 1962 26 9811 +# ... with 289 more rows +#+end_example + +*** Rough initial LB Analysis +#+header: :var dep0=exp17_enrich +#+header: :var dep1=exp17_points +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session +df %>% + left_join(dfp17) %>% + group_by(Phase, Rank, Code) %>% + summarize(Sum = sum(time)) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + filter(Rank != 0) %>% + filter(Code == "Computation") %>% + ggplot(aes(x=as.integer(Phase), y=Sum, color=Rank)) + + geom_point() + geom_line(aes(group=Rank), alpha=.1) + + ylim(0,NA) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-24763lPR/figure24763VS2.png]] + + +#+begin_src R :results output :session :exports both + +#+end_src +*** Cumsum finally with plot (Hibert Load Curve) + +#+header: :var dep0=exp17_enrich +#+header: :var dep1=exp17_points +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session +m1=0 +m2=m1+.05 +NP = 299 +df %>% + left_join(dfp17) %>% + filter(Rank != 0) %>% + filter(Phase == 5) %>% + filter(Code == "Computation") %>% + arrange(Rank) %>% + select(Rank, time, NELEW) %>% + mutate(TIMESUM = cumsum(time)/sum(time)) %>% + mutate(NELEWSUM = cumsum(NELEW)/sum(NELEW)) -> df.CUMSUM; +df.CUMSUM %>% + ggplot(aes(x=NELEWSUM, y=TIMESUM)) + + geom_point() + + geom_line() + ylim(m1,m2) + xlim(m1,m2) + + geom_hline(yintercept=(0:NP)/NP) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-24763lPR/figure247630Oq.png]] + +*** Inverting Hilbert Load Curve + +#+name: exp17_hilbert +#+begin_src R :results output :session *R* :exports both +df_cum_inverse = function(df,yval) { + df = rbind(data.frame(x=0,y=0),df[names(df) %in% c("x","y")]) + N = nrow(df); + xval = rep(NA,length(yval)) +# print(N); + for(i in 1:length(yval)) { +# print(yval[i]) + if(yval[i]<=0) {xval[i] = 0;} + else if(yval[i]>=1) {xval[i] = 1;} + else { + idx = df %>% mutate(Order = 1:nrow(.)) %>% filter(y >= yval[i]) %>% slice(1) %>% pull(Order) + idx = idx-1; +# print(idx) +# print(df[idx:(idx+1),]) + eps_x = (df[idx+1,]$x - df[idx,]$x); + eps_y = (df[idx+1,]$y - df[idx,]$y); + xval[i] = ((yval[i]-df[idx,]$y)*eps_x/eps_y + df[idx,]$x); + } + } + return(xval); +} +#+end_src + +#+RESULTS: exp17_hilbert + +#+header: :var dep0=exp17_hilbert +#+begin_src R :results output :session :exports both +df.CUMSUM %>% rename(y = TIMESUM, x = NELEWSUM) -> df.CUMSUM.2 +nb_proc = nrow(df.CUMSUM.2) +df_cum_inverse(df.CUMSUM.2, y = (1:nb_proc)/nb_proc) -> l.TARGET; +df.CUMSUM.2 %>% + mutate(Target.x = l.TARGET, + Target.y = (1:nb_proc)/nb_proc) -> df.TARGET; +df.TARGET; +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 299 x 7 + Rank time NELEW y x Target.x Target.y + + 1 1 26.696322 74032 0.006082734 0.003338458 0.001835591 0.003344482 + 2 2 26.975246 74198 0.012229020 0.006684402 0.003668480 0.006688963 + 3 3 18.678574 74370 0.016484917 0.010038101 0.005489164 0.010033445 + 4 4 12.662677 73418 0.019370097 0.013348871 0.007589754 0.013377926 + 5 5 6.610780 74212 0.020876357 0.016695446 0.010310625 0.016722408 + 6 6 12.895270 74002 0.023814533 0.020032551 0.014896989 0.020066890 + 7 7 12.642058 73503 0.026695015 0.023347154 0.019574650 0.023411371 + 8 8 8.923145 73808 0.028728146 0.026675511 0.023446748 0.026755853 + 9 9 16.234130 73757 0.032427079 0.030001567 0.027909374 0.030100334 +10 10 14.480795 73911 0.035726515 0.033334569 0.031029658 0.033444816 +# ... with 289 more rows +#+end_example + +#+begin_src R :results output :session :exports both +df.TARGET %>% select(Rank, x, y) -> df.1; +df.TARGET %>% select(Rank, Target.x, Target.y) -> df.2; +df.1 %>% + mutate(Category = "Observed") %>% + bind_rows(df.2 %>% + mutate(Category = "Target") %>% + rename(x = Target.x, y = Target.y)) -> df.TARGET.tidy; +df.TARGET %>% select(x, y, Target.x, Target.y) %>% filter(x == Target.x) +#+end_src + +#+RESULTS: +: # A tibble: 1 x 4 +: x y Target.x Target.y +: +: 1 1 1 1 1 + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 800 :height 400 :session +m1=0 +m2=m1+0.02 +NP = 299 +df.TARGET.tidy %>% + ggplot(aes(x=x, y=y, group+category)) + + theme_bw(base_size=20) + + geom_curve (data=df.TARGET %>% filter(x!=1), aes(x=x, y=y, xend=Target.x, yend=Target.y), arrow=arrow(length = unit(0.03, "npc"))) + + geom_point(size=4, aes(shape=Category, color=as.factor(Rank%%8))) + + scale_color_brewer(palette = "Set1") + +# geom_line(data=df.TARGET.tidy %>% filter(Category == "Observed")) + + coord_cartesian(xlim=c(m1, m2), ylim=c(m1, m2)) + + geom_hline(yintercept=(0:NP)/NP) + +#+end_src + +#+RESULTS: +[[file:/tmp/babel-24763lPR/figure24763qjJ.png]] + +#+begin_src R :results output :session :exports both +df_cum_inverse(df.CUMSUM.2, yval = 0.02) +#+end_src + +#+RESULTS: +: [1] 300 +: [1] 0.02 +: [1] 5 +: x y +: 5 0.01334887 0.01937010 +: 6 0.01669545 0.02087636 +: [1] 0.01474838 +*** Provide Rick with a file with the number of elements + +The following is not necessary because Ricard is going to define a +format and we will comply to it. + +#+begin_src R :results output :session :exports both +dfp17 %>% + select(Rank, NELEM) %>% + write_csv("NELEM-300ranks.csv") +#+end_src + +#+RESULTS: +** 50-node grisou + 8-node grimoire (928 cores + 1K iterations) :EXP18: +*** Alya.f90 Instrumentation +#+begin_src shell :results output +cd ~/misc/alya-bsc/ +svn diff Sources/kernel/master/Alya.f90 +#+end_src + +#+RESULTS: +#+begin_example +Index: Sources/kernel/master/Alya.f90 +=================================================================== +--- Sources/kernel/master/Alya.f90 (revision 8439) ++++ Sources/kernel/master/Alya.f90 (working copy) +@@ -1,3 +1,4 @@ ++#include "scorep/SCOREP_User.inc" + !> @file Alya.f90 + !! @author Guillaume Houzeaux + !! @brief Ayla main +@@ -20,7 +21,13 @@ + use def_master, only : kfl_goblk + use def_master, only : kfl_gocou + use def_coupli, only : kfl_gozon ++ use mod_parall, only : PAR_MY_WORLD_RANK + implicit none ++ INTEGER :: iter,ierror ++ character*100 striter ++ character*100 iterfile ++ real :: tnow ++ real, dimension(2) :: tarray + ! + ! DLB should be disabled as we only wabnt to activate it for particular loops + ! Master does not disble to lend its resources automatically +@@ -39,6 +46,10 @@ + + call Parall(22270_ip) + ++ write(iterfile,'(a,i4.4,a)') 'iterations-', PAR_MY_WORLD_RANK, '.csv' ++ open(unit=2, file=iterfile) ++ ++ iter = 1 + optimization: do while ( kfl_goopt == 1 ) + + call Iniunk() +@@ -49,6 +60,11 @@ + time: do while ( kfl_gotim == 1 ) + + call Timste() ++ call ETIME(tarray, tnow) ++ write(2,*) tnow, tarray(1), tarray(2), PAR_MY_WORLD_RANK, iter ++ ++ write(striter, '(a,i3.3)') 'iter',iter ++ SCOREP_USER_REGION_BY_NAME_BEGIN(striter, SCOREP_USER_REGION_TYPE_PHASE) + + reset: do + call Begste() +@@ -77,6 +93,13 @@ + + call Endste() + ++ SCOREP_USER_REGION_BY_NAME_END(striter) ++ call ETIME(tarray, tnow) ++ write(2,*) tnow, tarray(1), tarray(2), PAR_MY_WORLD_RANK, iter ++ iter = iter + 1 ++ ++ ++ + call Filter(ITASK_ENDTIM) + call Output(ITASK_ENDTIM) + +@@ -91,6 +114,9 @@ + + end do optimization + ++ close(2) ++ + call Turnof() + ++ + end program Alya +#+end_example +*** config.in with scorep +#+begin_src shell :results output +cd ~/misc/alya-bsc/ +head -n 25 Executables/unix/config.in +#+end_src + +#+RESULTS: +#+begin_example +################################################################### +# GFORTRAN CONFIGURE # +#MN3 RECOMENDED MODULE: # +#module load gcc/5.1.0 openmpi/1.8.5 # +################################################################### + +#@Compiler: Using gfortran Compiler. +SCOREP = ~/install/nova/scorep-3.0-alya/bin/scorep +MPIF90 = ~//spack-ALYA/opt/spack/linux-debian9-x86_64/gcc-6.3.0/openmpi-3.0.0-a7g33v4ulwtb4g2verliyelvtifybrq3/bin/mpif90 +MPICC = ~//spack-ALYA/opt/spack/linux-debian9-x86_64/gcc-6.3.0/openmpi-3.0.0-a7g33v4ulwtb4g2verliyelvtifybrq3/bin/mpicc +F77 = $(SCOREP) --user --nocompiler --nopomp --noopenmp $(MPIF90) -cpp +F90 = $(SCOREP) --user --nocompiler --nopomp --noopenmp $(MPIF90) -cpp +FCOCC = $(SCOREP) --user --nocompiler --nopomp --noopenmp $(MPICC) -c +FCFLAGS = -c -J$O -I$O -ffree-line-length-none -fimplicit-none +FPPFLAGS = -x f95-cpp-input +EXTRALIB = -lc +EXTRAINC = +fa2p = $(MPIF90) -cpp -c -x f95-cpp-input -DMPI_OFF -J../../Utils/user/alya2pos -I../../Utils/user/alya2pos +fa2plk = $(MPIF90) -cpp -lc + +################################################################### +# PERFORMANCE FLAGS # +################################################################### +#@Optimization: O1 +FOPT = -O2 +#+end_example +*** Comments on tracing only COLL MPI group :deprecated: + +#+begin_src shell :results output +export SCOREP_MPI_ENABLE_GROUPS=COLL +export SCOREP_ENABLE_PROFILING=TRUE +export SCOREP_ENABLE_TRACING=TRUE +#+end_src + +If I do as above (only collectives), trace become big. And otf22csv is +strange, it allocates all the memory before dumping anything. If I +only enable the COLL group, profiling get affected by that and my +per-phase compute mesure becomes unreliable. So, I've decided to write +begin/end of each phase in files. + +*** Execution + +#+begin_src shell :results output +source ~/spack-ALYA/share/spack/setup-env.sh +export PATH=$(spack location -i openmpi)/bin:$PATH +$(which mpirun) \ + -x SCOREP_TOTAL_MEMORY=4GB \ + -x SCOREP_MPI_ENABLE_GROUPS=ALL \ + -x SCOREP_ENABLE_TRACING=FALSE \ + -x SCOREP_ENABLE_PROFILING=TRUE \ + -x LD_LIBRARY_PATH=$(spack location -i openmpi)/lib/ \ + -np $(cat ~/machine-file | wc -l) \ + -machinefile ~/machine-file \ + ~/alya-bsc/Executables/unix/Alya.x fensap +#+end_src + +*** Details about each partition + +- Output the points (and details) for each partition + - Carlos told me that I can put this to obtain some info + #+BEGIN_EXAMPLE + PARALL_SERVICE: On + OUTPUT_FILE: On + #+END_EXAMPLE + +*** 0. Experiment directory + +#+name: exp18_dir +#+begin_src shell :results output +echo -n "exp_18_grisou_grimoire_58_manual/np912_1000iter_scorep-20180125_0517_105434259013856/" +#+end_src + +#+RESULTS: exp18_dir +: exp_18_grisou_grimoire_58_manual/np912_1000iter_scorep-20180125_0517_105434259013856/ + +*** Data Transformation +**** 1. Read Iteration Timings + +#+name: exp18_iteration_timings +#+header: :var DIR=exp18_dir +#+begin_src R :results output :session :exports both +suppressMessages(library(tidyverse)); +do.call("bind_rows", lapply(list.files(DIR, pattern="iterations.*csv$", full.names=TRUE), + function(file) { + read_table(file, col_names=FALSE) %>% + rename(Time.User = X1, + Time.System = X2, + Time.Run = X3, + Rank = X4, + Iteration = X5) %>% + filter(Iteration != 0) %>% + group_by(Rank, Iteration) %>% + summarize(Start = min(Time.User), + End = max(Time.User)) %>% + group_by(Rank) %>% + mutate(End = End - min(Start), + Start = Start - min(Start)) + })) -> exp.iter; +#+end_src + +#+RESULTS: exp18_iteration_timings +#+begin_example +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +#+end_example + +**** 2. Transform profile.cubex + +#+name: exp18_current_dir +#+header: :var DIR=exp18_dir +#+begin_src R :results verbatim :session :exports both +print(DIR) +#+end_src + +#+RESULTS: exp18_current_dir +: exp_18_grisou_grimoire_58_manual/np912_1000iter_scorep-20180125_0517_105434259013856/ + +#+name: exp18_cubex_to_open +#+header: :var DIR=exp18_current_dir +#+begin_src shell :results output +cd $DIR +~/install/cube-4.3.5/bin/cube_dump -c all -m all -s csv2 profile.cubex > profile.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex | tail -n+33 | head -n279 | tr '(' ';' | sed -e "s/[[:space:]]*//g" -e "s/,.*$//" -e "s/id=//" -e "s/:[^;]*;/;/" > regions-codes.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex > cube_info.txt +#+end_src + +#+RESULTS: exp18_cubex_to_open + +**** 3. Parse the call tree + +#+name: exp18_cube_calltree +#+header: :var dep0=exp18_cubex_to_open +#+header: :var DIR=exp18_current_dir +#+begin_src perl :results output :exports both +use strict; +my($filename) = $DIR . "/cube_info.txt"; +my($line); +open(INPUT,$filename); +my($in_CALLTREE) = 0; + +my($VAR_iteration) = -1; +my($VAR_type) = -1; +my($VAR_id) = -1; + +my($filename_out) = $filename; +$filename_out =~ s/txt$/csv/; + +open(OUTPUT,"> ".$filename_out); + +while(defined($line=)) { + chomp $line; + if($line =~ "CALL TREE") { $in_CALLTREE = 1; } + if(!$in_CALLTREE) { next; } + if($line =~ "SYSTEM DIMENSION") { $in_CALLTREE = 0; } + + if($line =~ /^iter(\d*)\s*.*id=(\d*),.*/) { + $VAR_iteration = $1; + $VAR_type = "Computation"; + $VAR_id = $2; + # print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } + if($line =~ /^\s*\|\-(\S*)\s.*id=(\d*),.*/) { +# print $line."\n"; + $VAR_type = $1; + $VAR_id = $2; +# print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } +} +close(OUTPUT); +print $filename_out; +#+end_src + +#+RESULTS: exp18_cube_calltree +: exp_18_grisou_grimoire_58_manual/np912_1000iter_scorep-20180125_0517_105434259013856//cube_info.csv + +**** 4. Enrich the call tree + +#+name: exp18_enrich +#+header: :var CSV=exp18_cube_calltree +#+header: :var DIR=exp18_current_dir +#+begin_src R :results output :session :exports both +suppressMessages(library(tidyverse)); +CSV = "exp_18_grisou_grimoire_58_manual/np912_1000iter_scorep-20180125_0517_105434259013856//cube_info.csv" +PROFILE = paste0(DIR, "/profile.csv"); +REGION = paste0(DIR, "/regions-codes.csv"); +df.PROF <- read_csv(PROFILE); +exp.REGION <- read_delim(REGION, col_names=FALSE, delim=";") %>% + rename(ID = X2, Name = X1); +exp.ENRICH <- read_delim(CSV, delim=",", col_names=FALSE) %>% + rename(Phase = X1, Code = X2, ID = X3) %>% + left_join(df.PROF, by=c("ID" = "Cnode ID")) %>% + rename(Rank = `Thread ID`) %>% + filter(Code == "Computation") %>% + mutate(Phase = as.integer(Phase)) %>% + filter(Rank != 0) +#+end_src + +**** X. Verify the load imbalance (and eventually summarize it if necessary) +***** Rough initial LB Analysis + +#+header: :var dep0=exp18_enrich +#+begin_src R :results output graphics :file img/exp18_1K_iterations_912_NP.png :exports both :width 1400 :height 400 :session +exp.ENRICH %>% + group_by(Phase, Rank, Code) %>% + summarize(Sum = sum(time)) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + ggplot(aes(x=as.integer(Phase), y=Sum, color=Rank)) + + #geom_line(aes(group=Rank), alpha=.1) + + geom_point(alpha=.1, size=.1) + + ylim(0,NA) + + theme_bw(base_size=18) +#+end_src + +#+RESULTS: +[[file:img/exp18_1K_iterations_912_NP.png]] + +***** Calculate the average + +#+name: exp18_lb_average +#+header: :var dep0=exp18_enrich +#+begin_src R :results output :session :exports both +exp.ENRICH %>% + filter(Phase <= 9) %>% + group_by(Rank) %>% + summarize(time = mean(time)) -> exp.ENRICH +exp.ENRICH +#+end_src + +#+RESULTS: exp18_lb_average +#+begin_example +# A tibble: 64 x 2 + Rank time + + 1 1 42.92296 + 2 2 64.98924 + 3 3 66.13120 + 4 4 62.14705 + 5 5 64.66830 + 6 6 59.88956 + 7 7 61.76306 + 8 8 67.09068 + 9 9 57.22863 +10 10 64.70995 +# ... with 54 more rows +#+end_example + +**** 5. The number of elements on T1 + +#+name: exp18_number_of_elements_shell +#+header: :var DIR=exp18_current_dir +#+begin_src shell :results table :colnames yes +cd $DIR +echo "Type Rank NELEM NPOIN NBOUN NPOI32" +cat domain-*.csv | grep ^.*T1 +#+end_src + +#+RESULTS: exp18_number_of_elements_shell +| Type | Rank | NELEM | NPOIN | NBOUN | NPOI32 | +| T1 | 0 | 0 | 0 | 553075 | -1 | +| T1 | 1 | 143862 | 27303 | 2832 | 2362 | +| T1 | 2 | 143921 | 48988 | 11415 | 4174 | +| T1 | 3 | 143822 | 47644 | 9906 | 4947 | +| T1 | 4 | 143695 | 43667 | 8197 | 3921 | +| T1 | 5 | 143675 | 46242 | 9442 | 4191 | +| T1 | 6 | 143752 | 43209 | 7433 | 4876 | +| T1 | 7 | 143768 | 48927 | 11007 | 4580 | +| T1 | 8 | 143792 | 47241 | 10828 | 2428 | +| T1 | 9 | 143945 | 39022 | 6023 | 3238 | +| T1 | 10 | 144243 | 47255 | 9907 | 4088 | +| T1 | 11 | 143833 | 52178 | 12835 | 4720 | +| T1 | 12 | 143842 | 48489 | 13095 | 2082 | +| T1 | 13 | 143754 | 47833 | 10503 | 3770 | +| T1 | 14 | 143817 | 35239 | 3680 | 3938 | +| T1 | 15 | 143776 | 38633 | 5338 | 4204 | +| T1 | 16 | 144023 | 37650 | 5066 | 3630 | +| T1 | 17 | 143773 | 42217 | 7839 | 3269 | +| T1 | 18 | 143746 | 32849 | 2614 | 3662 | +| T1 | 19 | 143824 | 37433 | 4732 | 4118 | +| T1 | 20 | 143821 | 38632 | 5583 | 3796 | +| T1 | 21 | 143849 | 42960 | 8158 | 3320 | +| T1 | 22 | 143789 | 46616 | 9577 | 4177 | +| T1 | 23 | 143887 | 43733 | 8093 | 3975 | +| T1 | 24 | 143817 | 40890 | 6690 | 3904 | +| T1 | 25 | 143742 | 37534 | 4465 | 4698 | +| T1 | 26 | 143741 | 34257 | 3128 | 3959 | +| T1 | 27 | 143887 | 39461 | 6142 | 3517 | +| T1 | 28 | 143876 | 35937 | 4111 | 3778 | +| T1 | 29 | 144122 | 46780 | 9805 | 3777 | +| T1 | 30 | 143919 | 47984 | 11611 | 2013 | +| T1 | 31 | 143703 | 43909 | 9221 | 3191 | +| T1 | 32 | 143864 | 41551 | 7138 | 3673 | +| T1 | 33 | 143920 | 45680 | 9560 | 3251 | +| T1 | 34 | 143917 | 43972 | 8473 | 3576 | +| T1 | 35 | 143733 | 36957 | 5195 | 2808 | +| T1 | 36 | 143974 | 37075 | 5274 | 2810 | +| T1 | 37 | 143763 | 33188 | 2855 | 3421 | +| T1 | 38 | 143915 | 43860 | 8409 | 3526 | +| T1 | 39 | 143733 | 43199 | 7629 | 4442 | +| T1 | 40 | 144072 | 50989 | 12285 | 3312 | +| T1 | 41 | 143873 | 34536 | 7704 | 3727 | +| T1 | 42 | 143719 | 36628 | 6242 | 3728 | +| T1 | 43 | 143858 | 46666 | 11830 | 3627 | +| T1 | 44 | 143779 | 49048 | 12550 | 4756 | +| T1 | 45 | 143927 | 50855 | 11622 | 4467 | +| T1 | 46 | 143677 | 51004 | 11178 | 4918 | +| T1 | 47 | 143977 | 51247 | 11323 | 4801 | +| T1 | 48 | 143872 | 49662 | 11307 | 3910 | +| T1 | 49 | 144004 | 43839 | 8154 | 3973 | +| T1 | 50 | 143728 | 43759 | 8449 | 3406 | +| T1 | 51 | 143841 | 41709 | 7511 | 3472 | +| T1 | 52 | 143929 | 50158 | 11469 | 4245 | +| T1 | 53 | 143971 | 41111 | 6898 | 3765 | +| T1 | 54 | 143831 | 55000 | 13607 | 4988 | +| T1 | 55 | 143580 | 55935 | 14390 | 4804 | +| T1 | 56 | 143751 | 56151 | 13884 | 6112 | +| T1 | 57 | 143993 | 53463 | 13028 | 4431 | +| T1 | 58 | 144026 | 55324 | 14121 | 4205 | +| T1 | 59 | 143762 | 46085 | 12560 | 4506 | +| T1 | 60 | 143832 | 50092 | 12743 | 4424 | +| T1 | 61 | 143674 | 44626 | 8912 | 3867 | +| T1 | 62 | 143894 | 26865 | 1694 | 2141 | +| T1 | 63 | 143735 | 36132 | 7190 | 4182 | +| T1 | 64 | 144024 | 38547 | 8615 | 3754 | + +#+name: exp18_number_of_elements +#+header: :var TABLE=exp18_number_of_elements_shell +#+begin_src R :results output :session :exports both :colnames yes +TABLE %>% + as_tibble() %>% + filter(Rank != 0) %>% + select(-Type) -> exp.ELEMENTS; +exp.ELEMENTS; +#+end_src + +#+RESULTS: exp18_number_of_elements +#+begin_example +# A tibble: 64 x 5 + Rank NELEM NPOIN NBOUN NPOI32 + + 1 1 143862 27303 2832 2362 + 2 2 143921 48988 11415 4174 + 3 3 143822 47644 9906 4947 + 4 4 143695 43667 8197 3921 + 5 5 143675 46242 9442 4191 + 6 6 143752 43209 7433 4876 + 7 7 143768 48927 11007 4580 + 8 8 143792 47241 10828 2428 + 9 9 143945 39022 6023 3238 +10 10 144243 47255 9907 4088 +# ... with 54 more rows +#+end_example + +**** 6. Integrate number of elements with compute cost + +#+name: exp18_integrate +#+header: :var dep1=exp18_lb_average +#+header: :var dep0=exp18_number_of_elements +#+begin_src R :results output :session :exports both +exp.ENRICH %>% + left_join(exp.ELEMENTS) %>% + select(Rank, time, NELEM) -> df.INTEGRATED; +df.INTEGRATED +#+end_src + +#+RESULTS: exp18_integrate +#+begin_example +Joining, by = "Rank" +# A tibble: 64 x 3 + Rank time NELEM + + 1 1 28.18224 143862 + 2 2 41.61312 143921 + 3 3 40.45403 143822 + 4 4 38.41553 143695 + 5 5 40.30081 143675 + 6 6 37.42849 143752 + 7 7 42.15025 143768 + 8 8 42.18351 143792 + 9 9 36.24237 143945 +10 10 40.10288 144243 +# ... with 54 more rows +#+end_example + +**** 7. Do the cumsum + +#+name: exp18_cumsum +#+header: :var dep0=exp18_integrate +#+begin_src R :results output :session :exports both +df.INTEGRATED %>% + arrange(Rank) %>% + mutate(TIMESUM = cumsum(time)/sum(time)) %>% + mutate(NELEMSUM = cumsum(NELEM)/sum(NELEM)) -> df.CUMSUM; +df.CUMSUM -> df.exp18.CUMSUM; +df.CUMSUM +#+end_src + +#+RESULTS: exp18_cumsum +#+begin_example +# A tibble: 64 x 5 + Rank time NELEM TIMESUM NELEMSUM + + 1 1 60.43743 184387 0.01510068 0.02002865 + 2 2 68.98525 135390 0.03233709 0.03473510 + 3 3 62.07687 134635 0.04784740 0.04935954 + 4 4 61.41787 140751 0.06319305 0.06464832 + 5 5 66.77389 137213 0.07987694 0.07955279 + 6 6 60.27102 146712 0.09493604 0.09548907 + 7 7 69.88611 141375 0.11239754 0.11084562 + 8 8 63.52919 131541 0.12827072 0.12513399 + 9 9 64.41341 152492 0.14436482 0.14169811 +10 10 66.84620 136200 0.16106678 0.15649254 +# ... with 54 more rows +#+end_example + +**** 8. Target with Inverting Hilbert Load Curve +***** The function +#+name: exp18_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df_cum_inverse = function(df,yval) { + df = rbind(data.frame(x=0,y=0),df[names(df) %in% c("x","y")]) + N = nrow(df); + xval = rep(NA,length(yval)) +# print(N); + for(i in 1:length(yval)) { +# print(yval[i]) + if(yval[i]<=0) {xval[i] = 0;} + else if(yval[i]>=1) {xval[i] = 1;} + else { + idx = df %>% mutate(Order = 1:nrow(.)) %>% filter(y >= yval[i]) %>% slice(1) %>% pull(Order) + idx = idx-1; +# print(idx) +# print(df[idx:(idx+1),]) + eps_x = (df[idx+1,]$x - df[idx,]$x); + eps_y = (df[idx+1,]$y - df[idx,]$y); + xval[i] = ((yval[i]-df[idx,]$y)*eps_x/eps_y + df[idx,]$x); + } + } + return(xval); +} +#+end_src + +#+RESULTS: exp18_hilbert_invertion_function +***** Calculate the Target +#+name: exp18_target +#+header: :var dep0=exp18_cumsum +#+header: :var dep1=exp18_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df.CUMSUM %>% + rename(y = TIMESUM, x = NELEMSUM) -> df.CUMSUM.2; +nb_proc = nrow(df.CUMSUM.2) +df_cum_inverse(df.CUMSUM.2, y = (1:nb_proc)/nb_proc) -> l.TARGET; +df.CUMSUM.2 %>% + mutate(Target.x = l.TARGET, + Target.y = (1:nb_proc)/nb_proc) -> df.TARGET; +df.TARGET; +#+end_src + +#+name: exp18_tidying +#+header: :var dep0=exp18_target +#+begin_src R :results output :session :exports both +df.TARGET %>% select(Rank, x, y) -> df.1; +df.TARGET %>% select(Rank, Target.x, Target.y) -> df.2; +df.1 %>% + mutate(Category = "Observed") %>% + bind_rows(df.2 %>% + mutate(Category = "Target") %>% + rename(x = Target.x, y = Target.y)) -> df.TARGET.tidy; +df.TARGET %>% select(x, y, Target.x, Target.y) %>% filter(x == Target.x) +#+end_src + +#+RESULTS: +: # A tibble: 1 x 4 +: x y Target.x Target.y +: +: 1 1 1 1 1 + +***** Plot with the arrows + +#+header: :var dep0=exp18_tidying +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 800 :height 400 :session +m1=0 +m2=m1+1 +NP = df.TARGET.tidy %>% pull(Rank) %>% unique %>% length +df.TARGET.tidy %>% + ggplot(aes(x=x, y=y, group+category)) + + theme_bw(base_size=20) + + geom_curve (data=df.TARGET %>% filter(x!=1), aes(x=x, y=y, xend=Target.x, yend=Target.y), arrow=arrow(length = unit(0.03, "npc"))) + + geom_point(size=1, aes(shape=Category, color=as.factor(Rank%%8))) + + scale_color_brewer(palette = "Set1") + +# geom_line(data=df.TARGET.tidy %>% filter(Category == "Observed")) + + coord_cartesian(xlim=c(m1, m2), ylim=c(m1, m2)) #+ +# geom_hline(yintercept=(0:NP)/NP) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900T5J.png]] + +**** 9. Output the new =rank-elements.dat= file + +#+header: :var dep0=exp18_target +#+begin_src R :results output :session :exports both +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + write_delim("rank-elements.dat", delim=" ", col_names=FALSE) +#+end_src + +#+RESULTS: + +**** 9.1 Check which rank has the larger correction + +#+header: :var dep0=exp18_target +#+begin_src R :results output :session :exports both +t = 61 +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + filter(Rank >= t) +#+end_src + +#+RESULTS: +: # A tibble: 4 x 2 +: Rank Target +: +: 1 61 0.9131876 +: 2 62 1.1703797 +: 3 63 1.0715776 +: 4 64 1.0107468 + +#+begin_src R :results output :session :exports both +df.INTEGRATED %>% + mutate(F = time / NELEM) %>% + mutate(diff = mean(time) - time) %>% + filter(Rank >= t) +#+end_src + +#+RESULTS: +: # A tibble: 4 x 5 +: Rank time NELEM F diff +: +: 1 61 72.95337 141232 0.0005165499 -10.41747 +: 2 62 81.25564 221395 0.0003670166 -18.71973 +: 3 63 29.42375 162676 0.0001808734 33.11215 +: 4 64 66.49157 157024 0.0004234484 -3.95566 + +#+begin_src R :results output :session :exports both +exp.ENRICH %>% pull(Rank) %>% min +exp.ENRICH %>% summary +exp.ENRICH %>% pull(time) %>% sd +exp.ENRICH %>% arrange(-time) +#+end_src + +#+RESULTS: +#+begin_example +[1] 1 + Rank time + Min. : 1.00 Min. : 32.55 + 1st Qu.:16.75 1st Qu.: 59.81 + Median :32.50 Median : 61.14 + Mean :32.50 Mean : 61.80 + 3rd Qu.:48.25 3rd Qu.: 63.00 + Max. :64.00 Max. :103.14 +[1] 12.66165 +# A tibble: 64 x 2 + Rank time + + 1 31 103.14403 + 2 59 97.99925 + 3 46 96.40160 + 4 44 91.13452 + 5 3 81.42977 + 6 20 77.99168 + 7 1 67.54315 + 8 36 66.69546 + 9 56 65.95448 +10 39 65.71467 +# ... with 54 more rows +#+end_example + +** FINALLY :EXP20: +*** Use the =sfc= branch of alya + +#+begin_src shell :results output +cd ~/misc/ +svn checkout svn+ssh://bsc21835@dt01.bsc.es/gpfs/projects/bsc21/svnroot/Alya/branches/sfc alya-bsc-sfc +#+end_src + +Copy-paste: +- my modified version of Alya.f90 on top of the alya-bsc-sfc branch. +- my modified version of inivar.f90 on top of the alya-bsc-sfc branch + +The filename should be named: rank-elements.dat +- First column: rank +- Second column: the relative weight of each partition +Separated by one space + +- All the data about the partition (subdomain) is registered in a file + called =def_domain.f90=, within kernel/defmod + +#+begin_src shell :results output +cd ~/misc/ +ORIGIN=alya-bsc +TARGET=alya-bsc-sfc +cp $ORIGIN/Sources/kernel/master/Alya.f90 $TARGET/Sources/kernel/master/ +cp $ORIGIN/Sources/services/parall/par_prepro.f90 $TARGET/Sources/services/parall/ +cp $ORIGIN/Executables/unix/config.in $TARGET/Executables/unix/ +#+end_src + +#+RESULTS: + +*** Alya Instrumentation +#+begin_src shell :results output +cd ~/misc/alya-bsc-sfc/ +svn diff +#+end_src + +#+RESULTS: +#+begin_example +Index: Sources/kernel/master/Alya.f90 +=================================================================== +--- Sources/kernel/master/Alya.f90 (revision 8442) ++++ Sources/kernel/master/Alya.f90 (working copy) +@@ -1,3 +1,4 @@ ++#include "scorep/SCOREP_User.inc" + !> @file Alya.f90 + !! @author Guillaume Houzeaux + !! @brief Ayla main +@@ -20,7 +21,13 @@ + use def_master, only : kfl_goblk + use def_master, only : kfl_gocou + use def_coupli, only : kfl_gozon ++ use mod_parall, only : PAR_MY_WORLD_RANK + implicit none ++ INTEGER :: iter,ierror ++ character*100 striter ++ character*100 iterfile ++ real :: tnow ++ real, dimension(2) :: tarray + ! + ! DLB should be disabled as we only wabnt to activate it for particular loops + ! Master does not disble to lend its resources automatically +@@ -39,6 +46,10 @@ + + call Parall(22270_ip) + ++ write(iterfile,'(a,i4.4,a)') 'iterations-', PAR_MY_WORLD_RANK, '.csv' ++ open(unit=2, file=iterfile) ++ ++ iter = 1 + optimization: do while ( kfl_goopt == 1 ) + + call Iniunk() +@@ -49,6 +60,11 @@ + time: do while ( kfl_gotim == 1 ) + + call Timste() ++ call ETIME(tarray, tnow) ++ write(2,*) tnow, tarray(1), tarray(2), PAR_MY_WORLD_RANK, iter ++ ++ write(striter, '(a,i3.3)') 'iter',iter ++ SCOREP_USER_REGION_BY_NAME_BEGIN(striter, SCOREP_USER_REGION_TYPE_PHASE) + + reset: do + call Begste() +@@ -77,6 +93,13 @@ + + call Endste() + ++ SCOREP_USER_REGION_BY_NAME_END(striter) ++ call ETIME(tarray, tnow) ++ write(2,*) tnow, tarray(1), tarray(2), PAR_MY_WORLD_RANK, iter ++ iter = iter + 1 ++ ++ ++ + call Filter(ITASK_ENDTIM) + call Output(ITASK_ENDTIM) + +@@ -91,6 +114,9 @@ + + end do optimization + ++ close(2) ++ + call Turnof() + ++ + end program Alya +Index: Sources/services/parall/par_prepro.f90 +=================================================================== +--- Sources/services/parall/par_prepro.f90 (revision 8442) ++++ Sources/services/parall/par_prepro.f90 (working copy) +@@ -19,6 +19,7 @@ + use mod_memory + use mod_par_partit_sfc, only : par_partit_sfc + use mod_parall, only : PAR_WORLD_SIZE ++ use mod_parall, only : PAR_MY_WORLD_RANK + use mod_parall, only : PAR_METIS4 + use mod_parall, only : PAR_SFC + use mod_parall, only : PAR_ORIENTED_BIN +@@ -32,6 +33,9 @@ + integer(ip) :: npoin_tmp, nelem_tmp, nboun_tmp + real(rp) :: time1,time2,time3,time4,time5 + character(100) :: messa_integ ++ character*100 dfile ++ integer(ip) ii ++ + ! + ! Output + ! +@@ -354,4 +358,20 @@ + end if + + ++ ! LUCAS ++ write(dfile,'(a,i4.4,a)') 'domain-', PAR_MY_WORLD_RANK, '.csv' ++ open(unit=12345, file=dfile) ++ ++ write(12345,*) "T1", PAR_MY_WORLD_RANK, nelem, npoin, nboun, npoi3-npoi2 ++ do ii = 1, nelem ++ write(12345,*) "T2", PAR_MY_WORLD_RANK, ii, ltype(ii) ++ end do ++ do ii = 1, nboun ++ write(12345,*) "T3", PAR_MY_WORLD_RANK, ii, ltypb(ii) ++ end do ++ ++ close(12345) ++ ! END LUCAS ++ ++ + end subroutine par_prepro +#+end_example + +*** Compile the sfc branch (MPI 3.0 + SCOREP 3) +#+begin_src shell :results output +source ~/spack-ALYA/share/spack/setup-env.sh +export PATH=$(spack location -i openmpi)/bin:$PATH +#+end_src +*** Run the sfc branch with my modifications + +#+begin_src shell :results output +source ~/spack-ALYA/share/spack/setup-env.sh +export PATH=$(spack location -i openmpi)/bin:$PATH +$(which mpirun) --bind-to core:overload-allowed --report-bindings \ + -x SCOREP_TOTAL_MEMORY=4GB \ + -x SCOREP_MPI_ENABLE_GROUPS=ALL \ + -x SCOREP_ENABLE_TRACING=FALSE \ + -x SCOREP_ENABLE_PROFILING=TRUE \ + -x LD_LIBRARY_PATH=$(spack location -i openmpi)/lib/ \ + -np $(cat ~/machine-file | wc -l) \ + -machinefile ~/machine-file \ + ~/alya-bsc-sfc/Executables/unix/Alya.x fensap +#+end_src +*** Extracting information from the partitions +#+BEGIN_EXAMPLE + character*100 dfile + integer(ip) ii + + ! LUCAS + write(dfile,'(a,i4.4,a)') 'domain-', PAR_MY_WORLD_RANK, '.csv' + open(unit=12345, file=dfile) + + write(12345,*) "T1", PAR_MY_WORLD_RANK, nelem, npoin, nboun, npoi3-npoi2 + do ii = 1, nelem + write(12345,*) "T2", PAR_MY_WORLD_RANK, ii, ltype(ii) + end do + do ii = 1, nboun + write(12345,*) "T3", PAR_MY_WORLD_RANK, ii, ltypb(ii) + end do + + close(12345) + ! END LUCAS +#+END_EXAMPLE +*** Data Transformation +**** 0. Experiment directory + +#+name: exp20_dir +#+begin_src shell :results output +echo -n "exp_20_grisou_4_manual/scorep-20180125_1159_34483171676480/" +#+end_src + +#+RESULTS: exp20_dir +: exp_20_grisou_4_manual/scorep-20180125_1159_34483171676480/ + +**** 1. Read Iteration Timings + +#+name: exp20_iteration_timings +#+header: :var DIR=exp20_dir +#+begin_src R :results output :session :exports both +suppressMessages(library(tidyverse)); +do.call("bind_rows", lapply(list.files(DIR, pattern="iterations.*csv$", full.names=TRUE), + function(file) { + read_table(file, col_names=FALSE) %>% + rename(Time.User = X1, + Time.System = X2, + Time.Run = X3, + Rank = X4, + Iteration = X5) %>% + filter(Iteration != 0) %>% + group_by(Rank, Iteration) %>% + summarize(Start = min(Time.User), + End = max(Time.User)) %>% + group_by(Rank) %>% + mutate(End = End - min(Start), + Start = Start - min(Start)) + })) -> exp.iter; +#+end_src + +**** 2. Transform profile.cubex + +#+name: exp20_cubex_to_open +#+header: :var DIR=exp20_dir +#+begin_src shell :results output +cd $DIR +~/install/cube-4.3.5/bin/cube_dump -c all -m all -s csv2 profile.cubex > profile.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex | tail -n+33 | head -n279 | tr '(' ';' | sed -e "s/[[:space:]]*//g" -e "s/,.*$//" -e "s/id=//" -e "s/:[^;]*;/;/" > regions-codes.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex > cube_info.txt +#+end_src + +#+RESULTS: exp20_cubex_to_open + +**** 3. Parse the call tree + +#+name: exp20_cube_calltree +#+header: :var dep0=exp20_cubex_to_open +#+header: :var DIR=exp20_dir +#+begin_src perl :results output :exports both +use strict; +my($filename) = $DIR . "/cube_info.txt"; +my($line); +open(INPUT,$filename); +my($in_CALLTREE) = 0; + +my($VAR_iteration) = -1; +my($VAR_type) = -1; +my($VAR_id) = -1; + +my($filename_out) = $filename; +$filename_out =~ s/txt$/csv/; + +open(OUTPUT,"> ".$filename_out); + +while(defined($line=)) { + chomp $line; + if($line =~ "CALL TREE") { $in_CALLTREE = 1; } + if(!$in_CALLTREE) { next; } + if($line =~ "SYSTEM DIMENSION") { $in_CALLTREE = 0; } + + if($line =~ /^iter(\d*)\s*.*id=(\d*),.*/) { + $VAR_iteration = $1; + $VAR_type = "Computation"; + $VAR_id = $2; + # print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } + if($line =~ /^\s*\|\-(\S*)\s.*id=(\d*),.*/) { +# print $line."\n"; + $VAR_type = $1; + $VAR_id = $2; +# print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } +} +close(OUTPUT); +print $filename_out; +#+end_src + +#+RESULTS: exp20_cube_calltree +: exp_20_grisou_4_manual/scorep-20180125_1159_34483171676480//cube_info.csv + +**** 4. Enrich the call tree + +#+name: exp20_enrich +#+header: :var CSV=exp20_cube_calltree +#+header: :var DIR=exp20_dir +#+begin_src R :results output :session :exports both +PROFILE = paste0(DIR, "/profile.csv"); +REGION = paste0(DIR, "/regions-codes.csv"); +df.PROF <- read_csv(PROFILE); +exp.REGION <- read_delim(REGION, col_names=FALSE, delim=";") %>% + rename(ID = X2, Name = X1); +exp.ENRICH <- read_delim(CSV, delim=",", col_names=FALSE) %>% + rename(Phase = X1, Code = X2, ID = X3) %>% + left_join(df.PROF, by=c("ID" = "Cnode ID")) %>% + rename(Rank = `Thread ID`); +#+end_src + +#+RESULTS: exp20_enrich +#+begin_example +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +#+end_example + +**** 5. The number of elements + +#+name: exp20_number_of_elements_shell +#+header: :var DIR=exp20_dir +#+begin_src shell :results table :colnames yes +cd $DIR +echo "Type Rank NELEM NPOIN NBOUN NPOI32" +cat domain-*.csv | grep ^.*T1 +#+end_src + +#+name: exp20_number_of_elements +#+header: :var TABLE=exp20_number_of_elements_shell +#+begin_src R :results output :session :exports both :colnames yes +TABLE %>% + as_tibble() %>% + filter(Rank != 0) %>% + select(-Type) -> exp.ELEMENTS; +exp.ELEMENTS; +#+end_src + +#+RESULTS: exp20_number_of_elements +#+begin_example +# A tibble: 64 x 5 + Rank NELEM NPOIN NBOUN NPOI32 + + 1 1 143862 27303 2832 2362 + 2 2 143921 48988 11415 4174 + 3 3 143822 47644 9906 4947 + 4 4 143695 43667 8197 3921 + 5 5 143675 46242 9442 4191 + 6 6 143752 43209 7433 4876 + 7 7 143768 48927 11007 4580 + 8 8 143792 47241 10828 2428 + 9 9 143945 39022 6023 3238 +10 10 144243 47255 9907 4088 +# ... with 54 more rows +#+end_example + +**** 6. Integrate number of elements with compute cost + +#+name: exp20_integrate +#+header: :var dep1=exp20_enrich +#+header: :var dep0=exp20_number_of_elements +#+begin_src R :results output :session :exports both +exp.ENRICH %>% + filter(Rank != 0) %>% + filter(Code == "Computation") %>% + mutate(Phase = as.integer(Phase)) %>% + left_join(exp.ELEMENTS) %>% + select(Phase, Rank, time, NELEM) -> df.INTEGRATED; +#+end_src + +#+RESULTS: +: Joining, by = "Rank" + +**** 7. Do the cumsum + +#+name: exp20_cumsum +#+header: :var dep0=exp20_integrate +#+begin_src R :results output :session :exports both +df.INTEGRATED %>% + arrange(Phase, Rank) %>% + group_by(Phase) %>% + mutate(TIMESUM = cumsum(time)/sum(time)) %>% + mutate(NELEMSUM = cumsum(NELEM)/sum(NELEM)) %>% + ungroup() -> df.CUMSUM; +df.CUMSUM +#+end_src + +#+RESULTS: exp20_cumsum +#+begin_example +# A tibble: 128 x 6 + Phase Rank time NELEM TIMESUM NELEMSUM + + 1 1 1 48.16957 143862 0.01204776 0.01562670 + 2 1 2 68.57628 143921 0.02919947 0.03125982 + 3 1 3 62.76540 143822 0.04489782 0.04688218 + 4 1 4 70.52202 143695 0.06253618 0.06249074 + 5 1 5 51.59148 143675 0.07543980 0.07809713 + 6 1 6 57.46073 143752 0.08981139 0.09371189 + 7 1 7 66.35009 143768 0.10640630 0.10932838 + 8 1 8 68.51840 143792 0.12354354 0.12494748 + 9 1 9 64.11168 143945 0.13957860 0.14058320 +10 1 10 57.19916 144243 0.15388477 0.15625129 +# ... with 118 more rows +#+end_example + +**** 8. Target with Inverting Hilbert Load Curve + +#+name: exp20_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df_cum_inverse = function(df,yval) { + df = rbind(data.frame(x=0,y=0),df[names(df) %in% c("x","y")]) + N = nrow(df); + xval = rep(NA,length(yval)) +# print(N); + for(i in 1:length(yval)) { +# print(yval[i]) + if(yval[i]<=0) {xval[i] = 0;} + else if(yval[i]>=1) {xval[i] = 1;} + else { + idx = df %>% mutate(Order = 1:nrow(.)) %>% filter(y >= yval[i]) %>% slice(1) %>% pull(Order) + idx = idx-1; +# print(idx) +# print(df[idx:(idx+1),]) + eps_x = (df[idx+1,]$x - df[idx,]$x); + eps_y = (df[idx+1,]$y - df[idx,]$y); + xval[i] = ((yval[i]-df[idx,]$y)*eps_x/eps_y + df[idx,]$x); + } + } + return(xval); +} +#+end_src + +#+RESULTS: exp20_hilbert_invertion_function + +#+name: exp20_target +#+header: :var dep0=exp20_cumsum +#+header: :var dep1=exp20_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df.CUMSUM %>% + filter(Phase == 2) %>% + rename(y = TIMESUM, x = NELEMSUM) -> df.CUMSUM.2; +nb_proc = nrow(df.CUMSUM.2) +df_cum_inverse(df.CUMSUM.2, y = (1:nb_proc)/nb_proc) -> l.TARGET; +df.CUMSUM.2 %>% + mutate(Target.x = l.TARGET, + Target.y = (1:nb_proc)/nb_proc) -> df.TARGET; +df.TARGET; +#+end_src + +#+RESULTS: exp20_target +#+begin_example +# A tibble: 64 x 8 + Phase Rank time NELEM y x Target.x Target.y + + 1 2 1 51.55150 143862 0.01265736 0.01562670 0.01834748 0.015625 + 2 2 2 69.44857 143921 0.02970895 0.03125982 0.03265310 0.031250 + 3 2 3 70.37532 143822 0.04698809 0.04688218 0.04677993 0.046875 + 4 2 4 73.12350 143695 0.06494199 0.06249074 0.06036775 0.062500 + 5 2 5 62.68656 143675 0.08033332 0.07809713 0.07585795 0.078125 + 6 2 6 68.69897 143752 0.09720087 0.09371189 0.09051732 0.093750 + 7 2 7 69.42635 143768 0.11424701 0.10932838 0.10486498 0.109375 + 8 2 8 63.86800 143792 0.12992842 0.12494748 0.12003864 0.125000 + 9 2 9 53.87067 143945 0.14315520 0.14058320 0.13759219 0.140625 +10 2 10 82.17739 144243 0.16333208 0.15625129 0.15075179 0.156250 +# ... with 54 more rows +#+end_example + +#+begin_src R :results output :session :exports both +df.TARGET %>% select(Rank, x, y) -> df.1; +df.TARGET %>% select(Rank, Target.x, Target.y) -> df.2; +df.1 %>% + mutate(Category = "Observed") %>% + bind_rows(df.2 %>% + mutate(Category = "Target") %>% + rename(x = Target.x, y = Target.y)) -> df.TARGET.tidy; +df.TARGET %>% select(x, y, Target.x, Target.y) %>% filter(x == Target.x) +#+end_src + +#+RESULTS: +: # A tibble: 1 x 4 +: x y Target.x Target.y +: +: 1 1 1 1 1 + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 800 :height 400 :session +m1=0.5 +m2=m1+0.3 +NP = df.TARGET.tidy %>% pull(Rank) %>% unique %>% length +df.TARGET.tidy %>% + ggplot(aes(x=x, y=y, group+category)) + + theme_bw(base_size=20) + + geom_curve (data=df.TARGET %>% filter(x!=1), aes(x=x, y=y, xend=Target.x, yend=Target.y), arrow=arrow(length = unit(0.03, "npc"))) + + geom_point(size=4, aes(shape=Category, color=as.factor(Rank%%8))) + + scale_color_brewer(palette = "Set1") + +# geom_line(data=df.TARGET.tidy %>% filter(Category == "Observed")) + + coord_cartesian(xlim=c(m1, m2), ylim=c(m1, m2)) + + geom_hline(yintercept=(0:NP)/NP) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure169004sh.png]] + +#+begin_src R :results output :session :exports both +df_cum_inverse(df.CUMSUM.2, yval = 0.02) +#+end_src + +#+RESULTS: +: [1] 300 +: [1] 0.02 +: [1] 5 +: x y +: 5 0.01334887 0.01937010 +: 6 0.01669545 0.02087636 +: [1] 0.01474838 +**** 9. Output the new =rank-elements.dat= file + +#+header: :var dep0=exp20_target +#+begin_src R :results output :session :exports both +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + write_delim("rank-elements.dat", delim=" ", col_names=FALSE) +#+end_src + +#+RESULTS: + +*** Visualization / Analysis +**** Rough initial LB Analysis +#+header: :var dep1=exp20_points + +#+header: :var dep0=exp20_enrich +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session +exp.ENRICH %>% + group_by(Phase, Rank, Code) %>% + summarize(Sum = sum(time)) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + filter(Rank != 0) %>% + filter(Code == "Computation") %>% + ggplot(aes(x=as.integer(Phase), y=Sum, color=Rank)) + + geom_point() + geom_line(aes(group=Rank), alpha=.1) + + ylim(0,NA) + + theme_bw(base_size=18) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900R5y.png]] +** 4-node grisou (64 cores + 1 core) with 10 iterations :EXP21: +*** Data Transformation +**** 0. Experiment directory + +#+name: exp21_dir +#+begin_src shell :results output +echo -n "exp_21_grisou_4_manual/scorep-20180125_1431_56251511746269/" +#+end_src + +#+RESULTS: exp21_dir +: exp_21_grisou_4_manual/scorep-20180125_1431_56251511746269/ + +**** 1. Read Iteration Timings + +#+name: exp21_iteration_timings +#+header: :var DIR=exp21_dir +#+begin_src R :results output :session :exports both +suppressMessages(library(tidyverse)); +do.call("bind_rows", lapply(list.files(DIR, pattern="iterations.*csv$", full.names=TRUE), + function(file) { + read_table(file, col_names=FALSE) %>% + rename(Time.User = X1, + Time.System = X2, + Time.Run = X3, + Rank = X4, + Iteration = X5) %>% + filter(Iteration != 0) %>% + group_by(Rank, Iteration) %>% + summarize(Start = min(Time.User), + End = max(Time.User)) %>% + group_by(Rank) %>% + mutate(End = End - min(Start), + Start = Start - min(Start)) + })) -> exp.iter; +#+end_src + +**** 2. Transform profile.cubex + +#+name: exp21_cubex_to_open +#+header: :var DIR=exp21_dir +#+begin_src shell :results output +cd $DIR +~/install/cube-4.3.5/bin/cube_dump -c all -m all -s csv2 profile.cubex > profile.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex | tail -n+33 | head -n279 | tr '(' ';' | sed -e "s/[[:space:]]*//g" -e "s/,.*$//" -e "s/id=//" -e "s/:[^;]*;/;/" > regions-codes.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex > cube_info.txt +#+end_src + +#+RESULTS: exp21_cubex_to_open + +**** 3. Parse the call tree + +#+name: exp21_cube_calltree +#+header: :var dep0=exp21_cubex_to_open +#+header: :var DIR=exp21_dir +#+begin_src perl :results output :exports both +use strict; +my($filename) = $DIR . "/cube_info.txt"; +my($line); +open(INPUT,$filename); +my($in_CALLTREE) = 0; + +my($VAR_iteration) = -1; +my($VAR_type) = -1; +my($VAR_id) = -1; + +my($filename_out) = $filename; +$filename_out =~ s/txt$/csv/; + +open(OUTPUT,"> ".$filename_out); + +while(defined($line=)) { + chomp $line; + if($line =~ "CALL TREE") { $in_CALLTREE = 1; } + if(!$in_CALLTREE) { next; } + if($line =~ "SYSTEM DIMENSION") { $in_CALLTREE = 0; } + + if($line =~ /^iter(\d*)\s*.*id=(\d*),.*/) { + $VAR_iteration = $1; + $VAR_type = "Computation"; + $VAR_id = $2; + # print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } + if($line =~ /^\s*\|\-(\S*)\s.*id=(\d*),.*/) { +# print $line."\n"; + $VAR_type = $1; + $VAR_id = $2; +# print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } +} +close(OUTPUT); +print $filename_out; +#+end_src + +#+RESULTS: exp21_cube_calltree +: exp_20_grisou_4_manual/scorep-20180125_1159_34483171676480//cube_info.csv + +**** 4. Enrich the call tree + +#+name: exp21_enrich +#+header: :var CSV=exp21_cube_calltree +#+header: :var DIR=exp21_dir +#+begin_src R :results output :session :exports both +PROFILE = paste0(DIR, "/profile.csv"); +REGION = paste0(DIR, "/regions-codes.csv"); +df.PROF <- read_csv(PROFILE); +exp.REGION <- read_delim(REGION, col_names=FALSE, delim=";") %>% + rename(ID = X2, Name = X1); +exp.ENRICH <- read_delim(CSV, delim=",", col_names=FALSE) %>% + rename(Phase = X1, Code = X2, ID = X3) %>% + left_join(df.PROF, by=c("ID" = "Cnode ID")) %>% + rename(Rank = `Thread ID`) %>% + filter(Code == "Computation") %>% + mutate(Phase = as.integer(Phase)) %>% + filter(Rank != 0) +#+end_src + +#+RESULTS: exp21_enrich +#+begin_example +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +#+end_example + +**** X. Verify the load imbalance (and eventually summarize it if necessary) +***** Rough initial LB Analysis + +#+header: :var dep0=exp21_enrich +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session +exp.ENRICH %>% + group_by(Phase, Rank, Code) %>% + summarize(Sum = sum(time)) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + ggplot(aes(x=as.integer(Phase), y=Sum, color=Rank)) + + geom_point() + geom_line(aes(group=Rank), alpha=.2) + + ylim(0,NA) + + theme_bw(base_size=18) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900F4Q.png]] + +***** Calculate the average + +#+name: exp21_lb_average +#+header: :var dep0=exp21_enrich +#+begin_src R :results output :session :exports both +exp.ENRICH %>% + filter(Phase <= 9) %>% + group_by(Rank) %>% + summarize(time = mean(time)) -> exp.ENRICH +exp.ENRICH +#+end_src + +#+RESULTS: exp21_lb_average +#+begin_example +# A tibble: 64 x 2 + Rank time + + 1 1 42.92296 + 2 2 64.98924 + 3 3 66.13120 + 4 4 62.14705 + 5 5 64.66830 + 6 6 59.88956 + 7 7 61.76306 + 8 8 67.09068 + 9 9 57.22863 +10 10 64.70995 +# ... with 54 more rows +#+end_example + +**** 5. The number of elements on T1 + +#+name: exp21_number_of_elements_shell +#+header: :var DIR=exp21_dir +#+begin_src shell :results table :colnames yes +cd $DIR +echo "Type Rank NELEM NPOIN NBOUN NPOI32" +cat domain-*.csv | grep ^.*T1 +#+end_src + +#+name: exp21_number_of_elements +#+header: :var TABLE=exp21_number_of_elements_shell +#+begin_src R :results output :session :exports both :colnames yes +TABLE %>% + as_tibble() %>% + filter(Rank != 0) %>% + select(-Type) -> exp.ELEMENTS; +exp.ELEMENTS; +#+end_src + +#+RESULTS: exp21_number_of_elements +#+begin_example +# A tibble: 64 x 5 + Rank NELEM NPOIN NBOUN NPOI32 + + 1 1 143862 27303 2832 2362 + 2 2 143921 48988 11415 4174 + 3 3 143822 47644 9906 4947 + 4 4 143695 43667 8197 3921 + 5 5 143675 46242 9442 4191 + 6 6 143752 43209 7433 4876 + 7 7 143768 48927 11007 4580 + 8 8 143792 47241 10828 2428 + 9 9 143945 39022 6023 3238 +10 10 144243 47255 9907 4088 +# ... with 54 more rows +#+end_example + +**** 6. Integrate number of elements with compute cost + +#+name: exp21_integrate +#+header: :var dep1=exp21_lb_average +#+header: :var dep0=exp21_number_of_elements +#+begin_src R :results output :session :exports both +exp.ENRICH %>% + left_join(exp.ELEMENTS) %>% + select(Rank, time, NELEM) -> df.INTEGRATED; +df.INTEGRATED +#+end_src + +#+RESULTS: exp21_integrate +#+begin_example +Joining, by = "Rank" +# A tibble: 64 x 3 + Rank time NELEM + + 1 1 42.92296 143862 + 2 2 64.98924 143921 + 3 3 66.13120 143822 + 4 4 62.14705 143695 + 5 5 64.66830 143675 + 6 6 59.88956 143752 + 7 7 61.76306 143768 + 8 8 67.09068 143792 + 9 9 57.22863 143945 +10 10 64.70995 144243 +# ... with 54 more rows +#+end_example + +**** 7. Do the cumsum + +#+name: exp21_cumsum +#+header: :var dep0=exp21_integrate +#+begin_src R :results output :session :exports both +df.INTEGRATED %>% + arrange(Rank) %>% + mutate(TIMESUM = cumsum(time)/sum(time)) %>% + mutate(NELEMSUM = cumsum(NELEM)/sum(NELEM)) -> df.CUMSUM; +df.CUMSUM -> df.exp21.CUMSUM; +df.CUMSUM +#+end_src + +#+RESULTS: exp21_cumsum +#+begin_example +# A tibble: 64 x 5 + Rank time NELEM TIMESUM NELEMSUM + + 1 1 42.92296 143862 0.01094528 0.01562670 + 2 2 64.98924 143921 0.02751743 0.03125982 + 3 3 66.13120 143822 0.04438077 0.04688218 + 4 4 62.14705 143695 0.06022817 0.06249074 + 5 5 64.66830 143675 0.07671848 0.07809713 + 6 6 59.88956 143752 0.09199021 0.09371189 + 7 7 61.76306 143768 0.10773969 0.10932838 + 8 8 67.09068 143792 0.12484770 0.12494748 + 9 9 57.22863 143945 0.13944090 0.14058320 +10 10 64.70995 144243 0.15594183 0.15625129 +# ... with 54 more rows +#+end_example + +**** 8. Target with Inverting Hilbert Load Curve +***** The function +#+name: exp21_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df_cum_inverse = function(df,yval) { + df = rbind(data.frame(x=0,y=0),df[names(df) %in% c("x","y")]) + N = nrow(df); + xval = rep(NA,length(yval)) +# print(N); + for(i in 1:length(yval)) { +# print(yval[i]) + if(yval[i]<=0) {xval[i] = 0;} + else if(yval[i]>=1) {xval[i] = 1;} + else { + idx = df %>% mutate(Order = 1:nrow(.)) %>% filter(y >= yval[i]) %>% slice(1) %>% pull(Order) + idx = idx-1; +# print(idx) +# print(df[idx:(idx+1),]) + eps_x = (df[idx+1,]$x - df[idx,]$x); + eps_y = (df[idx+1,]$y - df[idx,]$y); + xval[i] = ((yval[i]-df[idx,]$y)*eps_x/eps_y + df[idx,]$x); + } + } + return(xval); +} +#+end_src + +#+RESULTS: exp21_hilbert_invertion_function +***** Calculate the Target +#+name: exp21_target +#+header: :var dep0=exp21_cumsum +#+header: :var dep1=exp21_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df.CUMSUM %>% + rename(y = TIMESUM, x = NELEMSUM) -> df.CUMSUM.2; +nb_proc = nrow(df.CUMSUM.2) +df_cum_inverse(df.CUMSUM.2, y = (1:nb_proc)/nb_proc) -> l.TARGET; +df.CUMSUM.2 %>% + mutate(Target.x = l.TARGET, + Target.y = (1:nb_proc)/nb_proc) -> df.TARGET; +df.TARGET; +#+end_src + +#+name: exp21_tidying +#+header: :var dep0=exp21_target +#+begin_src R :results output :session :exports both +df.TARGET %>% select(Rank, x, y) -> df.1; +df.TARGET %>% select(Rank, Target.x, Target.y) -> df.2; +df.1 %>% + mutate(Category = "Observed") %>% + bind_rows(df.2 %>% + mutate(Category = "Target") %>% + rename(x = Target.x, y = Target.y)) -> df.TARGET.tidy; +df.TARGET %>% select(x, y, Target.x, Target.y) %>% filter(x == Target.x) +#+end_src + +#+RESULTS: +: # A tibble: 1 x 4 +: x y Target.x Target.y +: +: 1 1 1 1 1 +***** Plot with the arrows + +#+header: :var dep0=exp21_tidying +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 800 :height 400 :session +m1=0 +m2=m1+1 +NP = df.TARGET.tidy %>% pull(Rank) %>% unique %>% length +df.TARGET.tidy %>% + ggplot(aes(x=x, y=y, group+category)) + + theme_bw(base_size=20) + + geom_curve (data=df.TARGET %>% filter(x!=1), aes(x=x, y=y, xend=Target.x, yend=Target.y), arrow=arrow(length = unit(0.03, "npc"))) + + geom_point(size=1, aes(shape=Category, color=as.factor(Rank%%8))) + + scale_color_brewer(palette = "Set1") + +# geom_line(data=df.TARGET.tidy %>% filter(Category == "Observed")) + + coord_cartesian(xlim=c(m1, m2), ylim=c(m1, m2)) #+ +# geom_hline(yintercept=(0:NP)/NP) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900iIJ.png]] + +**** 9. Output the new =rank-elements.dat= file + +#+header: :var dep0=exp21_target +#+begin_src R :results output :session :exports both +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + write_delim("rank-elements.dat", delim=" ", col_names=FALSE) +#+end_src + +#+RESULTS: +**** 9.1 Check which rank has the larger correction + +#+header: :var dep0=exp21_target +#+begin_src R :results output :session :exports both +t = 61 +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + filter(Rank >= t) +#+end_src + +#+RESULTS: +: # A tibble: 4 x 2 +: Rank Target +: +: 1 61 0.9816559 +: 2 62 1.5405989 +: 3 63 1.1311167 +: 4 64 1.0905741 + +#+begin_src R :results output :session :exports both +df.INTEGRATED %>% + mutate(F = time / NELEM) %>% + mutate(diff = mean(time) - time) %>% + filter(Rank >= t) +#+end_src + +#+RESULTS: +: # A tibble: 4 x 5 +: Rank time NELEM F diff +: +: 1 61 32.55414 143674 0.0002265834 30.051996 +: 2 62 48.51618 143894 0.0003371661 14.089961 +: 3 63 56.97007 143735 0.0003963549 5.636066 +: 4 64 57.51256 144024 0.0003993262 5.093579 + +*** Visualization / Analysis +**** Rough initial LB Analysis + +#+header: :var dep0=exp21_enrich +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session +exp.ENRICH %>% + group_by(Phase, Rank, Code) %>% + summarize(Sum = sum(time)) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + filter(Rank != 0) %>% + filter(Code == "Computation") %>% + ggplot(aes(x=as.integer(Phase), y=Sum, color=Rank)) + + geom_point(alpha=.2, size=3) + geom_line(aes(group=Rank), alpha=.2) + + ylim(0,NA) + + theme_bw(base_size=18) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900YNW.png]] +*** Trying to craft a model based on nelem and ltype(elements) +**** 0. The number of elements on T2 + +#+name: exp21_number_of_elements_shell_t2 +#+header: :var DIR=exp21_dir +#+begin_src shell :results output +cd $DIR +TEMPFILE=/tmp/xis +#$(mktemp) +echo "Type,Rank,Element.ID,Element.Type" > $TEMPFILE +cat domain-*.csv | grep ^.*T2 | sed -e "s/ */,/g" -e "s/^,//" >> $TEMPFILE +echo -n $TEMPFILE +#+end_src + +#+RESULTS: exp21_number_of_elements_shell_t2 +: /tmp/xis + +#+header: :var T2FILE=exp21_number_of_elements_shell_t2 +#+begin_src R :results output :session :exports both +T2FILE %>% + read_csv %>% + select(-Type) %>% + group_by(Rank, Element.Type) %>% + summarize(N=n()) %>% + mutate(Element.Type = case_when(Element.Type == 30 ~ "Tetra", + Element.Type == 37 ~ "Hexa", + Element.Type == 32 ~ "Pyra", + Element.Type == 34 ~ "Penta", + TRUE ~ "Unknown")) -> exp.T2DATA; +exp.T2DATA; +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + Type = col_character(), + Rank = col_integer(), + Element.ID = col_integer(), + Element.Type = col_integer() +) +# A tibble: 138 x 3 +# Groups: Rank [64] + Rank Element.Type N + + 1 1 Tetra 143767 + 2 1 Pyra 32 + 3 1 Penta 63 + 4 2 Tetra 90229 + 5 2 Pyra 76 + 6 2 Penta 53616 + 7 3 Tetra 95492 + 8 3 Penta 48330 + 9 4 Tetra 102585 +10 4 Penta 41110 +# ... with 128 more rows +#+end_example + +#+name: exp21_number_of_elements +#+header: :var TABLE=exp21_number_of_elements_shell +#+begin_src R :results output :session :exports both :colnames yes +TABLE %>% + as_tibble() %>% + filter(Rank != 0) %>% + select(-Type) -> exp.ELEMENTS; +exp.ELEMENTS; +#+end_src + +**** 1. Dataset from previous workflow +***** Get the data +#+header: :var dep0=exp21_enrich +#+header: :var dep1=exp21_number_of_elements +#+begin_src R :results output :session :exports both +exp.ENRICH %>% + left_join(exp.T2DATA) %>% + select(Phase, Rank, time, Element.Type, N) -> exp.DATASET; +#+end_src + +#+RESULTS: +: Joining, by = "Rank" + +#+begin_src R :results output :session :exports both +exp.ENRICH %>% + left + +#exp.DATASET %>% +# select( +# spread(Type, Amount, - +#+end_src + +#+RESULTS: +: Error in overscope_eval_next(overscope, expr) : +: object 'Element.Type' not found + + + +***** Try to fit + +#+begin_src R :results output :session :exports both +summary(lm(data=exp.DATASET, time ~ N+Element.Type)) +#NELEM+NPOIN+NBOUN+NPOI32)) +#+end_src + +#+RESULTS: +#+begin_example + +Call: +lm(formula = time ~ Element.Type, data = exp.DATASET) + +Residuals: + Min 1Q Median 3Q Max +-39.874 -5.407 3.344 9.236 27.574 + +Coefficients: + Estimate Std. Error t value Pr(>|t|) +(Intercept) 6.127e+01 5.416e-01 113.141 < 2e-16 *** +Element.TypePyra -9.350e+00 1.473e+00 -6.346 2.99e-10 *** +Element.TypeTetra 1.003e-14 7.659e-01 0.000 1 +--- +Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 + +Residual standard error: 13.7 on 1377 degrees of freedom +Multiple R-squared: 0.03041, Adjusted R-squared: 0.02901 +F-statistic: 21.6 on 2 and 1377 DF, p-value: 5.82e-10 +#+end_example +***** Plot the chaos +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session +ggplot(data=exp.DATASET, aes(x=NELEM,y=time, color=Rank)) + geom_point() +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900sNQ.png]] +** 4-node grisou (64 cores + 1 core) with 10 iters and rank elements :EXP22: +*** Data Transformation +**** 0. Experiment directory + +#+name: exp22_dir +#+begin_src shell :results output +echo -n "exp_22_grisou_4_manual/scorep-20180125_1509_61786778640531" +#+end_src + +#+RESULTS: exp22_dir +: exp_22_grisou_4_manual/scorep-20180125_1509_61786778640531 + +**** 1. Read Iteration Timings + +#+name: exp22_iteration_timings +#+header: :var DIR=exp22_dir +#+begin_src R :results output :session :exports both +suppressMessages(library(tidyverse)); +do.call("bind_rows", lapply(list.files(DIR, pattern="iterations.*csv$", full.names=TRUE), + function(file) { + read_table(file, col_names=FALSE) %>% + rename(Time.User = X1, + Time.System = X2, + Time.Run = X3, + Rank = X4, + Iteration = X5) %>% + filter(Iteration != 0) %>% + group_by(Rank, Iteration) %>% + summarize(Start = min(Time.User), + End = max(Time.User)) %>% + group_by(Rank) %>% + mutate(End = End - min(Start), + Start = Start - min(Start)) + })) -> exp.iter; +#+end_src + +**** 2. Transform profile.cubex + +#+name: exp22_cubex_to_open +#+header: :var DIR=exp22_dir +#+begin_src shell :results output +cd $DIR +~/install/cube-4.3.5/bin/cube_dump -c all -m all -s csv2 profile.cubex > profile.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex | tail -n+33 | head -n279 | tr '(' ';' | sed -e "s/[[:space:]]*//g" -e "s/,.*$//" -e "s/id=//" -e "s/:[^;]*;/;/" > regions-codes.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex > cube_info.txt +#+end_src + +#+RESULTS: exp22_cubex_to_open + +**** 3. Parse the call tree + +#+name: exp22_cube_calltree +#+header: :var dep0=exp22_cubex_to_open +#+header: :var DIR=exp22_dir +#+begin_src perl :results output :exports both +use strict; +my($filename) = $DIR . "/cube_info.txt"; +my($line); +open(INPUT,$filename); +my($in_CALLTREE) = 0; + +my($VAR_iteration) = -1; +my($VAR_type) = -1; +my($VAR_id) = -1; + +my($filename_out) = $filename; +$filename_out =~ s/txt$/csv/; + +open(OUTPUT,"> ".$filename_out); + +while(defined($line=)) { + chomp $line; + if($line =~ "CALL TREE") { $in_CALLTREE = 1; } + if(!$in_CALLTREE) { next; } + if($line =~ "SYSTEM DIMENSION") { $in_CALLTREE = 0; } + + if($line =~ /^iter(\d*)\s*.*id=(\d*),.*/) { + $VAR_iteration = $1; + $VAR_type = "Computation"; + $VAR_id = $2; + # print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } + if($line =~ /^\s*\|\-(\S*)\s.*id=(\d*),.*/) { +# print $line."\n"; + $VAR_type = $1; + $VAR_id = $2; +# print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } +} +close(OUTPUT); +print $filename_out; +#+end_src + +#+RESULTS: exp22_cube_calltree +: exp_20_grisou_4_manual/scorep-20180125_1159_34483171676480//cube_info.csv + +**** 4. Enrich the call tree + +#+name: exp22_enrich +#+header: :var CSV=exp22_cube_calltree +#+header: :var DIR=exp22_dir +#+begin_src R :results output :session :exports both +PROFILE = paste0(DIR, "/profile.csv"); +REGION = paste0(DIR, "/regions-codes.csv"); +df.PROF <- read_csv(PROFILE); +exp.REGION <- read_delim(REGION, col_names=FALSE, delim=";") %>% + rename(ID = X2, Name = X1); +exp.ENRICH <- read_delim(CSV, delim=",", col_names=FALSE) %>% + rename(Phase = X1, Code = X2, ID = X3) %>% + left_join(df.PROF, by=c("ID" = "Cnode ID")) %>% + rename(Rank = `Thread ID`) %>% + filter(Code == "Computation") %>% + mutate(Phase = as.integer(Phase)) %>% + filter(Rank != 0) +#+end_src + +#+RESULTS: exp22_enrich +#+begin_example +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +#+end_example + +**** X. Verify the load imbalance (and eventually summarize it if necessary) +***** Rough initial LB Analysis + +#+header: :var dep0=exp22_enrich +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session +exp.ENRICH %>% + group_by(Phase, Rank, Code) %>% + summarize(Sum = sum(time)) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + ggplot(aes(x=as.integer(Phase), y=Sum, color=Rank)) + + geom_point() + geom_line(aes(group=Rank), alpha=.2) + + ylim(0,NA) + + theme_bw(base_size=18) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900skL.png]] + +***** Calculate the average + +#+name: exp22_lb_average +#+header: :var dep0=exp22_enrich +#+begin_src R :results output :session :exports both +exp.ENRICH %>% + filter(Phase <= 9) %>% + group_by(Rank) %>% + summarize(time = mean(time)) -> exp.ENRICH +exp.ENRICH +#+end_src + +#+RESULTS: exp22_lb_average +#+begin_example +# A tibble: 64 x 2 + Rank time + + 1 1 42.92296 + 2 2 64.98924 + 3 3 66.13120 + 4 4 62.14705 + 5 5 64.66830 + 6 6 59.88956 + 7 7 61.76306 + 8 8 67.09068 + 9 9 57.22863 +10 10 64.70995 +# ... with 54 more rows +#+end_example + +**** 5. The number of elements on T1 + +#+name: exp22_number_of_elements_shell +#+header: :var DIR=exp22_dir +#+begin_src shell :results table :colnames yes +cd $DIR +echo "Type Rank NELEM NPOIN NBOUN NPOI32" +cat domain-*.csv | grep ^.*T1 +#+end_src + +#+name: exp22_number_of_elements +#+header: :var TABLE=exp22_number_of_elements_shell +#+begin_src R :results output :session :exports both :colnames yes +TABLE %>% + as_tibble() %>% + filter(Rank != 0) %>% + select(-Type) -> exp.ELEMENTS; +exp.ELEMENTS; +#+end_src + +#+RESULTS: exp22_number_of_elements +#+begin_example +# A tibble: 64 x 5 + Rank NELEM NPOIN NBOUN NPOI32 + + 1 1 143862 27303 2832 2362 + 2 2 143921 48988 11415 4174 + 3 3 143822 47644 9906 4947 + 4 4 143695 43667 8197 3921 + 5 5 143675 46242 9442 4191 + 6 6 143752 43209 7433 4876 + 7 7 143768 48927 11007 4580 + 8 8 143792 47241 10828 2428 + 9 9 143945 39022 6023 3238 +10 10 144243 47255 9907 4088 +# ... with 54 more rows +#+end_example + +**** 6. Integrate number of elements with compute cost + +#+name: exp22_integrate +#+header: :var dep1=exp22_lb_average +#+header: :var dep0=exp22_number_of_elements +#+begin_src R :results output :session :exports both +exp.ENRICH %>% + left_join(exp.ELEMENTS) %>% + select(Rank, time, NELEM) -> df.INTEGRATED; +df.INTEGRATED +#+end_src + +#+RESULTS: exp22_integrate +#+begin_example +Joining, by = "Rank" +# A tibble: 64 x 3 + Rank time NELEM + + 1 1 42.92296 143862 + 2 2 64.98924 143921 + 3 3 66.13120 143822 + 4 4 62.14705 143695 + 5 5 64.66830 143675 + 6 6 59.88956 143752 + 7 7 61.76306 143768 + 8 8 67.09068 143792 + 9 9 57.22863 143945 +10 10 64.70995 144243 +# ... with 54 more rows +#+end_example + +**** 7. Do the cumsum + +#+name: exp22_cumsum +#+header: :var dep0=exp22_integrate +#+begin_src R :results output :session :exports both +df.INTEGRATED %>% + arrange(Rank) %>% + mutate(TIMESUM = cumsum(time)/sum(time)) %>% + mutate(NELEMSUM = cumsum(NELEM)/sum(NELEM)) -> df.CUMSUM; +df.CUMSUM -> df.exp22.CUMSUM; +df.CUMSUM +#+end_src + +#+RESULTS: exp22_cumsum +#+begin_example +# A tibble: 64 x 5 + Rank time NELEM TIMESUM NELEMSUM + + 1 1 60.43743 184387 0.01510068 0.02002865 + 2 2 68.98525 135390 0.03233709 0.03473510 + 3 3 62.07687 134635 0.04784740 0.04935954 + 4 4 61.41787 140751 0.06319305 0.06464832 + 5 5 66.77389 137213 0.07987694 0.07955279 + 6 6 60.27102 146712 0.09493604 0.09548907 + 7 7 69.88611 141375 0.11239754 0.11084562 + 8 8 63.52919 131541 0.12827072 0.12513399 + 9 9 64.41341 152492 0.14436482 0.14169811 +10 10 66.84620 136200 0.16106678 0.15649254 +# ... with 54 more rows +#+end_example + +**** 8. Target with Inverting Hilbert Load Curve +***** The function +#+name: exp22_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df_cum_inverse = function(df,yval) { + df = rbind(data.frame(x=0,y=0),df[names(df) %in% c("x","y")]) + N = nrow(df); + xval = rep(NA,length(yval)) +# print(N); + for(i in 1:length(yval)) { +# print(yval[i]) + if(yval[i]<=0) {xval[i] = 0;} + else if(yval[i]>=1) {xval[i] = 1;} + else { + idx = df %>% mutate(Order = 1:nrow(.)) %>% filter(y >= yval[i]) %>% slice(1) %>% pull(Order) + idx = idx-1; +# print(idx) +# print(df[idx:(idx+1),]) + eps_x = (df[idx+1,]$x - df[idx,]$x); + eps_y = (df[idx+1,]$y - df[idx,]$y); + xval[i] = ((yval[i]-df[idx,]$y)*eps_x/eps_y + df[idx,]$x); + } + } + return(xval); +} +#+end_src + +#+RESULTS: exp22_hilbert_invertion_function +***** Calculate the Target +#+name: exp22_target +#+header: :var dep0=exp22_cumsum +#+header: :var dep1=exp22_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df.CUMSUM %>% + rename(y = TIMESUM, x = NELEMSUM) -> df.CUMSUM.2; +nb_proc = nrow(df.CUMSUM.2) +df_cum_inverse(df.CUMSUM.2, y = (1:nb_proc)/nb_proc) -> l.TARGET; +df.CUMSUM.2 %>% + mutate(Target.x = l.TARGET, + Target.y = (1:nb_proc)/nb_proc) -> df.TARGET; +df.TARGET; +#+end_src + +#+name: exp22_tidying +#+header: :var dep0=exp22_target +#+begin_src R :results output :session :exports both +df.TARGET %>% select(Rank, x, y) -> df.1; +df.TARGET %>% select(Rank, Target.x, Target.y) -> df.2; +df.1 %>% + mutate(Category = "Observed") %>% + bind_rows(df.2 %>% + mutate(Category = "Target") %>% + rename(x = Target.x, y = Target.y)) -> df.TARGET.tidy; +df.TARGET %>% select(x, y, Target.x, Target.y) %>% filter(x == Target.x) +#+end_src + +#+RESULTS: +: # A tibble: 1 x 4 +: x y Target.x Target.y +: +: 1 1 1 1 1 + +***** Plot with the arrows + +#+header: :var dep0=exp22_tidying +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 800 :height 400 :session +m1=0 +m2=m1+1 +NP = df.TARGET.tidy %>% pull(Rank) %>% unique %>% length +df.TARGET.tidy %>% + ggplot(aes(x=x, y=y, group+category)) + + theme_bw(base_size=20) + + geom_curve (data=df.TARGET %>% filter(x!=1), aes(x=x, y=y, xend=Target.x, yend=Target.y), arrow=arrow(length = unit(0.03, "npc"))) + + geom_point(size=1, aes(shape=Category, color=as.factor(Rank%%8))) + + scale_color_brewer(palette = "Set1") + +# geom_line(data=df.TARGET.tidy %>% filter(Category == "Observed")) + + coord_cartesian(xlim=c(m1, m2), ylim=c(m1, m2)) #+ +# geom_hline(yintercept=(0:NP)/NP) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900k8u.png]] + +**** 9. Output the new =rank-elements.dat= file + +#+header: :var dep0=exp22_target +#+begin_src R :results output :session :exports both +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + write_delim("rank-elements.dat", delim=" ", col_names=FALSE) +#+end_src + +#+RESULTS: + +**** 9.1 Check which rank has the larger correction + +#+header: :var dep0=exp22_target +#+begin_src R :results output :session :exports both +t = 61 +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + filter(Rank == 30) +#+end_src + +#+RESULTS: +: # A tibble: 1 x 2 +: Rank Target +: +: 1 30 1.325439 + +#+begin_src R :results output :session :exports both +df.INTEGRATED %>% + mutate(F = time / NELEM) %>% + mutate(diff = mean(time) - time) %>% + filter(Rank >= t) +#+end_src + +#+RESULTS: +: # A tibble: 4 x 5 +: Rank time NELEM F diff +: +: 1 61 72.95337 141232 0.0005165499 -10.41747 +: 2 62 81.25564 221395 0.0003670166 -18.71973 +: 3 63 29.42375 162676 0.0001808734 33.11215 +: 4 64 66.49157 157024 0.0004234484 -3.95566 + +#+begin_src R :results output :session :exports both +exp.ENRICH %>% pull(Rank) %>% min +exp.ENRICH %>% summary +exp.ENRICH %>% pull(time) %>% sd +#+end_src + +#+RESULTS: +: [1] 1 +: Rank time +: Min. : 1.00 Min. :29.42 +: 1st Qu.:16.75 1st Qu.:60.40 +: Median :32.50 Median :62.84 +: Mean :32.50 Mean :62.54 +: 3rd Qu.:48.25 3rd Qu.:66.38 +: Max. :64.00 Max. :89.45 +: [1] 11.35958 + +*** Visualization / Analysis +**** Rough initial LB Analysis + +#+header: :var dep0=exp22_enrich +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session +exp.ENRICH %>% + group_by(Phase, Rank, Code) %>% + summarize(Sum = sum(time)) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + filter(Rank != 0) %>% + filter(Code == "Computation") %>% + ggplot(aes(x=as.integer(Phase), y=Sum, color=Rank)) + + geom_point(alpha=.2, size=3) + geom_line(aes(group=Rank), alpha=.2) + + ylim(0,NA) + + theme_bw(base_size=18) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900xgb.png]] +**** Merge the cumsum of exp21 and exp22 +#+header: :var dep0=exp21_cumsum +#+header: :var dep1=exp22_cumsum + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 1200 :height 800 :session +z0 = 0.5 +z1 = z0+0.5 +df.exp21.CUMSUM %>% + mutate(Type = "exp21") %>% + bind_rows(df.exp22.CUMSUM %>% + mutate(Type = "exp22")) %>% +ggplot(aes(x=NELEMSUM, y=TIMESUM)) + + geom_line() + + geom_point(aes(color=Type, size=time)) + + coord_cartesian(xlim=c(z0, z1), ylim=c(z0, z1)) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900M6C.png]] +** 4-node grisou (64 cores + 1 core) 10its, new round :EXP23: +*** Data Transformation +**** 0. Experiment directory + +#+name: exp23_dir +#+begin_src shell :results output +echo -n "exp_23_grisou_4_manual/scorep-20180125_1607_70169635606280" +#+end_src + +#+RESULTS: exp23_dir +: exp_23_grisou_4_manual/scorep-20180125_1607_70169635606280 + +**** 1. Read Iteration Timings + +#+name: exp23_iteration_timings +#+header: :var DIR=exp23_dir +#+begin_src R :results output :session :exports both +suppressMessages(library(tidyverse)); +do.call("bind_rows", lapply(list.files(DIR, pattern="iterations.*csv$", full.names=TRUE), + function(file) { + read_table(file, col_names=FALSE) %>% + rename(Time.User = X1, + Time.System = X2, + Time.Run = X3, + Rank = X4, + Iteration = X5) %>% + filter(Iteration != 0) %>% + group_by(Rank, Iteration) %>% + summarize(Start = min(Time.User), + End = max(Time.User)) %>% + group_by(Rank) %>% + mutate(End = End - min(Start), + Start = Start - min(Start)) + })) -> exp.iter; +#+end_src + +**** 2. Transform profile.cubex + +#+name: exp23_cubex_to_open +#+header: :var DIR=exp23_dir +#+begin_src shell :results output +cd $DIR +~/install/cube-4.3.5/bin/cube_dump -c all -m all -s csv2 profile.cubex > profile.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex | tail -n+33 | head -n279 | tr '(' ';' | sed -e "s/[[:space:]]*//g" -e "s/,.*$//" -e "s/id=//" -e "s/:[^;]*;/;/" > regions-codes.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex > cube_info.txt +#+end_src + +#+RESULTS: exp23_cubex_to_open + +**** 3. Parse the call tree + +#+name: exp23_cube_calltree +#+header: :var dep0=exp23_cubex_to_open +#+header: :var DIR=exp23_dir +#+begin_src perl :results output :exports both +use strict; +my($filename) = $DIR . "/cube_info.txt"; +my($line); +open(INPUT,$filename); +my($in_CALLTREE) = 0; + +my($VAR_iteration) = -1; +my($VAR_type) = -1; +my($VAR_id) = -1; + +my($filename_out) = $filename; +$filename_out =~ s/txt$/csv/; + +open(OUTPUT,"> ".$filename_out); + +while(defined($line=)) { + chomp $line; + if($line =~ "CALL TREE") { $in_CALLTREE = 1; } + if(!$in_CALLTREE) { next; } + if($line =~ "SYSTEM DIMENSION") { $in_CALLTREE = 0; } + + if($line =~ /^iter(\d*)\s*.*id=(\d*),.*/) { + $VAR_iteration = $1; + $VAR_type = "Computation"; + $VAR_id = $2; + # print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } + if($line =~ /^\s*\|\-(\S*)\s.*id=(\d*),.*/) { +# print $line."\n"; + $VAR_type = $1; + $VAR_id = $2; +# print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } +} +close(OUTPUT); +print $filename_out; +#+end_src + +#+RESULTS: exp23_cube_calltree +: exp_20_grisou_4_manual/scorep-20180125_1159_34483171676480//cube_info.csv + +**** 4. Enrich the call tree + +#+name: exp23_enrich +#+header: :var CSV=exp23_cube_calltree +#+header: :var DIR=exp23_dir +#+begin_src R :results output :session :exports both +PROFILE = paste0(DIR, "/profile.csv"); +REGION = paste0(DIR, "/regions-codes.csv"); +df.PROF <- read_csv(PROFILE); +exp.REGION <- read_delim(REGION, col_names=FALSE, delim=";") %>% + rename(ID = X2, Name = X1); +exp.ENRICH <- read_delim(CSV, delim=",", col_names=FALSE) %>% + rename(Phase = X1, Code = X2, ID = X3) %>% + left_join(df.PROF, by=c("ID" = "Cnode ID")) %>% + rename(Rank = `Thread ID`) %>% + filter(Code == "Computation") %>% + mutate(Phase = as.integer(Phase)) %>% + filter(Rank != 0) +#+end_src + +#+RESULTS: exp23_enrich +#+begin_example +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +#+end_example + +**** X. Verify the load imbalance (and eventually summarize it if necessary) +***** Rough initial LB Analysis + +#+header: :var dep0=exp23_enrich +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session +exp.ENRICH %>% + group_by(Phase, Rank, Code) %>% + summarize(Sum = sum(time)) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + ggplot(aes(x=as.integer(Phase), y=Sum, color=Rank)) + + geom_point() + geom_line(aes(group=Rank), alpha=.2) + + ylim(0,NA) + + theme_bw(base_size=18) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900TRG.png]] + +***** Calculate the average + +#+name: exp23_lb_average +#+header: :var dep0=exp23_enrich +#+begin_src R :results output :session :exports both +exp.ENRICH %>% + filter(Phase <= 9) %>% + group_by(Rank) %>% + summarize(time = mean(time)) -> exp.ENRICH +exp.ENRICH +#+end_src + +#+RESULTS: exp23_lb_average +#+begin_example +# A tibble: 64 x 2 + Rank time + + 1 1 42.92296 + 2 2 64.98924 + 3 3 66.13120 + 4 4 62.14705 + 5 5 64.66830 + 6 6 59.88956 + 7 7 61.76306 + 8 8 67.09068 + 9 9 57.22863 +10 10 64.70995 +# ... with 54 more rows +#+end_example + +**** 5. The number of elements on T1 + +#+name: exp23_number_of_elements_shell +#+header: :var DIR=exp23_dir +#+begin_src shell :results table :colnames yes +cd $DIR +echo "Type Rank NELEM NPOIN NBOUN NPOI32" +cat domain-*.csv | grep ^.*T1 +#+end_src + +#+name: exp23_number_of_elements +#+header: :var TABLE=exp23_number_of_elements_shell +#+begin_src R :results output :session :exports both :colnames yes +TABLE %>% + as_tibble() %>% + filter(Rank != 0) %>% + select(-Type) -> exp.ELEMENTS; +exp.ELEMENTS; +#+end_src + +#+RESULTS: exp23_number_of_elements +#+begin_example +# A tibble: 64 x 5 + Rank NELEM NPOIN NBOUN NPOI32 + + 1 1 143862 27303 2832 2362 + 2 2 143921 48988 11415 4174 + 3 3 143822 47644 9906 4947 + 4 4 143695 43667 8197 3921 + 5 5 143675 46242 9442 4191 + 6 6 143752 43209 7433 4876 + 7 7 143768 48927 11007 4580 + 8 8 143792 47241 10828 2428 + 9 9 143945 39022 6023 3238 +10 10 144243 47255 9907 4088 +# ... with 54 more rows +#+end_example + +**** 6. Integrate number of elements with compute cost + +#+name: exp23_integrate +#+header: :var dep1=exp23_lb_average +#+header: :var dep0=exp23_number_of_elements +#+begin_src R :results output :session :exports both +exp.ENRICH %>% + left_join(exp.ELEMENTS) %>% + select(Rank, time, NELEM) -> df.INTEGRATED; +df.INTEGRATED +#+end_src + +#+RESULTS: exp23_integrate +#+begin_example +Joining, by = "Rank" +# A tibble: 64 x 3 + Rank time NELEM + + 1 1 42.92296 143862 + 2 2 64.98924 143921 + 3 3 66.13120 143822 + 4 4 62.14705 143695 + 5 5 64.66830 143675 + 6 6 59.88956 143752 + 7 7 61.76306 143768 + 8 8 67.09068 143792 + 9 9 57.22863 143945 +10 10 64.70995 144243 +# ... with 54 more rows +#+end_example + +**** 7. Do the cumsum + +#+name: exp23_cumsum +#+header: :var dep0=exp23_integrate +#+begin_src R :results output :session :exports both +df.INTEGRATED %>% + arrange(Rank) %>% + mutate(TIMESUM = cumsum(time)/sum(time)) %>% + mutate(NELEMSUM = cumsum(NELEM)/sum(NELEM)) -> df.CUMSUM; +df.CUMSUM -> df.exp23.CUMSUM; +df.CUMSUM +#+end_src + +#+RESULTS: exp23_cumsum +#+begin_example +# A tibble: 64 x 5 + Rank time NELEM TIMESUM NELEMSUM + + 1 1 60.43743 184387 0.01510068 0.02002865 + 2 2 68.98525 135390 0.03233709 0.03473510 + 3 3 62.07687 134635 0.04784740 0.04935954 + 4 4 61.41787 140751 0.06319305 0.06464832 + 5 5 66.77389 137213 0.07987694 0.07955279 + 6 6 60.27102 146712 0.09493604 0.09548907 + 7 7 69.88611 141375 0.11239754 0.11084562 + 8 8 63.52919 131541 0.12827072 0.12513399 + 9 9 64.41341 152492 0.14436482 0.14169811 +10 10 66.84620 136200 0.16106678 0.15649254 +# ... with 54 more rows +#+end_example + +**** 8. Target with Inverting Hilbert Load Curve +***** The function +#+name: exp23_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df_cum_inverse = function(df,yval) { + df = rbind(data.frame(x=0,y=0),df[names(df) %in% c("x","y")]) + N = nrow(df); + xval = rep(NA,length(yval)) +# print(N); + for(i in 1:length(yval)) { +# print(yval[i]) + if(yval[i]<=0) {xval[i] = 0;} + else if(yval[i]>=1) {xval[i] = 1;} + else { + idx = df %>% mutate(Order = 1:nrow(.)) %>% filter(y >= yval[i]) %>% slice(1) %>% pull(Order) + idx = idx-1; +# print(idx) +# print(df[idx:(idx+1),]) + eps_x = (df[idx+1,]$x - df[idx,]$x); + eps_y = (df[idx+1,]$y - df[idx,]$y); + xval[i] = ((yval[i]-df[idx,]$y)*eps_x/eps_y + df[idx,]$x); + } + } + return(xval); +} +#+end_src + +#+RESULTS: exp23_hilbert_invertion_function +***** Calculate the Target +#+name: exp23_target +#+header: :var dep0=exp23_cumsum +#+header: :var dep1=exp23_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df.CUMSUM %>% + rename(y = TIMESUM, x = NELEMSUM) -> df.CUMSUM.2; +nb_proc = nrow(df.CUMSUM.2) +df_cum_inverse(df.CUMSUM.2, y = (1:nb_proc)/nb_proc) -> l.TARGET; +df.CUMSUM.2 %>% + mutate(Target.x = l.TARGET, + Target.y = (1:nb_proc)/nb_proc) -> df.TARGET; +df.TARGET; +#+end_src + +#+name: exp23_tidying +#+header: :var dep0=exp23_target +#+begin_src R :results output :session :exports both +df.TARGET %>% select(Rank, x, y) -> df.1; +df.TARGET %>% select(Rank, Target.x, Target.y) -> df.2; +df.1 %>% + mutate(Category = "Observed") %>% + bind_rows(df.2 %>% + mutate(Category = "Target") %>% + rename(x = Target.x, y = Target.y)) -> df.TARGET.tidy; +df.TARGET %>% select(x, y, Target.x, Target.y) %>% filter(x == Target.x) +#+end_src + +#+RESULTS: +: # A tibble: 1 x 4 +: x y Target.x Target.y +: +: 1 1 1 1 1 + +***** Plot with the arrows + +#+header: :var dep0=exp23_tidying +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 800 :height 400 :session +m1=0 +m2=m1+1 +NP = df.TARGET.tidy %>% pull(Rank) %>% unique %>% length +df.TARGET.tidy %>% + ggplot(aes(x=x, y=y, group+category)) + + theme_bw(base_size=20) + + geom_curve (data=df.TARGET %>% filter(x!=1), aes(x=x, y=y, xend=Target.x, yend=Target.y), arrow=arrow(length = unit(0.03, "npc"))) + + geom_point(size=1, aes(shape=Category, color=as.factor(Rank%%8))) + + scale_color_brewer(palette = "Set1") + +# geom_line(data=df.TARGET.tidy %>% filter(Category == "Observed")) + + coord_cartesian(xlim=c(m1, m2), ylim=c(m1, m2)) #+ +# geom_hline(yintercept=(0:NP)/NP) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900k8u.png]] + +**** 9. Output the new =rank-elements.dat= file + +#+header: :var dep0=exp23_target +#+begin_src R :results output :session :exports both +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + write_delim("rank-elements.dat", delim=" ", col_names=FALSE) +#+end_src + +#+RESULTS: + +**** 9.1 Check which rank has the larger correction + +#+header: :var dep0=exp23_target +#+begin_src R :results output :session :exports both +t = 61 +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + filter(Rank >= t) +#+end_src + +#+RESULTS: +: # A tibble: 4 x 2 +: Rank Target +: +: 1 61 0.9190534 +: 2 62 1.0775470 +: 3 63 1.1810391 +: 4 64 1.0176706 + +#+begin_src R :results output :session :exports both +df.INTEGRATED %>% + mutate(F = time / NELEM) %>% + mutate(diff = mean(time) - time) %>% + filter(Rank >= t) +#+end_src + +#+RESULTS: +: # A tibble: 4 x 5 +: Rank time NELEM F diff +: +: 1 61 72.95337 141232 0.0005165499 -10.41747 +: 2 62 81.25564 221395 0.0003670166 -18.71973 +: 3 63 29.42375 162676 0.0001808734 33.11215 +: 4 64 66.49157 157024 0.0004234484 -3.95566 + +#+begin_src R :results output :session :exports both +exp.ENRICH %>% pull(Rank) %>% min +exp.ENRICH %>% summary +exp.ENRICH %>% pull(time) %>% sd +exp.ENRICH %>% arrange(-time) +#+end_src + +#+RESULTS: +#+begin_example +[1] 1 + Rank time + Min. : 1.00 Min. :31.29 + 1st Qu.:16.75 1st Qu.:61.21 + Median :32.50 Median :62.77 + Mean :32.50 Mean :62.20 + 3rd Qu.:48.25 3rd Qu.:64.36 + Max. :64.00 Max. :94.47 +[1] 9.980286 +# A tibble: 64 x 2 + Rank time + + 1 30 94.46917 + 2 63 91.98761 + 3 41 79.08293 + 4 21 76.44483 + 5 62 71.74396 + 6 29 70.84170 + 7 43 70.29370 + 8 42 69.40785 + 9 17 67.79000 +10 4 66.53386 +# ... with 54 more rows +#+end_example +*** Visualization / Analysis +**** Merge the cumsum of exp21 and exp22 and exp23 +#+header: :var dep0=exp21_cumsum +#+header: :var dep1=exp22_cumsum +#+header: :var dep1=exp23_cumsum +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 1200 :height 800 :session +oz0 = 0 +z1 = z0+1 +df.exp21.CUMSUM %>% + mutate(Type = "exp21") %>% + bind_rows(df.exp22.CUMSUM %>% mutate(Type = "exp22")) %>% + bind_rows(df.exp23.CUMSUM %>% mutate(Type = "exp23")) %>% +group_by(Type) %>% +mutate(Rank.Worst = (time == max(time))) %>% +ggplot(aes(x=NELEMSUM, y=TIMESUM)) + + geom_line() + + geom_point(aes(color=Type, size=time, shape=Rank.Worst)) + + coord_cartesian(xlim=c(z0, z1), ylim=c(z0, z1)) + + theme_bw(base_size = 22) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900PxZ.png]] +** 4-node grisou (64 cores + 1 core) 5its (STABLE: HT, TB) :EXP24: +*** Differences +- New round +- No hyperthreading +- No turboboost +- MPI bind to core +*** Data Transformation +**** 0. Experiment directory + +#+name: exp24_dir +#+begin_src shell :results output +echo -n "exp_24_grisou_4_manual/scorep-20180125_1706_78570606376628/"; +#+end_src + +#+RESULTS: exp24_dir +: exp_24_grisou_4_manual/scorep-20180125_1706_78570606376628/ + +**** 1. Read Iteration Timings + +#+name: exp24_iteration_timings +#+header: :var DIR=exp24_dir +#+begin_src R :results output :session :exports both +suppressMessages(library(tidyverse)); +do.call("bind_rows", lapply(list.files(DIR, pattern="iterations.*csv$", full.names=TRUE), + function(file) { + read_table(file, col_names=FALSE) %>% + rename(Time.User = X1, + Time.System = X2, + Time.Run = X3, + Rank = X4, + Iteration = X5) %>% + filter(Iteration != 0) %>% + group_by(Rank, Iteration) %>% + summarize(Start = min(Time.User), + End = max(Time.User)) %>% + group_by(Rank) %>% + mutate(End = End - min(Start), + Start = Start - min(Start)) + })) -> exp.iter; +#+end_src + +**** 2. Transform profile.cubex + +#+name: exp24_cubex_to_open +#+header: :var DIR=exp24_dir +#+begin_src shell :results output +cd $DIR +~/install/cube-4.3.5/bin/cube_dump -c all -m all -s csv2 profile.cubex > profile.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex | tail -n+33 | head -n279 | tr '(' ';' | sed -e "s/[[:space:]]*//g" -e "s/,.*$//" -e "s/id=//" -e "s/:[^;]*;/;/" > regions-codes.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex > cube_info.txt +#+end_src + +#+RESULTS: exp24_cubex_to_open + +**** 3. Parse the call tree + +#+name: exp24_cube_calltree +#+header: :var dep0=exp24_cubex_to_open +#+header: :var DIR=exp24_dir +#+begin_src perl :results output :exports both +use strict; +my($filename) = $DIR . "/cube_info.txt"; +my($line); +open(INPUT,$filename); +my($in_CALLTREE) = 0; + +my($VAR_iteration) = -1; +my($VAR_type) = -1; +my($VAR_id) = -1; + +my($filename_out) = $filename; +$filename_out =~ s/txt$/csv/; + +open(OUTPUT,"> ".$filename_out); + +while(defined($line=)) { + chomp $line; + if($line =~ "CALL TREE") { $in_CALLTREE = 1; } + if(!$in_CALLTREE) { next; } + if($line =~ "SYSTEM DIMENSION") { $in_CALLTREE = 0; } + + if($line =~ /^iter(\d*)\s*.*id=(\d*),.*/) { + $VAR_iteration = $1; + $VAR_type = "Computation"; + $VAR_id = $2; + # print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } + if($line =~ /^\s*\|\-(\S*)\s.*id=(\d*),.*/) { +# print $line."\n"; + $VAR_type = $1; + $VAR_id = $2; +# print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } +} +close(OUTPUT); +print $filename_out; +#+end_src + +#+RESULTS: exp24_cube_calltree +: exp_20_grisou_4_manual/scorep-20180125_1159_34483171676480//cube_info.csv + +**** 4. Enrich the call tree + +#+name: exp24_enrich +#+header: :var CSV=exp24_cube_calltree +#+header: :var DIR=exp24_dir +#+begin_src R :results output :session :exports both +PROFILE = paste0(DIR, "/profile.csv"); +REGION = paste0(DIR, "/regions-codes.csv"); +df.PROF <- read_csv(PROFILE); +exp.REGION <- read_delim(REGION, col_names=FALSE, delim=";") %>% + rename(ID = X2, Name = X1); +exp.ENRICH <- read_delim(CSV, delim=",", col_names=FALSE) %>% + rename(Phase = X1, Code = X2, ID = X3) %>% + left_join(df.PROF, by=c("ID" = "Cnode ID")) %>% + rename(Rank = `Thread ID`) %>% + filter(Code == "Computation") %>% + mutate(Phase = as.integer(Phase)) %>% + filter(Rank != 0) +#+end_src + +#+RESULTS: exp24_enrich +#+begin_example +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +#+end_example + +**** X. Verify the load imbalance (and eventually summarize it if necessary) +***** Rough initial LB Analysis + +#+header: :var dep0=exp24_enrich +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session +exp.ENRICH %>% + group_by(Phase, Rank, Code) %>% + summarize(Sum = sum(time)) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + ggplot(aes(x=as.integer(Phase), y=Sum, color=Rank)) + + geom_point() + geom_line(aes(group=Rank), alpha=.2) + + ylim(0,NA) + + theme_bw(base_size=18) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-27902p-4/figure27902TuK.png]] + +***** Calculate the average + +#+name: exp24_lb_average +#+header: :var dep0=exp24_enrich +#+begin_src R :results output :session :exports both +exp.ENRICH %>% + filter(Phase <= 9) %>% + group_by(Rank) %>% + summarize(time = mean(time)) -> exp.ENRICH +exp.ENRICH +#+end_src + +#+RESULTS: exp24_lb_average +#+begin_example +# A tibble: 64 x 2 + Rank time + + 1 1 42.92296 + 2 2 64.98924 + 3 3 66.13120 + 4 4 62.14705 + 5 5 64.66830 + 6 6 59.88956 + 7 7 61.76306 + 8 8 67.09068 + 9 9 57.22863 +10 10 64.70995 +# ... with 54 more rows +#+end_example + +**** 5. The number of elements on T1 + +#+name: exp24_number_of_elements_shell +#+header: :var DIR=exp24_dir +#+begin_src shell :results table :colnames yes +cd $DIR +echo "Type Rank NELEM NPOIN NBOUN NPOI32" +cat domain-*.csv | grep ^.*T1 +#+end_src + +#+name: exp24_number_of_elements +#+header: :var TABLE=exp24_number_of_elements_shell +#+begin_src R :results output :session :exports both :colnames yes +TABLE %>% + as_tibble() %>% + filter(Rank != 0) %>% + select(-Type) -> exp.ELEMENTS; +exp.ELEMENTS; +#+end_src + +#+RESULTS: exp24_number_of_elements +#+begin_example +# A tibble: 64 x 5 + Rank NELEM NPOIN NBOUN NPOI32 + + 1 1 143862 27303 2832 2362 + 2 2 143921 48988 11415 4174 + 3 3 143822 47644 9906 4947 + 4 4 143695 43667 8197 3921 + 5 5 143675 46242 9442 4191 + 6 6 143752 43209 7433 4876 + 7 7 143768 48927 11007 4580 + 8 8 143792 47241 10828 2428 + 9 9 143945 39022 6023 3238 +10 10 144243 47255 9907 4088 +# ... with 54 more rows +#+end_example + +**** 6. Integrate number of elements with compute cost + +#+name: exp24_integrate +#+header: :var dep1=exp24_lb_average +#+header: :var dep0=exp24_number_of_elements +#+begin_src R :results output :session :exports both +exp.ENRICH %>% + left_join(exp.ELEMENTS) %>% + select(Rank, time, NELEM) -> df.INTEGRATED; +df.INTEGRATED +#+end_src + +#+RESULTS: exp24_integrate +#+begin_example +Joining, by = "Rank" +# A tibble: 64 x 3 + Rank time NELEM + + 1 1 42.92296 143862 + 2 2 64.98924 143921 + 3 3 66.13120 143822 + 4 4 62.14705 143695 + 5 5 64.66830 143675 + 6 6 59.88956 143752 + 7 7 61.76306 143768 + 8 8 67.09068 143792 + 9 9 57.22863 143945 +10 10 64.70995 144243 +# ... with 54 more rows +#+end_example + +**** 7. Do the cumsum + +#+name: exp24_cumsum +#+header: :var dep0=exp24_integrate +#+begin_src R :results output :session :exports both +df.INTEGRATED %>% + arrange(Rank) %>% + mutate(TIMESUM = cumsum(time)/sum(time)) %>% + mutate(NELEMSUM = cumsum(NELEM)/sum(NELEM)) -> df.CUMSUM; +df.CUMSUM -> df.exp24.CUMSUM; +df.CUMSUM +#+end_src + +#+RESULTS: exp24_cumsum +#+begin_example +# A tibble: 64 x 5 + Rank time NELEM TIMESUM NELEMSUM + + 1 1 60.43743 184387 0.01510068 0.02002865 + 2 2 68.98525 135390 0.03233709 0.03473510 + 3 3 62.07687 134635 0.04784740 0.04935954 + 4 4 61.41787 140751 0.06319305 0.06464832 + 5 5 66.77389 137213 0.07987694 0.07955279 + 6 6 60.27102 146712 0.09493604 0.09548907 + 7 7 69.88611 141375 0.11239754 0.11084562 + 8 8 63.52919 131541 0.12827072 0.12513399 + 9 9 64.41341 152492 0.14436482 0.14169811 +10 10 66.84620 136200 0.16106678 0.15649254 +# ... with 54 more rows +#+end_example + +**** 8. Target with Inverting Hilbert Load Curve +***** The function +#+name: exp24_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df_cum_inverse = function(df,yval) { + df = rbind(data.frame(x=0,y=0),df[names(df) %in% c("x","y")]) + N = nrow(df); + xval = rep(NA,length(yval)) +# print(N); + for(i in 1:length(yval)) { +# print(yval[i]) + if(yval[i]<=0) {xval[i] = 0;} + else if(yval[i]>=1) {xval[i] = 1;} + else { + idx = df %>% mutate(Order = 1:nrow(.)) %>% filter(y >= yval[i]) %>% slice(1) %>% pull(Order) + idx = idx-1; +# print(idx) +# print(df[idx:(idx+1),]) + eps_x = (df[idx+1,]$x - df[idx,]$x); + eps_y = (df[idx+1,]$y - df[idx,]$y); + xval[i] = ((yval[i]-df[idx,]$y)*eps_x/eps_y + df[idx,]$x); + } + } + return(xval); +} +#+end_src + +#+RESULTS: exp24_hilbert_invertion_function +***** Calculate the Target +#+name: exp24_target +#+header: :var dep0=exp24_cumsum +#+header: :var dep1=exp24_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df.CUMSUM %>% + rename(y = TIMESUM, x = NELEMSUM) -> df.CUMSUM.2; +nb_proc = nrow(df.CUMSUM.2) +df_cum_inverse(df.CUMSUM.2, y = (1:nb_proc)/nb_proc) -> l.TARGET; +df.CUMSUM.2 %>% + mutate(Target.x = l.TARGET, + Target.y = (1:nb_proc)/nb_proc) -> df.TARGET; +df.TARGET; +#+end_src + +#+name: exp24_tidying +#+header: :var dep0=exp24_target +#+begin_src R :results output :session :exports both +df.TARGET %>% select(Rank, x, y) -> df.1; +df.TARGET %>% select(Rank, Target.x, Target.y) -> df.2; +df.1 %>% + mutate(Category = "Observed") %>% + bind_rows(df.2 %>% + mutate(Category = "Target") %>% + rename(x = Target.x, y = Target.y)) -> df.TARGET.tidy; +df.TARGET %>% select(x, y, Target.x, Target.y) %>% filter(x == Target.x) +#+end_src + +#+RESULTS: +: # A tibble: 1 x 4 +: x y Target.x Target.y +: +: 1 1 1 1 1 + +***** Plot with the arrows + +#+header: :var dep0=exp24_tidying +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 800 :height 400 :session +m1=0 +m2=m1+1 +NP = df.TARGET.tidy %>% pull(Rank) %>% unique %>% length +df.TARGET.tidy %>% + ggplot(aes(x=x, y=y, group+category)) + + theme_bw(base_size=20) + + geom_curve (data=df.TARGET %>% filter(x!=1), aes(x=x, y=y, xend=Target.x, yend=Target.y), arrow=arrow(length = unit(0.03, "npc"))) + + geom_point(size=1, aes(shape=Category, color=as.factor(Rank%%8))) + + scale_color_brewer(palette = "Set1") + +# geom_line(data=df.TARGET.tidy %>% filter(Category == "Observed")) + + coord_cartesian(xlim=c(m1, m2), ylim=c(m1, m2)) #+ +# geom_hline(yintercept=(0:NP)/NP) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900k8u.png]] + +**** 9. Output the new =rank-elements.dat= file + +#+header: :var dep0=exp24_target +#+begin_src R :results output :session :exports both +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + write_delim("rank-elements.dat", delim=" ", col_names=FALSE) +#+end_src + +#+RESULTS: + +**** 9.1 Check which rank has the larger correction + +#+header: :var dep0=exp24_target +#+begin_src R :results output :session :exports both +t = 61 +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + filter(Rank >= t) +#+end_src + +#+RESULTS: +: # A tibble: 4 x 2 +: Rank Target +: +: 1 61 1.0533256 +: 2 62 0.7642869 +: 3 63 0.7402923 +: 4 64 0.8513753 + +#+begin_src R :results output :session :exports both +df.INTEGRATED %>% + mutate(F = time / NELEM) %>% + mutate(diff = mean(time) - time) %>% + filter(Rank >= t) +#+end_src + +#+RESULTS: +: # A tibble: 4 x 5 +: Rank time NELEM F diff +: +: 1 61 72.95337 141232 0.0005165499 -10.41747 +: 2 62 81.25564 221395 0.0003670166 -18.71973 +: 3 63 29.42375 162676 0.0001808734 33.11215 +: 4 64 66.49157 157024 0.0004234484 -3.95566 + +#+begin_src R :results output :session :exports both +exp.ENRICH %>% pull(Rank) %>% min +exp.ENRICH %>% summary +exp.ENRICH %>% pull(time) %>% sd +exp.ENRICH %>% arrange(-time) +#+end_src + +#+RESULTS: +#+begin_example +[1] 1 + Rank time + Min. : 1.00 Min. : 30.46 + 1st Qu.:16.75 1st Qu.: 59.09 + Median :32.50 Median : 63.61 + Mean :32.50 Mean : 63.07 + 3rd Qu.:48.25 3rd Qu.: 71.25 + Max. :64.00 Max. :104.52 +[1] 15.64072 +# A tibble: 64 x 2 + Rank time + + 1 63 104.52326 + 2 39 89.81473 + 3 1 88.79020 + 4 19 85.54385 + 5 25 84.91805 + 6 35 84.68214 + 7 18 81.35206 + 8 15 81.18039 + 9 34 76.98144 +10 26 76.94032 +# ... with 54 more rows +#+end_example +** 4-node grisou (64 cores + 1 core) 1its, new round :EXP25: +*** Differences +The same as exp24 +*** Data Transformation +**** 0. Experiment directory + +#+name: exp25_dir +#+begin_src shell :results output +echo -n "exp_25_grisou_4_manual/scorep-20180125_1730_82025208841684/" +#+end_src + +#+RESULTS: exp25_dir +: exp_25_grisou_4_manual/scorep-20180125_1730_82025208841684/ + +**** 1. Read Iteration Timings + +#+name: exp25_iteration_timings +#+header: :var DIR=exp25_dir +#+begin_src R :results output :session :exports both +suppressMessages(library(tidyverse)); +do.call("bind_rows", lapply(list.files(DIR, pattern="iterations.*csv$", full.names=TRUE), + function(file) { + read_table(file, col_names=FALSE) %>% + rename(Time.User = X1, + Time.System = X2, + Time.Run = X3, + Rank = X4, + Iteration = X5) %>% + filter(Iteration != 0) %>% + group_by(Rank, Iteration) %>% + summarize(Start = min(Time.User), + End = max(Time.User)) %>% + group_by(Rank) %>% + mutate(End = End - min(Start), + Start = Start - min(Start)) + })) -> exp.iter; +#+end_src + +**** 2. Transform profile.cubex + +#+name: exp25_cubex_to_open +#+header: :var DIR=exp25_dir +#+begin_src shell :results output +cd $DIR +~/install/cube-4.3.5/bin/cube_dump -c all -m all -s csv2 profile.cubex > profile.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex | tail -n+33 | head -n279 | tr '(' ';' | sed -e "s/[[:space:]]*//g" -e "s/,.*$//" -e "s/id=//" -e "s/:[^;]*;/;/" > regions-codes.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex > cube_info.txt +#+end_src + +#+RESULTS: exp25_cubex_to_open + +**** 3. Parse the call tree + +#+name: exp25_cube_calltree +#+header: :var dep0=exp25_cubex_to_open +#+header: :var DIR=exp25_dir +#+begin_src perl :results output :exports both +use strict; +my($filename) = $DIR . "/cube_info.txt"; +my($line); +open(INPUT,$filename); +my($in_CALLTREE) = 0; + +my($VAR_iteration) = -1; +my($VAR_type) = -1; +my($VAR_id) = -1; + +my($filename_out) = $filename; +$filename_out =~ s/txt$/csv/; + +open(OUTPUT,"> ".$filename_out); + +while(defined($line=)) { + chomp $line; + if($line =~ "CALL TREE") { $in_CALLTREE = 1; } + if(!$in_CALLTREE) { next; } + if($line =~ "SYSTEM DIMENSION") { $in_CALLTREE = 0; } + + if($line =~ /^iter(\d*)\s*.*id=(\d*),.*/) { + $VAR_iteration = $1; + $VAR_type = "Computation"; + $VAR_id = $2; + # print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } + if($line =~ /^\s*\|\-(\S*)\s.*id=(\d*),.*/) { +# print $line."\n"; + $VAR_type = $1; + $VAR_id = $2; +# print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } +} +close(OUTPUT); +print $filename_out; +#+end_src + +#+RESULTS: exp25_cube_calltree +: exp_20_grisou_4_manual/scorep-20180125_1159_34483171676480//cube_info.csv + +**** 4. Enrich the call tree + +#+name: exp25_enrich +#+header: :var CSV=exp25_cube_calltree +#+header: :var DIR=exp25_dir +#+begin_src R :results output :session :exports both +PROFILE = paste0(DIR, "/profile.csv"); +REGION = paste0(DIR, "/regions-codes.csv"); +df.PROF <- read_csv(PROFILE); +exp.REGION <- read_delim(REGION, col_names=FALSE, delim=";") %>% + rename(ID = X2, Name = X1); +exp.ENRICH <- read_delim(CSV, delim=",", col_names=FALSE) %>% + rename(Phase = X1, Code = X2, ID = X3) %>% + left_join(df.PROF, by=c("ID" = "Cnode ID")) %>% + rename(Rank = `Thread ID`) %>% + filter(Code == "Computation") %>% + mutate(Phase = as.integer(Phase)) %>% + filter(Rank != 0) +#+end_src + +#+RESULTS: exp25_enrich +#+begin_example +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +#+end_example + +**** X. Verify the load imbalance (and eventually summarize it if necessary) +***** Rough initial LB Analysis + +#+header: :var dep0=exp25_enrich +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session +exp.ENRICH %>% + group_by(Phase, Rank, Code) %>% + summarize(Sum = sum(time)) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + ggplot(aes(x=as.integer(Phase), y=Sum, color=Rank)) + + geom_point() + geom_line(aes(group=Rank), alpha=.2) + + ylim(0,NA) + + theme_bw(base_size=18) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900TFI.png]] + +***** Calculate the average + +#+name: exp25_lb_average +#+header: :var dep0=exp25_enrich +#+begin_src R :results output :session :exports both +exp.ENRICH %>% + filter(Phase <= 9) %>% + group_by(Rank) %>% + summarize(time = mean(time)) -> exp.ENRICH +exp.ENRICH +#+end_src + +#+RESULTS: exp25_lb_average +#+begin_example +# A tibble: 64 x 2 + Rank time + + 1 1 42.92296 + 2 2 64.98924 + 3 3 66.13120 + 4 4 62.14705 + 5 5 64.66830 + 6 6 59.88956 + 7 7 61.76306 + 8 8 67.09068 + 9 9 57.22863 +10 10 64.70995 +# ... with 54 more rows +#+end_example + +**** 5. The number of elements on T1 + +#+name: exp25_number_of_elements_shell +#+header: :var DIR=exp25_dir +#+begin_src shell :results table :colnames yes +cd $DIR +echo "Type Rank NELEM NPOIN NBOUN NPOI32" +cat domain-*.csv | grep ^.*T1 +#+end_src + +#+name: exp25_number_of_elements +#+header: :var TABLE=exp25_number_of_elements_shell +#+begin_src R :results output :session :exports both :colnames yes +TABLE %>% + as_tibble() %>% + filter(Rank != 0) %>% + select(-Type) -> exp.ELEMENTS; +exp.ELEMENTS; +#+end_src + +#+RESULTS: exp25_number_of_elements +#+begin_example +# A tibble: 64 x 5 + Rank NELEM NPOIN NBOUN NPOI32 + + 1 1 143862 27303 2832 2362 + 2 2 143921 48988 11415 4174 + 3 3 143822 47644 9906 4947 + 4 4 143695 43667 8197 3921 + 5 5 143675 46242 9442 4191 + 6 6 143752 43209 7433 4876 + 7 7 143768 48927 11007 4580 + 8 8 143792 47241 10828 2428 + 9 9 143945 39022 6023 3238 +10 10 144243 47255 9907 4088 +# ... with 54 more rows +#+end_example + +**** 6. Integrate number of elements with compute cost + +#+name: exp25_integrate +#+header: :var dep1=exp25_lb_average +#+header: :var dep0=exp25_number_of_elements +#+begin_src R :results output :session :exports both +exp.ENRICH %>% + left_join(exp.ELEMENTS) %>% + select(Rank, time, NELEM) -> df.INTEGRATED; +df.INTEGRATED +#+end_src + +#+RESULTS: exp25_integrate +#+begin_example +Joining, by = "Rank" +# A tibble: 64 x 3 + Rank time NELEM + + 1 1 42.92296 143862 + 2 2 64.98924 143921 + 3 3 66.13120 143822 + 4 4 62.14705 143695 + 5 5 64.66830 143675 + 6 6 59.88956 143752 + 7 7 61.76306 143768 + 8 8 67.09068 143792 + 9 9 57.22863 143945 +10 10 64.70995 144243 +# ... with 54 more rows +#+end_example + +**** 7. Do the cumsum + +#+name: exp25_cumsum +#+header: :var dep0=exp25_integrate +#+begin_src R :results output :session :exports both +df.INTEGRATED %>% + arrange(Rank) %>% + mutate(TIMESUM = cumsum(time)/sum(time)) %>% + mutate(NELEMSUM = cumsum(NELEM)/sum(NELEM)) -> df.CUMSUM; +df.CUMSUM -> df.exp25.CUMSUM; +df.CUMSUM +#+end_src + +#+RESULTS: exp25_cumsum +#+begin_example +# A tibble: 64 x 5 + Rank time NELEM TIMESUM NELEMSUM + + 1 1 60.43743 184387 0.01510068 0.02002865 + 2 2 68.98525 135390 0.03233709 0.03473510 + 3 3 62.07687 134635 0.04784740 0.04935954 + 4 4 61.41787 140751 0.06319305 0.06464832 + 5 5 66.77389 137213 0.07987694 0.07955279 + 6 6 60.27102 146712 0.09493604 0.09548907 + 7 7 69.88611 141375 0.11239754 0.11084562 + 8 8 63.52919 131541 0.12827072 0.12513399 + 9 9 64.41341 152492 0.14436482 0.14169811 +10 10 66.84620 136200 0.16106678 0.15649254 +# ... with 54 more rows +#+end_example + +**** 8. Target with Inverting Hilbert Load Curve +***** The function +#+name: exp25_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df_cum_inverse = function(df,yval) { + df = rbind(data.frame(x=0,y=0),df[names(df) %in% c("x","y")]) + N = nrow(df); + xval = rep(NA,length(yval)) +# print(N); + for(i in 1:length(yval)) { +# print(yval[i]) + if(yval[i]<=0) {xval[i] = 0;} + else if(yval[i]>=1) {xval[i] = 1;} + else { + idx = df %>% mutate(Order = 1:nrow(.)) %>% filter(y >= yval[i]) %>% slice(1) %>% pull(Order) + idx = idx-1; +# print(idx) +# print(df[idx:(idx+1),]) + eps_x = (df[idx+1,]$x - df[idx,]$x); + eps_y = (df[idx+1,]$y - df[idx,]$y); + xval[i] = ((yval[i]-df[idx,]$y)*eps_x/eps_y + df[idx,]$x); + } + } + return(xval); +} +#+end_src + +#+RESULTS: exp25_hilbert_invertion_function +***** Calculate the Target +#+name: exp25_target +#+header: :var dep0=exp25_cumsum +#+header: :var dep1=exp25_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df.CUMSUM %>% + rename(y = TIMESUM, x = NELEMSUM) -> df.CUMSUM.2; +nb_proc = nrow(df.CUMSUM.2) +df_cum_inverse(df.CUMSUM.2, y = (1:nb_proc)/nb_proc) -> l.TARGET; +df.CUMSUM.2 %>% + mutate(Target.x = l.TARGET, + Target.y = (1:nb_proc)/nb_proc) -> df.TARGET; +df.TARGET; +#+end_src + +#+name: exp25_tidying +#+header: :var dep0=exp25_target +#+begin_src R :results output :session :exports both +df.TARGET %>% select(Rank, x, y) -> df.1; +df.TARGET %>% select(Rank, Target.x, Target.y) -> df.2; +df.1 %>% + mutate(Category = "Observed") %>% + bind_rows(df.2 %>% + mutate(Category = "Target") %>% + rename(x = Target.x, y = Target.y)) -> df.TARGET.tidy; +df.TARGET %>% select(x, y, Target.x, Target.y) %>% filter(x == Target.x) +#+end_src + +#+RESULTS: +: # A tibble: 1 x 4 +: x y Target.x Target.y +: +: 1 1 1 1 1 + +***** Plot with the arrows + +#+header: :var dep0=exp25_tidying +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 800 :height 400 :session +m1=0 +m2=m1+1 +NP = df.TARGET.tidy %>% pull(Rank) %>% unique %>% length +df.TARGET.tidy %>% + ggplot(aes(x=x, y=y, group+category)) + + theme_bw(base_size=20) + + geom_curve (data=df.TARGET %>% filter(x!=1), aes(x=x, y=y, xend=Target.x, yend=Target.y), arrow=arrow(length = unit(0.03, "npc"))) + + geom_point(size=1, aes(shape=Category, color=as.factor(Rank%%8))) + + scale_color_brewer(palette = "Set1") + +# geom_line(data=df.TARGET.tidy %>% filter(Category == "Observed")) + + coord_cartesian(xlim=c(m1, m2), ylim=c(m1, m2)) #+ +# geom_hline(yintercept=(0:NP)/NP) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900k8u.png]] + +**** 9. Output the new =rank-elements.dat= file + +#+header: :var dep0=exp25_target +#+begin_src R :results output :session :exports both +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + write_delim("rank-elements.dat", delim=" ", col_names=FALSE) +#+end_src + +#+RESULTS: + +**** 9.1 Check which rank has the larger correction + +#+header: :var dep0=exp25_target +#+begin_src R :results output :session :exports both +t = 61 +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + filter(Rank >= t) +#+end_src + +#+RESULTS: +: # A tibble: 4 x 2 +: Rank Target +: +: 1 61 0.8920903 +: 2 62 1.1812863 +: 3 63 1.0454509 +: 4 64 0.9953072 + +#+begin_src R :results output :session :exports both +df.INTEGRATED %>% + mutate(F = time / NELEM) %>% + mutate(diff = mean(time) - time) %>% + filter(Rank >= t) +#+end_src + +#+RESULTS: +: # A tibble: 4 x 5 +: Rank time NELEM F diff +: +: 1 61 55.86152 151485 0.0003687594 6.125037 +: 2 62 38.89725 109918 0.0003538752 23.089306 +: 3 63 48.54840 106440 0.0004561104 13.438163 +: 4 64 52.56137 122507 0.0004290479 9.425188 + +#+begin_src R :results output :session :exports both +exp.ENRICH %>% pull(Rank) %>% min +exp.ENRICH %>% summary +exp.ENRICH %>% pull(time) %>% sd +exp.ENRICH %>% arrange(-time) +#+end_src + +#+RESULTS: +#+begin_example +[1] 1 + Rank time + Min. : 1.00 Min. : 35.71 + 1st Qu.:16.75 1st Qu.: 51.84 + Median :32.50 Median : 59.71 + Mean :32.50 Mean : 61.99 + 3rd Qu.:48.25 3rd Qu.: 70.18 + Max. :64.00 Max. :101.61 +[1] 14.53319 +# A tibble: 64 x 2 + Rank time + + 1 33 101.60772 + 2 57 96.35507 + 3 20 95.07653 + 4 46 93.58917 + 5 50 89.33524 + 6 56 85.84247 + 7 38 85.29922 + 8 51 84.17110 + 9 12 74.24943 +10 48 74.22989 +# ... with 54 more rows +#+end_example +*** Visualization / Analysis +**** Merge the cumsum of exp24 and exp25 + +#+header: :var dep0=exp24_cumsum +#+header: :var dep1=exp25_cumsum +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 1200 :height 800 :session +oz0 = 0 +z1 = z0+1 +df.exp24.CUMSUM %>% mutate(Type = "exp24") %>% + bind_rows(df.exp25.CUMSUM %>% mutate(Type = "exp25")) %>% +group_by(Type) %>% +mutate(Rank.Worst = (time == max(time)), + Ynonnormalized = cumsum(time)) %>% +ggplot(aes(x=NELEMSUM, y=Ynonnormalized)) + + geom_line() + + geom_point(aes(color=Type, size=time, shape=Rank.Worst)) + + coord_cartesian(xlim=c(z0, z1)) + #, ylim=c(z0, z1)) + + theme_bw(base_size = 22) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-27902p-4/figure279026aF.png]] +** 4-node grisou (64 cores + 1 core) 5its, new round :EXP26: +*** Data Transformation +**** 0. Experiment directory + +#+name: exp26_dir +#+begin_src shell :results output +echo -n "exp_26_grisou_4_manual/scorep-20180125_1752_85193983406456" +#+end_src + +#+RESULTS: exp26_dir +: exp_26_grisou_4_manual/scorep-20180125_1752_85193983406456 + +**** 1. Read Iteration Timings + +#+name: exp26_iteration_timings +#+header: :var DIR=exp26_dir +#+begin_src R :results output :session :exports both +suppressMessages(library(tidyverse)); +do.call("bind_rows", lapply(list.files(DIR, pattern="iterations.*csv$", full.names=TRUE), + function(file) { + read_table(file, col_names=FALSE) %>% + rename(Time.User = X1, + Time.System = X2, + Time.Run = X3, + Rank = X4, + Iteration = X5) %>% + filter(Iteration != 0) %>% + group_by(Rank, Iteration) %>% + summarize(Start = min(Time.User), + End = max(Time.User)) %>% + group_by(Rank) %>% + mutate(End = End - min(Start), + Start = Start - min(Start)) + })) -> exp.iter; +#+end_src + +**** 2. Transform profile.cubex + +#+name: exp26_cubex_to_open +#+header: :var DIR=exp26_dir +#+begin_src shell :results output +cd $DIR +~/install/cube-4.3.5/bin/cube_dump -c all -m all -s csv2 profile.cubex > profile.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex | tail -n+33 | head -n279 | tr '(' ';' | sed -e "s/[[:space:]]*//g" -e "s/,.*$//" -e "s/id=//" -e "s/:[^;]*;/;/" > regions-codes.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex > cube_info.txt +#+end_src + +#+RESULTS: exp26_cubex_to_open + +**** 3. Parse the call tree + +#+name: exp26_cube_calltree +#+header: :var dep0=exp26_cubex_to_open +#+header: :var DIR=exp26_dir +#+begin_src perl :results output :exports both +use strict; +my($filename) = $DIR . "/cube_info.txt"; +my($line); +open(INPUT,$filename); +my($in_CALLTREE) = 0; + +my($VAR_iteration) = -1; +my($VAR_type) = -1; +my($VAR_id) = -1; + +my($filename_out) = $filename; +$filename_out =~ s/txt$/csv/; + +open(OUTPUT,"> ".$filename_out); + +while(defined($line=)) { + chomp $line; + if($line =~ "CALL TREE") { $in_CALLTREE = 1; } + if(!$in_CALLTREE) { next; } + if($line =~ "SYSTEM DIMENSION") { $in_CALLTREE = 0; } + + if($line =~ /^iter(\d*)\s*.*id=(\d*),.*/) { + $VAR_iteration = $1; + $VAR_type = "Computation"; + $VAR_id = $2; + # print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } + if($line =~ /^\s*\|\-(\S*)\s.*id=(\d*),.*/) { +# print $line."\n"; + $VAR_type = $1; + $VAR_id = $2; +# print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } +} +close(OUTPUT); +print $filename_out; +#+end_src + +#+RESULTS: exp26_cube_calltree +: exp_20_grisou_4_manual/scorep-20180125_1159_34483171676480//cube_info.csv + +**** 4. Enrich the call tree + +#+name: exp26_enrich +#+header: :var CSV=exp26_cube_calltree +#+header: :var DIR=exp26_dir +#+begin_src R :results output :session :exports both +PROFILE = paste0(DIR, "/profile.csv"); +REGION = paste0(DIR, "/regions-codes.csv"); +df.PROF <- read_csv(PROFILE); +exp.REGION <- read_delim(REGION, col_names=FALSE, delim=";") %>% + rename(ID = X2, Name = X1); +exp.ENRICH <- read_delim(CSV, delim=",", col_names=FALSE) %>% + rename(Phase = X1, Code = X2, ID = X3) %>% + left_join(df.PROF, by=c("ID" = "Cnode ID")) %>% + rename(Rank = `Thread ID`) %>% + filter(Code == "Computation") %>% + mutate(Phase = as.integer(Phase)) %>% + filter(Rank != 0) +#+end_src + +#+RESULTS: exp26_enrich +#+begin_example +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +#+end_example + +**** X. Verify the load imbalance (and eventually summarize it if necessary) +***** Rough initial LB Analysis + +#+header: :var dep0=exp26_enrich +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session +exp.ENRICH %>% + group_by(Phase, Rank, Code) %>% + summarize(Sum = sum(time)) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + ggplot(aes(x=as.integer(Phase), y=Sum, color=Rank)) + + geom_point() + geom_line(aes(group=Rank), alpha=.2) + + ylim(0,NA) + + theme_bw(base_size=18) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900Vtv.png]] + +***** Calculate the average + +#+name: exp26_lb_average +#+header: :var dep0=exp26_enrich +#+begin_src R :results output :session :exports both +exp.ENRICH %>% + filter(Phase <= 9) %>% + group_by(Rank) %>% + summarize(time = mean(time)) -> exp.ENRICH +exp.ENRICH +#+end_src + +#+RESULTS: exp26_lb_average +#+begin_example +# A tibble: 64 x 2 + Rank time + + 1 1 42.92296 + 2 2 64.98924 + 3 3 66.13120 + 4 4 62.14705 + 5 5 64.66830 + 6 6 59.88956 + 7 7 61.76306 + 8 8 67.09068 + 9 9 57.22863 +10 10 64.70995 +# ... with 54 more rows +#+end_example + +**** 5. The number of elements on T1 + +#+name: exp26_number_of_elements_shell +#+header: :var DIR=exp26_dir +#+begin_src shell :results table :colnames yes +cd $DIR +echo "Type Rank NELEM NPOIN NBOUN NPOI32" +cat domain-*.csv | grep ^.*T1 +#+end_src + +#+name: exp26_number_of_elements +#+header: :var TABLE=exp26_number_of_elements_shell +#+begin_src R :results output :session :exports both :colnames yes +TABLE %>% + as_tibble() %>% + filter(Rank != 0) %>% + select(-Type) -> exp.ELEMENTS; +exp.ELEMENTS; +#+end_src + +#+RESULTS: exp26_number_of_elements +#+begin_example +# A tibble: 64 x 5 + Rank NELEM NPOIN NBOUN NPOI32 + + 1 1 143862 27303 2832 2362 + 2 2 143921 48988 11415 4174 + 3 3 143822 47644 9906 4947 + 4 4 143695 43667 8197 3921 + 5 5 143675 46242 9442 4191 + 6 6 143752 43209 7433 4876 + 7 7 143768 48927 11007 4580 + 8 8 143792 47241 10828 2428 + 9 9 143945 39022 6023 3238 +10 10 144243 47255 9907 4088 +# ... with 54 more rows +#+end_example + +**** 6. Integrate number of elements with compute cost + +#+name: exp26_integrate +#+header: :var dep1=exp26_lb_average +#+header: :var dep0=exp26_number_of_elements +#+begin_src R :results output :session :exports both +exp.ENRICH %>% + left_join(exp.ELEMENTS) %>% + select(Rank, time, NELEM) -> df.INTEGRATED; +df.INTEGRATED +#+end_src + +#+RESULTS: exp26_integrate +#+begin_example +Joining, by = "Rank" +# A tibble: 64 x 3 + Rank time NELEM + + 1 1 42.92296 143862 + 2 2 64.98924 143921 + 3 3 66.13120 143822 + 4 4 62.14705 143695 + 5 5 64.66830 143675 + 6 6 59.88956 143752 + 7 7 61.76306 143768 + 8 8 67.09068 143792 + 9 9 57.22863 143945 +10 10 64.70995 144243 +# ... with 54 more rows +#+end_example + +**** 7. Do the cumsum + +#+name: exp26_cumsum +#+header: :var dep0=exp26_integrate +#+begin_src R :results output :session :exports both +df.INTEGRATED %>% + arrange(Rank) %>% + mutate(TIMESUM = cumsum(time)/sum(time)) %>% + mutate(NELEMSUM = cumsum(NELEM)/sum(NELEM)) -> df.CUMSUM; +df.CUMSUM -> df.exp26.CUMSUM; +df.CUMSUM +#+end_src + +#+RESULTS: exp26_cumsum +#+begin_example +# A tibble: 64 x 5 + Rank time NELEM TIMESUM NELEMSUM + + 1 1 60.43743 184387 0.01510068 0.02002865 + 2 2 68.98525 135390 0.03233709 0.03473510 + 3 3 62.07687 134635 0.04784740 0.04935954 + 4 4 61.41787 140751 0.06319305 0.06464832 + 5 5 66.77389 137213 0.07987694 0.07955279 + 6 6 60.27102 146712 0.09493604 0.09548907 + 7 7 69.88611 141375 0.11239754 0.11084562 + 8 8 63.52919 131541 0.12827072 0.12513399 + 9 9 64.41341 152492 0.14436482 0.14169811 +10 10 66.84620 136200 0.16106678 0.15649254 +# ... with 54 more rows +#+end_example + +**** 8. Target with Inverting Hilbert Load Curve +***** The function +#+name: exp26_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df_cum_inverse = function(df,yval) { + df = rbind(data.frame(x=0,y=0),df[names(df) %in% c("x","y")]) + N = nrow(df); + xval = rep(NA,length(yval)) +# print(N); + for(i in 1:length(yval)) { +# print(yval[i]) + if(yval[i]<=0) {xval[i] = 0;} + else if(yval[i]>=1) {xval[i] = 1;} + else { + idx = df %>% mutate(Order = 1:nrow(.)) %>% filter(y >= yval[i]) %>% slice(1) %>% pull(Order) + idx = idx-1; +# print(idx) +# print(df[idx:(idx+1),]) + eps_x = (df[idx+1,]$x - df[idx,]$x); + eps_y = (df[idx+1,]$y - df[idx,]$y); + xval[i] = ((yval[i]-df[idx,]$y)*eps_x/eps_y + df[idx,]$x); + } + } + return(xval); +} +#+end_src + +#+RESULTS: exp26_hilbert_invertion_function +***** Calculate the Target +#+name: exp26_target +#+header: :var dep0=exp26_cumsum +#+header: :var dep1=exp26_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df.CUMSUM %>% + rename(y = TIMESUM, x = NELEMSUM) -> df.CUMSUM.2; +nb_proc = nrow(df.CUMSUM.2) +df_cum_inverse(df.CUMSUM.2, y = (1:nb_proc)/nb_proc) -> l.TARGET; +df.CUMSUM.2 %>% + mutate(Target.x = l.TARGET, + Target.y = (1:nb_proc)/nb_proc) -> df.TARGET; +df.TARGET; +#+end_src + +#+name: exp26_tidying +#+header: :var dep0=exp26_target +#+begin_src R :results output :session :exports both +df.TARGET %>% select(Rank, x, y) -> df.1; +df.TARGET %>% select(Rank, Target.x, Target.y) -> df.2; +df.1 %>% + mutate(Category = "Observed") %>% + bind_rows(df.2 %>% + mutate(Category = "Target") %>% + rename(x = Target.x, y = Target.y)) -> df.TARGET.tidy; +df.TARGET %>% select(x, y, Target.x, Target.y) %>% filter(x == Target.x) +#+end_src + +#+RESULTS: +: # A tibble: 1 x 4 +: x y Target.x Target.y +: +: 1 1 1 1 1 + +***** Plot with the arrows + +#+header: :var dep0=exp26_tidying +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 800 :height 400 :session +m1=0 +m2=m1+1 +NP = df.TARGET.tidy %>% pull(Rank) %>% unique %>% length +df.TARGET.tidy %>% + ggplot(aes(x=x, y=y, group+category)) + + theme_bw(base_size=20) + + geom_curve (data=df.TARGET %>% filter(x!=1), aes(x=x, y=y, xend=Target.x, yend=Target.y), arrow=arrow(length = unit(0.03, "npc"))) + + geom_point(size=1, aes(shape=Category, color=as.factor(Rank%%8))) + + scale_color_brewer(palette = "Set1") + +# geom_line(data=df.TARGET.tidy %>% filter(Category == "Observed")) + + coord_cartesian(xlim=c(m1, m2), ylim=c(m1, m2)) #+ +# geom_hline(yintercept=(0:NP)/NP) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900T5J.png]] + +**** 9. Output the new =rank-elements.dat= file + +#+header: :var dep0=exp26_target +#+begin_src R :results output :session :exports both +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + write_delim("rank-elements.dat", delim=" ", col_names=FALSE) +#+end_src + +#+RESULTS: + +**** 9.1 Check which rank has the larger correction + +#+header: :var dep0=exp26_target +#+begin_src R :results output :session :exports both +t = 61 +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + filter(Rank >= t) +#+end_src + +#+RESULTS: +: # A tibble: 4 x 2 +: Rank Target +: +: 1 61 0.9131876 +: 2 62 1.1703797 +: 3 63 1.0715776 +: 4 64 1.0107468 + +#+begin_src R :results output :session :exports both +df.INTEGRATED %>% + mutate(F = time / NELEM) %>% + mutate(diff = mean(time) - time) %>% + filter(Rank >= t) +#+end_src + +#+RESULTS: +: # A tibble: 4 x 5 +: Rank time NELEM F diff +: +: 1 61 72.95337 141232 0.0005165499 -10.41747 +: 2 62 81.25564 221395 0.0003670166 -18.71973 +: 3 63 29.42375 162676 0.0001808734 33.11215 +: 4 64 66.49157 157024 0.0004234484 -3.95566 + +#+begin_src R :results output :session :exports both +exp.ENRICH %>% pull(Rank) %>% min +exp.ENRICH %>% summary +exp.ENRICH %>% pull(time) %>% sd +exp.ENRICH %>% arrange(-time) +#+end_src + +#+RESULTS: +#+begin_example +[1] 1 + Rank time + Min. : 1.00 Min. : 32.55 + 1st Qu.:16.75 1st Qu.: 59.81 + Median :32.50 Median : 61.14 + Mean :32.50 Mean : 61.80 + 3rd Qu.:48.25 3rd Qu.: 63.00 + Max. :64.00 Max. :103.14 +[1] 12.66165 +# A tibble: 64 x 2 + Rank time + + 1 31 103.14403 + 2 59 97.99925 + 3 46 96.40160 + 4 44 91.13452 + 5 3 81.42977 + 6 20 77.99168 + 7 1 67.54315 + 8 36 66.69546 + 9 56 65.95448 +10 39 65.71467 +# ... with 54 more rows +#+end_example +*** Visualization / Analysis +**** Merge the cumsum of exp24 and exp25 and exp26 + +#+header: :var dep0=exp24_cumsum +#+header: :var dep1=exp25_cumsum +#+header: :var dep2=exp26_cumsum +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 1200 :height 800 :session +z0 = 0 +z1 = z0+1 +df.exp24.CUMSUM %>% mutate(Type = "exp24") %>% + bind_rows(df.exp25.CUMSUM %>% mutate(Type = "exp25")) %>% + bind_rows(df.exp26.CUMSUM %>% mutate(Type = "exp26")) %>% +filter(Type != "exp24") %>% +group_by(Type) %>% +mutate(Rank.Worst = (time == max(time)), + Ynonnormalized = cumsum(time)) %>% +ggplot(aes(x=NELEMSUM, y=Ynonnormalized)) + + geom_line() + + geom_point(aes(color=Type, size=time, shape=Rank.Worst)) + + coord_cartesian(xlim=c(z0, z1)) + #, ylim=c(z0, z1)) + + theme_bw(base_size = 22) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-27902p-4/figure279028vF.png]] + + +#+begin_src R :results output graphics :file img/exp25-exp26-small_big_outliers_are_together.png :exports both :width 1200 :height 800 :session +library(ggrepel); +z0 = 0 +z1 = z0+1 +df.exp24.CUMSUM %>% mutate(Type = "exp24") %>% + bind_rows(df.exp25.CUMSUM %>% mutate(Type = "exp25")) %>% + bind_rows(df.exp26.CUMSUM %>% mutate(Type = "exp26")) %>% +filter(Type == "exp26") %>% +group_by(Type) %>% +mutate(Rank.Worst = (time > 0.9*max(time) | time <= 1.2*min(time)), + Ynonnormalized = cumsum(time)) -> t +t %>% filter(Rank.Worst) -> z; +t %>% +ggplot(aes(x=NELEMSUM, y=TIMESUM)) + + geom_line(aes(group=Type), alpha=.6) + + geom_point(aes(color=as.factor(Rank%%8), shape=Type, size=time, alpha=Rank.Worst)) + + coord_cartesian(xlim=c(z0, z1), ylim=c(z0, z1)) + + theme_bw(base_size = 16) + + scale_color_brewer(palette = "Set1") + + geom_hline(data=tibble(VAL=1:NP), aes(yintercept=VAL/NP, color=as.factor(VAL%%8))) + + geom_text_repel(data=z, force=15, size=4, box.padding=1, max_iter=100, aes(label=paste0( "Rank:", Rank, "\nTime:", round(time,2)))) + + xlim(z0, z1) +#+end_src + +#+RESULTS: +[[file:img/exp25-exp26-small_big_outliers_are_together.png]] + + +Intepretation +- Extremes are together + +#+begin_src R :results output graphics :file img/exp25-exp26-cost_of_region_is_surprising.png :exports both :width 1200 :height 800 :session +library(ggrepel); +z0 = 0.8 +z1 = z0+0.2 +df.exp24.CUMSUM %>% mutate(Type = "exp24") %>% + bind_rows(df.exp25.CUMSUM %>% mutate(Type = "exp25")) %>% + bind_rows(df.exp26.CUMSUM %>% mutate(Type = "exp26")) %>% +filter(Type != "exp24") %>% +group_by(Type) %>% +mutate(Rank.Worst = (time > 0.9*max(time) | time <= 1.2*min(time)), + Ynonnormalized = cumsum(time)) -> t +t %>% +ggplot(aes(x=NELEMSUM, y=TIMESUM)) + + geom_line(aes(group=Type), alpha=.6) + + geom_point(aes(color=as.factor(Rank%%8), shape=Type, size=time)) + + coord_cartesian(xlim=c(z0, z1), ylim=c(z0, z1)) + + theme_bw(base_size = 16) + + scale_color_brewer(palette = "Set1") + + geom_hline(data=tibble(VAL=1:NP), aes(yintercept=VAL/NP, color=as.factor(VAL%%8))) + + geom_text_repel(force=30, aes(label=paste0( "Rank:", Rank, "\nTime:", round(time,2)))) + + xlim(z0, z1) +#+end_src + +#+RESULTS: +[[file:img/exp25-exp26-cost_of_region_is_surprising.png]] + +Interpretation: +- Rank 59 of exp26 lasts 98 seconds while the Rank 58 of exp25 lasts + 42 seconds and both occupy almost the same region of elements (look + to the left). +- Rank 57 of exp25 lasts 96 seconds while the rank 58 of exp26 lasts + 32 seconds and both occupy a similar region (large overlap on X from + the previous point to the left following the line). +- This is surprising, we should run multiple experiments with the same + distribution to confirm that those regions are indeed very costly. + +#+begin_src R :results output graphics :file img/exp25-exp26-cost_of_region_is_surprising_2.png :exports both :width 1200 :height 800 :session +library(ggrepel); +z0 = 0.65 +z1 = z0+0.2 +df.exp24.CUMSUM %>% mutate(Type = "exp24") %>% + bind_rows(df.exp25.CUMSUM %>% mutate(Type = "exp25")) %>% + bind_rows(df.exp26.CUMSUM %>% mutate(Type = "exp26")) %>% +filter(Type != "exp24") %>% +group_by(Type) %>% +mutate(Rank.Worst = (time > 0.9*max(time) | time <= 1.2*min(time)), + Ynonnormalized = cumsum(time)) -> t +t %>% +ggplot(aes(x=NELEMSUM, y=TIMESUM)) + + geom_line(aes(group=Type), alpha=.6) + + geom_point(aes(color=as.factor(Rank%%8), shape=Type, size=time)) + + coord_cartesian(xlim=c(z0, z1), ylim=c(z0, z1)) + + theme_bw(base_size = 16) + + scale_color_brewer(palette = "Set1") + + geom_hline(data=tibble(VAL=1:NP), aes(yintercept=VAL/NP, color=as.factor(VAL%%8))) + + geom_text_repel(force=30, aes(label=paste0( "Rank:", Rank, "\nTime:", round(time,2)))) + + xlim(z0, z1) +#+end_src + +#+RESULTS: +[[file:img/exp25-exp26-cost_of_region_is_surprising_2.png]] + +Interpretation: +- Rank 48 exp25 (red) takes 74 seconds while the Rank 47 of exp26 + (very small pink) occupies almost the same region and takes only 33 + seconds. The small additional region (non overlapping between them) + is therefore extremely costly. + +** 4-node grisou (64 cores + 1 core) 10its, check stability :EXP27: +*** Goal for next +- Run two executions with exactly the same weight in rank-elements.csv + - Weight will be 1 so all ranks get the same load +*** Initial rank-elements.dat +#+begin_src shell :results output +for i in $(seq 1 64); do + echo $i 1 +done > rank-elements.dat +#+end_src + +#+RESULTS: +*** Verify contents of the four executions + +We should use all but the RANK.HETER whose rank-elements.dat was not 1. + +#+begin_src shell :results output +du -h --max-depth=1 exp_27_grisou_4_manual/exp_27 +#+end_src + +#+RESULTS: +: 395M exp_27_grisou_4_manual/exp_27/scorep-20180126_1257_27316133992896 +: 395M exp_27_grisou_4_manual/exp_27/RANK.HETER_scorep-20180126_1152_17923889031732 +: 395M exp_27_grisou_4_manual/exp_27/scorep-20180126_1506_45830289772336 +: 395M exp_27_grisou_4_manual/exp_27/scorep-20180126_1217_21570417239832 +: 395M exp_27_grisou_4_manual/exp_27/scorep-20180126_1440_42090372406932 +: 2.0G exp_27_grisou_4_manual/exp_27 +*** Data Transformation +**** 1. Read Iteration Timings + +#+name: exp27_iteration_timings +#+begin_src R :results output :session :exports both +suppressMessages(library(tidyverse)); +do.call("bind_rows", lapply(list.files(DIR, pattern="iterations.*csv$", full.names=TRUE), + function(file) { + read_table(file, col_names=FALSE) %>% + rename(Time.User = X1, + Time.System = X2, + Time.Run = X3, + Rank = X4, + Iteration = X5) %>% + filter(Iteration != 0) %>% + group_by(Rank, Iteration) %>% + summarize(Start = min(Time.User), + End = max(Time.User)) %>% + group_by(Rank) %>% + mutate(End = End - min(Start), + Start = Start - min(Start)) + })) -> exp.iter; +#+end_src + +**** 2. Transform profile.cubex + +#+name: exp27_current_dir +#+begin_src R :results verbatim :session :exports both +print(DIR) +#+end_src + +#+RESULTS: exp27_current_dir +: exp_27_grisou_4_manual/exp_27/scorep-20180126_1257_27316133992896/ + +#+name: exp27_cubex_to_open +#+header: :var DIR=exp27_current_dir +#+begin_src shell :results output +cd $DIR +~/install/cube-4.3.5/bin/cube_dump -c all -m all -s csv2 profile.cubex > profile.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex | tail -n+33 | head -n279 | tr '(' ';' | sed -e "s/[[:space:]]*//g" -e "s/,.*$//" -e "s/id=//" -e "s/:[^;]*;/;/" > regions-codes.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex > cube_info.txt +#+end_src + +#+RESULTS: exp27_cubex_to_open + +**** 3. Parse the call tree + +#+name: exp27_cube_calltree +#+header: :var dep0=exp27_cubex_to_open +#+header: :var DIR=exp27_current_dir +#+begin_src perl :results output :exports both +use strict; +my($filename) = $DIR . "/cube_info.txt"; +my($line); +open(INPUT,$filename); +my($in_CALLTREE) = 0; + +my($VAR_iteration) = -1; +my($VAR_type) = -1; +my($VAR_id) = -1; + +my($filename_out) = $filename; +$filename_out =~ s/txt$/csv/; + +open(OUTPUT,"> ".$filename_out); + +while(defined($line=)) { + chomp $line; + if($line =~ "CALL TREE") { $in_CALLTREE = 1; } + if(!$in_CALLTREE) { next; } + if($line =~ "SYSTEM DIMENSION") { $in_CALLTREE = 0; } + + if($line =~ /^iter(\d*)\s*.*id=(\d*),.*/) { + $VAR_iteration = $1; + $VAR_type = "Computation"; + $VAR_id = $2; + # print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } + if($line =~ /^\s*\|\-(\S*)\s.*id=(\d*),.*/) { +# print $line."\n"; + $VAR_type = $1; + $VAR_id = $2; +# print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } +} +close(OUTPUT); +print $filename_out; +#+end_src + +#+RESULTS: exp27_cube_calltree +: exp_20_grisou_4_manual/scorep-20180125_1159_34483171676480//cube_info.csv + +**** 4. Enrich the call tree + +#+name: exp27_enrich +#+header: :var CSV=exp27_cube_calltree +#+header: :var DIR=exp27_current_dir +#+begin_src R :results output :session :exports both +suppressMessages(library(tidyverse)); +PROFILE = paste0(DIR, "/profile.csv"); +REGION = paste0(DIR, "/regions-codes.csv"); +df.PROF <- read_csv(PROFILE); +exp.REGION <- read_delim(REGION, col_names=FALSE, delim=";") %>% + rename(ID = X2, Name = X1); +exp.ENRICH <- read_delim(CSV, delim=",", col_names=FALSE) %>% + rename(Phase = X1, Code = X2, ID = X3) %>% + left_join(df.PROF, by=c("ID" = "Cnode ID")) %>% + rename(Rank = `Thread ID`) %>% + filter(Code == "Computation") %>% + mutate(Phase = as.integer(Phase)) %>% + filter(Rank != 0) +#+end_src + +#+RESULTS: exp27_enrich +#+begin_example +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +#+end_example + +**** X. Verify the load imbalance (and eventually summarize it if necessary) +***** Rough initial LB Analysis + +#+header: :var dep0=exp27_enrich +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session +exp.ENRICH %>% + group_by(Phase, Rank, Code) %>% + summarize(Sum = sum(time)) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + ggplot(aes(x=as.integer(Phase), y=Sum, color=Rank)) + + geom_point() + geom_line(aes(group=Rank), alpha=.2) + + ylim(0,NA) + + theme_bw(base_size=18) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-27902p-4/figure27902m1n.png]] + +***** Calculate the average + +#+name: exp27_lb_average +#+header: :var dep0=exp27_enrich +#+begin_src R :results output :session :exports both +exp.ENRICH %>% + filter(Phase <= 9) %>% + group_by(Rank) %>% + summarize(time = mean(time)) -> exp.ENRICH +exp.ENRICH +#+end_src + +#+RESULTS: exp27_lb_average +#+begin_example +# A tibble: 64 x 2 + Rank time + + 1 1 42.92296 + 2 2 64.98924 + 3 3 66.13120 + 4 4 62.14705 + 5 5 64.66830 + 6 6 59.88956 + 7 7 61.76306 + 8 8 67.09068 + 9 9 57.22863 +10 10 64.70995 +# ... with 54 more rows +#+end_example + +**** 5. The number of elements on T1 + +#+name: exp27_number_of_elements_shell +#+header: :var DIR=exp27_current_dir +#+begin_src shell :results table :colnames yes +cd $DIR +echo "Type Rank NELEM NPOIN NBOUN NPOI32" +cat domain-*.csv | grep ^.*T1 +#+end_src + +#+name: exp27_number_of_elements +#+header: :var TABLE=exp27_number_of_elements_shell +#+begin_src R :results output :session :exports both :colnames yes +TABLE %>% + as_tibble() %>% + filter(Rank != 0) %>% + select(-Type) -> exp.ELEMENTS; +exp.ELEMENTS; +#+end_src + +#+RESULTS: exp27_number_of_elements +#+begin_example +# A tibble: 64 x 5 + Rank NELEM NPOIN NBOUN NPOI32 + + 1 1 143862 27303 2832 2362 + 2 2 143921 48988 11415 4174 + 3 3 143822 47644 9906 4947 + 4 4 143695 43667 8197 3921 + 5 5 143675 46242 9442 4191 + 6 6 143752 43209 7433 4876 + 7 7 143768 48927 11007 4580 + 8 8 143792 47241 10828 2428 + 9 9 143945 39022 6023 3238 +10 10 144243 47255 9907 4088 +# ... with 54 more rows +#+end_example + +**** 6. Integrate number of elements with compute cost + +#+name: exp27_integrate +#+header: :var dep1=exp27_lb_average +#+header: :var dep0=exp27_number_of_elements +#+begin_src R :results output :session :exports both +exp.ENRICH %>% + left_join(exp.ELEMENTS) %>% + select(Rank, time, NELEM) -> df.INTEGRATED; +df.INTEGRATED +#+end_src + +#+RESULTS: exp27_integrate +#+begin_example +Joining, by = "Rank" +# A tibble: 64 x 3 + Rank time NELEM + + 1 1 42.92296 143862 + 2 2 64.98924 143921 + 3 3 66.13120 143822 + 4 4 62.14705 143695 + 5 5 64.66830 143675 + 6 6 59.88956 143752 + 7 7 61.76306 143768 + 8 8 67.09068 143792 + 9 9 57.22863 143945 +10 10 64.70995 144243 +# ... with 54 more rows +#+end_example + +**** 7. Do the cumsum + +#+name: exp27_cumsum +#+header: :var dep0=exp27_integrate +#+begin_src R :results output :session :exports both +df.INTEGRATED %>% + arrange(Rank) %>% + mutate(TIMESUM = cumsum(time)/sum(time)) %>% + mutate(NELEMSUM = cumsum(NELEM)/sum(NELEM)) -> df.CUMSUM; +df.CUMSUM -> df.exp27.CUMSUM; +df.CUMSUM +#+end_src + +#+RESULTS: exp27_cumsum +#+begin_example +# A tibble: 64 x 5 + Rank time NELEM TIMESUM NELEMSUM + + 1 1 60.43743 184387 0.01510068 0.02002865 + 2 2 68.98525 135390 0.03233709 0.03473510 + 3 3 62.07687 134635 0.04784740 0.04935954 + 4 4 61.41787 140751 0.06319305 0.06464832 + 5 5 66.77389 137213 0.07987694 0.07955279 + 6 6 60.27102 146712 0.09493604 0.09548907 + 7 7 69.88611 141375 0.11239754 0.11084562 + 8 8 63.52919 131541 0.12827072 0.12513399 + 9 9 64.41341 152492 0.14436482 0.14169811 +10 10 66.84620 136200 0.16106678 0.15649254 +# ... with 54 more rows +#+end_example + +**** 8. Target with Inverting Hilbert Load Curve +***** The function +#+name: exp27_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df_cum_inverse = function(df,yval) { + df = rbind(data.frame(x=0,y=0),df[names(df) %in% c("x","y")]) + N = nrow(df); + xval = rep(NA,length(yval)) +# print(N); + for(i in 1:length(yval)) { +# print(yval[i]) + if(yval[i]<=0) {xval[i] = 0;} + else if(yval[i]>=1) {xval[i] = 1;} + else { + idx = df %>% mutate(Order = 1:nrow(.)) %>% filter(y >= yval[i]) %>% slice(1) %>% pull(Order) + idx = idx-1; +# print(idx) +# print(df[idx:(idx+1),]) + eps_x = (df[idx+1,]$x - df[idx,]$x); + eps_y = (df[idx+1,]$y - df[idx,]$y); + xval[i] = ((yval[i]-df[idx,]$y)*eps_x/eps_y + df[idx,]$x); + } + } + return(xval); +} +#+end_src + +#+RESULTS: exp27_hilbert_invertion_function +***** Calculate the Target +#+name: exp27_target +#+header: :var dep0=exp27_cumsum +#+header: :var dep1=exp27_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df.CUMSUM %>% + rename(y = TIMESUM, x = NELEMSUM) -> df.CUMSUM.2; +nb_proc = nrow(df.CUMSUM.2) +df_cum_inverse(df.CUMSUM.2, y = (1:nb_proc)/nb_proc) -> l.TARGET; +df.CUMSUM.2 %>% + mutate(Target.x = l.TARGET, + Target.y = (1:nb_proc)/nb_proc) -> df.TARGET; +df.TARGET; +#+end_src + +#+name: exp27_tidying +#+header: :var dep0=exp27_target +#+begin_src R :results output :session :exports both +df.TARGET %>% select(Rank, x, y) -> df.1; +df.TARGET %>% select(Rank, Target.x, Target.y) -> df.2; +df.1 %>% + mutate(Category = "Observed") %>% + bind_rows(df.2 %>% + mutate(Category = "Target") %>% + rename(x = Target.x, y = Target.y)) -> df.TARGET.tidy; +df.TARGET %>% select(x, y, Target.x, Target.y) %>% filter(x == Target.x) +#+end_src + +#+RESULTS: +: # A tibble: 1 x 4 +: x y Target.x Target.y +: +: 1 1 1 1 1 + +***** Plot with the arrows + +#+header: :var dep0=exp27_tidying +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 800 :height 400 :session +m1=0 +m2=m1+1 +NP = df.TARGET.tidy %>% pull(Rank) %>% unique %>% length +df.TARGET.tidy %>% + ggplot(aes(x=x, y=y, group+category)) + + theme_bw(base_size=20) + + geom_curve (data=df.TARGET %>% filter(x!=1), aes(x=x, y=y, xend=Target.x, yend=Target.y), arrow=arrow(length = unit(0.03, "npc"))) + + geom_point(size=1, aes(shape=Category, color=as.factor(Rank%%8))) + + scale_color_brewer(palette = "Set1") + +# geom_line(data=df.TARGET.tidy %>% filter(Category == "Observed")) + + coord_cartesian(xlim=c(m1, m2), ylim=c(m1, m2)) #+ +# geom_hline(yintercept=(0:NP)/NP) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900T5J.png]] + +**** 9. Output the new =rank-elements.dat= file + +#+header: :var dep0=exp27_target +#+begin_src R :results output :session :exports both +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + write_delim("rank-elements.dat", delim=" ", col_names=FALSE) +#+end_src + +#+RESULTS: + +**** 9.1 Check which rank has the larger correction + +#+header: :var dep0=exp27_target +#+begin_src R :results output :session :exports both +t = 61 +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + filter(Rank >= t) +#+end_src + +#+RESULTS: +: # A tibble: 4 x 2 +: Rank Target +: +: 1 61 0.9131876 +: 2 62 1.1703797 +: 3 63 1.0715776 +: 4 64 1.0107468 + +#+begin_src R :results output :session :exports both +df.INTEGRATED %>% + mutate(F = time / NELEM) %>% + mutate(diff = mean(time) - time) %>% + filter(Rank >= t) +#+end_src + +#+RESULTS: +: # A tibble: 4 x 5 +: Rank time NELEM F diff +: +: 1 61 72.95337 141232 0.0005165499 -10.41747 +: 2 62 81.25564 221395 0.0003670166 -18.71973 +: 3 63 29.42375 162676 0.0001808734 33.11215 +: 4 64 66.49157 157024 0.0004234484 -3.95566 + +#+begin_src R :results output :session :exports both +exp.ENRICH %>% pull(Rank) %>% min +exp.ENRICH %>% summary +exp.ENRICH %>% pull(time) %>% sd +exp.ENRICH %>% arrange(-time) +#+end_src + +#+RESULTS: +#+begin_example +[1] 1 + Rank time + Min. : 1.00 Min. : 32.55 + 1st Qu.:16.75 1st Qu.: 59.81 + Median :32.50 Median : 61.14 + Mean :32.50 Mean : 61.80 + 3rd Qu.:48.25 3rd Qu.: 63.00 + Max. :64.00 Max. :103.14 +[1] 12.66165 +# A tibble: 64 x 2 + Rank time + + 1 31 103.14403 + 2 59 97.99925 + 3 46 96.40160 + 4 44 91.13452 + 5 3 81.42977 + 6 20 77.99168 + 7 1 67.54315 + 8 36 66.69546 + 9 56 65.95448 +10 39 65.71467 +# ... with 54 more rows +#+end_example +*** Code to get the four runs together in the same data frame + +Run this once + +#+begin_src R :results output :session :exports both +exp27 = list() +#+end_src + +#+RESULTS: + +Get one of the following list at each time + +#+begin_src R :results output :session :exports both +DIR=list.files("exp_27_grisou_4_manual/exp_27", pattern="^scorep", full.names=TRUE)[[4]] +DIR +#+end_src + +#+RESULTS: +: [1] "exp_27_grisou_4_manual/exp_27/scorep-20180126_1506_45830289772336" + +For every time you defined DIR above, run the code below + +#+header: :var dep0=exp27_enrich +#+header: :var dep1=exp27_number_of_elements +#+begin_src R :results output :session :exports both +DIR +position = length(exp27) + 1 +exp27[position] <- list(exp.ENRICH %>% left_join(exp.ELEMENTS) %>% mutate(Case = position)); +length(exp27); +#+end_src + +#+RESULTS: +: [1] "exp_27_grisou_4_manual/exp_27/scorep-20180126_1506_45830289772336" +: Joining, by = "Rank" +: [1] 4 + +#+begin_src R :results output :session :exports both +exp.ALL <- do.call("bind_rows", exp27) %>% mutate(Case = as.factor(Case)) +exp.ALL +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 2,560 x 21 + Phase Code ID Rank visits time min_time max_time bytes_put + + 1 1 Computation 14 1 1 26.89853 138.7910 138.7910 0 + 2 1 Computation 14 2 1 41.27433 138.7933 138.7933 0 + 3 1 Computation 14 3 1 39.85998 138.7941 138.7941 0 + 4 1 Computation 14 4 1 38.27706 138.7927 138.7927 0 + 5 1 Computation 14 5 1 38.98377 138.7940 138.7940 0 + 6 1 Computation 14 6 1 37.20629 138.7926 138.7926 0 + 7 1 Computation 14 7 1 40.47147 138.7942 138.7942 0 + 8 1 Computation 14 8 1 40.49991 138.7933 138.7933 0 + 9 1 Computation 14 9 1 35.34918 138.7931 138.7931 0 +10 1 Computation 14 10 1 40.26140 138.7932 138.7932 0 +# ... with 2,550 more rows, and 12 more variables: bytes_get , +# ALLOCATION_SIZE , DEALLOCATION_SIZE , bytes_leaked , +# maximum_heap_memory_allocated , bytes_sent , +# bytes_received , NELEM , NPOIN , NBOUN , NPOI32 , +# Case +#+end_example + +*** People are anxious, so let's see the result (Rough initial LB Analysis \times 4) + +#+begin_src R :results output graphics :file img/exp27_stability_check.png :exports both :width 1000 :height 800 :session +exp.ALL %>% + group_by(Phase, Rank, Code, Case) %>% + summarize(Sum = sum(time)) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + ggplot(aes(x=as.integer(Phase), y=Sum, color=Rank)) + + geom_point() + geom_line(aes(group=Rank), alpha=.2) + + ylim(0,NA) + + theme_bw(base_size=18) + + facet_wrap(~Case) +#+end_src + +#+RESULTS: +[[file:img/exp27_stability_check.png]] + +#+begin_src R :results output graphics :file img/exp27_stability_check_v2_phase1.png :exports both :width 1000 :height 800 :session +exp.ALL %>% + filter(Phase == 1) %>% + group_by(Phase, Rank, Code) %>% + summarize(Sum = mean(time), SE=3*sd(time)/sqrt(n())) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + ggplot(aes(x=Rank, y=Sum)) + + geom_point(alpha=.5) + + geom_line(aes(group=Rank), alpha=.2) + + geom_errorbar(aes(ymin=Sum-SE, ymax=Sum+SE)) + + ylab("Mean") + + theme_bw(base_size=18) + + facet_wrap(~Phase) +#+end_src + +#+RESULTS: +[[file:img/exp27_stability_check_v2_phase1.png]] + +#+begin_src R :results output graphics :file img/exp27_stability_check_v2.png :exports both :width 1000 :height 800 :session +exp.ALL %>% + group_by(Phase, Rank, Code) %>% + summarize(Sum = mean(time), SE=3*sd(time)/sqrt(n())) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + ggplot(aes(x=Rank, y=Sum)) + + geom_point(alpha=.5) + + geom_line(aes(group=Rank), alpha=.2) + + geom_errorbar(aes(ymin=Sum-SE, ymax=Sum+SE), width=.5) + + ylab("Mean") + + theme_bw(base_size=18) + + facet_wrap(~Phase) +#+end_src + +#+RESULTS: +[[file:img/exp27_stability_check_v2.png]] + +*** Textual check just to be sure + +#+begin_src R :results output :session :exports both +exp.ALL %>% + filter(Rank == 27) %>% + filter(Phase == 2) %>% + pull(time) %>% + unique +#+end_src + +#+RESULTS: +: [1] 35.57875 35.51689 36.33292 36.20780 + +*** Save the data :ATTACH: +:PROPERTIES: +:Attachments: exp27_stability_check.csv.gz +:ID: 39cb473a-83fa-4019-a2fe-9aea7e4cb373 +:END: + + +#+begin_src R :results output :session :exports both +write_csv(exp.ALL, "exp27_stability_check.csv") +#+end_src + +#+RESULTS: +** 15-node chetemi (300 cores) :EXP28: +*** Goals +1. Finer partitioner mesh + #+BEGIN_EXAMPLE + Sources/kernel/parall/mod_par_partit_sfc.f90 + #+END_EXAMPLE + Change =DIM_BIN_CORE= from /128_ip/ to /256_ip/ +2. Check computing stability + - Run 5 times with /rank-elements.dat/ set to 1 +3. Try to automatize the refinement +*** Check Alya modifications of the =sfc= branch +#+begin_src shell :results output :dir /ssh:lille.g5k:./alya-bsc-sfc/ +svn diff +#+end_src + +#+RESULTS: +#+begin_example +Index: Sources/kernel/coupli/mod_commdom_alya.f90 +=================================================================== +--- Sources/kernel/coupli/mod_commdom_alya.f90 (revision 8445) ++++ Sources/kernel/coupli/mod_commdom_alya.f90 (working copy) +@@ -485,8 +485,8 @@ + ! | |_ moduls(ITASK_TIMSTE) + ! | |_ setgts(2_ip) <-- minimum of critical time steps. NUNCA PASA POR MODULS!! + ! |_Begste <-- aqui ya se tiene el minimo general de todos los modulos!! +- CPLNG%sendrecv(1,6) = (current_task==ITASK_BEGSTE).and.( current_when==ITASK_BEFORE).and.& ! \ +- ( iblok==1 ).and.(CPLNG%send_modul_i==modul ) ! |__ 2014Dic10 ++ !CPLNG%sendrecv(1,6) = (current_task==ITASK_BEGSTE).and.( current_when==ITASK_BEFORE).and.& ! \ ++ ! ( iblok==1 ).and.(CPLNG%send_modul_i==modul ) ! |__ 2014Dic10 + CPLNG%sendrecv(2,6) = (current_task==ITASK_BEGSTE).and.( current_when==ITASK_BEFORE).and.& ! | + ( iblok==1 ).and.(CPLNG%send_modul_j==modul) ! / + +Index: Sources/kernel/master/Alya.f90 +=================================================================== +--- Sources/kernel/master/Alya.f90 (revision 8445) ++++ Sources/kernel/master/Alya.f90 (working copy) +@@ -1,3 +1,4 @@ ++#include "scorep/SCOREP_User.inc" + !> @file Alya.f90 + !! @author Guillaume Houzeaux + !! @brief Ayla main +@@ -20,7 +21,13 @@ + use def_master, only : kfl_goblk + use def_master, only : kfl_gocou + use def_coupli, only : kfl_gozon ++ use mod_parall, only : PAR_MY_WORLD_RANK + implicit none ++ INTEGER :: iter,ierror ++ character*100 striter ++ character*100 iterfile ++ real :: tnow ++ real, dimension(2) :: tarray + ! + ! DLB should be disabled as we only wabnt to activate it for particular loops + ! Master does not disble to lend its resources automatically +@@ -39,6 +46,10 @@ + + call Parall(22270_ip) + ++ write(iterfile,'(a,i4.4,a)') 'iterations-', PAR_MY_WORLD_RANK, '.csv' ++ open(unit=2, file=iterfile) ++ ++ iter = 1 + optimization: do while ( kfl_goopt == 1 ) + + call Iniunk() +@@ -49,6 +60,11 @@ + time: do while ( kfl_gotim == 1 ) + + call Timste() ++ call ETIME(tarray, tnow) ++ write(2,*) tnow, tarray(1), tarray(2), PAR_MY_WORLD_RANK, iter ++ ++ write(striter, '(a,i3.3)') 'iter',iter ++ SCOREP_USER_REGION_BY_NAME_BEGIN(striter, SCOREP_USER_REGION_TYPE_PHASE) + + reset: do + call Begste() +@@ -77,6 +93,13 @@ + + call Endste() + ++ SCOREP_USER_REGION_BY_NAME_END(striter) ++ call ETIME(tarray, tnow) ++ write(2,*) tnow, tarray(1), tarray(2), PAR_MY_WORLD_RANK, iter ++ iter = iter + 1 ++ ++ ++ + call Filter(ITASK_ENDTIM) + call Output(ITASK_ENDTIM) + +@@ -91,6 +114,9 @@ + + end do optimization + ++ close(2) ++ + call Turnof() + ++ + end program Alya +Index: Sources/kernel/parall/mod_par_partit_sfc.f90 +=================================================================== +--- Sources/kernel/parall/mod_par_partit_sfc.f90 (revision 8445) ++++ Sources/kernel/parall/mod_par_partit_sfc.f90 (working copy) +@@ -36,7 +36,7 @@ + + implicit none + +- integer(ip),PARAMETER :: DIM_BIN_CORE = 128_ip ! Number of boxes per direction per partitioning process ++ integer(ip),PARAMETER :: DIM_BIN_CORE = 256_ip ! Number of boxes per direction per partitioning process + integer(ip),PARAMETER :: VISUALIZATION = 0_ip ! Do you want to visualize: 1=weight, 2=partition + integer(ip),PARAMETER :: CRITERIA = 1_ip ! Partition criteria: 0=nodes, 1=elements, 2=weigh.elem.(gauss points) + ! 3=weigh. elem. (entries) +Index: Sources/services/parall/par_prepro.f90 +=================================================================== +--- Sources/services/parall/par_prepro.f90 (revision 8445) ++++ Sources/services/parall/par_prepro.f90 (working copy) +@@ -19,6 +19,7 @@ + use mod_memory + use mod_par_partit_sfc, only : par_partit_sfc + use mod_parall, only : PAR_WORLD_SIZE ++ use mod_parall, only : PAR_MY_WORLD_RANK + use mod_parall, only : PAR_METIS4 + use mod_parall, only : PAR_SFC + use mod_parall, only : PAR_ORIENTED_BIN +@@ -32,6 +33,9 @@ + integer(ip) :: npoin_tmp, nelem_tmp, nboun_tmp + real(rp) :: time1,time2,time3,time4,time5 + character(100) :: messa_integ ++ character*100 dfile ++ integer(ip) ii ++ + ! + ! Output + ! +@@ -354,4 +358,20 @@ + end if + + ++ ! LUCAS ++ write(dfile,'(a,i4.4,a)') 'domain-', PAR_MY_WORLD_RANK, '.csv' ++ open(unit=12345, file=dfile) ++ ++ write(12345,*) "T1", PAR_MY_WORLD_RANK, nelem, npoin, nboun, npoi3-npoi2 ++ do ii = 1, nelem ++ write(12345,*) "T2", PAR_MY_WORLD_RANK, ii, ltype(ii) ++ end do ++ do ii = 1, nboun ++ write(12345,*) "T3", PAR_MY_WORLD_RANK, ii, ltypb(ii) ++ end do ++ ++ close(12345) ++ ! END LUCAS ++ ++ + end subroutine par_prepro +Index: Utils/user/alya2pos/compile.sh +=================================================================== +--- Utils/user/alya2pos/compile.sh (revision 8445) ++++ Utils/user/alya2pos/compile.sh (working copy) +@@ -1,27 +1,27 @@ +-ifort -c -traceback -O3 -fpp def_kintyp.f90 -o def_kintyp.o +-ifort -c -traceback -O3 def_elmtyp.f90 -o def_elmtyp.o +-ifort -c -traceback -O3 def_inpout.f90 -o def_inpout.o +-ifort -c -traceback -O3 gidele.f90 -o gidele.o +-ifort -c -traceback -O3 gidres_he.f90 -o gidres_he.o +-ifort -c -traceback -O3 gidres_gp.f90 -o gidres_gp.o +-ifort -c -traceback -O3 elmtyp.f90 -o elmtyp.o +-ifort -c -traceback -O3 ecoute.f90 -o ecoute.o +-ifort -c -traceback -O3 runend.f90 -o runend.o +-ifort -c -traceback -O3 connpo.f90 -o connpo.o +-ifort -c -traceback -O3 vu_msh.f90 -o vu_msh.o +-ifort -c -traceback -O3 vu_res.f90 -o vu_res.o +-ifort -c -traceback -O3 vu_filter.f90 -o vu_filter.o +-ifort -c -traceback -O3 ensmsh.f90 -o ensmsh.o +-ifort -c -traceback -O3 ensmsh_bin.f90 -o ensmsh_bin.o +-ifort -c -traceback -O3 ensres_bin.f90 -o ensres_bin.o +-ifort -c -traceback -O3 ensres_filter.f90 -o ensres_filter.o +-ifort -c -traceback -O3 ensmsh_filter.f90 -o ensmsh_filter.o +-ifort -c -traceback -O3 ensres.f90 -o ensres.o +-ifort -c -traceback -O3 txtres.f90 -o txtres.o +-ifort -c -traceback -O3 alya2pos.f90 -o alya2pos.o +-ifort -c -traceback -O3 wristl.f90 -o wristl.o +-ifort -c -traceback -O3 ensmsh_filter.f90 -o ensmsh_filter.o +-ifort -c -traceback -O3 reahed.f90 -o reahed.o +-ifort -c -traceback -O3 zfemres.f90 -o zfemres.o +-ifort -traceback -O3 -o alya2pos.x *.o ++gfortran -c -O3 -cpp def_kintyp.f90 -o def_kintyp.o ++gfortran -c -O3 def_elmtyp.f90 -o def_elmtyp.o ++gfortran -c -O3 def_inpout.f90 -o def_inpout.o ++gfortran -c -O3 gidele.f90 -o gidele.o ++gfortran -c -O3 gidres_he.f90 -o gidres_he.o ++gfortran -c -O3 gidres_gp.f90 -o gidres_gp.o ++gfortran -c -O3 elmtyp.f90 -o elmtyp.o ++gfortran -c -O3 ecoute.f90 -o ecoute.o ++gfortran -c -O3 runend.f90 -o runend.o ++gfortran -c -O3 connpo.f90 -o connpo.o ++gfortran -c -O3 vu_msh.f90 -o vu_msh.o ++gfortran -c -O3 vu_res.f90 -o vu_res.o ++gfortran -c -O3 vu_filter.f90 -o vu_filter.o ++gfortran -c -O3 ensmsh.f90 -o ensmsh.o ++gfortran -c -O3 ensmsh_bin.f90 -o ensmsh_bin.o ++gfortran -c -O3 ensres_bin.f90 -o ensres_bin.o ++gfortran -c -O3 ensres_filter.f90 -o ensres_filter.o ++gfortran -c -O3 ensmsh_filter.f90 -o ensmsh_filter.o ++gfortran -c -O3 ensres.f90 -o ensres.o ++gfortran -c -O3 txtres.f90 -o txtres.o ++gfortran -c -O3 alya2pos.f90 -o alya2pos.o ++gfortran -c -O3 wristl.f90 -o wristl.o ++gfortran -c -O3 ensmsh_filter.f90 -o ensmsh_filter.o ++gfortran -c -O3 reahed.f90 -o reahed.o ++gfortran -c -O3 zfemres.f90 -o zfemres.o ++gfortran -O3 -o alya2pos.x *.o + rm -rf *.o rm *_genmod.f90 *.mod +#+end_example + +*** Stability check +**** Initial rank-elements.dat + +#+begin_src shell :results output +for i in $(seq 1 64); do + echo $i 1 +done > rank-elements.dat +scp rank-elements.dat lille.g5k:. +#+end_src + +#+RESULTS: + +**** Execution script + +I have manually ran the script that control the experiment: +- Disabling HT +- Disabling Turboboost +- Log the state of the machines in per-machine ORG files + +#+begin_src shell :results output +source ~/spack-ALYA/share/spack/setup-env.sh +export PATH=$(spack location -i openmpi)/bin:$PATH + +pushd $HOME/WORK-RICARD/resp_sfc/ +export EXPEDIR=$HOME/exp_28_chetemi_15_manual/ +mkdir -p $EXPEDIR +for RUN in $(seq 1 5); do + RUNKEY="RUN${RUN}" + SCOREPDIR="scorep-${RUNKEY}" + echo $RUNKEY + + # copy machine-file + cp $HOME/machine-file . + + # Run the program + $(which mpirun) \ + --mca btl_base_warn_component_unused 0 \ + --bind-to core:overload-allowed \ + --report-bindings \ + -x SCOREP_TOTAL_MEMORY=3900MB \ + -x SCOREP_MPI_ENABLE_GROUPS=ALL \ + -x SCOREP_ENABLE_TRACING=FALSE \ + -x SCOREP_ENABLE_PROFILING=TRUE \ + -x SCOREP_EXPERIMENT_DIRECTORY=$SCOREPDIR \ + -x LD_LIBRARY_PATH=$(spack location -i openmpi)/lib/ \ + -np 65 \ + -machinefile ./machine-file \ + $HOME/alya-bsc-sfc/Executables/unix/Alya.x fensap + + # Save the data into the SCOREPDIR + cp machine-file $SCOREPDIR + cp *.log fensap.dat $SCOREPDIR + mv domain-*.csv iterations-*.csv $SCOREPDIR + cp rank-elements.dat $SCOREPDIR + + # Move to EXPEDIR + mv $SCOREPDIR $EXPEDIR +done > execution.log +mv execution.log $EXPEDIR +popd +#+end_src + +#+RESULTS: +: RUN1 +: RUN2 +: RUN3 +: RUN4 +: RUN5 +**** Data Transformation +***** 1. Read Iteration Timings + +#+name: exp28_iteration_timings +#+begin_src R :results output :session :exports both +suppressMessages(library(tidyverse)); +do.call("bind_rows", lapply(list.files(DIR, pattern="iterations.*csv$", full.names=TRUE), + function(file) { + read_table(file, col_names=FALSE) %>% + rename(Time.User = X1, + Time.System = X2, + Time.Run = X3, + Rank = X4, + Iteration = X5) %>% + filter(Iteration != 0) %>% + group_by(Rank, Iteration) %>% + summarize(Start = min(Time.User), + End = max(Time.User)) %>% + group_by(Rank) %>% + mutate(End = End - min(Start), + Start = Start - min(Start)) + })) -> exp.iter; +#+end_src + +#+RESULTS: exp28_iteration_timings +#+begin_example +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_double(), + X2 = col_double(), + X3 = col_double(), + X4 = col_integer(), + X5 = col_integer() +) +#+end_example + +***** 2. Transform profile.cubex + +#+name: exp28_current_dir +#+begin_src R :results verbatim :session :exports both +print(DIR) +#+end_src + +#+RESULTS: exp28_current_dir +: exp_28_chetemi_15_manual//scorep-RUN1 + +#+name: exp28_cubex_to_open +#+header: :var DIR=exp28_current_dir +#+begin_src shell :results output +cd $DIR +~/install/cube-4.3.5/bin/cube_dump -c all -m all -s csv2 profile.cubex > profile.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex | tail -n+33 | head -n279 | tr '(' ';' | sed -e "s/[[:space:]]*//g" -e "s/,.*$//" -e "s/id=//" -e "s/:[^;]*;/;/" > regions-codes.csv +~/install/cube-4.3.5/bin/cube_dump -w profile.cubex > cube_info.txt +#+end_src + +#+RESULTS: exp28_cubex_to_open + +***** 3. Parse the call tree + +#+name: exp28_cube_calltree +#+header: :var dep0=exp28_cubex_to_open +#+header: :var DIR=exp28_current_dir +#+begin_src perl :results output :exports both +use strict; +my($filename) = $DIR . "/cube_info.txt"; +my($line); +open(INPUT,$filename); +my($in_CALLTREE) = 0; + +my($VAR_iteration) = -1; +my($VAR_type) = -1; +my($VAR_id) = -1; + +my($filename_out) = $filename; +$filename_out =~ s/txt$/csv/; + +open(OUTPUT,"> ".$filename_out); + +while(defined($line=)) { + chomp $line; + if($line =~ "CALL TREE") { $in_CALLTREE = 1; } + if(!$in_CALLTREE) { next; } + if($line =~ "SYSTEM DIMENSION") { $in_CALLTREE = 0; } + + if($line =~ /^iter(\d*)\s*.*id=(\d*),.*/) { + $VAR_iteration = $1; + $VAR_type = "Computation"; + $VAR_id = $2; + # print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } + if($line =~ /^\s*\|\-(\S*)\s.*id=(\d*),.*/) { +# print $line."\n"; + $VAR_type = $1; + $VAR_id = $2; +# print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } +} +close(OUTPUT); +print $filename_out; +#+end_src + +#+RESULTS: exp28_cube_calltree +: exp_28_chetemi_15_manual//scorep-RUN1/cube_info.csv + +***** 4. Enrich the call tree + +#+name: exp28_enrich +#+header: :var CSV=exp28_cube_calltree +#+header: :var DIR=exp28_current_dir +#+begin_src R :results output :session :exports both +suppressMessages(library(tidyverse)); +PROFILE = paste0(DIR, "/profile.csv"); +REGION = paste0(DIR, "/regions-codes.csv"); +df.PROF <- read_csv(PROFILE); +exp.REGION <- read_delim(REGION, col_names=FALSE, delim=";") %>% + rename(ID = X2, Name = X1); +exp.ENRICH <- read_delim(CSV, delim=",", col_names=FALSE) %>% + rename(Phase = X1, Code = X2, ID = X3) %>% + left_join(df.PROF, by=c("ID" = "Cnode ID")) %>% + rename(Rank = `Thread ID`) %>% + filter(Code == "Computation") %>% + mutate(Phase = as.integer(Phase)) %>% + filter(Rank != 0) +#+end_src + +#+RESULTS: exp28_enrich +#+begin_example +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +#+end_example + +***** X. Verify the load imbalance (and eventually summarize it if necessary) +****** Rough initial LB Analysis + +#+header: :var dep0=exp28_enrich +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session +exp.ENRICH %>% + group_by(Phase, Rank, Code) %>% + summarize(Sum = sum(time)) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + ggplot(aes(x=as.integer(Phase), y=Sum, color=Rank)) + + geom_point() + geom_line(aes(group=Rank), alpha=.2) + + ylim(0,NA) + + theme_bw(base_size=18) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-18310RUI/figure18310MBK.png]] + +****** Calculate the average + +#+name: exp28_lb_average +#+header: :var dep0=exp28_enrich +#+begin_src R :results output :session :exports both +exp.ENRICH %>% + filter(Phase <= 9) %>% + group_by(Rank) %>% + summarize(time = mean(time)) -> exp.ENRICH +exp.ENRICH +#+end_src + +#+RESULTS: exp28_lb_average +#+begin_example +# A tibble: 64 x 2 + Rank time + + 1 1 42.92296 + 2 2 64.98924 + 3 3 66.13120 + 4 4 62.14705 + 5 5 64.66830 + 6 6 59.88956 + 7 7 61.76306 + 8 8 67.09068 + 9 9 57.22863 +10 10 64.70995 +# ... with 54 more rows +#+end_example + +***** 5. The number of elements on T1 + +#+name: exp28_number_of_elements_shell +#+header: :var DIR=exp28_current_dir +#+begin_src shell :results table :colnames yes +cd $DIR +echo "Type Rank NELEM NPOIN NBOUN NPOI32" +cat domain-*.csv | grep ^.*T1 +#+end_src + +#+RESULTS: exp28_number_of_elements_shell +| Type | Rank | NELEM | NPOIN | NBOUN | NPOI32 | +| T1 | 0 | 0 | 0 | 553075 | -1 | +| T1 | 1 | 143862 | 27303 | 2832 | 2362 | +| T1 | 2 | 143921 | 48988 | 11415 | 4174 | +| T1 | 3 | 143822 | 47644 | 9906 | 4947 | +| T1 | 4 | 143695 | 43667 | 8197 | 3921 | +| T1 | 5 | 143675 | 46242 | 9442 | 4191 | +| T1 | 6 | 143752 | 43209 | 7433 | 4876 | +| T1 | 7 | 143768 | 48927 | 11007 | 4580 | +| T1 | 8 | 143792 | 47241 | 10828 | 2428 | +| T1 | 9 | 143945 | 39022 | 6023 | 3238 | +| T1 | 10 | 144243 | 47255 | 9907 | 4088 | +| T1 | 11 | 143833 | 52178 | 12835 | 4720 | +| T1 | 12 | 143842 | 48489 | 13095 | 2082 | +| T1 | 13 | 143754 | 47833 | 10503 | 3770 | +| T1 | 14 | 143817 | 35239 | 3680 | 3938 | +| T1 | 15 | 143776 | 38633 | 5338 | 4204 | +| T1 | 16 | 144023 | 37650 | 5066 | 3630 | +| T1 | 17 | 143773 | 42217 | 7839 | 3269 | +| T1 | 18 | 143746 | 32849 | 2614 | 3662 | +| T1 | 19 | 143824 | 37433 | 4732 | 4118 | +| T1 | 20 | 143821 | 38632 | 5583 | 3796 | +| T1 | 21 | 143849 | 42960 | 8158 | 3320 | +| T1 | 22 | 143789 | 46616 | 9577 | 4177 | +| T1 | 23 | 143887 | 43733 | 8093 | 3975 | +| T1 | 24 | 143817 | 40890 | 6690 | 3904 | +| T1 | 25 | 143742 | 37534 | 4465 | 4698 | +| T1 | 26 | 143741 | 34257 | 3128 | 3959 | +| T1 | 27 | 143887 | 39461 | 6142 | 3517 | +| T1 | 28 | 143876 | 35937 | 4111 | 3778 | +| T1 | 29 | 144122 | 46780 | 9805 | 3777 | +| T1 | 30 | 143919 | 47984 | 11611 | 2013 | +| T1 | 31 | 143703 | 43909 | 9221 | 3191 | +| T1 | 32 | 143864 | 41551 | 7138 | 3673 | +| T1 | 33 | 143920 | 45680 | 9560 | 3251 | +| T1 | 34 | 143917 | 43972 | 8473 | 3576 | +| T1 | 35 | 143733 | 36957 | 5195 | 2808 | +| T1 | 36 | 143974 | 37075 | 5274 | 2810 | +| T1 | 37 | 143763 | 33188 | 2855 | 3421 | +| T1 | 38 | 143915 | 43860 | 8409 | 3526 | +| T1 | 39 | 143733 | 43199 | 7629 | 4442 | +| T1 | 40 | 144072 | 50989 | 12285 | 3312 | +| T1 | 41 | 143873 | 34536 | 7704 | 3727 | +| T1 | 42 | 143719 | 36628 | 6242 | 3728 | +| T1 | 43 | 143858 | 46666 | 11830 | 3627 | +| T1 | 44 | 143779 | 49048 | 12550 | 4756 | +| T1 | 45 | 143927 | 50855 | 11622 | 4467 | +| T1 | 46 | 143677 | 51004 | 11178 | 4918 | +| T1 | 47 | 143977 | 51247 | 11323 | 4801 | +| T1 | 48 | 143872 | 49662 | 11307 | 3910 | +| T1 | 49 | 144004 | 43839 | 8154 | 3973 | +| T1 | 50 | 143728 | 43759 | 8449 | 3406 | +| T1 | 51 | 143841 | 41709 | 7511 | 3472 | +| T1 | 52 | 143929 | 50158 | 11469 | 4245 | +| T1 | 53 | 143971 | 41111 | 6898 | 3765 | +| T1 | 54 | 143831 | 55000 | 13607 | 4988 | +| T1 | 55 | 143580 | 55935 | 14390 | 4804 | +| T1 | 56 | 143751 | 56151 | 13884 | 6112 | +| T1 | 57 | 143993 | 53463 | 13028 | 4431 | +| T1 | 58 | 144026 | 55324 | 14121 | 4205 | +| T1 | 59 | 143762 | 46085 | 12560 | 4506 | +| T1 | 60 | 143832 | 50092 | 12743 | 4424 | +| T1 | 61 | 143674 | 44626 | 8912 | 3867 | +| T1 | 62 | 143894 | 26865 | 1694 | 2141 | +| T1 | 63 | 143735 | 36132 | 7190 | 4182 | +| T1 | 64 | 144024 | 38547 | 8615 | 3754 | + +#+name: exp28_number_of_elements +#+header: :var TABLE=exp28_number_of_elements_shell +#+begin_src R :results output :session :exports both :colnames yes +TABLE %>% + as_tibble() %>% + filter(Rank != 0) %>% + select(-Type) -> exp.ELEMENTS; +exp.ELEMENTS; +#+end_src + +#+RESULTS: exp28_number_of_elements +#+begin_example +# A tibble: 64 x 5 + Rank NELEM NPOIN NBOUN NPOI32 + + 1 1 143862 27303 2832 2362 + 2 2 143921 48988 11415 4174 + 3 3 143822 47644 9906 4947 + 4 4 143695 43667 8197 3921 + 5 5 143675 46242 9442 4191 + 6 6 143752 43209 7433 4876 + 7 7 143768 48927 11007 4580 + 8 8 143792 47241 10828 2428 + 9 9 143945 39022 6023 3238 +10 10 144243 47255 9907 4088 +# ... with 54 more rows +#+end_example + +***** 6. Integrate number of elements with compute cost + +#+name: exp28_integrate +#+header: :var dep1=exp28_lb_average +#+header: :var dep0=exp28_number_of_elements +#+begin_src R :results output :session :exports both +exp.ENRICH %>% + left_join(exp.ELEMENTS) %>% + select(Rank, time, NELEM) -> df.INTEGRATED; +df.INTEGRATED +#+end_src + +#+RESULTS: exp28_integrate +#+begin_example +Joining, by = "Rank" +# A tibble: 64 x 3 + Rank time NELEM + + 1 1 28.18224 143862 + 2 2 41.61312 143921 + 3 3 40.45403 143822 + 4 4 38.41553 143695 + 5 5 40.30081 143675 + 6 6 37.42849 143752 + 7 7 42.15025 143768 + 8 8 42.18351 143792 + 9 9 36.24237 143945 +10 10 40.10288 144243 +# ... with 54 more rows +#+end_example + +***** 7. Do the cumsum + +#+name: exp28_cumsum +#+header: :var dep0=exp28_integrate +#+begin_src R :results output :session :exports both +df.INTEGRATED %>% + arrange(Rank) %>% + mutate(TIMESUM = cumsum(time)/sum(time)) %>% + mutate(NELEMSUM = cumsum(NELEM)/sum(NELEM)) -> df.CUMSUM; +df.CUMSUM -> df.exp28.CUMSUM; +df.CUMSUM +#+end_src + +#+RESULTS: exp28_cumsum +#+begin_example +# A tibble: 64 x 5 + Rank time NELEM TIMESUM NELEMSUM + + 1 1 60.43743 184387 0.01510068 0.02002865 + 2 2 68.98525 135390 0.03233709 0.03473510 + 3 3 62.07687 134635 0.04784740 0.04935954 + 4 4 61.41787 140751 0.06319305 0.06464832 + 5 5 66.77389 137213 0.07987694 0.07955279 + 6 6 60.27102 146712 0.09493604 0.09548907 + 7 7 69.88611 141375 0.11239754 0.11084562 + 8 8 63.52919 131541 0.12827072 0.12513399 + 9 9 64.41341 152492 0.14436482 0.14169811 +10 10 66.84620 136200 0.16106678 0.15649254 +# ... with 54 more rows +#+end_example + +***** 8. Target with Inverting Hilbert Load Curve +****** The function +#+name: exp28_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df_cum_inverse = function(df,yval) { + df = rbind(data.frame(x=0,y=0),df[names(df) %in% c("x","y")]) + N = nrow(df); + xval = rep(NA,length(yval)) +# print(N); + for(i in 1:length(yval)) { +# print(yval[i]) + if(yval[i]<=0) {xval[i] = 0;} + else if(yval[i]>=1) {xval[i] = 1;} + else { + idx = df %>% mutate(Order = 1:nrow(.)) %>% filter(y >= yval[i]) %>% slice(1) %>% pull(Order) + idx = idx-1; +# print(idx) +# print(df[idx:(idx+1),]) + eps_x = (df[idx+1,]$x - df[idx,]$x); + eps_y = (df[idx+1,]$y - df[idx,]$y); + xval[i] = ((yval[i]-df[idx,]$y)*eps_x/eps_y + df[idx,]$x); + } + } + return(xval); +} +#+end_src + +#+RESULTS: exp28_hilbert_invertion_function +****** Calculate the Target +#+name: exp28_target +#+header: :var dep0=exp28_cumsum +#+header: :var dep1=exp28_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df.CUMSUM %>% + rename(y = TIMESUM, x = NELEMSUM) -> df.CUMSUM.2; +nb_proc = nrow(df.CUMSUM.2) +df_cum_inverse(df.CUMSUM.2, y = (1:nb_proc)/nb_proc) -> l.TARGET; +df.CUMSUM.2 %>% + mutate(Target.x = l.TARGET, + Target.y = (1:nb_proc)/nb_proc) -> df.TARGET; +df.TARGET; +#+end_src + +#+name: exp28_tidying +#+header: :var dep0=exp28_target +#+begin_src R :results output :session :exports both +df.TARGET %>% select(Rank, x, y) -> df.1; +df.TARGET %>% select(Rank, Target.x, Target.y) -> df.2; +df.1 %>% + mutate(Category = "Observed") %>% + bind_rows(df.2 %>% + mutate(Category = "Target") %>% + rename(x = Target.x, y = Target.y)) -> df.TARGET.tidy; +df.TARGET %>% select(x, y, Target.x, Target.y) %>% filter(x == Target.x) +#+end_src + +#+RESULTS: +: # A tibble: 1 x 4 +: x y Target.x Target.y +: +: 1 1 1 1 1 + +****** Plot with the arrows + +#+header: :var dep0=exp28_tidying +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 800 :height 400 :session +m1=0 +m2=m1+1 +NP = df.TARGET.tidy %>% pull(Rank) %>% unique %>% length +df.TARGET.tidy %>% + ggplot(aes(x=x, y=y, group+category)) + + theme_bw(base_size=20) + + geom_curve (data=df.TARGET %>% filter(x!=1), aes(x=x, y=y, xend=Target.x, yend=Target.y), arrow=arrow(length = unit(0.03, "npc"))) + + geom_point(size=1, aes(shape=Category, color=as.factor(Rank%%8))) + + scale_color_brewer(palette = "Set1") + +# geom_line(data=df.TARGET.tidy %>% filter(Category == "Observed")) + + coord_cartesian(xlim=c(m1, m2), ylim=c(m1, m2)) #+ +# geom_hline(yintercept=(0:NP)/NP) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900T5J.png]] + +***** 9. Output the new =rank-elements.dat= file + +#+header: :var dep0=exp28_target +#+begin_src R :results output :session :exports both +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + write_delim("rank-elements.dat", delim=" ", col_names=FALSE) +#+end_src + +#+RESULTS: + +***** 9.1 Check which rank has the larger correction + +#+header: :var dep0=exp28_target +#+begin_src R :results output :session :exports both +t = 61 +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + filter(Rank >= t) +#+end_src + +#+RESULTS: +: # A tibble: 4 x 2 +: Rank Target +: +: 1 61 0.9131876 +: 2 62 1.1703797 +: 3 63 1.0715776 +: 4 64 1.0107468 + +#+begin_src R :results output :session :exports both +df.INTEGRATED %>% + mutate(F = time / NELEM) %>% + mutate(diff = mean(time) - time) %>% + filter(Rank >= t) +#+end_src + +#+RESULTS: +: # A tibble: 4 x 5 +: Rank time NELEM F diff +: +: 1 61 72.95337 141232 0.0005165499 -10.41747 +: 2 62 81.25564 221395 0.0003670166 -18.71973 +: 3 63 29.42375 162676 0.0001808734 33.11215 +: 4 64 66.49157 157024 0.0004234484 -3.95566 + +#+begin_src R :results output :session :exports both +exp.ENRICH %>% pull(Rank) %>% min +exp.ENRICH %>% summary +exp.ENRICH %>% pull(time) %>% sd +exp.ENRICH %>% arrange(-time) +#+end_src + +#+RESULTS: +#+begin_example +[1] 1 + Rank time + Min. : 1.00 Min. : 32.55 + 1st Qu.:16.75 1st Qu.: 59.81 + Median :32.50 Median : 61.14 + Mean :32.50 Mean : 61.80 + 3rd Qu.:48.25 3rd Qu.: 63.00 + Max. :64.00 Max. :103.14 +[1] 12.66165 +# A tibble: 64 x 2 + Rank time + + 1 31 103.14403 + 2 59 97.99925 + 3 46 96.40160 + 4 44 91.13452 + 5 3 81.42977 + 6 20 77.99168 + 7 1 67.54315 + 8 36 66.69546 + 9 56 65.95448 +10 39 65.71467 +# ... with 54 more rows +#+end_example +**** Code to get the four runs together in the same data frame + +Run this once + +#+begin_src R :results output :session :exports both +exp28 = list() +#+end_src + +#+RESULTS: + +Get one of the following list at each time + +#+begin_src R :results output :session :exports both +DIR=list.files("exp_28_chetemi_15_manual", pattern="^scorep", full.names=TRUE)[[5]] +DIR +#+end_src + +#+RESULTS: +: [1] "exp_28_chetemi_15_manual/scorep-RUN5" + +For every time you defined DIR above, run the code below + +#+header: :var dep0=exp28_enrich +#+header: :var dep1=exp28_number_of_elements +#+begin_src R :results output :session :exports both +DIR +position = length(exp28) + 1 +exp28[position] <- list(exp.ENRICH %>% left_join(exp.ELEMENTS) %>% mutate(Case = position)); +length(exp28); +#+end_src + +#+RESULTS: +: [1] "exp_28_chetemi_15_manual/scorep-RUN5" +: Joining, by = "Rank" +: [1] 5 + +#+begin_src R :results output :session :exports both +exp.ALL <- do.call("bind_rows", exp28) %>% mutate(Case = as.factor(Case)) +exp.ALL +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 64,000 x 21 + Phase Code ID Rank visits time min_time max_time bytes_put + + 1 1 Computation 14 1 1 28.73742 50.40382 50.40382 0 + 2 1 Computation 14 2 1 43.04585 50.40640 50.40640 0 + 3 1 Computation 14 3 1 41.43878 50.40730 50.40730 0 + 4 1 Computation 14 4 1 38.93097 50.40588 50.40588 0 + 5 1 Computation 14 5 1 41.11821 50.40721 50.40721 0 + 6 1 Computation 14 6 1 38.17838 50.40571 50.40571 0 + 7 1 Computation 14 7 1 42.57766 50.40739 50.40739 0 + 8 1 Computation 14 8 1 43.47140 50.40639 50.40639 0 + 9 1 Computation 14 9 1 37.08179 50.40619 50.40619 0 +10 1 Computation 14 10 1 40.99082 50.40620 50.40620 0 +# ... with 63,990 more rows, and 12 more variables: bytes_get , +# ALLOCATION_SIZE , DEALLOCATION_SIZE , bytes_leaked , +# maximum_heap_memory_allocated , bytes_sent , +# bytes_received , NELEM , NPOIN , NBOUN , NPOI32 , +# Case +#+end_example + +**** Textual check just to be sure + +#+begin_src R :results output :session :exports both +exp.ALL %>% + filter(Rank == 27) %>% + filter(Phase == 2) %>% + pull(time) %>% + unique +#+end_src + +#+RESULTS: +: [1] 36.85080 36.88063 36.85116 36.73310 36.63673 + +**** Save the data :ATTACH: +:PROPERTIES: +:Attachments: exp28_stability_check.csv.gz +:ID: 05da4262-5e83-43d7-809a-f0f03cd159f2 +:END: + +#+begin_src R :results output :session :exports both +write_csv(exp.ALL, "exp28_stability_check.csv.gz") +#+end_src + +#+RESULTS: + +**** Visualize +***** All points (each facet is a run) + +#+begin_src R :results output graphics :file img/exp28_stability_check.png :exports both :width 1000 :height 800 :session +read_csv("data/05/da4262-5e83-43d7-809a-f0f03cd159f2/exp28_stability_check.csv.gz") %>% + group_by(Phase, Rank, Code, Case) %>% + summarize(Sum = sum(time)) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + ggplot(aes(x=as.integer(Phase), y=Sum, color=Rank)) + + geom_point() + geom_line(aes(group=Rank), alpha=.2) + + ylim(0,NA) + + theme_bw(base_size=18) + + facet_wrap(~Case) +#+end_src + +#+RESULTS: +[[file:img/exp28_stability_check.png]] + +***** Per-rank average of points (all iterations of all five runs together) + +#+begin_src R :results output graphics :file img/exp28_stability_check_v2_phase1.png :exports both :width 1000 :height 800 :session +read_csv("data/05/da4262-5e83-43d7-809a-f0f03cd159f2/exp28_stability_check.csv.gz") %>% + filter(Phase == 1) %>% + group_by(Phase, Rank, Code) %>% + summarize(Sum = mean(time), SE=3*sd(time)/sqrt(n())) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + ggplot(aes(x=Rank, y=Sum)) + + geom_point(alpha=.5) + + geom_line(aes(group=Rank), alpha=.2) + + geom_errorbar(aes(ymin=Sum-SE, ymax=Sum+SE)) + + ylab("Mean") + + theme_bw(base_size=18) + + facet_wrap(~Phase) +#+end_src + +#+RESULTS: +[[file:img/exp28_stability_check_v2_phase1.png]] + +***** Per-iteration variability (considering five runs) + +#+begin_src R :results output graphics :file img/exp28_stability_check_v2.png :exports both :width 1000 :height 1600 :session +read_csv("data/05/da4262-5e83-43d7-809a-f0f03cd159f2/exp28_stability_check.csv.gz") %>% + group_by(Phase, Rank, Code) %>% + summarize(Sum = mean(time), SE=3*sd(time)/sqrt(n())) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + ggplot(aes(x=Rank, y=Sum)) + + geom_point(alpha=.2) + + geom_errorbar(aes(ymin=Sum-SE, ymax=Sum+SE), width=.5) + + ylab("Mean") + + theme_bw(base_size=18) + + facet_wrap(~Phase) +#+end_src + +#+RESULTS: +[[file:img/exp28_stability_check_v2.png]] +*** Automatize the refinement + +The workflow has been separated on its own ORG file: +- [[./Refinement_Workflow.org]] +Check its first section for details on how to use it. +** 23-node chetemi/chifflet (524 cores), 1 it., check automation :EXP29: +*** Goal +- Verify if automatization works +*** Execution script + +I have manually ran the script that control the experiment: +- Disabling HT +- Disabling Turboboost +- Log the state of the machines in per-machine ORG files + +#+begin_src shell :results output +# Configuration part +export STEPS=50 +export NP=129 +export EXPEKEY=exp_29_chetemi_chifflet_23_manual_${NP} +export EXPEDIR=${HOME}/${EXPEKEY} +export MACHINEFILE=$HOME/machine-file +export CASEDIR=$HOME/WORK-RICARD/resp_sfc/ +export CASENAME=fensap +export ALYA=$HOME/alya-bsc-sfc/Executables/unix/Alya.x +rm -rf $EXPEDIR +mkdir -p $EXPEDIR +export GITREPO=$HOME/Alya-Perf/ + +# Control part +cat $MACHINEFILE | uniq > /tmp/mf-uniq +$GITREPO/scripts/control_experiment.sh /tmp/mf-uniq $EXPEDIR + +# Use spack +source ~/spack-ALYA/share/spack/setup-env.sh +# Use the correct mpicc/mpif90/mpirun +export PATH=$(spack location -i openmpi)/bin:$PATH +# Use the cube_dump for the refinement workflow +export PATH=$(spack location -i cube)/bin:$PATH + +# Generate initial rank-elements.dat to case +for i in $(seq 1 $(echo ${NP} - 1 | bc)); do + echo $i 1 +done > $CASEDIR/rank-elements.dat + +pushd $CASEDIR +for RUN in $(seq 1 ${STEPS}); do + RUNKEY="RUN${RUN}_of_${STEPS}" + SCOREPDIR="scorep-${RUNKEY}" + rm -rf $SCOREPDIR + echo $RUNKEY + + # copy machine-file + cp $MACHINEFILE . + + # Run the program + $(which mpirun) \ + --mca btl_base_warn_component_unused 0 \ + --bind-to core:overload-allowed \ + --report-bindings \ + -x SCOREP_TOTAL_MEMORY=3900MB \ + -x SCOREP_MPI_ENABLE_GROUPS=ALL \ + -x SCOREP_ENABLE_TRACING=FALSE \ + -x SCOREP_ENABLE_PROFILING=TRUE \ + -x SCOREP_EXPERIMENT_DIRECTORY=$SCOREPDIR \ + -x LD_LIBRARY_PATH=$(spack location -i openmpi)/lib/ \ + -np ${NP} \ + -machinefile $MACHINEFILE \ + $ALYA $CASENAME + + # Save the data into the SCOREPDIR + cp $MACHINEFILE $SCOREPDIR + cp *.log ${CASENAME}.dat $SCOREPDIR + mv domain-*.csv iterations-*.csv $SCOREPDIR + cp rank-elements.dat $SCOREPDIR + + # Run the refinement workflow + pushd $SCOREPDIR + cp $GITREPO/Refinement_Workflow.org . + emacs -batch -l ~/.emacs.d/init.el \ + --eval "(setq vc-follow-symlinks nil)" \ + --eval "(setq ess-ask-for-ess-directory nil)" \ + --eval "(setq org-export-babel-evaluate t)" \ + Refinement_Workflow.org \ + --funcall org-babel-execute-buffer + + # Adjust rank-elements_next.dat + NRANKS=rank-elements_next.dat + NRANKSNEW=rank-elements_next_new.dat + MAXRANK=$(cat $NRANKS | tail -n1 | cut -d" " -f1) + MAXRANKVALUE=$(cat $NRANKS | tail -n1 | cut -d" " -f2) + SUMSEC=$(echo $(cat $NRANKS | cut -d" " -f2 | tr '\n' '+' | sed 's/+$//') | bc -l) + DIFF=$(echo "$MAXRANK - $SUMSEC" | bc -l) + head -n-1 $NRANKS > $NRANKSNEW + echo "$MAXRANK $(echo "$MAXRANKVALUE + $DIFF" | bc -l)" >> $NRANKSNEW + cp $NRANKSNEW $CASEDIR/rank-elements.dat + popd + + # Move all data to EXPEDIR for archiving purposes + mv $SCOREPDIR $EXPEDIR +done > execution.log +mv execution.log $EXPEDIR +popd +#+end_src +*** Analysis + +- Program has executed for 13 times before crashing + +** 27-nodes (432 cores) parasilo, 1 it., check update :FAIL:EXP30: +*** Goal + +- Verify if 256+1 works now with the update at Alya +- Launch from =parasilo-1=. + +*** Execution script + +#+begin_src shell :results output +# Configuration part +export STEPS=3 +export NP=257 +export EXPEKEY=exp_30_parasilo_27_manual_${NP} +export EXPEDIR=${HOME}/${EXPEKEY} +export MACHINEFILE=$HOME/machine-file +export CASEDIR=$HOME/WORK-RICARD/resp_sfc/ +export CASENAME=fensap +export ALYA=$HOME/svn-bsc-sfc-2/Executables/unix/Alya.x +rm -rf $EXPEDIR +mkdir -p $EXPEDIR +export GITREPO=$HOME/Alya-Perf/ + +# Control part +cat $MACHINEFILE | uniq > /tmp/mf-uniq +$GITREPO/scripts/control_experiment.sh /tmp/mf-uniq $EXPEDIR + +# Use spack +source ~/spack-ALYA/share/spack/setup-env.sh +# Use the correct mpicc/mpif90/mpirun +export PATH=$(spack location -i openmpi)/bin:$PATH +# Use the cube_dump for the refinement workflow +export PATH=$(spack location -i cube)/bin:$PATH + +# Generate initial rank-elements.dat to case +for i in $(seq 1 $(echo ${NP} - 1 | bc)); do + echo $i 1 +done > $CASEDIR/rank-elements.dat + +pushd $CASEDIR +for RUN in $(seq 1 ${STEPS}); do + RUNKEY="RUN${RUN}_of_${STEPS}" + SCOREPDIR="scorep-${RUNKEY}" + rm -rf $SCOREPDIR + echo $RUNKEY + + # copy machine-file + cp $MACHINEFILE . + + # Run the program + $(which mpirun) \ + --mca btl_base_warn_component_unused 0 \ + --bind-to core:overload-allowed \ + --report-bindings \ + -x SCOREP_TOTAL_MEMORY=3900MB \ + -x SCOREP_MPI_ENABLE_GROUPS=ALL \ + -x SCOREP_ENABLE_TRACING=FALSE \ + -x SCOREP_ENABLE_PROFILING=TRUE \ + -x SCOREP_EXPERIMENT_DIRECTORY=$SCOREPDIR \ + -x LD_LIBRARY_PATH=$(spack location -i openmpi)/lib/ \ + -np ${NP} \ + -machinefile $MACHINEFILE \ + $ALYA $CASENAME + + # Save the data into the SCOREPDIR + cp $MACHINEFILE $SCOREPDIR + cp *.log ${CASENAME}.dat $SCOREPDIR + mv domain-*.csv iterations-*.csv $SCOREPDIR + cp rank-elements.dat $SCOREPDIR + + # Run the refinement workflow + pushd $SCOREPDIR + cp $GITREPO/Refinement_Workflow.org . + emacs -batch -l ~/.emacs.d/init.el \ + --eval "(setq vc-follow-symlinks nil)" \ + --eval "(setq ess-ask-for-ess-directory nil)" \ + --eval "(setq org-export-babel-evaluate t)" \ + Refinement_Workflow.org \ + --funcall org-babel-execute-buffer + + # Adjust rank-elements_next.dat + NRANKS=rank-elements_next.dat + NRANKSNEW=rank-elements_next_new.dat + MAXRANK=$(cat $NRANKS | tail -n1 | cut -d" " -f1) + MAXRANKVALUE=$(cat $NRANKS | tail -n1 | cut -d" " -f2) + SUMSEC=$(echo $(cat $NRANKS | cut -d" " -f2 | tr '\n' '+' | sed 's/+$//') | bc -l) + DIFF=$(echo "$MAXRANK - $SUMSEC" | bc -l) + head -n-1 $NRANKS > $NRANKSNEW + echo "$MAXRANK $(echo "$MAXRANKVALUE + $DIFF" | bc -l)" >> $NRANKSNEW + cp $NRANKSNEW $CASEDIR/rank-elements.dat + popd + + # Move all data to EXPEDIR for archiving purposes + mv $SCOREPDIR $EXPEDIR +done > execution.log +mv execution.log $EXPEDIR +popd +#+end_src +*** Summary +- Program Alya didn't work. +** 4-node (76 cores) rennes, check update :EXP31: +*** Goal + +- Verify if Alya works + +- Pre-check + - Install xclip ess + +*** Execution script + +#+begin_src shell :results output +# Configuration part +export STEPS=5 +export NSTEPS=1 +export NP=76 +export EXPEKEY=exp_31_rennes_4_manual_${NP} +export EXPEDIR=${HOME}/${EXPEKEY} +export MACHINEFILE=$HOME/machine-file +export CASEDIR=$HOME/WORK-RICARD/resp_sfc/ +export CASENAME=fensap +export ALYA=$HOME/alya-bsc-sfc/Executables/unix/Alya.g +rm -rf $EXPEDIR +mkdir -p $EXPEDIR +export GITREPO=$HOME/Alya-Perf/ + +# Control part +cat $MACHINEFILE | uniq > /tmp/mf-uniq +$GITREPO/scripts/control_experiment.sh /tmp/mf-uniq $EXPEDIR + +# Use spack +source ~/spack-ALYA/share/spack/setup-env.sh +# Use the correct mpicc/mpif90/mpirun +export PATH=$(spack location -i openmpi)/bin:$PATH +# Use the cube_dump for the refinement workflow +export PATH=$(spack location -i cube)/bin:$PATH + +# Generate initial rank-elements.dat to case +for i in $(seq 1 $(echo ${NP} - 1 | bc)); do + echo $i 1 +done > $CASEDIR/rank-elements.dat + +pushd $CASEDIR +sed -i "s/NUMBER_OF_STEPS=.*$/NUMBER_OF_STEPS=${NSTEPS}/" fensap.dat +for RUN in $(seq 1 ${STEPS}); do + RUNKEY="RUN${RUN}_of_${STEPS}" + SCOREPDIR="scorep-${RUNKEY}" + rm -rf $SCOREPDIR + echo $RUNKEY + + # Run the program + $(which mpirun) \ + --mca btl_base_warn_component_unused 0 \ + --bind-to core:overload-allowed \ + --report-bindings \ + -x SCOREP_TOTAL_MEMORY=3900MB \ + -x SCOREP_MPI_ENABLE_GROUPS=ALL \ + -x SCOREP_ENABLE_TRACING=FALSE \ + -x SCOREP_ENABLE_PROFILING=TRUE \ + -x SCOREP_EXPERIMENT_DIRECTORY=$SCOREPDIR \ + -x LD_LIBRARY_PATH=$(spack location -i openmpi)/lib/ \ + -np ${NP} \ + -machinefile $MACHINEFILE \ + $ALYA $CASENAME + + # Save the data into the SCOREPDIR + cp $MACHINEFILE $SCOREPDIR + cp *.log ${CASENAME}.dat $SCOREPDIR + mv domain-*.csv iterations-*.csv $SCOREPDIR + cp rank-elements.dat $SCOREPDIR + + # Run the refinement workflow + pushd $SCOREPDIR + cp $GITREPO/Refinement_Workflow.org . + emacs -batch -l ~/.emacs.d/init.el \ + --eval "(setq vc-follow-symlinks nil)" \ + --eval "(setq ess-ask-for-ess-directory nil)" \ + --eval "(setq org-export-babel-evaluate t)" \ + Refinement_Workflow.org \ + --funcall org-babel-execute-buffer + + # Adjust rank-elements_next.dat + NRANKS=rank-elements_next.dat + cp $NRANKS $CASEDIR/rank-elements.dat + popd + + # Move all data to EXPEDIR for archiving purposes + mv $SCOREPDIR $EXPEDIR +done > execution.log +mv execution.log $EXPEDIR +popd +#+end_src +** 65-nodes (1040 cores) rennes paravance, check update :EXP32: +*** Goal +- Image has been update with xclip ess +- Verify if Alya works up to 1024 cores + +- paravance-7[0123] with major problems + - Unable to launch mpirun because of that + - Solution was to remove then from the machinefile + +*** Execution script + +#+begin_src shell :results output +# Configuration part +export STEPS=5 +export NSTEPS=10 +export EXPEKEY=exp_32_paravance_68_manual +export EXPEDIR=${HOME}/${EXPEKEY} +export MACHINEFILE=$HOME/paravance +export CASEDIR=$HOME/WORK-RICARD/resp_sfc/ +export CASENAME=fensap +export ALYA=$HOME/alya-bsc-sfc/Executables/unix/Alya.x +rm -rf $EXPEDIR +mkdir -p $EXPEDIR +export GITREPO=$HOME/Alya-Perf/ + +# Control part +cat $MACHINEFILE | uniq > /tmp/mf-uniq +$GITREPO/scripts/control_experiment.sh /tmp/mf-uniq $EXPEDIR + +# Use spack +source ~/spack-ALYA/share/spack/setup-env.sh +# Use the correct mpicc/mpif90/mpirun +export PATH=$(spack location -i openmpi)/bin:$PATH +# Use the cube_dump for the refinement workflow +export PATH=$(spack location -i cube)/bin:$PATH + +pushd $CASEDIR +sed -i "s/NUMBER_OF_STEPS=.*$/NUMBER_OF_STEPS=${NSTEPS}/" fensap.dat +for NP in 1025 513 257 129; do + + # Generate initial rank-elements.dat to case + for i in $(seq 1 $(echo ${NP} - 1 | bc)); do + echo $i 1 + done > $CASEDIR/rank-elements.dat + +for RUN in $(seq 1 ${STEPS}); do + RUNKEY="RUN_${NP}_STEP_${RUN}_of_${STEPS}" + SCOREPDIR="scorep-${RUNKEY}" + rm -rf $SCOREPDIR + echo $RUNKEY + + # Run the program + $(which mpirun) \ + --mca btl_base_warn_component_unused 0 \ + --bind-to core:overload-allowed \ + --report-bindings \ + -x SCOREP_TOTAL_MEMORY=3900MB \ + -x SCOREP_MPI_ENABLE_GROUPS=ALL \ + -x SCOREP_ENABLE_TRACING=FALSE \ + -x SCOREP_ENABLE_PROFILING=TRUE \ + -x SCOREP_EXPERIMENT_DIRECTORY=$SCOREPDIR \ + -x LD_LIBRARY_PATH=$(spack location -i openmpi)/lib/ \ + -np ${NP} \ + -machinefile $MACHINEFILE \ + $ALYA $CASENAME + + # Save the data into the SCOREPDIR + cp $MACHINEFILE $SCOREPDIR + cp *.log ${CASENAME}.dat $SCOREPDIR + mv domain-*.csv iterations-*.csv $SCOREPDIR + cp rank-elements.dat $SCOREPDIR + + # Run the refinement workflow + pushd $SCOREPDIR + cp $GITREPO/Refinement_Workflow.org . + emacs -batch -l ~/.emacs.d/init.el \ + --eval "(setq vc-follow-symlinks nil)" \ + --eval "(setq ess-ask-for-ess-directory nil)" \ + --eval "(setq org-export-babel-evaluate t)" \ + Refinement_Workflow.org \ + --funcall org-babel-execute-buffer + + # Adjust rank-elements_next.dat + NRANKS=rank-elements_next.dat + cp $NRANKS $CASEDIR/rank-elements.dat + popd + + # Move all data to EXPEDIR for archiving purposes + mv $SCOREPDIR $EXPEDIR +done +done > execution.log +mv execution.log $EXPEDIR +popd +#+end_src +*** Data Transformation +**** 3. Parse the call tree (code block has been tangled and committed) + +#+begin_src perl :results output :exports both :tangle scripts/cube_calltree.pl :tangle-mode (identity #o755) +#!/usr/bin/perl +use strict; +my($DIR) = $ARGV[0]; +my($filename) = $DIR . "/cube_info.txt"; +my($line); +open(INPUT,$filename); +my($in_CALLTREE) = 0; + +my($VAR_iteration) = -1; +my($VAR_type) = -1; +my($VAR_id) = -1; + +my($filename_out) = $filename; +$filename_out =~ s/txt$/csv/; + +open(OUTPUT,"> ".$filename_out); + +while(defined($line=)) { + chomp $line; + if($line =~ "CALL TREE") { $in_CALLTREE = 1; } + if(!$in_CALLTREE) { next; } + if($line =~ "SYSTEM DIMENSION") { $in_CALLTREE = 0; } + + if($line =~ /^iter(\d*)\s*.*id=(\d*),.*/) { + $VAR_iteration = $1; + $VAR_type = "Computation"; + $VAR_id = $2; + # print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } + if($line =~ /^\s*\|\-(\S*)\s.*id=(\d*),.*/) { +# print $line."\n"; + $VAR_type = $1; + $VAR_id = $2; +# print "|$VAR_iteration | $VAR_type | $VAR_id |\n"; + print OUTPUT "$VAR_iteration,$VAR_type,$VAR_id\n"; + } +} +close(OUTPUT); +print($filename_out . "\n"); +#+end_src + +#+RESULTS: exp32_cube_calltree +: exp_20_grisou_4_manual/scorep-20180125_1159_34483171676480//cube_info.csv + +**** 5. The number of elements on T1 (code block has been tangled and committed) + +#+begin_src shell :results output :tangle scripts/no_elements_domain.sh :tangle-mode (identity #o755) +#!/bin/bash +DIR=$1 +FILE=$2 + +pushd $DIR +echo "Rank,NELEM,NPOIN,NBOUN,NPOI32" > $FILE +zcat domain-*.csv.gz | grep ^.*T1 | sed -e "s/ T1//" -e "s/[[:space:]]\+/ /g" -e "s/^ //" -e "s/ /,/g" >> $FILE +popd +#+end_src + +**** 0. DIR + +#+begin_src R :results output :session :exports both +DIR="exp_32_paravance_65_manual/scorep-RUN_513_STEP_2_of_5" +#+end_src + +#+RESULTS: + +#+name: exp32_current_dir +#+begin_src R :results verbatim :session :exports both +print(DIR) +#+end_src + +#+RESULTS: exp32_current_dir +: exp_32_paravance_65_manual/scorep-RUN_513_STEP_2_of_5 + +**** Read one DIR function + +#+name: exp32_read_one_dir +#+begin_src R :results output :session :exports both +suppressMessages(library(tidyverse)); +read_one_DIR <- function(DIR) { + # Get basic info about the experience + tibble(Dir=DIR) %>% + separate(Dir, into=c("X", "Case"), sep="/") %>% select(-X) %>% + separate(Case, into=c("X1", "NP", "X3", "Seq", "X5", "X6"), sep="_") %>% select(-X1, -X3, -X5, -X6) -> basic.info; + + # 2. Transform profile.cubex + PROFILE.CUBEX = paste0(DIR, "/profile.cubex"); + PROFILE.CSV = paste0(DIR, "/profile.csv"); + REGIONSCODES.CSV = paste0(DIR, "/regions-codes.csv"); + CUBEINFO.TXT = paste0(DIR, "/cube_info.txt"); + CUBEINFO.CSV = paste0(DIR, "/cube_info.csv"); + + system2("cube_dump", args=paste("-c all -m all -s csv2", PROFILE.CUBEX, ">", PROFILE.CSV)); + system2("cube_dump", args=paste("-w", PROFILE.CUBEX, "| tail -n+33 | head -n279 | tr '(' ';' | sed -e \"s/[[:space:]]*//g\" -e \"s/,.*$//\" -e \"s/id=//\" -e \"s/:[^;]*;/;/\" > ", REGIONSCODES.CSV)); + system2("cube_dump", args=paste("-w", PROFILE.CUBEX, ">", CUBEINFO.TXT)); + system2("./scripts/cube_calltree.pl", args=DIR); + + # 1. Read Iteration Timings + if(FALSE){ + do.call("bind_rows", lapply(list.files(DIR, pattern="iterations.*csv.gz$", full.names=TRUE), + function(file) { + read_table(file, col_names=FALSE) %>% + rename(Time.User = X1, + Time.System = X2, + Time.Run = X3, + Rank = X4, + Iteration = X5) %>% + filter(Iteration != 0) %>% + group_by(Rank, Iteration) %>% + summarize(Start = min(Time.User), + End = max(Time.User)) %>% + group_by(Rank) %>% + mutate(End = End - min(Start), + Start = Start - min(Start)) + })) -> exp.iter; + } + + # 4. Enrich the call tree + df.PROF <- read_csv(PROFILE.CSV); + exp.REGION <- read_delim(REGIONSCODES.CSV, col_names=FALSE, delim=";") %>% + rename(ID = X2, Name = X1); + exp.ENRICH <- read_delim(CUBEINFO.CSV, delim=",", col_names=FALSE) %>% + rename(Phase = X1, Code = X2, ID = X3) %>% + left_join(df.PROF, by=c("ID" = "Cnode ID")) %>% + rename(Rank = `Thread ID`) %>% + filter(Code == "Computation") %>% + mutate(Phase = as.integer(Phase)) %>% + filter(Rank != 0) %>% + mutate(NP = basic.info$NP, + Seq = basic.info$Seq) + + return(exp.ENRICH); + + # 5. The number of elements on T1 + NOELEMENTS.CSV = "no_elements.csv"; + system2("./scripts/no_elements_domain.sh", args=paste(DIR, NOELEMENTS.CSV)); + exp.ELEMENTS <- read_csv(paste0(DIR, "/", NOELEMENTS.CSV)) %>% + filter(Rank != 0) + + # Calculate the average + exp.ENRICH %>% + filter(Phase <= 9) %>% + group_by(Rank) %>% + summarize(time = mean(time)) -> exp.ENRICH.AVERAGE; + + # 6. Integrate number of elements with compute cost + exp.ENRICH.AVERAGE %>% + left_join(exp.ELEMENTS) %>% + select(Rank, time, NELEM) -> df.INTEGRATED; + + # 7. Do the cumsum + df.INTEGRATED %>% + arrange(Rank) %>% + mutate(TIMESUM = cumsum(time)/sum(time)) %>% + mutate(NELEMSUM = cumsum(NELEM)/sum(NELEM)) -> df.CUMSUM; +} +#+end_src + +#+RESULTS: exp32_read_one_dir + +**** Test read one DIR + +#+header: :var dep0=exp32_read_one_dir +#+begin_src R :results output :session :exports both +df.TMP <- read_one_DIR(DIR) +#+end_src + +#+RESULTS: +#+begin_example +exp_32_paravance_65_manual/scorep-RUN_513_STEP_2_of_5/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +Warning message: +In rbind(names(probs), probs_f) : + number of columns of result is not a multiple of vector length (arg 1) +#+end_example + +#+begin_src R :results output :session :exports both +df.TMP +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 5,120 x 18 + Phase Code ID Rank visits time min_time max_time bytes_put + + 1 1 Computation 14 1 1 4.485452 10.63609 10.63609 0 + 2 1 Computation 14 2 1 4.495881 10.63586 10.63586 0 + 3 1 Computation 14 3 1 4.677876 10.63597 10.63597 0 + 4 1 Computation 14 4 1 4.510341 10.63589 10.63589 0 + 5 1 Computation 14 5 1 4.474415 10.63602 10.63602 0 + 6 1 Computation 14 6 1 4.673044 10.63584 10.63584 0 + 7 1 Computation 14 7 1 4.433373 10.63605 10.63605 0 + 8 1 Computation 14 8 1 4.791339 10.63593 10.63593 0 + 9 1 Computation 14 9 1 4.694905 10.63616 10.63616 0 +10 1 Computation 14 10 1 4.711255 10.63598 10.63598 0 +# ... with 5,110 more rows, and 9 more variables: bytes_get , +# ALLOCATION_SIZE , DEALLOCATION_SIZE , bytes_leaked , +# maximum_heap_memory_allocated , bytes_sent , +# bytes_received , NP , Seq +#+end_example + +**** Read ALL DIRs + +#+begin_src R :results output :session :exports both +do.call("bind_rows", lapply(list.files("exp_32_paravance_65_manual", pattern="scorep", full.names=TRUE), + function(case) { read_one_DIR(case); } + )) -> exp32.ALL; +#+end_src + +#+RESULTS: +#+begin_example +exp_32_paravance_65_manual/scorep-RUN_1025_STEP_1_of_5/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 20500 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 15376 bytes_sent an integer 12200300300 file 2 15376 bytes_received an integer 12200300300 row 3 15377 bytes_sent an integer 12200300300 col 4 15377 bytes_received an integer 12200300300 expected 5 15378 bytes_sent an integer 12200300300 actual # ... with 1 more variables: file +... ................. ... ............................................. ........ ............................................. ...... ............................................. .... ............................................. ... ............................................. ... ............................................. ........ ............................................. ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_32_paravance_65_manual/scorep-RUN_1025_STEP_2_of_5/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 20500 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 15376 bytes_sent an integer 12200300300 file 2 15376 bytes_received an integer 12200300300 row 3 15377 bytes_sent an integer 12200300300 col 4 15377 bytes_received an integer 12200300300 expected 5 15378 bytes_sent an integer 12200300300 actual # ... with 1 more variables: file +... ................. ... ............................................. ........ ............................................. ...... ............................................. .... ............................................. ... ............................................. ... ............................................. ........ ............................................. ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_32_paravance_65_manual/scorep-RUN_1025_STEP_3_of_5/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 20500 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 15376 bytes_sent an integer 12200300300 file 2 15376 bytes_received an integer 12200300300 row 3 15377 bytes_sent an integer 12200300300 col 4 15377 bytes_received an integer 12200300300 expected 5 15378 bytes_sent an integer 12200300300 actual # ... with 1 more variables: file +... ................. ... ............................................. ........ ............................................. ...... ............................................. .... ............................................. ... ............................................. ... ............................................. ........ ............................................. ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_32_paravance_65_manual/scorep-RUN_1025_STEP_4_of_5/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 20500 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 15376 bytes_sent an integer 12200300300 file 2 15376 bytes_received an integer 12200300300 row 3 15377 bytes_sent an integer 12200300300 col 4 15377 bytes_received an integer 12200300300 expected 5 15378 bytes_sent an integer 12200300300 actual # ... with 1 more variables: file +... ................. ... ............................................. ........ ............................................. ...... ............................................. .... ............................................. ... ............................................. ... ............................................. ........ ............................................. ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_32_paravance_65_manual/scorep-RUN_1025_STEP_5_of_5/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 20500 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 15376 bytes_sent an integer 12200300300 file 2 15376 bytes_received an integer 12200300300 row 3 15377 bytes_sent an integer 12200300300 col 4 15377 bytes_received an integer 12200300300 expected 5 15378 bytes_sent an integer 12200300300 actual # ... with 1 more variables: file +... ................. ... ............................................. ........ ............................................. ...... ............................................. .... ............................................. ... ............................................. ... ............................................. ........ ............................................. ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_32_paravance_65_manual/scorep-RUN_129_STEP_1_of_5/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_32_paravance_65_manual/scorep-RUN_129_STEP_2_of_5/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_32_paravance_65_manual/scorep-RUN_129_STEP_3_of_5/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_32_paravance_65_manual/scorep-RUN_129_STEP_4_of_5/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_32_paravance_65_manual/scorep-RUN_129_STEP_5_of_5/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_32_paravance_65_manual/scorep-RUN_257_STEP_1_of_5/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 4626 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 3856 bytes_sent an integer 3059002124 file 2 3856 bytes_received an integer 3059002124 row 3 3857 bytes_sent an integer 3059002124 col 4 3857 bytes_received an integer 3059002124 expected 5 3858 bytes_sent an integer 3059002124 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_32_paravance_65_manual/scorep-RUN_257_STEP_2_of_5/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 4626 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 3856 bytes_sent an integer 3059002124 file 2 3856 bytes_received an integer 3059002124 row 3 3857 bytes_sent an integer 3059002124 col 4 3857 bytes_received an integer 3059002124 expected 5 3858 bytes_sent an integer 3059002124 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_32_paravance_65_manual/scorep-RUN_257_STEP_3_of_5/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 4626 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 3856 bytes_sent an integer 3059002124 file 2 3856 bytes_received an integer 3059002124 row 3 3857 bytes_sent an integer 3059002124 col 4 3857 bytes_received an integer 3059002124 expected 5 3858 bytes_sent an integer 3059002124 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_32_paravance_65_manual/scorep-RUN_257_STEP_4_of_5/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 4626 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 3856 bytes_sent an integer 3059002124 file 2 3856 bytes_received an integer 3059002124 row 3 3857 bytes_sent an integer 3059002124 col 4 3857 bytes_received an integer 3059002124 expected 5 3858 bytes_sent an integer 3059002124 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_32_paravance_65_manual/scorep-RUN_257_STEP_5_of_5/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 4626 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 3856 bytes_sent an integer 3059002124 file 2 3856 bytes_received an integer 3059002124 row 3 3857 bytes_sent an integer 3059002124 col 4 3857 bytes_received an integer 3059002124 expected 5 3858 bytes_sent an integer 3059002124 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_32_paravance_65_manual/scorep-RUN_513_STEP_1_of_5/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_32_paravance_65_manual/scorep-RUN_513_STEP_2_of_5/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_32_paravance_65_manual/scorep-RUN_513_STEP_3_of_5/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) + |===== | 7% |====== | 8% |======= | 9% |======= | 10% |======== | 12% |========= | 13% |========== | 14% |=========== | 15% |============ | 16% |============= | 18% |============== | 19% |=============== | 20% |=============== | 21% |=============== | 23% 1 MB |=============== | 24% 1 MB |================ | 25% 1 MB |================= | 26% 1 MB |================== | 28% 1 MB |=================== | 29% 1 MB |=================== | 30% 1 MB |==================== | 31% 1 MB |===================== | 33% 1 MB |====================== | 34% 1 MB |====================== | 35% 1 MB |======================= | 36% 1 MB |======================== | 37% 1 MB |========================= | 39% 1 MB |========================== | 40% 1 MB |=========================== | 41% 1 MB |=========================== | 42% 1 MB |============================ | 43% 1 MB |============================= | 45% 1 MB |============================== | 46% 2 MB |=============================== | 47% 2 MB |=============================== | 48% 2 MB |================================ | 50% 2 MB |================================= | 51% 2 MB |================================== | 52% 2 MB |================================== | 53% 2 MB |=================================== | 55% 2 MB |==================================== | 56% 2 MB |===================================== | 57% 2 MB |====================================== | 58% 2 MB |======================================= | 60% 2 MB |======================================= | 61% 2 MB |======================================== | 62% 2 MB |========================================= | 63% 2 MB |========================================== | 65% 2 MB |=========================================== | 66% 2 MB |============================================ | 67% 2 MB |============================================= | 69% 3 MB |============================================= | 70% 3 MB |============================================== | 71% 3 MB |=============================================== | 73% 3 MB |================================================ | 74% 3 MB |================================================= | 75% 3 MB |================================================== | 77% 3 MB |=================================================== | 78% 3 MB |=================================================== | 79% 3 MB |==================================================== | 81% 3 MB |===================================================== | 82% 3 MB |====================================================== | 83% 3 MB |======================================================= | 85% 3 MB |======================================================== | 86% 3 MB |========================================================= | 87% 3 MB |========================================================== | 89% 3 MB |========================================================== | 90% 3 MB |=========================================================== | 91% 3 MB |============================================================ | 93% 4 MB |============================================================= | 94% 4 MB |============================================================== | 95% 4 MB |=============================================================== | 97% 4 MB |================================================================| 98% 4 MB |================================================================| 99% 4 MB |=================================================================| 100% 4 MB +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_32_paravance_65_manual/scorep-RUN_513_STEP_4_of_5/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_32_paravance_65_manual/scorep-RUN_513_STEP_5_of_5/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +There were 15 warnings (use warnings() to see them) +#+end_example + +#+begin_src R :results output :session :exports both +exp32.ALL +#+end_src + +**** ALL Rough initial LB Analysis + +#+begin_src R :results output graphics :file img/exp32_ALL.png :exports both :width 1200 :height 800 :session +exp32.ALL %>% + mutate(NP = as.integer(NP), + Seq = as.integer(Seq)) %>% + group_by(NP, Seq, Phase, Rank, Code) %>% + summarize(Sum = sum(time)) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + ggplot(aes(x=as.integer(Phase), y=Sum, color=Rank)) + + geom_line(aes(group=Rank), alpha=.1) + + geom_point() + + ylim(0,NA) + + theme_bw(base_size=18) + + facet_grid(NP~Seq, scales="free") +#+end_src + +#+RESULTS: +[[file:img/exp32_ALL.png]] + +**** Save data for Arnaud's analysis :ATTACH: +:PROPERTIES: +:Attachments: exp32_compute_ALL.csv.gz +:ID: 572c142b-dd52-4aff-bc0a-1c26f5bad357 +:END: + +#+begin_src R :results output :session :exports both +write_csv(exp32.ALL, "exp32_compute_ALL.csv.gz"); +#+end_src + +#+RESULTS: + +**** TODO X. Verify the load imbalance (and eventually summarize it if necessary) +***** Rough initial LB Analysis + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session +df.TMP %>% + group_by(Phase, Rank, Code) %>% + summarize(Sum = sum(time)) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + ggplot(aes(x=as.integer(Phase), y=Sum, color=Rank)) + + geom_line(aes(group=Rank), alpha=.1) + + geom_point() + + ylim(0,NA) + + theme_bw(base_size=18) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-22415aoO/figure22415k9p.png]] + +***** Calculate the average + +#+name: exp32_lb_average +#+header: :var dep0=exp32_enrich +#+begin_src R :results output :session :exports both + +exp.ENRICH +#+end_src + +#+RESULTS: exp32_lb_average +#+begin_example +# A tibble: 64 x 2 + Rank time + + 1 1 42.92296 + 2 2 64.98924 + 3 3 66.13120 + 4 4 62.14705 + 5 5 64.66830 + 6 6 59.88956 + 7 7 61.76306 + 8 8 67.09068 + 9 9 57.22863 +10 10 64.70995 +# ... with 54 more rows +#+end_example + +**** TODO 8. Target with Inverting Hilbert Load Curve +***** The function +#+name: exp32_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df_cum_inverse = function(df,yval) { + df = rbind(data.frame(x=0,y=0),df[names(df) %in% c("x","y")]) + N = nrow(df); + xval = rep(NA,length(yval)) +# print(N); + for(i in 1:length(yval)) { +# print(yval[i]) + if(yval[i]<=0) {xval[i] = 0;} + else if(yval[i]>=1) {xval[i] = 1;} + else { + idx = df %>% mutate(Order = 1:nrow(.)) %>% filter(y >= yval[i]) %>% slice(1) %>% pull(Order) + idx = idx-1; +# print(idx) +# print(df[idx:(idx+1),]) + eps_x = (df[idx+1,]$x - df[idx,]$x); + eps_y = (df[idx+1,]$y - df[idx,]$y); + xval[i] = ((yval[i]-df[idx,]$y)*eps_x/eps_y + df[idx,]$x); + } + } + return(xval); +} +#+end_src + +#+RESULTS: exp32_hilbert_invertion_function +***** Calculate the Target +#+name: exp32_target +#+header: :var dep0=exp32_cumsum +#+header: :var dep1=exp32_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df.CUMSUM %>% + rename(y = TIMESUM, x = NELEMSUM) -> df.CUMSUM.2; +nb_proc = nrow(df.CUMSUM.2) +df_cum_inverse(df.CUMSUM.2, y = (1:nb_proc)/nb_proc) -> l.TARGET; +df.CUMSUM.2 %>% + mutate(Target.x = l.TARGET, + Target.y = (1:nb_proc)/nb_proc) -> df.TARGET; +df.TARGET; +#+end_src + +#+name: exp32_tidying +#+header: :var dep0=exp32_target +#+begin_src R :results output :session :exports both +df.TARGET %>% select(Rank, x, y) -> df.1; +df.TARGET %>% select(Rank, Target.x, Target.y) -> df.2; +df.1 %>% + mutate(Category = "Observed") %>% + bind_rows(df.2 %>% + mutate(Category = "Target") %>% + rename(x = Target.x, y = Target.y)) -> df.TARGET.tidy; +df.TARGET %>% select(x, y, Target.x, Target.y) %>% filter(x == Target.x) +#+end_src + +#+RESULTS: +: # A tibble: 1 x 4 +: x y Target.x Target.y +: +: 1 1 1 1 1 + +***** Plot with the arrows + +#+header: :var dep0=exp32_tidying +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 800 :height 400 :session +m1=0 +m2=m1+1 +NP = df.TARGET.tidy %>% pull(Rank) %>% unique %>% length +df.TARGET.tidy %>% + ggplot(aes(x=x, y=y, group+category)) + + theme_bw(base_size=20) + + geom_curve (data=df.TARGET %>% filter(x!=1), aes(x=x, y=y, xend=Target.x, yend=Target.y), arrow=arrow(length = unit(0.03, "npc"))) + + geom_point(size=1, aes(shape=Category, color=as.factor(Rank%%8))) + + scale_color_brewer(palette = "Set1") + +# geom_line(data=df.TARGET.tidy %>% filter(Category == "Observed")) + + coord_cartesian(xlim=c(m1, m2), ylim=c(m1, m2)) #+ +# geom_hline(yintercept=(0:NP)/NP) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900T5J.png]] + +**** TODO 9. Output the new =rank-elements.dat= file + +#+header: :var dep0=exp32_target +#+begin_src R :results output :session :exports both +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + write_delim("rank-elements.dat", delim=" ", col_names=FALSE) +#+end_src + +#+RESULTS: + +**** TODO 9.1 Check which rank has the larger correction + +#+header: :var dep0=exp32_target +#+begin_src R :results output :session :exports both +t = 61 +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + filter(Rank >= t) +#+end_src + +#+RESULTS: +: # A tibble: 4 x 2 +: Rank Target +: +: 1 61 0.9131876 +: 2 62 1.1703797 +: 3 63 1.0715776 +: 4 64 1.0107468 + +#+begin_src R :results output :session :exports both +df.INTEGRATED %>% + mutate(F = time / NELEM) %>% + mutate(diff = mean(time) - time) %>% + filter(Rank >= t) +#+end_src + +#+RESULTS: +: # A tibble: 4 x 5 +: Rank time NELEM F diff +: +: 1 61 72.95337 141232 0.0005165499 -10.41747 +: 2 62 81.25564 221395 0.0003670166 -18.71973 +: 3 63 29.42375 162676 0.0001808734 33.11215 +: 4 64 66.49157 157024 0.0004234484 -3.95566 + +#+begin_src R :results output :session :exports both +exp.ENRICH %>% pull(Rank) %>% min +exp.ENRICH %>% summary +exp.ENRICH %>% pull(time) %>% sd +exp.ENRICH %>% arrange(-time) +#+end_src + +#+RESULTS: +#+begin_example +[1] 1 + Rank time + Min. : 1.00 Min. : 32.55 + 1st Qu.:16.75 1st Qu.: 59.81 + Median :32.50 Median : 61.14 + Mean :32.50 Mean : 61.80 + 3rd Qu.:48.25 3rd Qu.: 63.00 + Max. :64.00 Max. :103.14 +[1] 12.66165 +# A tibble: 64 x 2 + Rank time + + 1 31 103.14403 + 2 59 97.99925 + 3 46 96.40160 + 4 44 91.13452 + 5 3 81.42977 + 6 20 77.99168 + 7 1 67.54315 + 8 36 66.69546 + 9 56 65.95448 +10 39 65.71467 +# ... with 54 more rows +#+end_example + +** 65-nodes (1040 cores) rennes paravance, check many rounds :EXP33: +*** Execution script + +#+begin_src shell :results output +# Configuration part +export STEPS=50 +export TIMESTEPS=10 +export EXPEKEY=exp_33_paravance_65_manual +export EXPEDIR=${HOME}/${EXPEKEY} +export MACHINEFILE=$HOME/paravance +export CASEDIR=$HOME/WORK-RICARD/resp_sfc/ +export CASENAME=fensap +export ALYA=$HOME/alya-bsc-sfc/Executables/unix/Alya.x +rm -rf $EXPEDIR +mkdir -p $EXPEDIR +export GITREPO=$HOME/Alya-Perf/ + +# Control part +cat $MACHINEFILE | uniq > /tmp/mf-uniq +$GITREPO/scripts/control_experiment.sh /tmp/mf-uniq $EXPEDIR + +# Use spack +source ~/spack-ALYA/share/spack/setup-env.sh +# Use the correct mpicc/mpif90/mpirun +export PATH=$(spack location -i openmpi)/bin:$PATH +# Use the cube_dump for the refinement workflow +export PATH=$(spack location -i cube)/bin:$PATH + +pushd $CASEDIR +sed -i "s/NUMBER_OF_STEPS=.*$/NUMBER_OF_STEPS=${TIMESTEPS}/" fensap.dat +for NP in 513 1025; do + + # Generate initial rank-elements.dat to case + for i in $(seq 1 $(echo ${NP} - 1 | bc)); do + echo $i 1 + done > $CASEDIR/rank-elements.dat + +for RUN in $(seq 1 ${STEPS}); do + RUNKEY="RUN_${NP}_STEP_${RUN}_of_${STEPS}" + SCOREPDIR="scorep-${RUNKEY}" + rm -rf $SCOREPDIR + echo $RUNKEY + + # Run the program + $(which mpirun) \ + --mca btl_base_warn_component_unused 0 \ + --bind-to core:overload-allowed \ + --report-bindings \ + -x SCOREP_TOTAL_MEMORY=3900MB \ + -x SCOREP_MPI_ENABLE_GROUPS=ALL \ + -x SCOREP_ENABLE_TRACING=FALSE \ + -x SCOREP_ENABLE_PROFILING=TRUE \ + -x SCOREP_EXPERIMENT_DIRECTORY=$SCOREPDIR \ + -x LD_LIBRARY_PATH=$(spack location -i openmpi)/lib/ \ + -np ${NP} \ + -machinefile $MACHINEFILE \ + $ALYA $CASENAME + + # Save the data into the SCOREPDIR + cp $MACHINEFILE $SCOREPDIR + cp *.log ${CASENAME}.dat $SCOREPDIR + mv domain-*.csv iterations-*.csv $SCOREPDIR + cp rank-elements.dat $SCOREPDIR + + # Run the refinement workflow + pushd $SCOREPDIR + cp $GITREPO/Refinement_Workflow.org . + emacs -batch -l ~/.emacs.d/init.el \ + --eval "(setq vc-follow-symlinks nil)" \ + --eval "(setq ess-ask-for-ess-directory nil)" \ + --eval "(setq org-export-babel-evaluate t)" \ + Refinement_Workflow.org \ + --funcall org-babel-execute-buffer + + # Compress data (after workflow) + gzip *.csv + + # Adjust rank-elements_next.dat + NRANKS=rank-elements_next.dat + cp $NRANKS $CASEDIR/rank-elements.dat + popd + + # Move all data to EXPEDIR for archiving purposes + mv $SCOREPDIR $EXPEDIR +done +done > execution.log +mv execution.log $EXPEDIR +popd +#+end_src +*** Data Transformation +**** Read one DIR function + +#+name: exp33_read_one_dir +#+begin_src R :results output :session :exports both +suppressMessages(library(tidyverse)); +read_one_DIR <- function(DIR) { + # Get basic info about the experience + tibble(Dir=DIR) %>% + separate(Dir, into=c("X", "Case"), sep="/") %>% select(-X) %>% + separate(Case, into=c("X1", "NP", "X3", "Seq", "X5", "X6"), sep="_") %>% select(-X1, -X3, -X5, -X6) -> basic.info; + + # 2. Transform profile.cubex + PROFILE.CUBEX = paste0(DIR, "/profile.cubex"); + PROFILE.CSV = paste0(DIR, "/profile.csv"); + REGIONSCODES.CSV = paste0(DIR, "/regions-codes.csv"); + CUBEINFO.TXT = paste0(DIR, "/cube_info.txt"); + CUBEINFO.CSV = paste0(DIR, "/cube_info.csv"); + + system2("cube_dump", args=paste("-c all -m all -s csv2", PROFILE.CUBEX, ">", PROFILE.CSV)); + system2("cube_dump", args=paste("-w", PROFILE.CUBEX, "| tail -n+33 | head -n279 | tr '(' ';' | sed -e \"s/[[:space:]]*//g\" -e \"s/,.*$//\" -e \"s/id=//\" -e \"s/:[^;]*;/;/\" > ", REGIONSCODES.CSV)); + system2("cube_dump", args=paste("-w", PROFILE.CUBEX, ">", CUBEINFO.TXT)); + system2("./scripts/cube_calltree.pl", args=DIR); + + # 1. Read Iteration Timings + if(FALSE){ + do.call("bind_rows", lapply(list.files(DIR, pattern="iterations.*csv.gz$", full.names=TRUE), + function(file) { + read_table(file, col_names=FALSE) %>% + rename(Time.User = X1, + Time.System = X2, + Time.Run = X3, + Rank = X4, + Iteration = X5) %>% + filter(Iteration != 0) %>% + group_by(Rank, Iteration) %>% + summarize(Start = min(Time.User), + End = max(Time.User)) %>% + group_by(Rank) %>% + mutate(End = End - min(Start), + Start = Start - min(Start)) + })) -> exp.iter; + } + + # 4. Enrich the call tree + df.PROF <- read_csv(PROFILE.CSV); + exp.REGION <- read_delim(REGIONSCODES.CSV, col_names=FALSE, delim=";") %>% + rename(ID = X2, Name = X1); + exp.ENRICH <- read_delim(CUBEINFO.CSV, delim=",", col_names=FALSE) %>% + rename(Phase = X1, Code = X2, ID = X3) %>% + left_join(df.PROF, by=c("ID" = "Cnode ID")) %>% + rename(Rank = `Thread ID`) %>% + filter(Code == "Computation") %>% + mutate(Phase = as.integer(Phase)) %>% + filter(Rank != 0) %>% + mutate(NP = basic.info$NP, + Seq = basic.info$Seq) + + return(exp.ENRICH); + + # 5. The number of elements on T1 + NOELEMENTS.CSV = "no_elements.csv"; + system2("./scripts/no_elements_domain.sh", args=paste(DIR, NOELEMENTS.CSV)); + exp.ELEMENTS <- read_csv(paste0(DIR, "/", NOELEMENTS.CSV)) %>% + filter(Rank != 0) + + # Calculate the average + exp.ENRICH %>% + filter(Phase <= 9) %>% + group_by(Rank) %>% + summarize(time = mean(time)) -> exp.ENRICH.AVERAGE; + + # 6. Integrate number of elements with compute cost + exp.ENRICH.AVERAGE %>% + left_join(exp.ELEMENTS) %>% + select(Rank, time, NELEM) -> df.INTEGRATED; + + # 7. Do the cumsum + df.INTEGRATED %>% + arrange(Rank) %>% + mutate(TIMESUM = cumsum(time)/sum(time)) %>% + mutate(NELEMSUM = cumsum(NELEM)/sum(NELEM)) -> df.CUMSUM; +} +#+end_src + +#+RESULTS: exp33_read_one_dir + +**** Read ALL DIRs + +#+begin_src R :results output :session :exports both +do.call("bind_rows", lapply(list.files("exp_33_paravance_65_manual", pattern="scorep", full.names=TRUE), + function(case) { read_one_DIR(case); } + )) -> exp33.ALL; +#+end_src + +#+RESULTS: +#+begin_example +exp_33_paravance_65_manual/scorep-RUN_513_STEP_1_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_10_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_11_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_12_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_13_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_14_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_15_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_16_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_17_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_18_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_19_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_2_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_20_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_21_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_22_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_23_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_24_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_25_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_26_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_27_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_28_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_29_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) + |=========== | 15% |============ | 16% |============= | 18% |============== | 19% |=============== | 20% |=============== | 21% |=============== | 23% 1 MB |=============== | 24% 1 MB |================ | 25% 1 MB |================= | 26% 1 MB |================== | 28% 1 MB |=================== | 29% 1 MB |=================== | 30% 1 MB |==================== | 31% 1 MB |===================== | 33% 1 MB |====================== | 34% 1 MB |====================== | 35% 1 MB |======================= | 36% 1 MB |======================== | 37% 1 MB |========================= | 39% 1 MB |========================== | 40% 1 MB |=========================== | 41% 1 MB |=========================== | 42% 1 MB |============================ | 43% 1 MB |============================= | 45% 1 MB |============================== | 46% 2 MB |=============================== | 47% 2 MB |=============================== | 48% 2 MB |================================ | 50% 2 MB |================================= | 51% 2 MB |================================== | 52% 2 MB |================================== | 53% 2 MB |=================================== | 55% 2 MB |==================================== | 56% 2 MB |===================================== | 57% 2 MB |====================================== | 58% 2 MB |======================================= | 60% 2 MB |======================================= | 61% 2 MB |======================================== | 62% 2 MB |========================================= | 63% 2 MB |========================================== | 65% 2 MB |=========================================== | 66% 2 MB |============================================ | 67% 2 MB |============================================= | 69% 3 MB |============================================= | 70% 3 MB |============================================== | 71% 3 MB |=============================================== | 73% 3 MB |================================================ | 74% 3 MB |================================================= | 75% 3 MB |================================================== | 77% 3 MB |=================================================== | 78% 3 MB |=================================================== | 79% 3 MB |==================================================== | 81% 3 MB |===================================================== | 82% 3 MB |====================================================== | 83% 3 MB |======================================================= | 85% 3 MB |======================================================== | 86% 3 MB |========================================================= | 87% 3 MB |========================================================== | 89% 3 MB |========================================================== | 90% 3 MB |=========================================================== | 91% 3 MB |============================================================ | 93% 4 MB |============================================================= | 94% 4 MB |============================================================== | 95% 4 MB |=============================================================== | 97% 4 MB |================================================================| 98% 4 MB |================================================================| 99% 4 MB |=================================================================| 100% 4 MB +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_3_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_30_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_31_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_32_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) + |============== | 19% |=============== | 20% |=============== | 21% |=============== | 23% 1 MB |=============== | 24% 1 MB |================ | 25% 1 MB |================= | 26% 1 MB |================== | 28% 1 MB |=================== | 29% 1 MB |=================== | 30% 1 MB |==================== | 31% 1 MB |===================== | 33% 1 MB |====================== | 34% 1 MB |====================== | 35% 1 MB |======================= | 36% 1 MB |======================== | 37% 1 MB |========================= | 39% 1 MB |========================== | 40% 1 MB |=========================== | 41% 1 MB |=========================== | 42% 1 MB |============================ | 43% 1 MB |============================= | 45% 1 MB |============================== | 46% 2 MB |=============================== | 47% 2 MB |=============================== | 48% 2 MB |================================ | 50% 2 MB |================================= | 51% 2 MB |================================== | 52% 2 MB |================================== | 53% 2 MB |=================================== | 55% 2 MB |==================================== | 56% 2 MB |===================================== | 57% 2 MB |====================================== | 58% 2 MB |======================================= | 60% 2 MB |======================================= | 61% 2 MB |======================================== | 62% 2 MB |========================================= | 63% 2 MB |========================================== | 65% 2 MB |=========================================== | 66% 2 MB |============================================ | 67% 2 MB |============================================= | 69% 3 MB |============================================= | 70% 3 MB |============================================== | 71% 3 MB |=============================================== | 73% 3 MB |================================================ | 74% 3 MB |================================================= | 75% 3 MB |================================================== | 77% 3 MB |=================================================== | 78% 3 MB |=================================================== | 79% 3 MB |==================================================== | 81% 3 MB |===================================================== | 82% 3 MB |====================================================== | 83% 3 MB |======================================================= | 85% 3 MB |======================================================== | 86% 3 MB |========================================================= | 87% 3 MB |========================================================== | 89% 3 MB |========================================================== | 90% 3 MB |=========================================================== | 91% 3 MB |============================================================ | 93% 4 MB |============================================================= | 94% 4 MB |============================================================== | 95% 4 MB |=============================================================== | 97% 4 MB |================================================================| 98% 4 MB |================================================================| 99% 4 MB |=================================================================| 100% 4 MB +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_33_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_34_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_35_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_36_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) + |== | 3% |=== | 4% |==== | 5% |===== | 7% |====== | 8% |======= | 9% |======= | 10% |======== | 12% |========= | 13% |========== | 14% |=========== | 15% |============ | 17% |============= | 18% |============== | 19% |=============== | 20% |=============== | 21% |=============== | 23% 1 MB |=============== | 24% 1 MB |================ | 25% 1 MB |================= | 26% 1 MB |================== | 28% 1 MB |=================== | 29% 1 MB |=================== | 30% 1 MB |==================== | 31% 1 MB |===================== | 32% 1 MB |====================== | 34% 1 MB |====================== | 35% 1 MB |======================= | 36% 1 MB |======================== | 37% 1 MB |========================= | 39% 1 MB |========================== | 40% 1 MB |=========================== | 41% 1 MB |=========================== | 42% 1 MB |============================ | 43% 1 MB |============================= | 45% 1 MB |============================== | 46% 2 MB |============================== | 47% 2 MB |=============================== | 48% 2 MB |================================ | 50% 2 MB |================================= | 51% 2 MB |================================== | 52% 2 MB |================================== | 53% 2 MB |=================================== | 55% 2 MB |==================================== | 56% 2 MB |===================================== | 57% 2 MB |====================================== | 58% 2 MB |======================================= | 60% 2 MB |======================================= | 61% 2 MB |======================================== | 62% 2 MB |========================================= | 63% 2 MB |========================================== | 65% 2 MB |=========================================== | 66% 2 MB |============================================ | 67% 2 MB |============================================= | 69% 3 MB |============================================= | 70% 3 MB |============================================== | 71% 3 MB |=============================================== | 73% 3 MB |================================================ | 74% 3 MB |================================================= | 75% 3 MB |================================================== | 77% 3 MB |=================================================== | 78% 3 MB |=================================================== | 79% 3 MB |==================================================== | 81% 3 MB |===================================================== | 82% 3 MB |====================================================== | 83% 3 MB |======================================================= | 85% 3 MB |======================================================== | 86% 3 MB |========================================================= | 87% 3 MB |========================================================== | 89% 3 MB |========================================================== | 90% 3 MB |=========================================================== | 91% 3 MB |============================================================ | 93% 4 MB |============================================================= | 94% 4 MB |============================================================== | 95% 4 MB |=============================================================== | 97% 4 MB |================================================================| 98% 4 MB |================================================================| 99% 4 MB |=================================================================| 100% 4 MB +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_37_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_38_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_39_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_4_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_40_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_41_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_42_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_43_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_44_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_45_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_46_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_47_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_48_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_49_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_5_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_50_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_6_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_7_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_8_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +exp_33_paravance_65_manual/scorep-RUN_513_STEP_9_of_50/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Warning: 10260 parsing failures. +row # A tibble: 5 x 5 col row col expected actual expected actual 1 7696 bytes_sent an integer 6106101516 file 2 7696 bytes_received an integer 6106101516 row 3 7697 bytes_sent an integer 6106101516 col 4 7697 bytes_received an integer 6106101516 expected 5 7698 bytes_sent an integer 6106101516 actual # ... with 1 more variables: file +... ................. ... ............................................ ........ ............................................ ...... ............................................ .... ............................................ ... ............................................ ... ............................................ ........ ............................................ ...... ....................................... +See problems(...) for more details. + +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +There were 50 or more warnings (use warnings() to see the first 50) +#+end_example + +#+begin_src R :results output :session :exports both +exp33.ALL +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 256,000 x 18 + Phase Code ID Rank visits time min_time max_time bytes_put + + 1 1 Computation 14 1 1 3.405599 11.07904 11.07904 0 + 2 1 Computation 14 2 1 3.347164 11.07898 11.07898 0 + 3 1 Computation 14 3 1 3.393387 11.07907 11.07907 0 + 4 1 Computation 14 4 1 3.409869 11.07887 11.07887 0 + 5 1 Computation 14 5 1 3.303782 11.07908 11.07908 0 + 6 1 Computation 14 6 1 3.601905 11.07898 11.07898 0 + 7 1 Computation 14 7 1 3.286918 11.07910 11.07910 0 + 8 1 Computation 14 8 1 3.443075 11.07900 11.07900 0 + 9 1 Computation 14 9 1 3.302105 11.07908 11.07908 0 +10 1 Computation 14 10 1 3.667202 11.07895 11.07895 0 +# ... with 255,990 more rows, and 9 more variables: bytes_get , +# ALLOCATION_SIZE , DEALLOCATION_SIZE , bytes_leaked , +# maximum_heap_memory_allocated , bytes_sent , +# bytes_received , NP , Seq +#+end_example + +**** ALL Rough initial LB Analysis + +#+begin_src R :results output graphics :file img/exp33_ALL.png :exports both :width 1800 :height 400 :session +exp33.ALL %>% + mutate(NP = as.integer(NP), + Seq = as.integer(Seq)) %>% + filter(NP == 513) %>% + group_by(NP, Seq, Phase, Rank, Code) %>% + summarize(Sum = sum(time)) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + ggplot(aes(x=as.integer(Phase), y=Sum, color=Rank)) + + geom_line(aes(group=Rank), alpha=.1) + + geom_point() + + ylim(0,NA) + + theme_bw(base_size=18) + + facet_wrap(~Seq, nrow=2) +#+end_src + +#+RESULTS: +[[file:img/exp33_ALL.png]] + +**** Save data for Arnaud's analysis :ATTACH: +:PROPERTIES: +:Attachments: exp33_compute_ALL.csv.gz +:ID: 5f287c9b-27f1-4fe6-a368-220504cba4ad +:END: + +#+begin_src R :results output :session :exports both +write_csv(exp33.ALL, "exp33_compute_ALL.csv.gz"); +#+end_src + +#+RESULTS: + +**** TODO X. Verify the load imbalance (and eventually summarize it if necessary) +***** Rough initial LB Analysis + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session +df.TMP %>% + group_by(Phase, Rank, Code) %>% + summarize(Sum = sum(time)) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + ggplot(aes(x=as.integer(Phase), y=Sum, color=Rank)) + + geom_line(aes(group=Rank), alpha=.1) + + geom_point() + + ylim(0,NA) + + theme_bw(base_size=18) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-22415aoO/figure22415k9p.png]] + +***** Calculate the average + +#+name: exp33_lb_average +#+header: :var dep0=exp33_enrich +#+begin_src R :results output :session :exports both + +exp.ENRICH +#+end_src + +#+RESULTS: exp33_lb_average +#+begin_example +# A tibble: 64 x 2 + Rank time + + 1 1 42.92296 + 2 2 64.98924 + 3 3 66.13120 + 4 4 62.14705 + 5 5 64.66830 + 6 6 59.88956 + 7 7 61.76306 + 8 8 67.09068 + 9 9 57.22863 +10 10 64.70995 +# ... with 54 more rows +#+end_example + +**** TODO 8. Target with Inverting Hilbert Load Curve +***** The function +#+name: exp33_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df_cum_inverse = function(df,yval) { + df = rbind(data.frame(x=0,y=0),df[names(df) %in% c("x","y")]) + N = nrow(df); + xval = rep(NA,length(yval)) +# print(N); + for(i in 1:length(yval)) { +# print(yval[i]) + if(yval[i]<=0) {xval[i] = 0;} + else if(yval[i]>=1) {xval[i] = 1;} + else { + idx = df %>% mutate(Order = 1:nrow(.)) %>% filter(y >= yval[i]) %>% slice(1) %>% pull(Order) + idx = idx-1; +# print(idx) +# print(df[idx:(idx+1),]) + eps_x = (df[idx+1,]$x - df[idx,]$x); + eps_y = (df[idx+1,]$y - df[idx,]$y); + xval[i] = ((yval[i]-df[idx,]$y)*eps_x/eps_y + df[idx,]$x); + } + } + return(xval); +} +#+end_src + +#+RESULTS: exp33_hilbert_invertion_function +***** Calculate the Target +#+name: exp33_target +#+header: :var dep0=exp33_cumsum +#+header: :var dep1=exp33_hilbert_invertion_function +#+begin_src R :results output :session :exports both +df.CUMSUM %>% + rename(y = TIMESUM, x = NELEMSUM) -> df.CUMSUM.2; +nb_proc = nrow(df.CUMSUM.2) +df_cum_inverse(df.CUMSUM.2, y = (1:nb_proc)/nb_proc) -> l.TARGET; +df.CUMSUM.2 %>% + mutate(Target.x = l.TARGET, + Target.y = (1:nb_proc)/nb_proc) -> df.TARGET; +df.TARGET; +#+end_src + +#+name: exp33_tidying +#+header: :var dep0=exp33_target +#+begin_src R :results output :session :exports both +df.TARGET %>% select(Rank, x, y) -> df.1; +df.TARGET %>% select(Rank, Target.x, Target.y) -> df.2; +df.1 %>% + mutate(Category = "Observed") %>% + bind_rows(df.2 %>% + mutate(Category = "Target") %>% + rename(x = Target.x, y = Target.y)) -> df.TARGET.tidy; +df.TARGET %>% select(x, y, Target.x, Target.y) %>% filter(x == Target.x) +#+end_src + +#+RESULTS: +: # A tibble: 1 x 4 +: x y Target.x Target.y +: +: 1 1 1 1 1 + +***** Plot with the arrows + +#+header: :var dep0=exp33_tidying +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 800 :height 400 :session +m1=0 +m2=m1+1 +NP = df.TARGET.tidy %>% pull(Rank) %>% unique %>% length +df.TARGET.tidy %>% + ggplot(aes(x=x, y=y, group+category)) + + theme_bw(base_size=20) + + geom_curve (data=df.TARGET %>% filter(x!=1), aes(x=x, y=y, xend=Target.x, yend=Target.y), arrow=arrow(length = unit(0.03, "npc"))) + + geom_point(size=1, aes(shape=Category, color=as.factor(Rank%%8))) + + scale_color_brewer(palette = "Set1") + +# geom_line(data=df.TARGET.tidy %>% filter(Category == "Observed")) + + coord_cartesian(xlim=c(m1, m2), ylim=c(m1, m2)) #+ +# geom_hline(yintercept=(0:NP)/NP) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-16900S5M/figure16900T5J.png]] + +**** TODO 9. Output the new =rank-elements.dat= file + +#+header: :var dep0=exp33_target +#+begin_src R :results output :session :exports both +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + write_delim("rank-elements.dat", delim=" ", col_names=FALSE) +#+end_src + +#+RESULTS: + +**** TODO 9.1 Check which rank has the larger correction + +#+header: :var dep0=exp33_target +#+begin_src R :results output :session :exports both +t = 61 +df.TARGET %>% + select(Rank, Target.x) %>% + mutate(Target.x.aux = c(0, Target.x[-n()])) %>% + mutate(Target = (Target.x - Target.x.aux) * NP) %>% + select(Rank, Target) %>% + filter(Rank >= t) +#+end_src + +#+RESULTS: +: # A tibble: 4 x 2 +: Rank Target +: +: 1 61 0.9131876 +: 2 62 1.1703797 +: 3 63 1.0715776 +: 4 64 1.0107468 + +#+begin_src R :results output :session :exports both +df.INTEGRATED %>% + mutate(F = time / NELEM) %>% + mutate(diff = mean(time) - time) %>% + filter(Rank >= t) +#+end_src + +#+RESULTS: +: # A tibble: 4 x 5 +: Rank time NELEM F diff +: +: 1 61 72.95337 141232 0.0005165499 -10.41747 +: 2 62 81.25564 221395 0.0003670166 -18.71973 +: 3 63 29.42375 162676 0.0001808734 33.11215 +: 4 64 66.49157 157024 0.0004234484 -3.95566 + +#+begin_src R :results output :session :exports both +exp.ENRICH %>% pull(Rank) %>% min +exp.ENRICH %>% summary +exp.ENRICH %>% pull(time) %>% sd +exp.ENRICH %>% arrange(-time) +#+end_src + +#+RESULTS: +#+begin_example +[1] 1 + Rank time + Min. : 1.00 Min. : 32.55 + 1st Qu.:16.75 1st Qu.: 59.81 + Median :32.50 Median : 61.14 + Mean :32.50 Mean : 61.80 + 3rd Qu.:48.25 3rd Qu.: 63.00 + Max. :64.00 Max. :103.14 +[1] 12.66165 +# A tibble: 64 x 2 + Rank time + + 1 31 103.14403 + 2 59 97.99925 + 3 46 96.40160 + 4 44 91.13452 + 5 3 81.42977 + 6 20 77.99168 + 7 1 67.54315 + 8 36 66.69546 + 9 56 65.95448 +10 39 65.71467 +# ... with 54 more rows +#+end_example + +** Quick analysis of whether NELEM, NPOIN, NBOUN, NPOI32 could be used as a predictor :EXP26:EXP25:EXP24: +Let's load the data Lucas sent me: +#+begin_src R :results output :session *R* :exports both +suppressMessages(library(tidyverse)); +df_params = read_csv("exp24-exp25-exp26.csv.gz") +df_params +#+end_src + +#+RESULTS: +#+begin_example +Parsed with column specification: +cols( + Phase = col_integer(), + ID = col_integer(), + Rank = col_integer(), + NELEM = col_integer(), + NPOIN = col_integer(), + NBOUN = col_integer(), + NPOI32 = col_integer(), + Exp = col_character() +) +# A tibble: 704 x 8 + Phase ID Rank NELEM NPOIN NBOUN NPOI32 Exp + + 1 1 14 1 190452 37438 4546 2406 exp24 + 2 1 14 2 120468 45173 11308 3570 exp24 + 3 1 14 3 134197 43452 8856 4330 exp24 + 4 1 14 4 133436 40913 7662 3896 exp24 + 5 1 14 5 130623 42986 9030 3931 exp24 + 6 1 14 6 145601 43436 7358 4927 exp24 + 7 1 14 7 130407 44445 9289 5185 exp24 + 8 1 14 8 127508 43276 9881 3418 exp24 + 9 1 14 9 134033 41766 7851 4177 exp24 +10 1 14 10 141351 40484 6690 3872 exp24 +# ... with 694 more rows +#+end_example + +Let's load all the corresponding traces. +#+header: :var dep0=exp24_cumsum +#+header: :var dep1=exp25_cumsum +#+header: :var dep2=exp26_cumsum + +#+begin_src R :results output :session *R* :exports both +df.exp24.CUMSUM %>% mutate(Type = "exp24") %>% + bind_rows(df.exp25.CUMSUM %>% mutate(Type = "exp25")) %>% + bind_rows(df.exp26.CUMSUM %>% mutate(Type = "exp26")) -> df_time +df_time +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 192 x 6 + Rank time NELEM TIMESUM NELEMSUM Type + + 1 1 88.8 190452 0.0220 0.0207 exp24 + 2 2 33.1 120468 0.0302 0.0338 exp24 + 3 3 63.3 134197 0.0459 0.0483 exp24 + 4 4 35.4 133436 0.0546 0.0628 exp24 + 5 5 37.2 130623 0.0639 0.0770 exp24 + 6 6 42.2 145601 0.0743 0.0928 exp24 + 7 7 61.3 130407 0.0895 0.107 exp24 + 8 8 62.4 127508 0.105 0.121 exp24 + 9 9 72.4 134033 0.123 0.135 exp24 +10 10 59.6 141351 0.138 0.151 exp24 +# ... with 182 more rows +#+end_example + +#+begin_src R :results output :session *R* :exports both +df_time %>% select(-NELEM) %>% + left_join(df_params %>% filter(Phase == 1), + by=c("Type" = "Exp", "Rank" = "Rank")) -> df_merged_lm +df_merged_lm %>% group_by(Type) %>% + summarize(time.mean = mean(time), time.max = max(time), time.min = min(time)) +#+end_src + +#+RESULTS: +: # A tibble: 3 x 4 +: Type time.mean time.max time.min +: +: 1 exp24 63.1 105 30.5 +: 2 exp25 62.0 102 35.7 +: 3 exp26 61.8 103 32.5 + +Then using information from +https://stackoverflow.com/questions/1169539/linear-regression-and-group-by-in-r + +#+begin_src R :results output :session *R* :exports both +library(broom) +df_merged_lm %>% group_by(Type) %>% + do(model = lm(time ~ (NELEM + NPOIN + NBOUN + NPOI32), data = .)) -> reg +#+end_src + +#+RESULTS: + +Let's have a look at the R^2: +#+begin_src R :results output :session *R* :exports both +reg %>% glance(model) +#+end_src + +#+RESULTS: +: # A tibble: 3 x 12 +: # Groups: Type [3] +: Type r.squared adj.r.squared sigma statistic p.value df logLik AIC +: +: 1 exp24 0.368 0.325 12.9 8.58 0.0000158 5 -252 515 +: 2 exp25 0.315 0.268 12.4 6.77 0.000148 5 -250 511 +: 3 exp26 0.196 0.142 11.7 3.60 0.0108 5 -246 504 +: # ... with 3 more variables: BIC , deviance , df.residual + +Yuck! Do we have significant parameters? +#+begin_src R :results output :session *R* :exports both +reg %>% tidy(model) %>% mutate(signif=ifelse(p.value<.001, "***", + ifelse(p.value<.01, "**", + ifelse(p.value<.05, "*", + ifelse(p.value<.1, ".", ""))))) +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 15 x 7 +# Groups: Type [3] + Type term estimate std.error statistic p.value signif + + 1 exp24 (Intercept) 47.9 11.4 4.20 0.0000919 *** + 2 exp24 NELEM 0.000588 0.000150 3.93 0.000223 *** + 3 exp24 NPOIN - 0.00212 0.000849 -2.50 0.0153 * + 4 exp24 NBOUN 0.00218 0.00169 1.30 0.200 "" + 5 exp24 NPOI32 0.00110 0.00255 0.431 0.668 "" + 6 exp25 (Intercept) 34.9 9.54 3.66 0.000540 *** + 7 exp25 NELEM - 0.000213 0.000189 -1.13 0.265 "" + 8 exp25 NPOIN 0.00183 0.000949 1.92 0.0592 . + 9 exp25 NBOUN - 0.00214 0.00181 -1.18 0.243 "" +10 exp25 NPOI32 - 0.000908 0.00253 -0.358 0.721 "" +11 exp26 (Intercept) 32.8 8.53 3.84 0.000303 *** +12 exp26 NELEM 0.000235 0.000135 1.74 0.0863 . +13 exp26 NPOIN - 0.000350 0.000755 -0.464 0.645 "" +14 exp26 NBOUN 0.00104 0.00150 0.693 0.491 "" +15 exp26 NPOI32 0.000378 0.00216 0.175 0.862 "" +#+end_example + +Not really. :( Let's have a look at the residuals to check whether +they are structured or not. +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 1600 :height 400 :session *R* +reg %>% augment(model) %>% ggplot(aes(x=.fitted,y=.resid)) + facet_wrap(~Type) + geom_smooth(color="red") + geom_point() +#+end_src + +#+RESULTS: +[[file:/tmp/babel-18292gz_/figure182927Vd.png]] + +** 10-node (160 cores) rennes parasilo, many rounds (after MN4) :EXP34: +*** Execution script + +#+begin_src shell :results output +# Configuration part +export STEPS=10 +export TIMESTEPS=5 +export EXPEKEY=exp_34_parasilo_10_manual +export EXPEDIR=${HOME}/${EXPEKEY} +export MACHINEFILE=$HOME/parasilo +export CASEDIR=$HOME/WORK-RICARD/resp_sfc/ +export CASENAME=fensap +export ALYA=$HOME/alya-bsc-sfc/Executables/unix/Alya.x +rm -rf $EXPEDIR +mkdir -p $EXPEDIR +export GITREPO=$HOME/Alya-Perf/ + +# Control part +cat $MACHINEFILE | uniq > /tmp/mf-uniq +$GITREPO/scripts/control_experiment.sh /tmp/mf-uniq $EXPEDIR + +# Use spack +source ~/spack-ALYA/share/spack/setup-env.sh +# Use the correct mpicc/mpif90/mpirun +export PATH=$(spack location -i openmpi)/bin:$PATH +# Use the cube_dump for the refinement workflow +export PATH=$(spack location -i cube)/bin:$PATH + +pushd $CASEDIR +sed -i "s/NUMBER_OF_STEPS=.*$/NUMBER_OF_STEPS=${TIMESTEPS}/" fensap.dat +for NP in 160; do + + # Generate initial rank-elements.dat to case + for i in $(seq 1 $(echo ${NP} - 1 | bc)); do + echo $i 1 + done > $CASEDIR/rank-elements.dat + +for RUN in $(seq 1 ${STEPS}); do + RUNKEY="RUN_${NP}_STEP_${RUN}_of_${STEPS}" + SCOREPDIR="scorep-${RUNKEY}" + rm -rf $SCOREPDIR + echo $RUNKEY + + # Run the program + $(which mpirun) \ + --mca btl_base_warn_component_unused 0 \ + --bind-to core:overload-allowed \ + --report-bindings \ + -x SCOREP_TOTAL_MEMORY=3900MB \ + -x SCOREP_MPI_ENABLE_GROUPS=ALL \ + -x SCOREP_ENABLE_TRACING=FALSE \ + -x SCOREP_ENABLE_PROFILING=TRUE \ + -x SCOREP_EXPERIMENT_DIRECTORY=$SCOREPDIR \ + -x LD_LIBRARY_PATH=$(spack location -i openmpi)/lib/ \ + -np ${NP} \ + -machinefile $MACHINEFILE \ + $ALYA $CASENAME + + # Save the data into the SCOREPDIR + cp $MACHINEFILE $SCOREPDIR + cp *.log ${CASENAME}.dat $SCOREPDIR + mv domain-*.csv iterations-*.csv $SCOREPDIR + cp rank-elements.dat $SCOREPDIR + + # Run the refinement workflow + pushd $SCOREPDIR + cp $GITREPO/Refinement_Workflow.org . + emacs -batch -l ~/.emacs.d/init.el \ + --eval "(setq vc-follow-symlinks nil)" \ + --eval "(setq ess-ask-for-ess-directory nil)" \ + --eval "(setq org-export-babel-evaluate t)" \ + Refinement_Workflow.org \ + --funcall org-babel-execute-buffer + + # Compress data (after workflow) + gzip *.csv + + # Adjust rank-elements_next.dat + NRANKS=rank-elements_next.dat + cp $NRANKS $CASEDIR/rank-elements.dat + popd + + # Move all data to EXPEDIR for archiving purposes + mv $SCOREPDIR $EXPEDIR +done +done > execution.log +mv execution.log $EXPEDIR +popd +#+end_src +*** Data Transformation +**** Read one DIR function + +#+name: exp34_read_one_dir +#+begin_src R :results output :session :exports both +suppressMessages(library(tidyverse)); +read_one_DIR <- function(DIR) { + # Get basic info about the experience + tibble(Dir=DIR) %>% + separate(Dir, into=c("X", "Case"), sep="/") %>% select(-X) %>% + separate(Case, into=c("X1", "NP", "X3", "Seq", "X5", "X6"), sep="_") %>% select(-X1, -X3, -X5, -X6) -> basic.info; + + print(basic.info); + + # 2. Transform profile.cubex + PROFILE.CUBEX = paste0(DIR, "/profile.cubex"); + PROFILE.CSV = paste0(DIR, "/profile.csv"); + REGIONSCODES.CSV = paste0(DIR, "/regions-codes.csv.gz"); + CUBEINFO.TXT = paste0(DIR, "/cube_info.txt"); + CUBEINFO.CSV = paste0(DIR, "/cube_info.csv.gz"); + + CUBE.DUMP = "/home/schnorr/install/stow/bin//cube_dump" + + system2(CUBE.DUMP, args=paste("-c all -m all -s csv2", PROFILE.CUBEX, ">", PROFILE.CSV)); + system2(CUBE.DUMP, args=paste("-w", PROFILE.CUBEX, "| tail -n+34 | head -n283 | tr '(' ';' | sed -e \"s/[[:space:]]*//g\" -e \"s/,.*$//\" -e \"s/id=//\" -e \"s/:[^;]*;/;/\" > ", REGIONSCODES.CSV)); + system2(CUBE.DUMP, args=paste("-w", PROFILE.CUBEX, ">", CUBEINFO.TXT)); + system2("./scripts/cube_calltree.pl", args=DIR); + + # 1. Read Iteration Timings + if(FALSE){ + do.call("bind_rows", lapply(list.files(DIR, pattern="iterations-.*csv.gz$", full.names=TRUE), + function(file) { + read_table(file, col_names=FALSE) %>% + rename(Time.User = X1, + Time.System = X2, + Time.Run = X3, + Rank = X4, + Iteration = X5) %>% + filter(Iteration != 0) %>% + group_by(Rank, Iteration) %>% + summarize(Start = min(Time.User), + End = max(Time.User)) %>% + group_by(Rank) %>% + mutate(End = End - min(Start), + Start = Start - min(Start)) + })) -> exp.iter; + } + + # 4. Enrich the call tree + df.PROF <- read_csv(PROFILE.CSV); + + exp.REGION <- read_delim(REGIONSCODES.CSV, col_names=FALSE, delim=";") %>% + rename(ID = X2, Name = X1); + + exp.ENRICH <- read_delim(CUBEINFO.CSV, delim=",", col_names=FALSE) %>% + rename(Phase = X1, Code = X2, ID = X3) %>% + left_join(df.PROF, by=c("ID" = "Cnode ID")) %>% + rename(Rank = `Thread ID`) %>% + filter(Code == "Computation") %>% + mutate(Phase = as.integer(Phase)) %>% + filter(Rank != 0) %>% + mutate(NP = basic.info$NP, + Seq = basic.info$Seq) + + return(exp.ENRICH); + + # 5. The number of elements on T1 + NOELEMENTS.CSV = "no_elements.csv"; + system2("./scripts/no_elements_domain.sh", args=paste(DIR, NOELEMENTS.CSV)); + exp.ELEMENTS <- read_csv(paste0(DIR, "/", NOELEMENTS.CSV)) %>% + filter(Rank != 0) + + # Calculate the average + exp.ENRICH %>% + filter(Phase <= 9) %>% + group_by(Rank) %>% + summarize(time = mean(time)) -> exp.ENRICH.AVERAGE; + + # 6. Integrate number of elements with compute cost + exp.ENRICH.AVERAGE %>% + left_join(exp.ELEMENTS) %>% + select(Rank, time, NELEM) -> df.INTEGRATED; + + # 7. Do the cumsum + df.INTEGRATED %>% + arrange(Rank) %>% + mutate(TIMESUM = cumsum(time)/sum(time)) %>% + mutate(NELEMSUM = cumsum(NELEM)/sum(NELEM)) -> df.CUMSUM; +} +#+end_src + +#+RESULTS: exp34_read_one_dir + +**** Read ALL DIRs + +#+begin_src R :results output :session :exports both +do.call("bind_rows", lapply(list.files("exp_34_parasilo_10_manual", pattern="scorep", full.names=TRUE), + function(case) { read_one_DIR(case); } + )) -> exp34.ALL; +#+end_src + +#+RESULTS: +#+begin_example +# A tibble: 1 x 2 + NP Seq +,* +1 160 1 +exp_34_parasilo_10_manual/scorep-RUN_160_STEP_1_of_10/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +# A tibble: 1 x 2 + NP Seq +,* +1 160 10 +exp_34_parasilo_10_manual/scorep-RUN_160_STEP_10_of_10/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +# A tibble: 1 x 2 + NP Seq +,* +1 160 2 +exp_34_parasilo_10_manual/scorep-RUN_160_STEP_2_of_10/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +# A tibble: 1 x 2 + NP Seq +,* +1 160 3 +exp_34_parasilo_10_manual/scorep-RUN_160_STEP_3_of_10/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +# A tibble: 1 x 2 + NP Seq +,* +1 160 4 +exp_34_parasilo_10_manual/scorep-RUN_160_STEP_4_of_10/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +# A tibble: 1 x 2 + NP Seq +,* +1 160 5 +exp_34_parasilo_10_manual/scorep-RUN_160_STEP_5_of_10/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +# A tibble: 1 x 2 + NP Seq +,* +1 160 6 +exp_34_parasilo_10_manual/scorep-RUN_160_STEP_6_of_10/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +# A tibble: 1 x 2 + NP Seq +,* +1 160 7 +exp_34_parasilo_10_manual/scorep-RUN_160_STEP_7_of_10/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +# A tibble: 1 x 2 + NP Seq +,* +1 160 8 +exp_34_parasilo_10_manual/scorep-RUN_160_STEP_8_of_10/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +# A tibble: 1 x 2 + NP Seq +,* +1 160 9 +exp_34_parasilo_10_manual/scorep-RUN_160_STEP_9_of_10/cube_info.csv +Parsed with column specification: +cols( + `Cnode ID` = col_integer(), + `Thread ID` = col_integer(), + visits = col_integer(), + time = col_double(), + min_time = col_double(), + max_time = col_double(), + bytes_put = col_integer(), + bytes_get = col_integer(), + ALLOCATION_SIZE = col_integer(), + DEALLOCATION_SIZE = col_integer(), + bytes_leaked = col_integer(), + maximum_heap_memory_allocated = col_integer(), + bytes_sent = col_integer(), + bytes_received = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_integer() +) +Parsed with column specification: +cols( + X1 = col_character(), + X2 = col_character(), + X3 = col_integer() +) +#+end_example + +**** ALL Rough initial LB Analysis + +#+begin_src R :results output graphics :file img/exp34_ALL.png :exports both :width 1000 :height 400 :session +exp34.ALL %>% + mutate(NP = as.integer(NP), + Seq = as.integer(Seq)) %>% + group_by(NP, Seq, Phase, Rank, Code) %>% + summarize(Sum = sum(time)) %>% + ungroup() %>% + filter(!is.na(Phase)) %>% + ggplot(aes(x=as.integer(Phase), y=Sum, color=Rank)) + + geom_line(aes(group=Rank), alpha=.1) + + geom_point() + + ylim(0,NA) + + theme_bw(base_size=18) + + facet_wrap(~Seq, nrow=2) +#+end_src + +#+RESULTS: +[[file:img/exp34_ALL.png]] + diff --git a/module2/ressources/video_examples/labbook_single.org b/module2/ressources/video_examples/labbook_single.org new file mode 100644 index 0000000..4589c1f --- /dev/null +++ b/module2/ressources/video_examples/labbook_single.org @@ -0,0 +1,6906 @@ +# -*- coding: utf-8 -*- +#+TITLE: Research journal +#+LANGUAGE: EN +#+CATEGORY: INRIA +#+STARTUP: overview indent inlineimages logdrawer hidestars +#+TAGS: [ SIMGRID : SMPI(s) ] +#+TAGS: [ PROGRAMMING : C(C) CPP(c) PYTHON(p) R(r) ] +#+TAGS: [ TOOLS : ORGMODE(o) GIT(g) GDB ] +#+TAGS: [ EXPERIMENTS(x) : EXP_SETUP EXP_EXEC EXP_RESULT EXP_ANALYSIS ] +#+TAGS: TRACING(t) PERFORMANCE(X) PROFILING(R) BUG(b) PAPER(P) HPL(h) MEETING(m) G5K(G) VALIDATION(v) REPORT(V) +#+TAGS: STAMPEDE(S) CBLAS(B) +#+LOGGING: lognoterepeat +#+SEQ_TODO: TODO(t!) STARTED(s!) WAITING(w@) APPT(a!) CANCELLED(c@) DEFERRED(f@) | DONE(d!) +#+SEQ_TODO: UNREAD | READ +#+SEQ_TODO: GOOD(g!) CRITICISM INTERESTING(w!) INVESTIGATE PROPOSAL +#+SEQ_TODO: QUESTION(q!) | RESOLVED(r!) + +* 2017 +** 2017-02 February +*** 2017-02-06 Monday +**** DONE Read [[file:5218a011.pdf][An Evaluation of Network Architectures for Next Generation Supercomputers]] :PAPER: +Bibtex: Chen:2016:ENA:3019057.3019059 +- The authors roughly do what we want to do: they use a simulator to do performance evaluation of different topologies, + with different workloads and routing algorithms. +- In a first part, they detail what are these topologies, routing algorithms and workloads. This could give us some + ideas of what to test. Maybe we could try to reproduce their results? +- They focus on topologies having: + + Full uniform bandwidth. + + Have good partitionability and can be grown modularly. + + Come at a lower cost than a 3-level fat tree (which is the state of the art in terms of pure performances). +- They test an adversarial traffic (task i sends to task (i+D) mod G, tuned to “be bad”). + + Fat tree has great performances, regardless of the routing algorithms. + + Other topologies (Dragonfly+, Stacked all-to-all, Stacked 2D hyperX) have terrible performances with direct + routing. For indirect or adaptive routing, performances are much better (but still a factor 2 lower than the fat + tree). +- Then, they test neighbor traffic (the logical topology is a grid for instance). + + Again, the fat tree has nearly full performances, regardless of the routing algorithm. + + Other topologies have lower performances with indirect routing. Their performances are ok with direct or adaptive + routing. +- Next, they look at AMR. + + Here, all topologies and routing algorithms have poor performances. + + The average throughput is high at the beginning, but decreases very quickly to nearly 0. This long tail with low + throughput accounts for the major part of the execution time. + + Thus, AMR seems to be inherently bad for parallelism. +- To sum up, the best routing algorithm is the adaptive routing (except maybe for the fat tree), the best topology is + the fat tree. +- The authors then had a look at random-mappings of the processes to the nodes (until now, the mapping was ideal). This + could reflect what would do a scheduler which is not topology-aware. In general, with adaptive routing, the Fat + Tree and the Dragonfly+ are very robust to irregular placements, the completion time is not impacted too much. This is + not the case for stacked topologies (due to a lack of path diversity). Thus, we should use a topology-aware job + scheduler, especially for stacked topologies. With non-adaptive routing, all the topologies suffer of performance + degradations. +**** DONE Talk with Arnaud about the internship. :MEETING: +***** Two possible things to have a look at. +- Simulate the impact of network failures on the performances. + + May need to work on Simgrid implementation, to handle failures. + + A recent paper has shown that, in their case, removing one of the two root switches of their fat tree did not impact + significantly the performances. + + A reason is that jobs rarely occupy the full tree, they are localized in one of its sub-trees. Thus, nearly no + communication go to the top switches. +- Modelize in Simgrid [[https://www.tacc.utexas.edu/stampede/][Stampede]] super computer. + + It uses a fat tree topology. + + We have access to real benchmark results. + + We have access to its configuration (e.g. OS and compiler used). +***** A first step for this internship would be to run HPL on a fat tree, with Simgrid. +***** Some features of Simgrid to speedup a simulation. +- A macro to only run the first steps of a loop and infer the total time from it. +- An allocator (replacing malloc/free) to share memory between processes. +***** Some technical details about Simgrid +- For every process, we run each piece of code until we reach a MPI operation. This gives us the execution time of this + code block. +- We know all the communication flows of the “current step”, thanks to the routing. + We thus have a list of linear constraints (e.g. the bandwidth of all flows going + through a same link should not exceed the capacity of this link). We solve this + by maximizing the minimum bandwidth of any flow (empirically, this is close to the + reality, where flows have a fair share of the resources). +- Routing is made with an AS hierarchy. There are local routing decisions (within an + AS) and global routing decisions (between two AS). +***** There exists other simulators. +Mainly codes/ross. Discrete event simulators, so they consider the problem at a lower level. But being too precises has +some drawbacks: +- The exact version of every piece of code can have noticeable impact → tedious to calibrate. +- The simulation takes much more time, does not scale as much as Simgrid. +*** 2017-02-07 Tuesday +**** DONE Begin writing a journal \o/ +**** DONE Read [[file:8815a909.pdf][Characterizing Parallel Scientific Applications on Commodity Clusters: An Empirical Study of a Tapered Fat-Tree]] :PAPER: +Bibtex: Leon:2016:CPS:3014904.3015009 +- The authors want to characterize the behavior of applications that run on their clusters, with an emphasis on + communication requirements. +- This should help to make more informed choices when building new clusters (should we use our budget to get more links + or more nodes?). +- They measured the utilization of their cluster during one week. It has a fat tree topology. The measurements show + that the network is not used very much: the maximal link utilization is approximately 50%, the average link + utilization is 2.4%. +- They did the same measures with a tapered fat tree (they removed one of the root switches). Except for some outliers + having a 90% link utilization at some point, this modification did not had a major impact on the link utilization, + which was 3.9%. +- The authors recorded which type of jobs were submitted. A great majority of them was really small. 95% of jobs have + at most 16 nodes, 76% have only one node. Jobs of less than 64 nodes consume 75% of the time. Thus, if the jobs are + well placed, the need for distant communications is very low, which explains the good performances of the tapered fat + tree. Of course this may change from one cluster to another, so we should reproduce these measurements and make our + own conclusions. +- Then, the authors removed one of there two top switches. +- A first micro-benchmark shows that it only impacts the aggregate bisection bandwidth, for large messages (> 32kB). +- Then, they evaluated the impact of the tapering on the performances of several “real-life” applications. +- They found that only one of these applications was sensible to the tapering. This application does collective + communications as well as point-to-point commulnications of large messages. +- However, the impact on the execution time of this application remains small: only 1-2% (it impacts its communication + time by 6-7.5% which itself accounts for only 9-15%). Furthermore, this only happens for a large number of nodes (> + 512). +- Finally, the authors claim that next generation hardware (faster CPU, memory and network, accelerators...) will lead + to some rewriting of the application to leverage this new hardware. In some applications, message sizes will be + larger. Thus, a tapered fat tree may have more impact with this new hardware, new experimentations will be needed to + find out. +**** DONE Some thoughts regarding previous paper, to discuss with Arnaud :PAPER:MEETING: +:LOGBOOK: +- State "DONE" from "TODO" [2017-03-03 Fri 17:51] +:END: +***** Can we have data about the utilization of clusters we will work on (Stampede, Bull)? +- It would help us to find relevant hypothesis (e.g. “pruning the fat-tree will not have any impact”). +- We need this for the simulation. What kind of jobs should we run? Small ones? Large ones? +***** Can we have data about the nature of the jobs submitted on these clusters? +- What are these applications? +- What fraction of the time do they use for communications? +- Small or large messages? +- Again, it will help us to make hypothesis and perform meaningful experimentations. +\to It changes a lot from one cluster to another, or even across time. It is also hard to record (a batch scheduler does +not know the nature of the jobs that it handles). +***** How to simulate “big nodes”? +- Can we simulate MPI+OpenMP programs with Simgrid? +- The paper from Christian explains briefly how Simgrid simulates multi-core machines (with one MPI process per core, no + threads). Why don't they talk about it in the other paper? Both papers are from the same year. +\to It would be very hard to support OpenMP in Simgrid, the standard is quite big. Also, in OpenMP, communications are +made with shared memory, so much more difficult to track than MPI communications. +***** Are time-independent traces larger than the “classical” traces? +\to No, same thing. +***** With ScalaTrace, traces have “near-constant size”. How? +\to Compression, lossless or lossy. +***** What is “detached mode” in point-to-point communication? +- Does the OS of the sender interrupt it, to ask it to send the data? +- If so, why is large mode slower for the sender? In detached mode, the sender has to stop what it is doing, + whereas in synchronous mode it is waiting. +\to Yes, the kernel interrupts the sender when the receiver is ready. Simgrid does not model the small messages used for +the synchronization. +***** What does the community think of closed source simulators, like xSim? Researchers behind xSim are doing strong claims that cannot be verified by independent researchers... +***** Why are there never confidence intervals in the plots of Simgrid's papers? +\to They are often not needed, because of too small variation. +***** About the paper Git/Org-mode +- Is there an implementation somewhere? Creating custom git commands seems [[http://thediscoblog.com/blog/2014/03/29/custom-git-commands-in-3-steps/][really easy]]. + \to Yes, but not packaged yet. To test the “beta version”, ask Vincent. +- Was not convinced by the subsection 4.3.3 (“fixing code”). When terminating an experiment, they revert all the changes + made to the source code since these may be ad hoc changes. Then the user has to cherry pick the changes (s)he wants to + keep. Sounds really dirty... It seems better to have generic scripts that you configure by giving command line + arguments and/or configuration files. Then you can simply put these arguments/files in the journal. +***** Routing in Simgrid (according to the doc) +- Routing tables are static (to achieve high performance). → Does it mean that handling link failures and dynanmic + re-routing will require a large code refactoring? What about the performance penalty? +- Routing algorithms are either based on short path (e.g. Floyd, Dijkstra) or manually entered. What about “classical” + algorithms like D-mod-K? An example is provided on [[https://github.com/simgrid/simgrid/blob/master/examples/platforms/cluster_fat_tree.xml][Github]]. The example implements a two levels fat-tree with + D-mod-K. However, D-mod-K is not specified in the XML, it seems to be implicit. Does it mean that we are forced to use + this routing algorithm for fat trees? +\to Read the code. Shortest path routing is a feature introduced by some Belgian researchers. For specific topologies like +fat-trees, the routing algorithm is hard-coded. +**** DONE Read [[file:hal-01415484.pdf][Simulating MPI applications: the SMPI approach]] :PAPER: +Bibtex: degomme:hal-01415484 +- This paper is about simulation of HPC systems. +- The authors claim that some research papers are based on simulation made with one-off programs with poor + documentation, making simplifying assumptions. Worse, these programs are sometimes not public. This is a big issue for + reproducibility. +- The whole paper consider several important aspects that a good simulator should take care of. +- Several use cases for simulation. + + Quantitative performance evaluation (what will be the performances if we take a bigger version of our hardware?). + + Qualitative performance evaluation (what will be the performances if we take different hardware?). + + Detection of hardware misconfiguration (leading to unexpected performance behaviors). + + MPI runtime tuning (e.g. choosing the algorithms of MPI collective operations). + + Teaching (supercomputers are expensive, we cannot let the students play with them). +***** Capturing the behavior of an application. +- Off-line simulation. A trace of MPI communication events is first obtained and then replayed. + + We measure the durations of the CPU bursts. Then, when replaying the application, we modify them to account for the + performance differences between the target platform and the platform used to get the traces. + + One problem is the size of the traces, which can be very large. To fix this, we may only record aggregated + statistics. They can be enough to detect some anomalies, but we cannot do more in-depth analysis. + + Another issue is extrapolation. Being able to extrapolate in the general case require assumptions hardly justifiable. + + In SMPI, they use “time-independent traces”. Instead of recording time durations, they log the number of + instructions and the number of bytes transferred by MPI primitives. These are independent of the hardware, so the + extrapolation issue is fixed. + + It does not solve anything for applications that adapt their behavior to the platform. But this is hopeless with + off-line simulation. + + There is still the issue of very large traces, they grow linearly with the problem size and the number of + processes. It seems to be fixed by ScalaTrace, but no explanation is given. +- On-line simulation. The actual application code is executed, part of the instruction stream is intercepted and passed + to a simulator. + + Several challenges. Intercepting MPI calls. Interractions between the application and the simulation kernel. + Obtaining full coverage of MPI standard. Over-subscribing resources. + + Several possibilities to capture MPI calls. Use PMPI interface (provided by every MPI implementation), but limited + to the high- level calls. Design a specific MPICH or OpenMPI driver, but tie the solution to a specific + implementation. One can also develop an ad-hoc implementation of the MPI standard. + + Many tool fold the application into a single process with several threads. This raise an issue for global variables, + they must be protected. One can duplicate the memory area of the global variables, or use a trick based on the + Global Offset Table (GOT). + + SMPI is based on a complete reimplementation of MPI standard. No full-coverage yet (e.g. remote memory access or + multithreaded MPI applications). + + Run MPICH internal compliance tests as part of their automatic testing. + + To protect global variables, duplicate their memory zone using mmap (smart thing, much more efficient thanks to + COW). +***** Modeling the infrastructure (network and CPU) +- Network modeling. + + Several solutions exist to modelize the network. + + Packet-level simulation, here we look at individual packets. It is very precise, but it is hard to know precisely + what we are modeling. Being precise with a wrong model is useless. Moreover, this model is very costly in terms of + simulations. + + Flow model. The finest grain here is the communication. Time to transfer a message of size S from i to j: L_i,j + S*B_i,j. The + B_i,j are not constant, they need to be evaluated for every moment. This model catch some complex behaviors (e.g. RTT unfairness + of TCP). Quite complex to implement, more costly than the delay model. Also, until recently, contentions could be neglected. + + Delay model, we have some equations to describe the communication times (e.g. LogP, LogGPS). It is elegant and cheap + in terms of simulations, but very unprecise. Does not take into account network topology (and eventual contentions) + and suppose a processor can only send one message at a time (single-port model). + + SMPI uses a hybrid network model. Point-to-point communications are divided in three modes: asynchronous, detached + and synchronous. Each mode has different values of bandwidth and latency, estimated by doing some benchmarks and + then a linear regression. + + To modelize network contentions, SMPI has three logical links for any physical link: a downlink, an uplink, and a + limiter link. The bandwidth of uploads (resp. downloads) must be lower than the capacity of uplinks + (resp. downlinks). The sum of the bandwidths must be lower than the capacity of the limiter link. +- CPU modeling. + + Like network modeling, several solutions. + + Microscopic models, very precise, but also very costly. + + Models with a coarser grain. For instance, we neglect the CPU load induced by communications → focus on Sequential + Execution Blocks (SEB). + + Most simplistic model: “CPU A is x times faster than CPU B”. Results ok for similar architectures, but not precise + at all if too different. For instance, number of registers, number of hypertreading cores, speed of floating point + computations, bandwidth to memory, etc. + + Thus, impossible to predict precisely without a perfect knowledge of the system state (and therefore a microscopic + model). + + Approach of SMPI: run SEB on a processor of the target architecture. Predict performances of *similar* architecrues by + applying a constant factor. + + Also, not all the code logic is data dependent. We can therefore greatly decrease the simulation time with two + tricks. + - Kernel sampling. Annotate some regions with macros. Execute them only a few times to obtain estimations, then skip + them. + - Memory folding. Share some data structures across processes. +***** Modeling the collective operations +- Again, several solutions for the modelization. +- More analytical ones: each collective operation has a cost equation (depending for instance on the message size and + the number of processes). As discussed for the network modelization, such approaches do not catch the eventual network + contention. +- Another approach is to benchmark each collective operation on the target platform, with various parameters and + communicators. Then, the obtained timings are reinjected in the simulation. We cannot do performance extrapolation + with this approach. Also, the benchmarking phase may be very long. +- Some replace every collective operation by the corresponding sequence of point-to-point communications (at compile + time). This does not capture the logic of selecting the right algorithm. +- Others capture this decomposition into point-to-point communication during the execution, then replay it. But this is + limited to off-line analysis. +- Simgrid implements all the collective algorithms and selection logics of both OpenMPI and MPICH. We are sure to + capture correctly the behavior of the operations, but this is an important work. Another interesting feature is that + the user can chose the selector or the algorithm from the command line. +***** Efficient simulation engine +- Rely on a efficient Discrete Event Simulation (DES) kernel. +- Some simulators parallelized this part (using MPI). But this results in a more complex implementations. +- In the way Simgrid works, there is not much potential parallelism. They therefore decided to keep a sequential DES. +- Simulation cost comes from the application itself (which can be greatly reduced, CPU modelization) and from the flow + level model. +***** Evaluation +Here, the authors show that the use cases mentionned at the beginning of the paper are all realised by Simgrid. +- Simgrid is very scalable, more than xSim which is already one of the most scalable simulators (self proclaimed). +- Kernel sampling and memory folding enable simulations of non-trivial applications with a very large number of cores. +- Then, the ability to make good predictions is demonstrated with a Mont Blanc project example. Here, Simgrid is much + closer to the reality than LogGPS model. However, no comparison is done with other simulators, so this result is hard + to evaluate. +- A quantitative performance extrapolation is demonstrated, showing good results. +- Empirically, the largest error made by SMPI in terms of time prediction is 5%. This allow to use SMPI to detect + hardware misconfiguration. Indeed, it already happened to the Simgrid team. +- Similarly to the previous point, the good accuracy of SMPI allow to investigate to find which MPI parameters lead to + the best performances. +- Finally, for obvious reasons, using a simulator is great for teaching MPI (rather than using a real cluster). +***** Conclusion +- The paper focused on MPI applications. +- But Simgrid has other use cases: formal verification of HPC applications, hybrid applications (MPI+CUDA). +**** DONE Read [[file:hal-01446134.pdf][Predicting the Performance and the Power Consumption of MPI Applications With SimGrid]] :PAPER: +Bibtex: heinrich:hal-01446134 +- The paper is about using Simgrid to predict energy consumption. +- This is a challenging question, the modelization is tricky. + + Power consumption of nodes has a static part (consumption of the node when idle) and a dynamic part. + + The static part is very significant (~50%), so we should really do something when the core is idle. + + A first solution is to power off the node, but the latency to power it on is large. + + Another solution is to use Dynamic Voltage Frequency Scaling (DVFS). This is not limited to the case where the core + is idle, it can also be used when the load is low but non-null. Performance loss is linear in the decrease of the + frequency, but the power consumption is quadratic. + + No other HPC simulator than Simgrid embed a power model yet. +***** Modeling multi-core architecture +- If two processes are in the same node (either a same core, or two cores of a same CPU), the simulation becomes tricky. + + The “naive” approach is to simply give a fair share to each of htese processes. But it does not take into accoutn + some memory effects. + + Simgrid can be pessimistic for processes heavily exploiting the L1 cache. In the simulation, the cache will be cold + after each MPI call, in reality the cache would be hot. + + Simgrid can be optimistic for processes heavily exploiting the L3 cache and the memory. In the simulation, they will + have exclusive access, in reality they will interfer between each other. +***** Modeling energy consumption +- The instantaneous power consumption is P_i,f,w(u) = Pstatic_i,f + Pdynamic_i,f,w * u, for a machine i, a frequency f, + a computational workload w and a usage u. +- In general, we assume that Pstatic_i,f = Pstatic_i (idle state, the frequency does not matter). +- Users can specify arbitrary relation (linear in the usage) for each possible frequency (in general, they should be + quadratic in the frequency, but it may change with new technologies). +- Each machine can have its own model, accounting for heterogeneity in the platform. +- Power consumption of each host is exposed to the application, allowing it to dynamically decide to change (or not) the + current frequency. +***** Modeling multicore computation +- A first step is to run the target application with a small workload using all the cores of a single node, on the target platform. +- Then, re-execute the application with the same workload on top of the simulator (hence using a single core). +- From these measures, associate to each code region a speedup factor that should be applied when emulating. +- In some applications, speedups are very close to 1. In other applications, some regions have a speedup of 0.16 while + other regions have a speedup of 14. Not taking this into accoutn can result to a large inaccuracy (~20-30%). +***** Modeling the network +- See the other paper for the details on the network model of SMPI. +- The authors also speak about local communications, within a node. They are implemented with shared memory. The model + here is also piecewise linear, but with less variability and higher speed. However, they did not implement this model, + they kept the classical network model since local communications were rare enough. +***** Validation +- The authors obtain a very good accuracy for performance estimations (as stated in the previous paper). +- For two of the three applications, they also have a very good accuracy for energy consumption estimations. +- With the last application, the accuracy is bad. The reason is that the application (HPL) does busy waiting on + communications (with MPI_Probe). In the current model, they assume that it does not cost energy. +***** Experimental environment +Minor modifications to the setup can have a major impact on the performances and/or the power consumption. The authors +therefore give a list of settings to track. +- Hardware. If we suppose that the cluster is homogeneous, it has to be the case. Two CPU having the same type can still + exhibit different performances (e.g. if they come from two different batches/factories). +- Date of the measurements. A lot of things having an impact can change in time: temperature of the machine room, + vibrations, BIOS and firmware version, etc. +- Operating system. The whole software stack and how it is compiled can have a huge impact. Also, always observe a delay + between the boot and the beginning of experiments. +- Kernel configuration. For instance, its version, the scheduling algorithm, technologies like hyperthreading, etc. +- The application itself and the runtime (e.g. the algorithms used for collective operations). +***** Conclusion / future work +- The approach to simulate power consumption is accurate only if the application is regular in time. To handle + applications with very different computation patterns, we could specify the power consumption for each code + region. But to do so, Simgrid has to be modified, and there need to be very precise measurements to instantiate the + model (impossible with the hardware of Grid 5000, sampling rate of only 1Hz). +- In Simgrid, we can currently not have different network models at the same time, to account for local and remote + communications. A refactoring of the code is underway to fix this. +*** 2017-02-08 Wednesday +**** DONE Paper reading. +- Notes have been added in the relevant section. +- One paper read today: “Simulating MPI applications: the SMPI approach”. +*** 2017-02-09 Thursday +**** TODO Read [[file:hdr.pdf][Scheduling for Large Scale Distributed Computing Systems: Approaches and Performance Evaluation Issues]] :PAPER: +Bibtex: legrand:tel-01247932 +**** DONE Read [[file:SIGOPS_paper.pdf][An Effective Git And Org-Mode Based Workflow For Reproducible Research]] :ORGMODE:GIT:PAPER: +Bibtex: stanisic:hal-01112795 +- A branching scheme for git, based on four types of branches. + + One src branch, where the code to run the experiments is located. This branch is quite light. + + One xp branch per experiment, that exists only during the period of the experiment. We can find here all the data + specific to this experiment. Also a light branch, since limited to an experiment. + + One data branch, in which all xp branches are merged when they are terminated. Quite an heavy branch, a lot of things. + + One art branch per article, where only the code and data related to the article are pulled from the data branch. + When an xp branch is merged in data and deleted, a tag is added. Then, we can easily checkout to this experiment in the future. +- Org-mode used as a laboratory notebook. All details about the experiments (what, why, how...) are written here. + Thanks to literate programming, the command lines to execute are also contained in the notebook. +**** Presentation about org-mode by Christian. :ORGMODE: +- Have a per day entry in the journal. If you work more than an our wthout writing anything in the journal, there is an + issue. +- Put tags in the headlines, to be able to search them (e.g. :SMPI: or :PRESENTATION:). Search with “match” key word. + Hierarchy of tags, described in the headline. +- About papers, tags READ/UNREAD. Also the bibtex included in the file. Attach files to the org modeo (different than a + simple link). Use C-a, then move. +- Spacemacs: add a lot of stuff to evil mode. +- Can also use tags to have link on entries, use the CUSTOM_ID tag. +- Can use org mode to put some code. +**** DONE Paper reading. +- One paper read today: “Predicting the Performance and the Power Consumption of MPI Applications With SimGrid”. +- Notes have been added in the relevant section. +**** DONE Apply the things learnt at the org-mode presentation. +*** 2017-02-10 Friday +**** Tried to get good org-mode settings. :ORGMODE: +- Cloned org-mode [[/home/tom/.emacs.d/org-mode][git repository]] to have the latest version (success). +- Tried to install [[http://mescal.imag.fr/membres/arnaud.legrand/misc/init.php][Arnaud's configuration file]] (fail). +- Will try Christian's configuration file on Monday. +**** DONE Paper reading. +- One paper read today: “An Effective Git and Org-Mode Based Worflow for Reproducible Research”. +- Notes have been added in the relevant section. +**** Begin looking at the documentation. +- Documentation about the [[http://simgrid.gforge.inria.fr/simgrid/3.13/doc/platform.html][topology]]. +**** Run a matrix product MPI code in a fat tree +- Code from the parallel system course. +- Tried [[https://github.com/simgrid/simgrid/blob/master/examples/platforms/cluster_fat_tree.xml][Github]] example (fat tree =2;4,4;1,2;1,2=, 2 levels and 16 nodes). +- Tried a personal example (fat tree =3;4,4,4;1,4,2;1,1,1=, 3 levels and 64 nodes). +**** DONE Find something to automatically draw a fat tree. + + Maybe there exists some tools? Did not find one however. + + Maybe Simgrid has a way to export a topology in a graphical way? Would be very nice. + + Could adapt the Tikz code I wrote during 2015 internship? +*** 2017-02-13 Monday +**** Keep working on the [[file:/home/tom/Documents/Fac/2017_Stage_LIG/small_tests/matmul.c][matrix product]]. :SMPI:C:BUG: +- Observe strange behavior. + + Commit: 719a0fd1775340628ef8f1ec0e7aa4033470356b + + Compilation: smpicc -O4 matmul.c -o matmul + + Execution: smpirun --cfg=smpi/bcast:mpich --cfg=smpi/running-power:6217956542.969 -np 64 -hostfile ./hostfile_64.txt + -platform ./cluster_fat_tree_64.xml ./matmul 2000 + Then, processes 0 and 63 behave very differently than others. + + Processes 0 and 63 have a communication time of about 0.21 and a computation time of about 1.52. + + Other processes have a communication time of about 0.85 and a computation time of about 0.75. + With other topologies and/or matrix sizes, we still have this behavior (more or less accentuated). +- If we change the order of the loops of the sequential matrix product from i-j-k to k-i-j: + + The execution time is shorter. Hypothesis: this solution has a better usage of the cache. + + The computation times are decreased (expected), but the communication times are also decreased (unexpected). + + Still observe the same trend than above for processes 0 and 63. +- Checked with some printf: all processes are the root of a line broadcast and of a column broadcast exactly once (expected). +- Tried several broadcast algorithms (default, mpich, ompi), still have the same behavior. +- Adding a call to MPI_Barrier at the beginning of the for loop fix the issue for the communication (all processes now + have a communication time of about 0.22) but not for the computation (still the same differences for processes 0 and + 63). +- When using a smaller numbmer of processes (16 or 4), communication times and computation times are more consistent + (with still some variability). +- With one process and a matrix size of 250, we have a computation time of 0.10 to 0.12. When we have 64 processes and a + matrix size of 2000, each block has as ize of 250. Thus, we can extrapolate that the “normal” computation time in this + case should be about 0.8 (8 iterations, so 8*0.10). Thus, processes 0 and 63 have a non-normal behavior, the others + are ok. +- Also tried other topologies, e.g. a simple cluster. Still have the same behavior (with different times). + + Again, normal behavior with less processes (e.g. 16). + + We get a normal behavior if we take hostfile_1600.txt, very strange. +- Bug fixed, the problem came from the hostfile. For some unknown reason, it missed a end-of-line character at the last + line. I suspect that two processes (0 and 63) were therefore mapped to a same host, because the last host was not + parsed correctly by smpi. The two versions of the file have been added to the repository. +- Issue reported on [[https://github.com/simgrid/simgrid/issues/136][Github]]. +**** Try to optimize the matrix product code. :SMPI:C: +- For the record, the following command yields communication times between 0.27 and 0.31 and computation times between + 0.78 and 0.83, for a total time of about 1.14: smpirun --cfg=smpi/bcast:mpich --cfg=smpi/running-power:6217956542.969 + -np 64 -hostfile ./hostfile_64.txt -platform ./cluster_fat_tree_64.xml ./matmul 2000 +- Replaced malloc/free by SMPI_SHARED_MALLOC/SMPI_SHARED_FREE. Got similar times (approximately). +- Added SMPI_SAMPLE_GLOBAL(0.5*size, 0.01) to the outer loop of the sequential matrix product. Got similar times (approximately). +- Remark: we should verify more rigorously that these optimizations do not change the estimated time. +- Greatly reduced simulation time (from 8.2s to 0.5s). +- Other optimization: stop initializing the content of the matrices (since we do not care of their content). +**** Meeting with Arnaud. :MEETING: +- There exists some visualization tools for Simgrid, to see the bandwidth that goes on some links. May be very useful + in the future, to have a better understanding of what is going on. +- The characteristics of the jobs (number of nodes, patterns of communication) have an important impact on performances. + However, it is difficult for us to have access to this, we do not own a supercomputer... Maybe Matthieu can have more + information (e.g. from Bull's clients)? +**** DONE Add supervisors on Github for the journal. +**** Some quick performance tests. :SMPI:EXPERIMENTS: +- Run my matrix product code, with SMPI optimizations. +- Use a 2-level fat-tree made with switches of 48 ports. +- First case: non-tapered. We use all the switches. The fat-tree is 2;24,48;1,24;1,1 (total of 1152 nodes). + + Use 1089 processes, matrix size of 4950. + + Time: 1.75s. + + Communication time: 0.94s. + + Computation time: 0.81s. +- Second case: tapered. We remove half of the root switches. The fat-tree is 2;24,48;1,12;1,1 (still 1152 nodes). + + Still uses 1089 processes, matrix size of 4950. + + Time: 1.78s. + + Communication time: 0.94s. + + Computation time: 0.82s. +- The observed difference does not seem significant, but we should check with a carefully designed experiment and + analysis. +- For the record, running the same application on the same topology with only one process takes a time of 3607s. Thus, + we have a speedup of about 2026, so an efficiency of 1.86. This is a very nice speedup (superlinear). Certainly due + to cache effects. +- These quick tests suggest that we could remove root switches without impacting the performances, even if we use nearly + the whole fat-tree (this is obvious if we use a small subtree). +**** DONE Run another benchmark (e.g. HPL), with more carefully designed experiments. +**** DONE The 3-level fat-tree was very long to load (aborted). Find why. +*** 2017-02-14 Tuesday +**** Work on experiment automatization. :PYTHON: +- Add Python functions to generate topology and host files from a given fat-tree description. +- Adapt Python script and Jupyter notebook from parallel system course to run experiments. +- The matrix size and the number of processes are fixed. We compute matrix products for various numbers of root switches + (we test fat-trees (2;24,48;1,n;1,1) for n in [1, 24]). +- Results seem very promising. For a matrix size of 6600, we can have as few as 10 root switches without important + impact on performances (recall that a typical 2-level fat tree with 48 port switches would have 24 root switches). If + we keep removing switches, then performances are quickly impacted. +- Repeated the experiment with the same topology and the same matrix size, but with only 576 processes. We observe a + same trend, we can remove a lot of root switches without having an impact. +**** DONE Ask if it would be possible to have an SSH access to some dedicated computer. +- Does not need to have a lot of cores (Simgrid is not a parallel program), but it would be nice if it had a fast core. +- Needs to be dedicated so as to not perturbate the experiments. +**** Webinar on reproducible research: [[https://github.com/alegrand/RR_webinars/blob/master/7_publications/index.org][Publication modes favoring reproducible research]] :MEETING: +- Speakers: [[http://khinsen.net/][Konrad Hinsen]] and [[http://www.labri.fr/perso/nrougier/][Nicolas Rougier]]. +- Two parts in research: dissemination (of the results/ideas) and evaluation (of the researcher). +- If we want reproducible research to become a norm, researchers should be rewarded for this (their reputation should + also depend on the reproducibility of their research, not only the number of citations or the impact factor). +- The speaker compares reproducible research for two points of view: human part and computer part, both for + dissemination and evaluation. +***** [[http://www.activepapers.org/][ActivePapers]] +- Not a tool that one should use (yet), neither a proposition of new standard. It is mainly an idea for computer-aided + research. +- How to have more trusts on the software? The “ideal” one is reimplementation (e.g. ReScience). The speaker tried this + on a dozen projects, he never got identical results. Other good ideas: good practices like verison control and + testing, keep track of the software stack (hardware, OS, tools, etc). +- ActivePapers group scripts, software dependencies and data into a same archive. +***** [[http://rescience.github.io/][ReScience]] +- Idea: replicate science. +- For a great majority of papers, we cannot replicate their reuse their code. +- It is hard to publish replication of an original paper, most journals will reject it since not original. +- This is why ReScience was born. It is (currently) used on Github. +- To publish a new study, do a pull-request on ReScience repository. Then it is reviewed openly by reviewers selected by + the editor. The replication is improved until it is publishable. +*** 2017-02-15 Wednesday +**** Use Christian’s config files for org mode :ORGMODE: +**** Work on the experiment script + - Parsing more generic fat-tree descriptions. For instance, our current topology description would be + 2;24,48;1,1:24;1,1. It means that the L1 switches can have between 1 and 24 up ports. + - Modify the script for experiments to be more generic. + + Can give as command line arguments the fat-tree description, the (unique) matrix size, the (unique) number of processes. + + Use Python’s argparse for a cleaner interface. +**** Re-run experiments with this new script + - Still observe the same trend: we can afford to remove a lot of up-ports for the L1 switches. + - Some points seem to be outliers. But we have not a lot of points, so it is difficult to say. We whould do more + experiments to see if these points are still significantly separated from the rest. +*** 2017-02-16 Thursday +**** DONE Enhance/fix Emacs configuration :ORGMODE: +- Translate days and months in English. +- Increase the line length limit (120 columns?). +- Reformat the whole document with such limit. +- Add tags where relevant. +- Attach files, instead of putting a link. +**** Try to use even more SMPI optimizations :SMPI: +- Currently, we use the macro SMPI_SAMPLE_GLOBAL only once: for the outer for loop of the sequential matrix product. +- Maybe we can also use it for the two other loops? We could also reduce the number of iterations (currently, it is + 0.5*size). Let’s try. +- Currently, we get communication times of about 0.14s and computation times of about 0.42s, for a total time of 0.57s, + with the following command: + smpirun --cfg=smpi/bcast:mpich --cfg=smpi/running-power:6217956542.969 -np 64 -hostfile ./hostfile_1152.txt -platform + ./big_tapered_fat_tree.xml ./matmul 1600 +- FAIL. It seems we cannot use imbricated sample blocks. Quite strange, do not understand why... +**** Try to run HPL with Simgrid :SMPI:HPL: +- Copied from [[https://gitlab.inria.fr/fheinric/paper-simgrid-energy/tree/master/src/hpl-2.2][Christian’s repository]]. +- Compilation fails, don’t know why. But binaries are stored in the git repository (don’t know why either), so I can use + them to do some first tests. + In fact, file Make.SMPI needed to be modified. Changed =mkdir= by =mkdir -p=, =ln= by =ln -f= and =cp= by =cp -f=. + Changed top directory. + Also, the Makefile couldn’t find the shared library atlas. It was in /usr/lib, but named =libatlas.so.3=. Added a + symbolic link to =libatlas.so=. +- Tested my laptop (with MPI, not SMPI). With a problem size of 10000 and 12 processes, it corresponds to 16.51 Gflops. +- Tested with SMPI, with a problem size of 10000 and 4 processes. Command: + smpirun --cfg=smpi/bcast:mpich --cfg=smpi/running-power:6217956542.969 -platform + ../../../small_tests/cluster_fat_tree_64.xml -hostfile ../../../small_tests/hostfile_64.txt -np 4 ./xhpl + Result: 1.849Gflops. +- Same thing, with 12 processes. Very similar: 1.847Gflops. Why is it not faster? +- Same thing, with 64 processes. Very similar: 1.858Gflops. Why is it not faster? +- Retried with a freshly compiled program. Still the same thing. +- Understood the issue: it is not enough to specify the number of processes with =-np 12=, we also have to tell it in the + file =HPL.dat=. +- Tried with =-np 4=, P=4 and Q=1. Now, 6.6224Gflops. We have a speedup of 3.59, which seems reasonable. +- The number of processes given with =-np= must be greater or equal to P \times Q. +- Tried with =-np 4=, P=1 and Q=4. Did not have a noticeable impact on performances (in comparison with P=4, Q=1). +- Tried with =-np 4=, P=2 and Q=2. Did not have a noticeable impact on performances (in comparison with P=4, Q=1). +- Tried with =-np 64=, P=8 and Q=8. Now, 22.46Gflops. Speedup of 12, very disappointing. +- Tried with =-np 64=, P=8 and Q=8 again, but with a problem size of 20000 (it was 10000). Now 52.2Gflops (speedup of 28.3). +**** Comparison with top 500 +- For the record, the order of magnitude for Intel desktop CPU of today is between 10 and 100 Gflops, according to [[https://www.pugetsystems.com/labs/hpc/Linpack-performance-Haswell-E-Core-i7-5960X-and-5930K-594/][this + website]], [[https://setiathome.berkeley.edu/cpu_list.php][this website]] and [[https://asteroidsathome.net/boinc/cpu_list.php][this website]]. My laptop supposedly has a speed of 3.84 Gflops per core and 15.21 Gflops in + total according to the last two websites. +- According to [[https://en.wikipedia.org/wiki/Raspberry_Pi#Performance][Wikipedia]], the first generation Raspberry Pi has a speed of 0.041 Gflops, a 64 nodes cluster made of + those has a speed of 1.14 Gflops. +- The first supercomputer has a speed of about 93Pflops, or 93,000,000Gflops. +- The last one has a speed of about 349Tflops, or 349,000Gflops. +- In June 2005, the first one had a speed of about 136Tflops, the last one 1.2Tflops. +- In our settings with 64 nodes, each node has one core that computes at 1Gflops. Thus, our Rpeak is 64Gflops. We have + an efficiency of 52.2/64 = 0.81. This is not bad, compared to the three first supercomputers of the top 500 + (respectively at 0.74, 0.61 and 0.63). But we should maybe not compare the efficiency of a 64 nodes cluster with these + supercomputers, since it becomes harder to be efficient with a large topology. +**** DONE SMPI optimization of HPL :SMPI:HPL: +:LOGBOOK: +- State "DONE" from "TODO" [2017-03-22 Wed 17:08] +- State "TODO" from [2017-02-16 Thu 16:17] +:END: +- It seems that no SMPI optimization is done in the code obtained from Christian’s repository. Maybe we could speed + things up? +- Need to check what is the algorithm behing HPL, whether it is regular (to use SMPI_SAMPLE) and data independent (to + use SMPI_SHARED). +**** DONE Adapt the experience script to run HPL +:LOGBOOK: +- State "DONE" from "TODO" [2017-02-17 Fri 15:47] +- State "TODO" from [2017-02-16 Thu 17:03] +:END: +- Parse the output (quite ugly to parse, but easy, use methods str.split and list.index). +- Run the same kind of experiments than for the matrix product. Will be much longer if we cannot use SMPI optimizations. +*** 2017-02-17 Friday +**** Refactor the experiment script :PYTHON: +- Aim: reuse for HPL the code already done for the matrix product. +- Now, we have a clase =AbstractRunner=, which runs the common logic (e.g. some basic checks on the parameters, or running + the desired number of experiments). +- We also have classes =MatrixProduct= and =HPL=, containing the piece of codes specific to the matrix product or HPL (e.g. running one experiment). +**** Some strange things with HPL :SMPI:BUG:HPL: +- The output has the following format: + #+begin_example + ================================================================================ + T/V N NB P Q Time Gflops + -------------------------------------------------------------------------------- + WR00L2L2 2000 120 1 1 3.17 1.683e+00 + #+end_example +- Sometimes, the last line is missing, so we do not have any informaiton on time and flops. +- Quite often it is present, but with wrong values: the time is 0.00 and the Gflops are absurdly high (e.g. 2.302e+03 + Gflops for a cluster made of 96 machines of 1 Gflops). It may come from an erroneous measure of the time. +- For instance, with the script of commit =dbdfeabbef3f90a3d4e2ecfbe5e8f505738cac23=, the following command line: + =./run_measures.py --global_csv /tmp/bla --nb_runs 10 --size 5000 --nb_proc 64 --fat_tree "2;24,48;1,24;1,1" + --experiment HPL= + + It may get this output in one experiment: + #+begin_example + ================================================================================ + T/V N NB P Q Time Gflops + -------------------------------------------------------------------------------- + WR00L2L2 5000 120 8 8 0.00 1.108e+05 + #+end_example + + And this output in another one: + #+begin_example + ================================================================================ + T/V N NB P Q Time Gflops + -------------------------------------------------------------------------------- + WR00L2L2 5000 120 8 8 5.35 1.560e+01 + #+end_example + Note that, for the two experiments, *nothing* has changed. The file =HPL.dat= is the same, the number of processes given + to the option =-np= is the same, the topology file and the host file are the same. +*** 2017-02-20 Monday +**** Keep investigating on the HPL anomaly +**** Found the issue with HPL :SMPI:BUG:HPL: +- Debugging with Christian, to understand what was going on. +- This was a concurrency issue. The private variables of the processes were in fact not private. This caused two + processes to write a same variable, which led to an inconsistent value when measuring time. +- The function is =HPL_ptimer=, in file =testing/ptest/HPL_pdtest.c=. +- When using simgrid, need to use option =--cfg=smpi/privatize-global-variables:yes= to fix this. +- Used a tool to search for a word, looks nice: =cg= and =vg= (package =cgvg=). +- Another nice thing: =ctags= (command =ctags --fields=+l -R -f ./ctags src testing=). +*** 2017-02-21 Tuesday +**** Test the experiment script for HPL :EXPERIMENTS:HPL: +- It seems to work well, the bug is fixed. +- Scalability issue. Testing for a size of 20k already takes a lot of time, and it is still too small to have a good + efficiency with 1000 processes (performances are worse than with 100 processes). +- Definitely need to use SMPI optimizations if we want to do anything with HPL. +**** Re-do experiments with matrix product +- Stuck with HPL... +- We also output the speed of the computation, in Gflops (this is redondant with the time, but we can use it for + comparison with other algorithms like HPL). +- The plot looks nice, but nothing new. +**** Work on the drawing of fat-trees +- Generate all nodes and edges of a fat-tree. +- No drawing yet. +- Will try to output Tikz code. +**** DONE Look at where to put SMPI macros in HPL, with Christian +:LOGBOOK: +- State "DONE" from "TODO" [2017-02-22 Thu 13:17] +- State "TODO" from [2017-02-21 Tue 15:03] +:END: +- Have a look at a trace, to see where most of the time is spent. +**** Keep working on the drawing of fat-trees. +- Now produce working Tikz code. +- Figure quickly becomes unreadable for large fat-trees (not surprising). +*** 2017-02-22 Wednesday +**** Terminate the work on fat-tree drawing :PYTHON: +- We can now do =./draw_topo.py bla.pdf "2;8,16;1,1:8;1,1" "2;4,8;1,1:4;1,1"= to draw all the fat-trees in the file + =bla.pdf=. Very useful to visualize the differences between the trees. +- No limit on the fat-tree size, they should fit on the pdf (a very large page is generated, then cropped to the right + dimension). However, a large fat-tree may not be very readable. +**** Tried to move the SMPI_SAMPLE of the matrix product +- Cannot use one SMPI_SAMPLE per loop (don’t know why, but it seems to be forbidden). +- It was used for the outer loop. Tried the inner loops, but performances were greatly degraded (about \times50 in simulation time). +- Reverting the change. +**** DONE Cannot use more than 1024 processes with Simgrid (need to fix) :SMPI:BUG: +:LOGBOOK: +- State "DONE" from "TODO" [2017-02-23 Thu 10:20] +- State "TODO" from [2017-02-22 Wed 14:12] +:END: +- The =open()= system call fails with =EMFILE= error code. +- It used to work, don’t understand what changed in the meantime. +**** Talk with Christian about SMPI optimizations in HPL :PERFORMANCE:HPL: +- He gave me a trace of HPL execution obtained with Simgrid. +- The parts taking most of the time are the following: + #+begin_example + 50 /home/cheinrich/src/hpl-2.2/src/pgesv/hpl_rollt.c 242 /home/cheinrich/src/hpl-2.2/src/comm/hpl_recv.c 136 190.785263 498 + 51 /home/cheinrich/src/hpl-2.2/src/pgesv/hpl_rollt.c 242 /home/cheinrich/src/hpl-2.2/src/comm/hpl_sdrv.c 180 372.272945 996 + 52 /home/cheinrich/src/hpl-2.2/src/pgesv/hpl_rollt.c 242 /home/cheinrich/src/hpl-2.2/src/comm/hpl_send.c 133 179.711679 498 + #+end_example +**** Let’s track these piece of code :PERFORMANCE:HPL: +- =HPL_rollT.c= has only one function: =HPL_rollT=. +- This function is called only once: at the end of function =HPL_pdlaswp01T= (eponym file). +- This function is called once in function =HPL_pduptateNT= and once in function =HPL_pdupdateTT= (eponym files). There are + very few differences between these two functions (4 line changes are relevant, which are small variations in the + arguments of a function, =HPL_dtrsm=). These files have 443 lines: this is a huge copy-paste, very dirty. +- A candidate for the long function we are looking for is =HPL_dlaswp10N= (found by Christian). Has two nested loops. + This function is also a good candidate for the most terrible piece of code ever written. +- Added a =SMPI_SAMPLE_GLOBAL= after the outer loop, did not reduce the simulation time. Also tried to remove the whole + code of the function, did not reduce the simulation time either. So we can say this function is not our big consummer. +- Functions =HPL_recv= and =HPL_sdrv= are both called *only* in =HPL_pdmxswp= and =HPL_pdlaswp00N=. +- Function =HPL_pdlaswp00n= is used only in =HPL_pdupdateTN= and =HPL_pdupdateNN=, which are nearly identical. These two + functions are then used in the =testing= folder, with something like =algo.upfun = HPL_pdupdateNN=. Might be hard to track... +- Function =HPL_pdmxswp= is used in =HPL_pdpancrT=, =HPL_pdpanllT=, =HPL_pdpanllN=, =HPL_pdpanlT=, =HPL_pdpanrlN=, + =HPL_pdpancrN=. These functions are used in the =testing= folder, with something like =algo.pffun = HPL_pdpancrN=. +- Trying to put some printf. + We use the command: + #+begin_src sh + smpirun --cfg=smpi/bcast:mpich --cfg=smpi/running-power:6217956542.969 --cfg=smpi/display-timing:yes + --cfg=smpi/privatize-global-variables:yes -np 16 -hostfile ../../../small_tests/hostfile_64.txt -platform + ../../../small_tests/cluster_fat_tree_64.xml ./xhpl + #+end_src + + Function =HPLpdupdateNN= never used. + + Function =HPLpdupdateTN= never user. + + Thus, function =HPL_pdlaswp00n= also never used (verified with printf in this function). + + Function =HPL_pdmxswp= is used and takes a significant (albeit not huge) amount of time (about 2 seconds when the + total time is 41 seconds (virtual time)). +*** 2017-02-23 Thursday +**** Try to increase the file limit :SMPI:BUG: +- First try, following [[http://askubuntu.com/questions/162229/how-do-i-increase-the-open-files-limit-for-a-non-root-user][this question]] and [[http://stackoverflow.com/questions/21515463/how-to-increase-maximum-file-open-limit-ulimit-in-ubuntu][this question]] from Stackoverflow. + + Added the following to =/etc/security/limits.conf=: + #+begin_example + * soft nofile 40000 + * hard nofile 40000 + #+end_example + + Added the following to =/etc/pam.d/common-session=: + #+begin_example + session required pam_limits.so + #+end_example + + Rebooting. +- Success, =ulimit -Sn= shows =40000= and we can now run experiments with more than 1024 processes. +**** Keep tracking the time consumming piece of code in HPL :PERFORMANCE:HPL: +- Function =HPL_pdmxswp= is used in some functions which are chosen with =algo.pffun= (see above). +- They are then used (through a call to algo.pffun) in functions =HPL_pdrpancrN=, =HPL_pdrpanrlN=, =HPL_pdrpanllN=, + =HPL_pdpanrlT=, =HPL_pdrpancrT= and =HPL_pdranllT=. +- Again, these functions are not used directly in =src=, there is something like =algo.rffun = HPL_pdrpancrT= in the =testing= folder. +- This =rffun= is used only once, in =HPL_pdfact=. +- Function =HPL_pdfact= takes between 2.5 and 2.8 seconds when the total time is 41 seconds (virtual time). This time includes + the time spent in =HPL_pdmxswp=. +- Function =HPL_pdffact= is used in functions =HPL_pdgesvK1=, =HPL_pdgesvK2= and =HPL_pdgesv0=. These functions are then called + in =HPL_pdgesv=. +- Function =HPL_pdgesv= takes a time of about 3 seconds when the total time is 41 seconds (virtual time). +- Strange thing. Deleting the content of this function gives a very short run-time. Maybe the way I measured time (using + =MPI_WTIME=) is not consistent with the way HPL measure time. +- Identified the long loop in =HPL_pdgesv0=. But cannot put a =SMPI_SAMPLE= here, there are calls to MPI primitives in the block. +- Found the right function to measure time: use =HPL_timer_walltime=, not =MPI_Wtime=. +- Instrumented the code of =HPL_pdgesv0= to have an idea of what takes time. Measures are taken with =HPL_timer_walltime=. + What takes time is the part “factor and broadcast current panel” in the loop. + Within this part, the call to =HPL_bcast= and =HPL_pdupdate= are what take most of the (virtual) time. In an execution of + 40.96 seconds: + #+begin_example + pdfact = 2.907908, binit = 0.002633, bcast = 11.013843, bwait = 0.000669, pdupdate = 26.709408 + #+end_example + Obviously there is nothing to do for the broadcast, but there may be hope for =pdupdate=. +- Several versions exist for this function: + + =HPL_pdupdateTN= + + =HPL_pdupdateNT= + + =HPL_pdupdateTT= + + =HPL_pdupdateNN= + Only =HPL_pdupdateTT= seems to be used (with our settings). + Removed body of function =HPL_pdupdateTT=, the simulation time becomes about 8 seconds (was 69 seconds). +- Might be tricky to optimize with SMPI macros, this function mixes computations and communications. +- Tried to insert a =return= line 208 (before comment “The panel has been forwarded at that point, finish the update”. The + time is not impacted and the correction test are valid, so the part of the code after this point seems useless here. + Verified by inserting a =printf=, this paprt is never executed. +- Line 143 is executed (just after comment “1 x Q case”). +- Adding a =return= statement line 136 (just before comment “Enable/disable th column panel probing mechanism”) gives a + simulation time of 8 seconds. + Same thing line 140, after the broadcast. +- The =if= block of lines 143-258 is never executed in our settings. Explain why acting on line 208 did not have any effect. +- Adding a =return= statement line 358 (just before comment “The panel has been forwarded at that point, finish the + update”) gives a simulation time of 9.7 seconds. +- The =if= block of lines 360-414 seem to be always executed. The =if= block of lines 366-390 is executed sometimes, but not + always. + In this block, we execute the =#else= part of the =#ifdef=. +- In this block, removing the call to =HPL_dgemm= reduce a lot the simulation time (from 68s to 13s). +- Several definitions exist for =HPL_dgemm=: there is an implementation in =src/blas/HPL_dgemm.c=, but also a =#define + HPL_dgemm cblas_dgemm= in =include/hpl_blas.h=. +- Can disable this =#define= by removing the line =HPL_OPTS = -DHPL_CALL_CBLAS= in the file =Make.SMPI=. + Then, =HPL_dgemm= is executed, but not the others (=HPL_dgemm0=, =HPL_dgemmTT=, =HPL_dgemmTN=, =HPL_dgemmNT=, =HPL_dgemmNN=). It + seems that =HPL_dgemm= can call =HPL_dgemm0= which can itself call the four others, but this only happens when + =HPL_CALL_VSIPL= is defined. +- In fact, there is maybe no need to insert the =SMPI_SAMPLE= macro in =dgemm= function. We can put it inside + =HPL_pdupdateTT=. For instance, line 360, just above the big =if= block. However, this performs realy badly. With + =SMPI_SAMPLE_GLOBAL(10, 0.1)=, the real time becomes about 10 seconds (speedup of \times4) but the virtual time becomes about + 90 seconds (\times2 error). If we increase one of the two numbers, the real times quickly become as large as it was before. + Same thing with =SMPI_SAMPLE_LOCAL=. Maybe this code is too irregular? Or we should “zoom in” and insert the SMPI + optimizations in =dgemm= (which is in an external library, so not that easy). +*** 2017-02-27 Monday +**** Try running matrix product experiment with big fat-trees :SMPI:BUG: +- Run a medium number of processes on a big fat-tree. + #+begin_src sh + ./run_measures.py --global_csv big_global.csv --local_csv big_local.csv --nb_runs 3 --size 9300 --nb_proc 961 + --fat_tree "3;24,24,48;1,24,1:24;1,1,1" --experiment matrix_product + #+end_src sh + Seems to work properly, one CPU core is quickly loaded at 100% and one experiment approximately takes two minutes. +- Try a larger number of processes with the same topology and the same matrix size. + #+begin_src sh + ./run_measures.py --global_csv big_global.csv --local_csv big_local.csv --nb_runs 3 --size 9300 --nb_proc 8649 + --fat_tree "3;24,24,48;1,24,1:24;1,1,1" --experiment matrix_product + #+end_src + The CPU is loaded at about 3% for quite a long time with the script =smpirun=. It finally launches =matmul= and becomes + loaded at 100%. Then it quickly terminates with a non-null exit code: =Could not map fd 8652 with size 80000: Cannot + allocate memory=. The memory consumption was only 3% of the total memory, this is strange. + This happens in function =shm_map=, called by =SMPI_SHARED_MALLOC=. +- Retrying the same command, with =malloc= instead of =SMPI_SHARED_MALLOC= and =free= instead of =SMPI_SHARED_FREE=. + As expected, larger memory consumption (10.9% of total memory). There is no error this time. The first experiment + terminates in about 20min. For the record, it achieved 1525 Gflops, with communication time and computation time of + approximately 0.48 seconds. +- Revert the changes to get back =SMPI_SHARED= macros. Retry to run =smpirun= with the same settings, except the option + =--cfg=smpi/privatize-global-variables:yes= which is not passed here. No error either this time, run for 13 + minutes. Also a large memory consumption (13.5%), maybe the 3% we observed was not the final memory consumption, since + the process exited with an error? +- Remark: for matrix product, there is no global variable. So maybe we can safely remove this option in this case? + This does not solve the problem since we need it for HPL. +- Try the initial command with a smaller matrix size (size=93, i.e. all processes have a sub-matrix of size 1\times1). Observed the same error. +- Also try to reproduce this with HPL, with this command: + #+begin_src sh + ./run_measures.py --global_csv big_global.csv --nb_runs 3 --size 5000 --nb_proc 8649 --fat_tree + "3;24,24,48;1,24,1:24;1,1,1" --experiment HPL + #+end_src + Not any error, although we have a memory consumption of 71.2%. +- Try the initial command, still with a size of 93, but commenting the call to =matrix_product= in =matmul.c=. Thus, there + is no allocation of temporary buffers, only the initial matrices (3 allocations instead of 5). No error. +- Same thing, with the call to =matrix_product= uncommented, but a =return= statement placed just after the temporary + buffers allocations. We get the =mmap= error. +- Create a MWE from this, called =mmap_error.c=. +**** Work on a MWE for the mmap error :SMPI:BUG: +- File =mmap_error.c= is a MWE for the =mmap= error. It consists in 5 calls to =SMPI_SHARED_MALLOC= with a size of 1, we + launch it with 8652 processes. + We also get an error if we do 100k calls to =SMPI_SHARED_MALLOC= with only one process. The total number of calls to + this macro seem to be the issue. We get the error with or without the option =smpi/privatize-global-variables:yes=. +- The following file =mmap_error.c=: + #+begin_src c + #include + #include + + #define N 65471 + + int main(int argc, char *argv[]) { + + MPI_Init(&argc, &argv); + + for(int i = 0; i < N; i++) { + float *a = SMPI_SHARED_MALLOC(1); + } + + MPI_Barrier(MPI_COMM_WORLD); + printf("Success\n"); + MPI_Finalize(); + return 0; + } + #+end_src + With the following command (commit =8eb0cf0b6993e174df58607e9492a134b85a4669= of Simgrid): + #+begin_src sh + smpicc -O4 mmap_error.c -o mmap_error + smpirun -np 1 -hostfile hostfile_64.txt -platform cluster_fat_tree_64.xml ./mmap_error + #+end_src + Yields an error. Note that the host and topology files are irrelevant here. + + For =N<65471=, we have no error (=Success= is printed). + + For =N>65471=, we have the error =Could not map fd 3 with size 1: Cannot allocate memory=. + + For =N=65471=, we have the error =Memory callocation of 524288 bytes failed=. +- Retried with latest version of Simgrid (commit =c8db21208f3436c35d3fdf5a875a0059719bff43=). Now have the + message: + #+begin_example + Could not map folded virtual memory (Cannot allocate memory). Do you perhaps need to increase + the STARPU_MALLOC_SIMULATION_FOLD environment variable or the sysctl vm.max_map_count? + #+end_example + Found the issue: + #+begin_src sh + $ sysctl vm.max_map_count + vm.max_map_count = 65530 + #+end_src + To modify the value of a =sysctl= variable, follow [[https://www.cyberciti.biz/faq/howto-set-sysctl-variables/][this link]]. + Temporary fix: + #+begin_src sh + sudo sysctl -w vm.max_map_count=100000 + #+end_src +**** Run the matrix product experiment with 8649 processes +- Using the command: + #+begin_src sh + ./run_measures.py --global_csv big_global.csv --local_csv big_local.csv --nb_runs 3 --size 9300 --nb_proc 8649 + --fat_tree "3;24,24,48;1,24,1:24;1,1,1" --experiment matrix_product + #+end_src +- The experiments are very long, about 30 minutes. The code is already optimized a lot (SMPI macros, no initialization + of the matrices), a large part of this time is spent outside of the application, so there is not much hope to run it + faster without modifying Simgrid. +- This shows that we *really* need to optimize HPL if we want to run it with a large number of processes. +- Anyway, without SMPI macros, every floating-point operation of the application is actually performed. Thus, if we are + simulating a computation made on a 1000 Gflops cluster, using a 1 Gflops laptop, the simulation should take *at least* + 1000 times longer than the same computation on a real 1000 Gflops cluster. +- First results show no large difference in the total time for small or large number of roots. The communication time is + about twice as large as the computation time, so maybe we should take a larger matrix. When we had 961 processes, each + one had a sub-matrix of size 300\times300. With 8649 processes, they have a sub-matrix of size 100\times100. + Problem: if we want to get back to the 300\times300 sub-matrices, we need to multiply the size by 3 and thus the memory + consumption by 9. It was already about 25%, so not feasible on this laptop. But this is strange, we should have the + memory of only one process and we successfully ran 300\times300 sub-matrices, need to check. +*** 2017-02-28 Tuesday +**** Other benchmarks on Simgrid :SMPI:EXPERIMENTS: +- The paper “Simulating MPI application: the SMPI approach” uses the benchmark [[https://www.nas.nasa.gov/publications/npb.html][NAS EP]] to demonstrate the scalability of + SMPI. With SMPI optimizations, they ran it with 16384 processes in 200 to 400 seconds (depending on the topology). + Where is the code for this? + + Found an [[https://github.com/sbadia/simgrid/tree/master/examples/smpi/NAS][old repository]]. Not clear if it is relevant. + + Also a (shorter) version in the [[https://github.com/simgrid/simgrid/tree/master/examples/smpi/NAS][official Simgrid repository]]. + Executable located in =simgrid/build/examples/smpi/NAS/=. + Launch with two arguments: number of processes (don’t know what it does, we already have =-np= option given to + =smpirun=) and the class to use (S, W, A, B, C, D, E, F). +- The NAS EP benchmark from Simgrid repository seems promising. Added a new class to have a larger problem (maybe we + could instead give the size as an argument). With a large enough size, we can go to about 3.5 Gflops per process, + i.e. an efficiency of 3.5 (recall that we use 1 Gflops nodes). It seems large, is it normal? +- Longer than the matrix product, 745 seconds for 1152 processes and class F (custom class with m=42). Only 93 seconds + were spent in the application, so the code is already correctly optimized (one call to =SMPI_SAMPLE_GLOBAL=). +- Apparently not impacted by a tapered fat tree. Roughly the same speed for =2;24,48;1,24;1,1= and =2;24,48;1,1;1,1=, 1152 + processes and class F: about 3.5 Gflops. The application is made of a computation followed by three =MPI_Allreduce= + of only one =double=, so very few communications (hence the name “embarassingly parallel”). +**** Talk with Christian about benchmarks +- Get an access to grid 5000. +- Profile the code, with something like smpirun -wrapper “valgrind ”. +- To use SMPI macros, run the =HPL_dgemm= implemented in HPL, not the one from the external library. +** 2017-03 March +*** 2017-03-01 Wednesday +**** Trying to use HPL without external BLAS library :HPL: +- Failed. +- It seems that three options are available for compilation, according to [[http://www.netlib.org/benchmark/hpl/software.html][this page]]: + + BLAS Fortran 77 interface (the default), + + BLAS C interface (option =-DHPL_CALL_CBLAS=), + + VSIPL library (option =-DHPL_CALL_VSIPL=). +- We currently use the C interface, which rely on an external library (e.g. Atlas). +- There is an implementation of =HPL_dgemm= in HPL, but it seems to need either code from Fortran 77 or from VSIPL. +- According to the [[http://www.netlib.org/benchmark/hpl/][HPL homepage]]: + #+begin_example + The HPL software package requires the availibility on your system of an implementation of the Message Passing + Interface MPI (1.1 compliant). An implementation of either the Basic Linear Algebra Subprograms BLAS or the Vector + Signal Image Processing Library VSIPL is also needed. Machine-specific as well as generic implementations of MPI, the + BLAS and VSIPL are available for a large variety of systems. + #+end_example + So it seems hopeless to get rid of a BLAS library. +**** Idea: trace calls to =HPL_dgemm= (Arnaud’s idea) :SMPI:TRACING:HPL: +- To do so, surround them by calls to trivial MPI primitives (e.g. =MPI_Initialized=). For instance: + #+begin_src c + #define HPL_dgemm(...) ({int simgrid_test; MPI_Initialized(&simgrid_test); cblas_dgemm(__VA_ARGS__);\ + MPI_Initialized(&simgrid_test);}) + #+end_src +- Then, trace the execution (output in =/tmp/trace=): + #+begin_src sh + smpirun -trace -trace-file /tmp/trace --cfg=smpi/trace-call-location:1 --cfg=smpi/bcast:mpich\ + --cfg=smpi/running-power:6217956542.969 --cfg=smpi/display-timing:yes --cfg=smpi/privatize-global-variables:yes -np 16\ + -hostfile ../../../small_tests/hostfile_64.txt -platform ../../../small_tests/cluster_fat_tree_64.xml ./xhpl\ + #+end_src +- Finally, dump this trace in CSV format: + #+begin_src sh + pj_dump --user-defined --ignore-incomplete-links trace > trace.dump + #+end_src +- Did not work, no =MPI_Initialized= in the trace. In fact, this primitive is currently not traced. We could modify SMPI + to achieve this behavior, or use another MPI primitive that is already traced. +*** 2017-03-02 Thursday +**** Keep trying to trace calls to =HPL_dgemm= :SMPI:TRACING:HPL: +- A MPI primitive is traced \Leftrightarrow the functions =new_pajePushState= and =new_pagePopState= are called (not sure, this is an + intuition). +- This function is not called by =MPI_Initialized=, or =MPI_Wtime=. +- It is called by =MPI_Test=, but only if the =MPI_Request= object passed as argument is non-null, so we would need to do a + fake asynchronous communication just before, which is probably not a good idea. +- Anyway, it looks dirty to use a MPI primitive like this. Wouldn’t it be better to have a custom no-op primitive that + force the introduction of a trace entry? For instance, something like + #+begin_src c + SMPI_Trace { + HPL_dgemm(); + } + #+end_src + or like + #+begin_src c + SMPI_BeginTrace(); + HPL_dgemm(); + SMPI_EndTrace(); + #+end_src +- Every MPI primitive is defined by a =#define= with a call to =smpi_trace_set_call_location= followed by a call to the + function. For instance: + #+begin_src c + #define MPI_Test(...) ({ smpi_trace_set_call_location(__FILE__,__LINE__); MPI_Test(__VA_ARGS__); }) + #+end_src + However, this only record the file name and the line number, I do not think it dumps anything in the trace. +**** Arnaud’s keynote: reproducible research :MEETING: +- Intro: article we had in exam, “Is everything we eat associated with cancer?”. +- In most articles, we can read formulae and trust results, but much less often reproduce the results. +- Reproducibility crisis, several scandals with falsified results (intentionnaly or not). +- Video: Brendan Gregg, shouting in the data center. +**** Discussion with Arnaud :MEETING: +- Regarding the matrix product: + + Compare the (tapered) fat-tree with “perfect” topology (cluster with no latency and infinit bandwidth). + + Run it with larger matrices for the same amount of processes. Do not aim at spending as much time in communication + than computation. We want the communication time to become nearly negligible. In practices, users of a supercomputer + try to fill the memory of their nodes. +- Regarding HPL: + + As discussed yesterday, we want to trace the calls to =HPL_dgemm= by putting calls to a MPI primitive just before and after. + + The short-time goal is to have an idea of the behavior of HPL regarding this function. Are there a lot of different + calls to =HPL_dgemm= coming from different locations? Do these calls always take the same amount of time (i.e. do we + always multiply matrices of the same size)? + + It seems that there is some variability in the duration of =HPL_dgemm= (to be verified with the trace). If HPL really + use the function to multiply matrices of different size, we cannot do something like + =SMPI_SAMPLE(){HPL_dgemm()}=, it will not be precise. What we could do however is to generalize =SMPI_SAMPLE=: we could + parametrize it by a number, representing the size of the problem that is sampled. If this size is always the same, + then we could do what we are doing now, simply take the average. If this size changes over time, we could do + something more elaborated for the prediction, like a linear regression. + + Using MPI functions like =MPI_Test= is not very “clean”, but we do not want to waste time on this currently, so we stick + with existing MPI primitives. We could try to change this in the future. + + It is always safe to call =smpi_process_index=. Thus, we could modify =PMPI_Test= to call =TRACE_smpi_testing= functions + even when the given request is =NULL=. +*** 2017-03-03 Friday +**** Tracing calls to =HPL_dgemm= :SMPI:C:PYTHON:R:EXPERIMENTS:TRACING:PERFORMANCE:HPL: +- Modification of the function =PMPI_Test= of Simgrid so that =MPI_Test= is traced even when the =MPI_Request= handle is + =NULL=. To do that, we need to get the rank of the process, with =smpi_process_index=. The value returned is always 0 in + this case. This is a problem, since we could not distinguish between calls to =MPI_Test= from different processes, thus + it would be impossible to measure time. Reverting the changes. +- To get a non-null =MPI_Request=, did a =MPI_Isend= followed by a =MPI_Recv=: + #+begin_src c + #define HPL_dgemm(...) ({\ + int my_rank, buff=0;\ + MPI_Request request;\ + MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);\ + MPI_Isend(&buff, 1, MPI_INT, my_rank, 0, MPI_COMM_WORLD, &request);\ + MPI_Recv(&buff, 1, MPI_INT, my_rank, 0, MPI_COMM_WORLD, NULL);\ + MPI_Wait(&request, MPI_STATUS_IGNORE);\ + cblas_dgemm(__VA_ARGS__);\ + MPI_Isend(&buff, 1, MPI_INT, my_rank, 0, MPI_COMM_WORLD, &request);\ + MPI_Recv(&buff, 1, MPI_INT, my_rank, 0, MPI_COMM_WORLD, NULL);\ + MPI_Wait(&request, MPI_STATUS_IGNORE);\ + }) + #+end_src +- Forget this. HPL was executed with only one process (=-np 16= but P and Q were 1 in =HPL.dat=). This is why we only had a + rank =0= when giving =NULL= as =MPI_Request=. Let’s revert this and use simple =MPI_Test= with =NULL=. +- Calls to =MPI_Test= seem to be correctly traced, but the post-processing of the trace with =pj_dump= crashes: + #+begin_example + terminate called after throwing an instance of 'std::out_of_range' + what(): vector::_M_range_check: __n (which is 4) >= this->size() (which is 4) + #+end_example + It also happened with the more complex piece of code that is shown above (with =MPI_Test= instead of =MPI_Wait=). + Reverting again, to use the bigger piece of code above. +- Now, the call to =pj_dump= succeeds, and we can see calls to =MPI_Wait= in the trace. +- The call to =smpirun= was: +#+begin_src sh +smpirun -trace -trace-file /tmp/trace --cfg=smpi/trace-call-location:1 --cfg=smpi/bcast:mpich\ +--cfg=smpi/running-power:6217956542.969 --cfg=smpi/display-timing:yes --cfg=smpi/privatize-global-variables:yes -np 16\ +-hostfile ../../../small_tests/hostfile_64.txt -platform ../../../small_tests/cluster_fat_tree_64.xml ./xhpl +#+end_src +- Processing of the trace. + Clean the file: +#+begin_src sh +pj_dump --user-defined --ignore-incomplete-links /tmp/trace > /tmp/trace.csv +grep "State," /tmp/trace.csv | grep MPI_Wait | sed -e 's/()//' -e 's/MPI_STATE, //ig' -e 's/State, //ig' -e 's/rank-//' -e\ +'s/PMPI_/MPI_/' | grep MPI_ | tr 'A-Z' 'a-z' > /tmp/trace_processed.csv +#+end_src + +Clean the paths: +#+begin_src python +import re +reg = re.compile('((?:[^/])*)(?:/[a-zA-Z0-9_-]*)*((?:/hpl-2.2(?:/[a-zA-Z0-9_-]*)*).*)') +def process(in_file, out_file): + with open(in_file, 'r') as in_f: + with open(out_file, 'w') as out_f: + for line in in_f: + match = reg.match(line) + out_f.write('%s%s\n' % (match.group(1), match.group(2))) +process('/tmp/trace_processed.csv', '/tmp/trace_cleaned.csv') + #+end_src + + #+RESULTS: + +#+begin_src R :results output :session *R* :exports both +df <- read.csv("/tmp/trace_cleaned.csv", header=F, strip.white=T, sep=","); +names(df) = c("rank", "start", "end", "duration", "level", "state", "Filename", "Linenumber"); +head(df) +#+end_src + +#+RESULTS: +#+begin_example + rank start end duration level state +1 8 2.743960 2.743960 0 0 mpi_wait +2 8 2.744005 2.744005 0 0 mpi_wait +3 8 2.744005 2.744005 0 0 mpi_wait +4 8 2.744005 2.744005 0 0 mpi_wait +5 8 2.744005 2.744005 0 0 mpi_wait +6 8 2.744005 2.744005 0 0 mpi_wait + Filename Linenumber +1 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 +2 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 +3 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 +4 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 +5 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 +6 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 +#+end_example + +#+BEGIN_SRC R :results output :session *R* :exports both +duration_compute = function(df) { + ndf = data.frame(); + df = df[with(df,order(rank,start)),]; + #origin = unique(df$origin) + for(i in (sort(unique(df$rank)))) { + start = df[df$rank==i,]$start; + end = df[df$rank==i,]$end; + l = length(end); + end = c(0,end[1:(l-1)]); # Computation starts at time 0 + + startline = c(0, df[df$rank==i,]$Linenumber[1:(l-1)]); + startfile = c("", as.character(df[df$rank==i,]$Filename[1:(l-1)])); + endline = df[df$rank==i,]$Linenumber; + endfile = df[df$rank==i,]$Filename; + + ndf = rbind(ndf, data.frame(rank=i, start=end, end=start, + duration=start-end, state="Computing", + startline=startline, startfile=startfile, endline=endline, + endfile=endfile)); + } + ndf$idx = 1:length(ndf$duration) + ndf; +} +durations = duration_compute(df); +durations = durations[durations["startfile"] == "/hpl-2.2/src/pgesv/hpl_pdupdatett.c" & durations["endfile"] == "/hpl-2.2/src/pgesv/hpl_pdupdatett.c" & + durations["startline"] == durations["endline"],] +#+END_SRC + +#+RESULTS: + +#+begin_src R :results output :session *R* :exports both +library(dplyr) +options(width=200) +group_by(durations, startfile, startline, endfile, endline) %>% summarise(duration=sum(duration), count=n()) %>% as.data.frame() +#+end_src + +#+RESULTS: +: startfile startline endfile endline duration count +: 1 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 387 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 387 683.6677 659 +: 2 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 411 2115.8129 1977 + +#+begin_src R :file images/trace1_16.png :results value graphics :session *R* :exports both +library(ggplot2) +ggplot(durations, aes(x=idx, y=duration, color=factor(rank))) + + geom_point(shape=1) + ggtitle("Durations of HPL_dgemm") +#+end_src + +#+RESULTS: +[[file:images/trace1_16.png]] + + +#+begin_src R :file images/trace2_16.png :results value graphics :session *R* :exports both +ggplot(durations, aes(x=start, y=duration, color=factor(rank))) + + geom_point(shape=1) + ggtitle("Durations of HPL_dgemm") +#+end_src + +#+RESULTS: +[[file:images/trace2_16.png]] + +Same results, with four processes: + +[[file:images/trace1_4.png]] + +[[file:images/trace2_4.png]] + +**** Seminaire :MEETING: +On the asymptotic behavior of the price of anarchy, how bad is selfish routing in highly congested networks? +- For instance, cars on a road make their own routing decisions, hence the “selfish” routing. This is not optimal (in + comparison with a centralized routing). +**** Discussion with Arnaud & Christian :MEETING: +- According to the plots, it is impossible to use =SMPI_SAMPLE= as is, since there are huge variations on the duration of =HPL_dgemm=. +- The idea of a parametrized =SMPI_SAMPLE= is also not super. Every process does consecutive calls to =HPL_dgemm=, each call + being shorter than the previous ones. So we would still have to compute expensive calls. +- A long term idea may be to have a “SimBLAS” library, that simulates the calls to =HPL_dgemm= (and other BLAS + primitives). Christian will work on this. +- Answers to all my questions from the paper readings. +**** TODO New tasks [3/4] +:LOGBOOK: +- State "TODO" from [2017-03-03 Fri 17:43] +:END: +- [X] Do the linear regression by hand, off-line. Output the sizes of the matrices given to =HPL_dgemm= (with =printf=). +- [X] Register on Grid5000. Compile HPL on one Grid5000 machine. +- [X] Try to run HPL with a very large matrix, by using =SMPI_SHARED_MALLOC= (thus look at where all the allocations of + matrices are done). +- [ ] Have a look at the code of Simgrid, in particular the routing in fat-trees. +*** 2017-03-06 Monday +**** Output the matrix sizes :C:PYTHON:TRACING:HPL: +- Add the following before the relevant calls to =HPL_dgemm=: + #+begin_src c + printf("line=%d rank=%d m=%d n=%d k=%d\n", __LINE__+3, rank, mp, nn, jb); + #+end_src + Then, run HPL by redirecting =stdout= to =/tmp/output=. +- Process the output, to get a CSV file: +#+begin_src python +import re +import csv +reg = re.compile('line=([0-9]+) rank=([0-9]+) m=([0-9]+) n=([0-9]+) k=([0-9]+)') + +def process(in_file, out_file): + with open(in_file, 'r') as in_f: + with open(out_file, 'w') as out_f: + csv_writer = csv.writer(out_f) + csv_writer.writerow(('line', 'rank', 'n', 'm', 'k')) + for line in in_f: + match = reg.match(line) + if match is not None: + csv_writer.writerow(tuple(match.group(i) for i in range(1,6))) +process('/tmp/output', '/tmp/sizes.csv') +#+end_src + +**** Merge the sizes with the durations :R:EXPERIMENTS:PERFORMANCE: +- Run =smpirun= as stated above, then process the output and the trace as before. +- Process the data: +#+begin_src R :results output :session *R* :exports both +df <- read.csv("/tmp/trace_cleaned.csv", header=F, strip.white=T, sep=","); +names(df) = c("rank", "start", "end", "duration", "level", "state", "Filename", "Linenumber"); +head(df) +#+end_src + +#+RESULTS: +: rank start end duration level state Filename Linenumber +: 1 8 2.743960 2.743960 0 0 mpi_wait /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 +: 2 8 2.744005 2.744005 0 0 mpi_wait /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 +: 3 8 2.744005 2.744005 0 0 mpi_wait /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 +: 4 8 2.744005 2.744005 0 0 mpi_wait /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 +: 5 8 2.744005 2.744005 0 0 mpi_wait /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 +: 6 8 2.744005 2.744005 0 0 mpi_wait /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 + +#+begin_src R :results output :session *R* :exports both +sizes <- read.csv("/tmp/sizes.csv"); +head(sizes) +#+end_src + +#+RESULTS: +: line rank n m k +: 1 411 12 4920 4920 120 +: 2 387 0 4920 4920 120 +: 3 411 8 5000 4920 120 +: 4 411 4 5040 4920 120 +: 5 411 13 4920 5040 120 +: 6 387 1 4920 5040 120 + +#+begin_src R :results output :session *R* :exports both +durations = duration_compute(df); # same function as above +durations = durations[durations["startfile"] == "/hpl-2.2/src/pgesv/hpl_pdupdatett.c" & durations["endfile"] == "/hpl-2.2/src/pgesv/hpl_pdupdatett.c" & + durations["startline"] == durations["endline"],] +head(durations) +#+end_src + +#+RESULTS: +: rank start end duration state startline startfile endline endfile idx +: 481 0 3.153899 6.271075 3.117176 Computing 387 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 387 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 481 +: 486 0 7.047247 10.063367 3.016120 Computing 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 486 +: 491 0 10.648367 13.716045 3.067678 Computing 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 491 +: 496 0 14.104534 17.155418 3.050884 Computing 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 496 +: 977 0 17.557080 20.430869 2.873789 Computing 387 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 387 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 977 +: 982 0 21.104026 24.044767 2.940741 Computing 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 982 + + +#+begin_src R :results output :session *R* :exports both +insert_sizes = function(durations, sizes) { + stopifnot(nrow(durations)==nrow(sizes)) + ndf = data.frame(); + for(i in (sort(unique(durations$rank)))) { + tmp_dur = durations[durations$rank == i,] + tmp_sizes = sizes[sizes$rank == i,] + stopifnot(nrow(tmp_dur) == nrow(tmp_sizes)) + stopifnot(tmp_dur$startline == tmp_sizes$line) + storage.mode(tmp_sizes$m) <- "double" # avoiding integer overflow when taking the product + storage.mode(tmp_sizes$n) <- "double" + storage.mode(tmp_sizes$k) <- "double" + tmp_dur$m = tmp_sizes$m + tmp_dur$n = tmp_sizes$n + tmp_dur$k = tmp_sizes$k + tmp_dur$size_product = tmp_sizes$m * tmp_sizes$n * tmp_sizes$k + ndf = rbind(ndf, tmp_dur) + } + return(ndf); +} +#+end_src + +#+RESULTS: + +#+begin_src R :results output :session *R* :exports both +result = insert_sizes(durations, sizes) +head(result) +#+end_src + +#+RESULTS: +: rank start end duration state startline startfile endline endfile idx m n k size_product +: 481 0 3.153899 6.271075 3.117176 Computing 387 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 387 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 481 4920 4920 120 2904768000 +: 486 0 7.047247 10.063367 3.016120 Computing 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 486 4920 4920 120 2904768000 +: 491 0 10.648367 13.716045 3.067678 Computing 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 491 4920 4920 120 2904768000 +: 496 0 14.104534 17.155418 3.050884 Computing 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 496 4920 4920 120 2904768000 +: 977 0 17.557080 20.430869 2.873789 Computing 387 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 387 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 977 4800 4800 120 2764800000 +: 982 0 21.104026 24.044767 2.940741 Computing 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 982 4800 4800 120 2764800000 + +**** Plot and linear regression :R:EXPERIMENTS:PERFORMANCE: + +#+begin_src R :file images/trace3_16.png :results value graphics :results output :session *R* :exports both +library(ggplot2) +ggplot(result, aes(x=size_product, y=duration, color=factor(rank))) + + geom_point(shape=1) + ggtitle("Durations of HPL_dgemm as a function of the sizes") +#+end_src + +#+RESULTS: +[[file:images/trace3_16.png]] + + +#+begin_src R :results output :session *R* :exports both +reg <- lm(duration~I(m*n*k), data=result) +summary(reg) +#+end_src + +#+RESULTS: +#+begin_example + +Call: +lm(formula = duration ~ I(m * n * k), data = result) + +Residuals: + Min 1Q Median 3Q Max +-0.10066 -0.01700 -0.00085 0.00351 0.57745 + +Coefficients: + Estimate Std. Error t value Pr(>|t|) +(Intercept) -2.476e-03 1.235e-03 -2.005 0.0451 * +I(m * n * k) 1.062e-09 9.220e-13 1151.470 <2e-16 *** +--- +Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 + +Residual standard error: 0.04205 on 2634 degrees of freedom +Multiple R-squared: 0.998, Adjusted R-squared: 0.998 +F-statistic: 1.326e+06 on 1 and 2634 DF, p-value: < 2.2e-16 +#+end_example + +#+begin_src R :file images/reg_16.png :results value graphics :results output :session *R* :exports both +layout(matrix(c(1,2,3,4),2,2)) +plot(reg) +#+end_src + +#+RESULTS: +[[file:images/reg_16.png]] +**** Comments on the linear regression :EXPERIMENTS: +- The plot of the duration as a function of =m*n*k= looks great. Maybe a bit of heteroscedasticity, but not so much. It is + clearly linear. +- The linear regression however is not so good. We have a high R-squared (0.998), but the plots look bad. The + residual-vs-fitted plot shows that the results are clearly heteroscedastic. The normal-QQ shows that they are not + linear (in =m*n*k=) but rather exponential. +- The plot of the linear regression seems to contradict the first plot, this is strange. +**** Investigating the linear regression :C: +- We can print other relevant parameters of =HPL_dgemm=: + #+begin_src c + printf("line=%d rank=%d m=%d n=%d k=%d a=%f lead_A=%d lead_B=%d lead_C=%d\n", __LINE__+3, + rank, mp, nn, jb, -HPL_rone, ldl2, LDU, lda); + #+end_src + Here, =a= is a scaling factor applied to the matrix, =lead_A=, =lead_B= and =lead_C= are the leading dimensions of matrices =A=, =B= and + =C=. + A sample of what we get is (only some lines are reported here): + #+begin_example + line=411 rank=2 m=2240 n=2160 k=120 a=-1.000000 lead_A=2480 lead_B=2160 lead_C=2480 + line=387 rank=3 m=1640 n=1641 k=120 a=-1.000000 lead_A=2480 lead_B=1641 lead_C=2480 + line=387 rank=2 m=680 n=720 k=120 a=-1.000000 lead_A=680 lead_B=720 lead_C=2480 + line=387 rank=2 m=200 n=240 k=120 a=-1.000000 lead_A=200 lead_B=240 lead_C=2480 + 177 line=411 rank=1 m=480 n=441 k=120 a=-1.000000 lead_A=2520 lead_B=441 lead_C=2520 + #+end_example + This trend seems to roughly repeat: =a= is always -1, =lead_C= is always either 2480 or 2520. For small enough values, + =lead_A= is equal to =m= and =lead_C= is equal to =n=. For larger values, they are not equal anymore, but all are large. + However, there are still some noticeable variations. For instance: + #+begin_example + line=387 rank=0 m=600 n=600 k=120 a=-1.000000 lead_A=2520 lead_B=600 lead_C=2520 + line=411 rank=0 m=600 n=600 k=120 a=-1.000000 lead_A=600 lead_B=600 lead_C=2520 + #+end_example + In this last example, all parameters are equal, except =lead_A= which is more than four times larger in one case. +- A small leading dimension means a better locality and thus better performances. These differences in the leading + dimensions could explain the non-linearity and the heteroscedasticity. +*** 2017-03-07 Tuesday +**** And the leading dimensions? :C:PYTHON:R:EXPERIMENTS:TRACING:PERFORMANCE: +- We have this =printf= before the calls to =HPL_dgemm= (same as before, except for the =a= that is removed): + #+begin_src c + printf("line=%d rank=%d m=%d n=%d k=%d lead_A=%d lead_B=%d lead_C=%d\n", __LINE__+3, + rank, mp, nn, jb, ldl2, LDU, lda); + #+end_src +- The trace is in the file =/tmp/trace=, we process it as before. The output is redirected in the file =/tmp/output=. +- Processing of the output: +#+begin_src python +import re +import csv +reg = re.compile('line=([0-9]+) rank=([0-9]+) m=([0-9]+) n=([0-9]+) k=([0-9]+) lead_A=([0-9]+) lead_B=([0-9]+) lead_C=([0-9]+)') + +def process(in_file, out_file): + with open(in_file, 'r') as in_f: + with open(out_file, 'w') as out_f: + csv_writer = csv.writer(out_f) + csv_writer.writerow(('line', 'rank', 'n', 'm', 'k', 'lead_A', 'lead_B', 'lead_C')) + for line in in_f: + match = reg.match(line) + if match is not None: + csv_writer.writerow(tuple(match.group(i) for i in range(1,9))) +process('/tmp/output', '/tmp/sizes.csv') +#+end_src + +We have the =durations= dataframe, obtained as before: + +#+begin_src R :results output :session *R* :exports both +head(durations) +#+end_src + +#+RESULTS: +: rank start end duration state startline startfile endline endfile idx +: 481 0 4.111176 7.158459 3.047283 Computing 387 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 387 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 481 +: 486 0 7.827329 10.848572 3.021243 Computing 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 486 +: 491 0 11.411456 14.445789 3.034333 Computing 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 491 +: 496 0 14.837377 17.868118 3.030741 Computing 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 496 +: 977 0 18.268679 21.142146 2.873467 Computing 387 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 387 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 977 +: 982 0 21.809954 24.699182 2.889228 Computing 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 982 + +Then we get the =sizes= dataframe: +#+begin_src R :results output :session *R* :exports both +sizes <- read.csv("/tmp/sizes.csv"); +head(sizes) +#+end_src + +#+RESULTS: +: line rank n m k lead_A lead_B lead_C +: 1 387 0 4920 4920 120 5040 4920 5040 +: 2 411 8 5000 4920 120 5000 4920 5000 +: 3 411 4 5040 4920 120 5040 4920 5040 +: 4 411 12 4920 4920 120 4920 4920 4920 +: 5 387 1 4920 5040 120 4920 5040 5040 +: 6 411 5 5040 5040 120 5040 5040 5040 + +#+begin_src R :results output :session *R* :exports both +insert_sizes = function(durations, sizes) { + stopifnot(nrow(durations)==nrow(sizes)) + ndf = data.frame(); + for(i in (sort(unique(durations$rank)))) { + tmp_dur = durations[durations$rank == i,] + tmp_sizes = sizes[sizes$rank == i,] + stopifnot(nrow(tmp_dur) == nrow(tmp_sizes)) + stopifnot(tmp_dur$startline == tmp_sizes$line) + storage.mode(tmp_sizes$m) <- "double" # avoiding integer overflow when taking the product + storage.mode(tmp_sizes$n) <- "double" + storage.mode(tmp_sizes$k) <- "double" + storage.mode(tmp_sizes$lead_A) <- "double" + storage.mode(tmp_sizes$lead_B) <- "double" + storage.mode(tmp_sizes$lead_C) <- "double" + tmp_dur$m = tmp_sizes$m + tmp_dur$n = tmp_sizes$n + tmp_dur$k = tmp_sizes$k + tmp_dur$lead_A = tmp_sizes$lead_A + tmp_dur$lead_B = tmp_sizes$lead_B + tmp_dur$lead_C = tmp_sizes$lead_C + tmp_dur$lead_product = tmp_sizes$lead_A * tmp_sizes$lead_B * tmp_sizes$lead_C + tmp_dur$size_product = tmp_sizes$m * tmp_sizes$n * tmp_sizes$k + tmp_dur$ratio = tmp_dur$lead_product/tmp_dur$size_product + ndf = rbind(ndf, tmp_dur) + } + return(ndf); +} +#+end_src + +#+begin_src R :results output :session *R* :exports both +result = insert_sizes(durations, sizes) +head(result) +#+end_src + +#+RESULTS: +#+begin_example + rank start end duration state startline startfile endline endfile idx m n k lead_A lead_B lead_C lead_product +481 0 4.111176 7.158459 3.047283 Computing 387 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 387 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 481 4920 4920 120 5040 4920 5040 124975872000 +486 0 7.827329 10.848572 3.021243 Computing 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 486 4920 4920 120 4920 4920 5040 122000256000 +491 0 11.411456 14.445789 3.034333 Computing 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 491 4920 4920 120 4920 4920 5040 122000256000 +496 0 14.837377 17.868118 3.030741 Computing 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 496 4920 4920 120 4920 4920 5040 122000256000 +977 0 18.268679 21.142146 2.873467 Computing 387 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 387 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 977 4800 4800 120 5040 4800 5040 121927680000 +982 0 21.809954 24.699182 2.889228 Computing 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 411 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 982 4800 4800 120 4800 4800 5040 116121600000 + size_product ratio +481 2904768000 43.02439 +486 2904768000 42.00000 +491 2904768000 42.00000 +496 2904768000 42.00000 +977 2764800000 44.10000 +982 2764800000 42.00000 +#+end_example + +#+begin_src R :file images/trace4_16.png :results value graphics :results output :session *R* :exports both +library(ggplot2) +ggplot(result, aes(x=lead_product, y=duration, color=factor(rank))) + + geom_point(shape=1) + ggtitle("Durations of HPL_dgemm as a function of the leading dimensions") +#+end_src + +#+RESULTS: +[[file:images/trace4_16.png]] + +#+begin_src R :file images/trace5_16.png :results value graphics :results output :session *R* :exports both +library(ggplot2) +ggplot(result, aes(x=lead_product, y=size_product, color=factor(rank))) + + geom_point(shape=1) + ggtitle("Size of the matrices of HPL_dgemm as a function of the leading dimensions") +#+end_src + +#+RESULTS: +[[file:images/trace5_16.png]] + +#+begin_src R :file images/trace6_16.png :results value graphics :results output :session *R* :exports both +ggplot(result, aes(x=idx, y=ratio, color=factor(rank))) + + geom_point(shape=1) + ggtitle("Ratios of the leading dimensions by the sizes over time") +#+end_src + +#+RESULTS: +[[file:images/trace6_16.png]] + +#+begin_src R :results output :session *R* :exports both +reg <- lm(duration~ I(m*n*k) + lead_A+lead_B+lead_C, data=result) +summary(reg) +#+end_src + +#+RESULTS: +#+begin_example + +Call: +lm(formula = duration ~ I(m * n * k) + lead_A + lead_B + lead_C, + data = result) + +Residuals: + Min 1Q Median 3Q Max +-0.09477 -0.01804 -0.00439 0.00850 1.39992 + +Coefficients: + Estimate Std. Error t value Pr(>|t|) +(Intercept) -7.741e-01 9.915e-02 -7.807 8.37e-15 *** +I(m * n * k) 1.069e-09 4.431e-12 241.217 < 2e-16 *** +lead_A 2.965e-06 7.744e-07 3.828 0.000132 *** +lead_B -7.048e-06 2.799e-06 -2.518 0.011863 * +lead_C 1.547e-04 1.981e-05 7.810 8.16e-15 *** +--- +Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 + +Residual standard error: 0.04981 on 2631 degrees of freedom +Multiple R-squared: 0.9972, Adjusted R-squared: 0.9972 +F-statistic: 2.361e+05 on 4 and 2631 DF, p-value: < 2.2e-16 +#+end_example + +#+begin_src R :file images/reg2_16.png :results value graphics :results output :session *R* :exports both +layout(matrix(c(1,2,3,4),2,2)) +plot(reg) +#+end_src + +#+RESULTS: +[[file:images/reg2_16.png]] + +**** Discussion about the leading dimensions :EXPERIMENTS: +- In the three previous plots, we see that the leading dimensions have two modes, which are directly observable in the + durations of =HPL_dgemm=. + + One of the modes seems to be linear in the sizes, we observe a straight line. + + The other mode is clearly non-linear. Maybe quadratic? Exponential? +- The linear regression shows that the variables =lead_A=, =lead_B= and =lead_C= have a non-negligible impact on the + performances, albeit smaller than the sizes. We still have terrible plots, adding parameters in the model did not + change anything. +- This could explain the “bad” plots of the linear regression. +**** Performance analysis of =dgemm= outside of HPL :C:EXPERIMENTS:PERFORMANCE: +- In the above analysis, the raw results come from a trace of HPL. Thus, we cannot control the sizes and/or leading + dimensions. We only have observational data and not experimental data. +- To fix this, let’s write a short C code, called =dgemm_test=, that call =cblas_dgemm= (the function to which is aliased =HPL_dgemm=). +- Currently, this code takes six parameters as arguments: the three sizes and the three leading dimensions. Be careful, + the meaning of these sizes and leading dimensions change depending on how =dgemm= is called: =CblasColMajor= or + =CblasRowMajor=, and =CblasNoTrans= or =CblasTrans=. In the current code, these are fixed to be the same than in HPL. +- Then, a Python script (called =runner.py=) sample random sizes and leading dimensions (taking care of the constraints + between the sizes and dimensions) and call =dgemm_test=. It then writes the results in a CSV file. +- Quick analysis of these results in R: + + We got plots with the same shape (both the plot of the raw results and the plot of the linear regression). + + The call to =dgemm= is 10 times faster in =dgemm_test= than in HPL. Need to find why. Firstly, what is the time obtained in the HPL + traces? Is it virtual or real? + + Similarly than with HPL, the linear regression shows that the ratio has a significative impact, but lower than the + sizes. +*** 2017-03-08 Wednesday +**** Keep looking at =dgemm= outside of HPL :C:EXPERIMENTS:PERFORMANCE: +- Use =dgemm_test= at commit =0455edcb0af1eb673725959d216137997fc40fd2=. Run 1000 experiments. +- Here, the variable =product= is sampled randomly and uniformly in [1, 2000^3]. Then, the three sizes are set to \lfloor + product^(1/3) \rfloor. +- The leading dimensions are equal to the sizes. +- Analysis in R: + #+begin_src R :results output :session *R* :exports both + result <- read.csv('~/tmp/3/result.csv') + head(result) + #+end_src + + #+RESULTS: + : time size_product lead_product ratio m n k lead_A lead_B lead_C + : 1 0.160235 843908625 843908625 1 945 945 945 945 945 945 + : 2 0.719003 4298942376 4298942376 1 1626 1626 1626 1626 1626 1626 + : 3 0.783674 4549540393 4549540393 1 1657 1657 1657 1657 1657 1657 + : 4 0.472595 2656741625 2656741625 1 1385 1385 1385 1385 1385 1385 + : 5 0.319670 1874516337 1874516337 1 1233 1233 1233 1233 1233 1233 + : 6 1.131936 6676532387 6676532387 1 1883 1883 1883 1883 1883 1883 + + #+begin_src R :file images/dgemm_test_raw.png :results value graphics :results output :session *R* :exports both + library(ggplot2) + ggplot(result, aes(x=size_product, y=time)) + + geom_point(shape=1) + ggtitle("Durations of cblas_dgemm as a function of the sizes product.") + #+end_src + + #+RESULTS: + [[file:images/dgemm_test_raw.png]] + + #+begin_src R :results output :session *R* :exports both + reg <- lm(time ~ size_product, result) + summary(reg) + #+end_src + + #+RESULTS: + #+begin_example + + Call: + lm(formula = time ~ size_product, data = result) + + Residuals: + Min 1Q Median 3Q Max + -0.027295 -0.008640 -0.002781 0.005900 0.229935 + + Coefficients: + Estimate Std. Error t value Pr(>|t|) + (Intercept) 1.172e-02 1.087e-03 10.78 <2e-16 *** + size_product 1.666e-10 2.353e-13 707.87 <2e-16 *** + --- + Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 + + Residual standard error: 0.01716 on 998 degrees of freedom + Multiple R-squared: 0.998, Adjusted R-squared: 0.998 + F-statistic: 5.011e+05 on 1 and 998 DF, p-value: < 2.2e-16 + #+end_example + + + #+begin_src R :file images/dgemm_test_lm.png :results value graphics :results output :session *R* :exports both + layout(matrix(c(1,2,3,4),2,2)) + plot(reg) + #+end_src + + #+RESULTS: + [[file:images/dgemm_test_lm.png]] + +- In the above plots, we can observe similar trends than with =HPL_dgemm=, albeit less important. + The data is slightly heteroscedastic and the residuals do not follow exactly a normal distribution. It seems that + there are several “outliers” where =dgemm= takes significantly more time, i.e. the distribution of the residuals is + skewed to the “right”. +- For instance, the entry n°208 has been obtained with sizes of 1503. It took a time of 0.807207. + Let’s run this experiment again 100 times (with the command =./dgemm_test 1503 1503 1503 1503 1503 1503=). The min and + the max over all observed times are respectively 0.5813 and 0.6494. The mean is 0.5897 and the standard deviation + is 0.0082. +- Thus, it seems that this point is a real outlier. We can suppose that this is also true for the other similar points. +- This outlier is 0.2 seconds larger than the average we got and 0.15 seconds larger than the max. It seems very + large. Maybe the process had a “bad” context switch (e.g. if it was moved to another core, but the execution time is + not that high, so it seems unlikely). +- There seems to be a pattern, the outliers look to happen at regular intervals. + #+begin_src R :results output :session *R* :exports both + x = df[abs(df$time - (1.666e-10*df$size_product + 1.172e-2)) > 5e-2, ] + x$id = which(abs(df$time - (1.666e-10*df$size_product + 1.172e-2)) > 5e-2) + x$prev_id = c(0, x$id[1:(length(x$id)-1)]) + x$id_diff = x$id - x$prev_id + x + #+end_src + + #+RESULTS: + #+begin_example + time size_product lead_product ratio m n k lead_A lead_B + 37 0.674633 3602686437 3602686437 1 1533 1533 1533 1533 1533 + 38 0.409866 2053225511 2053225511 1 1271 1271 1271 1271 1271 + 207 1.295097 7055792632 7055792632 1 1918 1918 1918 1918 1918 + 208 0.807207 3395290527 3395290527 1 1503 1503 1503 1503 1503 + 381 1.079795 5535839609 5535839609 1 1769 1769 1769 1769 1769 + 558 0.453775 1869959168 1869959168 1 1232 1232 1232 1232 1232 + 657 0.917557 4699421875 4699421875 1 1675 1675 1675 1675 1675 + 748 1.233466 6414120712 6414120712 1 1858 1858 1858 1858 1858 + 753 0.708934 3884701248 3884701248 1 1572 1572 1572 1572 1572 + 914 1.337868 7166730752 7166730752 1 1928 1928 1928 1928 1928 + lead_C id prev_id id_diff + 37 1533 37 0 37 + 38 1271 38 37 1 + 207 1918 207 38 169 + 208 1503 208 207 1 + 381 1769 381 208 173 + 558 1232 558 381 177 + 657 1675 657 558 99 + 748 1858 748 657 91 + 753 1572 753 748 5 + 914 1928 914 753 161 + #+end_example + + We see here that the differences between the ids do not seem to be uniformly random. Some of them are small (1, 5), + others are large (161, 169, 173, 177), or in between (37, 91, 99). +- This pattern has been reproduced by runing 1000 experiments with a size of 1503. Among the results, 26 of them are + larger than 0.7 (mean of 0.6024, standard deviation of 0.0249, min of 0.5811, max of 0.8363). + Here is the list of the differences between the indices of these elements. The list have been sorted: + #+begin_example + [1, 1, 1, 1, 1, 1, 2, 4, 4, 5, 7, 7, 10, 15, 20, 25, 28, 32, 42, 42, 43, 53, 108, 200, 201] + #+end_example + A lot of them are small or medium, and two are much larger. +**** Time prediction in HPL :C:PYTHON:R:EXPERIMENTS:PERFORMANCE:HPL: +- Let’s try to predict the time that will be spend in =HPL_dgemm=, and compare it with the real time. + The aim is then to have a cheap SimBLAS: replacing calls to the function by a sleep of the predicted time. + We have this =printf= before the calls to =HPL_dgemm=: + #+begin_src c + printf("line=%d rank=%d m=%d n=%d k=%d lead_A=%d lead_B=%d lead_C=%d expected_time=%f\n", + __LINE__+3, rank, mp, nn, jb, ldl2, LDU, lda, expected_time); + #+end_src + We do as before: we run HPL with P=Q=4 and N=20000. The trace is dumped in =/tmp/trace= and =stdout= is redirected to + =/tmp/output=. +- Processing of the output: +#+begin_src python +import re +import csv +reg = re.compile('line=([0-9]+) rank=([0-9]+) m=([0-9]+) n=([0-9]+) k=([0-9]+) lead_A=([0-9]+) lead_B=([0-9]+) lead_C=([0-9]+) expected_time=(-?[0-9]+.[0-9]+)') + +def process(in_file, out_file): + with open(in_file, 'r') as in_f: + with open(out_file, 'w') as out_f: + csv_writer = csv.writer(out_f) + csv_writer.writerow(('line', 'rank', 'n', 'm', 'k', 'lead_A', 'lead_B', 'lead_C', 'expected_time')) + for line in in_f: + match = reg.match(line) + if match is not None: + csv_writer.writerow(tuple(match.group(i) for i in range(1,10))) +process('/tmp/output', '/tmp/sizes.csv') +#+end_src + +#+RESULTS: +: None +- We process the trace as before, we get a dataframe =durations=. + #+begin_src R :results output :session *R* :exports both + head(durations) + #+end_src + + #+RESULTS: + : rank start end duration state startline startfile endline endfile idx + : 481 0 3.480994 6.54468 3.063686 Computing 388 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 388 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 481 + : 486 0 7.225255 10.24889 3.023633 Computing 413 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 413 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 486 + : 491 0 10.803780 13.82799 3.024215 Computing 413 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 413 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 491 + : 496 0 14.230774 17.26467 3.033897 Computing 413 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 413 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 496 + : 977 0 17.676746 20.58197 2.905229 Computing 388 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 388 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 977 + : 982 0 21.258337 24.16961 2.911277 Computing 413 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 413 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 982 + + #+begin_src R :results output :session *R* :exports both + sizes <- read.csv("/tmp/sizes.csv"); + head(sizes) + #+end_src + + #+RESULTS: + : line rank n m k lead_A lead_B lead_C expected_time + : 1 413 8 5000 4920 120 5000 4920 5000 3.132548 + : 2 413 12 4920 4920 120 4920 4920 4920 3.082388 + : 3 413 4 5040 4920 120 5040 4920 5040 3.157628 + : 4 388 0 4920 4920 120 5040 4920 5040 3.082388 + : 5 413 5 5040 5040 120 5040 5040 5040 3.234704 + : 6 413 9 5000 5040 120 5000 5040 5000 3.209012 + + #+begin_src R :results output :session *R* :exports both + insert_sizes = function(durations, sizes) { + stopifnot(nrow(durations)==nrow(sizes)) + ndf = data.frame(); + for(i in (sort(unique(durations$rank)))) { + tmp_dur = durations[durations$rank == i,] + tmp_sizes = sizes[sizes$rank == i,] + stopifnot(nrow(tmp_dur) == nrow(tmp_sizes)) + stopifnot(tmp_dur$startline == tmp_sizes$line) + storage.mode(tmp_sizes$m) <- "double" # avoiding integer overflow when taking the product + storage.mode(tmp_sizes$n) <- "double" + storage.mode(tmp_sizes$k) <- "double" + storage.mode(tmp_sizes$lead_A) <- "double" + storage.mode(tmp_sizes$lead_B) <- "double" + storage.mode(tmp_sizes$lead_C) <- "double" + tmp_dur$m = tmp_sizes$m + tmp_dur$n = tmp_sizes$n + tmp_dur$k = tmp_sizes$k + tmp_dur$lead_A = tmp_sizes$lead_A + tmp_dur$lead_B = tmp_sizes$lead_B + tmp_dur$lead_C = tmp_sizes$lead_C + tmp_dur$lead_product = tmp_sizes$lead_A * tmp_sizes$lead_B * tmp_sizes$lead_C + tmp_dur$size_product = tmp_sizes$m * tmp_sizes$n * tmp_sizes$k + tmp_dur$ratio = tmp_dur$lead_product/tmp_dur$size_product + tmp_dur$expected_time = tmp_sizes$expected_time + tmp_dur$absolute_time_diff = tmp_dur$expected_time - tmp_dur$duration + tmp_dur$relative_time_diff = (tmp_dur$expected_time - tmp_dur$duration)/tmp_dur$expected_time + ndf = rbind(ndf, tmp_dur) + } + return(ndf); + } + #+end_src + + #+begin_src R :results output :session *R* :exports both + result = insert_sizes(durations, sizes) + head(result) + #+end_src + + #+RESULTS: + #+begin_example + rank start end duration state startline startfile endline endfile idx m n k lead_A lead_B lead_C lead_product + 481 0 3.480994 6.54468 3.063686 Computing 388 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 388 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 481 4920 4920 120 5040 4920 5040 124975872000 + 486 0 7.225255 10.24889 3.023633 Computing 413 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 413 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 486 4920 4920 120 4920 4920 5040 122000256000 + 491 0 10.803780 13.82799 3.024215 Computing 413 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 413 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 491 4920 4920 120 4920 4920 5040 122000256000 + 496 0 14.230774 17.26467 3.033897 Computing 413 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 413 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 496 4920 4920 120 4920 4920 5040 122000256000 + 977 0 17.676746 20.58197 2.905229 Computing 388 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 388 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 977 4800 4800 120 5040 4800 5040 121927680000 + 982 0 21.258337 24.16961 2.911277 Computing 413 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 413 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 982 4800 4800 120 4800 4800 5040 116121600000 + size_product ratio expected_time absolute_time_diff relative_time_diff + 481 2904768000 43.02439 3.082388 0.018702 0.006067374 + 486 2904768000 42.00000 3.082388 0.058755 0.019061520 + 491 2904768000 42.00000 3.082388 0.058173 0.018872705 + 496 2904768000 42.00000 3.082388 0.048491 0.015731634 + 977 2764800000 44.10000 2.933742 0.028513 0.009718987 + 982 2764800000 42.00000 2.933742 0.022465 0.007657456 +#+end_example + +#+begin_src R :file images/trace7_16.png :results value graphics :session *R* :exports both +ggplot(result, aes(x=idx, y=absolute_time_diff, color=factor(rank))) + + geom_point(shape=1) + ggtitle("Absolute difference between the expected time and the real time") +#+end_src + +#+RESULTS: +[[file:images/trace7_16.png]] + +#+begin_src R :file images/trace8_16.png :results value graphics :session *R* :exports both +ggplot(result, aes(x=start, y=absolute_time_diff, color=factor(rank))) + + geom_point(shape=1) + ggtitle("Absolute difference between the expected time and the real time") +#+end_src + +#+RESULTS: +[[file:images/trace8_16.png]] + +#+begin_src R :file images/trace9_16.png :results value graphics :session *R* :exports both +ggplot(result, aes(x=start, y=relative_time_diff, color=factor(rank))) + + geom_point(shape=1) + ggtitle("Relative difference between the expected time and the real time") +#+end_src + +#+RESULTS: +[[file:images/trace9_16.png]] + +#+begin_src R :file images/trace10_16.png :results value graphics :session *R* :exports both +ggplot(result[result$start < 200,], aes(x=start, y=relative_time_diff, color=factor(rank))) + + geom_point(shape=1) + ggtitle("Relative difference between the expected time and the real time\n“Large enough” matrices") +#+end_src + +#+RESULTS: +[[file:images/trace10_16.png]] + + #+begin_src R :results output :session *R* :exports both + for(i in (sort(unique(result$rank)))) { + print(sum(result[result$rank == i,]$absolute_time_diff)) + } + #+end_src + + #+RESULTS: + #+begin_example + [1] 1.494745 + [1] 1.343339 + [1] -2.940891 + [1] -1.11672 + [1] 0.466087 + [1] 1.90049 + [1] -3.441326 + [1] -1.564635 + [1] -2.708597 + [1] -1.647053 + [1] 0.027765 + [1] -4.653833 + [1] 2.878523 + [1] 3.572304 + [1] 1.124928 + [1] 3.749203 +#+end_example + +- We can see several things. + + There are very large differences between the ranks. We could already see it in the first plots (=duration= vs + =size_product=), but it is even more obvious here. We should find why. + + There are some outliers that may have a very significant impact on the agregated difference between prediction and + reality. + + The prediction ability of this is better than =SMPI_Sample=, but still far from perfect. +**** Let’s try a cheap SimBLAS :SMPI:C:PERFORMANCE:HPL: +- We can replace the call to =HPL_dgemm= by the following: + #+begin_src c + double expected_time = (1.062e-09)*(double)mp*(double)nn*(double)jb - 2.476e-03 + if(expected_time > 0) + smpi_usleep((useconds_t)(expected_time*1e6)); + #+end_src +- First test: it works pretty well. We roughly got the same results than with the true call to =HPL_dgemm=: 2.329e+01 + Gflops, against 2.332e01, 2.305e01 and 2.315e01 Gflops. The simulation time is much shorter, about 46 seconds, against + about 495 seconds (8 minutes and 15 seconds). Note than with or without a real call to =HPL_dgemm=, the time spent + outside of the application is much lower: between 6 and 8 seconds. Thus, there is room for new optimizations. +**** Tracking the other expensive BLAS functions :PERFORMANCE:HPL: +- In the file =hpl_blas.h=, several functions are defined like =HPL_dgemm=, with =#define= aliasing them to the real =cblas= function. +- We can try to replace them by a no-op, to see if it changes the simulation time significantly. +- The following table sum up the (very approximate) gain we get on simulation time if we remove each of the + functions. We use the same parameters than above for HPL. + + | Function | time (s) | + |------------+----------| + | =HPL_dswap= | 0.5 | + | =HPL_dcopy= | N/A | + | =HPL_daxpy= | 0 | + | =HPL_dscal= | N/A | + | =HPL_idamax= | N/A | + | =HPL_dgemv= | 1 | + | =HPL_dtrsv= | 0 | + | =HPL_dger= | 0.5 | + | =HPL_dtrsm= | 10 | + + + The function =HPL_idamax= cannot be removed, since it returns an integer used to index an array. + + The functions =HPL_dscal= and =HPL_dcopy= cannot be removed either, since removing them causes the following error: + #+begin_example + /home/tom/simgrid/src/simix/smx_global.cpp:557: [simix_kernel/CRITICAL] Oops ! Deadlock or code not perfectly clean. + #+end_example +- It is clear that we should now focus on =HPL_dtrsm=. This function solves a triangular system of equations. +- It is also clear that the time spent in the application is not entirely spent in the BLAS functions, we should look + for something else. +**** Forgot a call to =HPL_dgemm= :PERFORMANCE:HPL: +- I found out that I forgot a place where =HPL_dgemm= was used. +- If we remove all additional occurences of =HPL_dgemm=, we gain 6 seconds (in addition of the high gain we already had). +- I thought that it was used only in =HPL_pduptateTT=, but it appears that it is also used in =HPL_pdrpanllT=. +- The call to =HPL_dgemm= was correctly traced. But I filtered the results in the R script and kept only the ones of =HPL_pdupdateTT=. +- The =printf= function with the parameters was only present in =HPL_pdupdateTT=. +- Consequently, all the visualizations and linear regressions were done with missing data. We should redo them to check + if this changes anything. +**** Looking at =HPL_dtrsm= :PERFORMANCE:HPL: +- This function is used in a lot of functions: =HPL_pdrpan***= and =HPL_pdupdate**= (each has several variants). +- By aliasing this function to =printf("%s\n", __FILE___)= and filtering the output with =awk '!a[$0]++'= (remove duplicates), + we know that, in our settings, =HPL_dtrsm= is only used in =HPL_pdrpanllT= and =HPL_pdupdateTT=. By sorting with =sort= and + then counting duplicates with =uniq -dc=, we know that =HPL_pdrpanllT= (resp. =HPL_pdupdateTT=) call our function 78664 times + (resp. 2636 times). +*** 2017-03-09 Thursday +**** Fix =HPL_dgemm= trace :C:TRACING:HPL: +- In the old version, the calls to =MPI_Wait= were done in the =#include=, so we were sure that every call to =HPL_dgemm= was + traced by Simgrid. However, the =printf= for the parameters had to be done before every call to =HPL_dgemm=, this is why + I missed some of them. +- Now, the =printf= is also done in the =#include=. Because we need to have the arguments given to =HPL_dgemm= here, we cannot + anymore use variadic arguments. We have to put all the parameters. +- The code is now as follows: + #+begin_src c + #define HPL_dgemm(layout, TransA, TransB, M, N, K, alpha, A, lda, B, ldb, beta, C, ldc) ({\ + int my_rank, buff=0;\ + MPI_Request request;\ + MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);\ + double expected_time = (1.062e-09)*(double)M*(double)N*(double)K - 2.476e-03;\ + printf("file=%s line=%d rank=%d m=%d n=%d k=%d lead_A=%d lead_B=%d lead_C=%d expected_time=%f\n", __FILE__, __LINE__+3, my_rank, M, N, K, lda, ldb, ldc, expected_time);\ + MPI_Isend(&buff, 1, MPI_INT, my_rank, 0, MPI_COMM_WORLD, &request);\ + MPI_Recv(&buff, 1, MPI_INT, my_rank, 0, MPI_COMM_WORLD, NULL);\ + MPI_Wait(&request, MPI_STATUS_IGNORE);\ + cblas_dgemm(layout, TransA, TransB, M, N, K, alpha, A, lda, B, ldb, beta, C, ldc);\ + MPI_Isend(&buff, 1, MPI_INT, my_rank, 0, MPI_COMM_WORLD, &request);\ + MPI_Recv(&buff, 1, MPI_INT, my_rank, 0, MPI_COMM_WORLD, NULL);\ + MPI_Wait(&request, MPI_STATUS_IGNORE);\ + }) + #+end_src +**** Tentative of linear regression of =HPL_dgemm=: failed, there is a bug somewhere :PYTHON:R:EXPERIMENTS:PERFORMANCE:BUG: +- In the other linear regressions, some calls to =HPL_dgemm= were missing. Thus, the analysis need to be done again, just + to check if it changes anything. +- I tried to run roughly the same process as above, but failed, there seems to be a bug somewhere. +- Everything piece of code is written here. The trace and the output have been obtained with N=5000 and P=Q=4. +Clean the file: +#+begin_src sh +pj_dump --user-defined --ignore-incomplete-links /tmp/trace > /tmp/trace.csv +grep "State," /tmp/trace.csv | grep MPI_Wait | sed -e 's/()//' -e 's/MPI_STATE, //ig' -e 's/State, //ig' -e 's/rank-//' -e\ +'s/PMPI_/MPI_/' | grep MPI_ | tr 'A-Z' 'a-z' > /tmp/trace_processed.csv +#+end_src + +Clean the paths: +#+begin_src python +import re +reg = re.compile('((?:[^/])*)(?:/[a-zA-Z0-9_-]*)*((?:/hpl-2.2(?:/[a-zA-Z0-9_-]*)*).*)') +def process(in_file, out_file): + with open(in_file, 'r') as in_f: + with open(out_file, 'w') as out_f: + for line in in_f: + match = reg.match(line) + out_f.write('%s%s\n' % (match.group(1), match.group(2))) +process('/tmp/trace_processed.csv', '/tmp/trace_cleaned.csv') + #+end_src + + #+RESULTS: + : None + +#+begin_src R :results output :session *R* :exports both +df <- read.csv("/tmp/trace_cleaned.csv", header=F, strip.white=T, sep=","); +names(df) = c("rank", "start", "end", "duration", "level", "state", "Filename", "Linenumber"); +head(df) +#+end_src + +#+RESULTS: +#+begin_example + rank start end duration level state +1 8 0.207257 0.207257 0 0 mpi_wait +2 8 0.207275 0.207275 0 0 mpi_wait +3 8 0.207289 0.207289 0 0 mpi_wait +4 8 0.207289 0.207289 0 0 mpi_wait +5 8 0.207309 0.207309 0 0 mpi_wait +6 8 0.207309 0.207309 0 0 mpi_wait + Filename Linenumber +1 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 +2 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 +3 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 +4 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 +5 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 +6 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 +#+end_example + +#+BEGIN_SRC R :results output :session *R* :exports both +duration_compute = function(df) { + ndf = data.frame(); + df = df[with(df,order(rank,start)),]; + #origin = unique(df$origin) + for(i in (sort(unique(df$rank)))) { + start = df[df$rank==i,]$start; + end = df[df$rank==i,]$end; + l = length(end); + end = c(0,end[1:(l-1)]); # Computation starts at time 0 + + startline = c(0, df[df$rank==i,]$Linenumber[1:(l-1)]); + startfile = c("", as.character(df[df$rank==i,]$Filename[1:(l-1)])); + endline = df[df$rank==i,]$Linenumber; + endfile = df[df$rank==i,]$Filename; + + ndf = rbind(ndf, data.frame(rank=i, start=end, end=start, + duration=start-end, state="Computing", + startline=startline, startfile=startfile, endline=endline, + endfile=endfile)); + } + ndf$idx = 1:length(ndf$duration) + ndf; +} +durations = duration_compute(df); +durations = durations[as.character(durations$startfile) == as.character(durations$endfile) & + durations$startline == durations$endline,] +#+END_SRC + +#+BEGIN_SRC R :results output :session *R* :exports both +head(durations) +#+END_SRC + +#+RESULTS: +#+begin_example + rank start end duration state startline +2 0 0.207097 0.207149 5.2e-05 Computing 222 +3 0 0.207149 0.207179 3.0e-05 Computing 222 +4 0 0.207179 0.207179 0.0e+00 Computing 222 +5 0 0.207179 0.207194 1.5e-05 Computing 222 +6 0 0.207194 0.207194 0.0e+00 Computing 222 +7 0 0.207194 0.207207 1.3e-05 Computing 222 + startfile endline endfile +2 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 /hpl-2.2/src/pfact/hpl_pdrpanllt.c +3 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 /hpl-2.2/src/pfact/hpl_pdrpanllt.c +4 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 /hpl-2.2/src/pfact/hpl_pdrpanllt.c +5 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 /hpl-2.2/src/pfact/hpl_pdrpanllt.c +6 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 /hpl-2.2/src/pfact/hpl_pdrpanllt.c +7 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 /hpl-2.2/src/pfact/hpl_pdrpanllt.c + idx +2 2 +3 3 +4 4 +5 5 +6 6 +7 7 +#+end_example + +#+BEGIN_SRC R :results output :session *R* :exports both +unique(durations[c("startfile", "startline")]) +#+END_SRC + +#+RESULTS: +: startfile startline +: 2 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 +: 14 /hpl-2.2/src/comm/hpl_sdrv.c 191 +: 478 /hpl-2.2/src/pgesv/hpl_rollt.c 242 +: 481 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 384 +: 486 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 407 + +We need to check each of these to see if this is indeed a call to =HPL_dgemm=, or something else. +It appears that =HPL_rollT= and =HPL_sdrv= are not calling =HPL_dgemm=, they are just calling =MPI_Wait=. Thus, we have to +remove them. + +#+BEGIN_SRC R :results output :session *R* :exports both +durations = durations[durations$startfile != "/hpl-2.2/src/comm/hpl_sdrv.c" & durations$startfile != "/hpl-2.2/src/pgesv/hpl_rollt.c",] +unique(durations[c("startfile", "startline")]) +#+END_SRC + + +#+RESULTS: +: startfile startline +: 2 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 +: 481 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 384 +: 486 /hpl-2.2/src/pgesv/hpl_pdupdatett.c 407 + +Now, let us get what was output by the =printf=. + +Processing the output: +#+begin_src python +import re +import csv +reg = re.compile('file=([a-zA-Z0-9/_.-]+) line=([0-9]+) rank=([0-9]+) m=([0-9]+) n=([0-9]+) k=([0-9]+) lead_A=([0-9]+) lead_B=([0-9]+) lead_C=([0-9]+) expected_time=(-?[0-9]+.[0-9]+)') + +def process(in_file, out_file): + with open(in_file, 'r') as in_f: + with open(out_file, 'w') as out_f: + csv_writer = csv.writer(out_f) + csv_writer.writerow(('file', 'line', 'rank', 'n', 'm', 'k', 'lead_A', 'lead_B', 'lead_C', 'expected_time')) + for line in in_f: + match = reg.match(line) + if match is not None: + result = list(match.group(i) for i in range(1, 11)) + result[0] = result[0][result[0].index('/hpl'):].lower() + csv_writer.writerow(result) +process('/tmp/output', '/tmp/parameters.csv') +#+end_src + +#+begin_src R :results output :session *R* :exports both +parameters <- read.csv("/tmp/parameters.csv"); +head(parameters) +#+end_src + +#+RESULTS: +#+begin_example + file line rank n m k lead_A lead_B lead_C +1 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 0 1320 60 0 1320 120 1320 +2 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 8 1200 60 0 1200 120 1200 +3 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 0 1320 30 0 1320 120 1320 +4 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 4 1280 60 0 1280 120 1280 +5 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 0 1320 16 0 1320 120 1320 +6 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 0 1320 8 0 1320 120 1320 + expected_time +1 -0.002476 +2 -0.002476 +3 -0.002476 +4 -0.002476 +5 -0.002476 +6 -0.002476 +#+end_example + +A first remark: we see that some rows have k=0, which is a bit surprising. I double-checked by adding some =printf= in the +files, this is not a bug. This only happens in =HPL_pdrpanllT= so it was unnoticed until now. + +#+begin_src R :results output :session *R* :exports both +nrow(parameters) +nrow(durations) +nrow(parameters[parameters$file == "/hpl-2.2/src/pfact/hpl_pdrpanllt.c",]) +nrow(durations[durations$startfile == "/hpl-2.2/src/pfact/hpl_pdrpanllt.c",]) +#+end_src + +#+RESULTS: +: [1] 20300 +: [1] 29964 +: [1] 19664 +: [1] 29328 + +- There is obviously something wrong. We should have a one-to-one correspondance between the elements of the =parameters= + dataframe and the elements of the =durations= dataframe. It seems here that SMPI has produced additional entries in the + trace, or some of the =printf= I put disapeared. +- This is not an error in parsing the output (e.g. some lines not parsed because of a wrong format/regexp). The output + file has 20359 lines. +- Tried puting a =printf("blabla\n")= just before =HPL_dgemm= in the file =HPL_pdrpanllT.c= and counted the number of times it + appeared. Exactly the same number, so definitely not an issue with the parsing or the definition with the =#define=. +- Checked the =durations= dataframe. Nothing apparently wrong, all the entries for this file are at the same line, so I + did not miss a hidden =MPI_Wait= somewhere else in this same file). +**** Using another way to measure durations :C:PYTHON:R:EXPERIMENTS:TRACING:PERFORMANCE:HPL: +- Let’s use something else than SMPI trace to measure durations. We will measure the time directly in the code. But + first we need to check that this new measure is consistent with what we got with the traces. +- Now, =HPL_dgemm= is defined as: +#+begin_src c +#define HPL_dgemm(layout, TransA, TransB, M, N, K, alpha, A, lda, B, ldb, beta, C, ldc) ({\ + int my_rank, buff=0;\ + MPI_Request request;\ + MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);\ + double expected_time = (1.062e-09)*(double)M*(double)N*(double)K - 2.476e-03;\ + struct timeval before = {};\ + struct timeval after = {};\ + gettimeofday(&before, NULL);\ + MPI_Isend(&buff, 1, MPI_INT, my_rank, 0, MPI_COMM_WORLD, &request);\ + MPI_Recv(&buff, 1, MPI_INT, my_rank, 0, MPI_COMM_WORLD, NULL);\ + MPI_Wait(&request, MPI_STATUS_IGNORE);\ + cblas_dgemm(layout, TransA, TransB, M, N, K, alpha, A, lda, B, ldb, beta, C, ldc);\ + gettimeofday(&after, NULL);\ + MPI_Isend(&buff, 1, MPI_INT, my_rank, 0, MPI_COMM_WORLD, &request);\ + MPI_Recv(&buff, 1, MPI_INT, my_rank, 0, MPI_COMM_WORLD, NULL);\ + MPI_Wait(&request, MPI_STATUS_IGNORE);\ + double time_before = (double)(before.tv_sec) + (double)(before.tv_usec)*1e-6;\ + double time_after = (double)(after.tv_sec) + (double)(after.tv_usec)*1e-6;\ + double real_time = time_after-time_before;\ + printf("file=%s line=%d rank=%d m=%d n=%d k=%d lead_A=%d lead_B=%d lead_C=%d real_time=%f expected_time=%f\n", __FILE__, __LINE__, my_rank, M, N, K, lda, ldb, ldc, real_time, expected_time);\ +}) +#+end_src +- We run the same code than above to get the =durations= frame. +#+BEGIN_SRC R :results output :session *R* :exports both +head(durations) +#+END_SRC + +#+RESULTS: +#+begin_example + rank start end duration state startline +2 0 0.275856 0.275896 4.0e-05 Computing 224 +3 0 0.275896 0.275929 3.3e-05 Computing 224 +4 0 0.275929 0.275929 0.0e+00 Computing 224 +5 0 0.275929 0.275948 1.9e-05 Computing 224 +6 0 0.275948 0.275948 0.0e+00 Computing 224 +7 0 0.275948 0.275965 1.7e-05 Computing 224 + startfile endline endfile +2 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 224 /hpl-2.2/src/pfact/hpl_pdrpanllt.c +3 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 224 /hpl-2.2/src/pfact/hpl_pdrpanllt.c +4 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 224 /hpl-2.2/src/pfact/hpl_pdrpanllt.c +5 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 224 /hpl-2.2/src/pfact/hpl_pdrpanllt.c +6 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 224 /hpl-2.2/src/pfact/hpl_pdrpanllt.c +7 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 224 /hpl-2.2/src/pfact/hpl_pdrpanllt.c + idx +2 2 +3 3 +4 4 +5 5 +6 6 +7 7 +#+end_example + +Now, we process the parameters: +#+begin_src python +import re +import csv +reg = re.compile('file=([a-zA-Z0-9/_.-]+) line=([0-9]+) rank=([0-9]+) m=([0-9]+) n=([0-9]+) k=([0-9]+) lead_A=([0-9]+) lead_B=([0-9]+) lead_C=([0-9]+) real_time=(-?[0-9]+.[0-9]+) expected_time=(-?[0-9]+.[0-9]+)') + +def process(in_file, out_file): + with open(in_file, 'r') as in_f: + with open(out_file, 'w') as out_f: + csv_writer = csv.writer(out_f) + csv_writer.writerow(('file', 'line', 'rank', 'n', 'm', 'k', 'lead_A', 'lead_B', 'lead_C', 'real_time', 'expected_time')) + for line in in_f: + match = reg.match(line) + if match is not None: + result = list(match.group(i) for i in range(1, 12)) + result[0] = result[0][result[0].index('/hpl'):].lower() + csv_writer.writerow(result) +process('/tmp/output', '/tmp/parameters.csv') +#+end_src + +#+begin_src R :results output :session *R* :exports both +parameters <- read.csv("/tmp/parameters.csv"); +head(parameters) +#+end_src + +#+RESULTS: +#+begin_example + file line rank n m k lead_A lead_B lead_C +1 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 224 0 1320 60 0 1320 120 1320 +2 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 224 0 1320 30 0 1320 120 1320 +3 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 224 0 1320 16 0 1320 120 1320 +4 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 224 0 1320 8 0 1320 120 1320 +5 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 224 0 1320 4 0 1320 120 1320 +6 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 224 0 1320 2 0 1320 120 1320 + real_time expected_time +1 8.1e-05 -0.002476 +2 0.0e+00 -0.002476 +3 0.0e+00 -0.002476 +4 0.0e+00 -0.002476 +5 1.0e-06 -0.002476 +6 0.0e+00 -0.002476 +#+end_example + +We merge the =durations= and =parameters= dataframes, but only the entries for the file =hpl_pdupdatett.c= (we cannot do it +for the other file since we have a mismatch). +#+begin_src R :results output :session *R* :exports both +insert_sizes = function(durations, sizes) { + stopifnot(nrow(durations)==nrow(sizes)) + ndf = data.frame(); + for(i in (sort(unique(durations$rank)))) { + tmp_dur = durations[durations$rank == i,] + tmp_sizes = sizes[sizes$rank == i,] + stopifnot(nrow(tmp_dur) == nrow(tmp_sizes)) + stopifnot(tmp_dur$startline == tmp_sizes$line) + storage.mode(tmp_sizes$m) <- "double" # avoiding integer overflow when taking the product + storage.mode(tmp_sizes$n) <- "double" + storage.mode(tmp_sizes$k) <- "double" + storage.mode(tmp_sizes$lead_A) <- "double" + storage.mode(tmp_sizes$lead_B) <- "double" + storage.mode(tmp_sizes$lead_C) <- "double" + tmp_dur$m = tmp_sizes$m + tmp_dur$n = tmp_sizes$n + tmp_dur$k = tmp_sizes$k + tmp_dur$lead_A = tmp_sizes$lead_A + tmp_dur$lead_B = tmp_sizes$lead_B + tmp_dur$lead_C = tmp_sizes$lead_C + tmp_dur$lead_product = tmp_sizes$lead_A * tmp_sizes$lead_B * tmp_sizes$lead_C + tmp_dur$size_product = tmp_sizes$m * tmp_sizes$n * tmp_sizes$k + tmp_dur$ratio = tmp_dur$lead_product/tmp_dur$size_product + tmp_dur$real_time = tmp_sizes$real_time + tmp_dur$expected_time = tmp_sizes$expected_time + tmp_dur$absolute_time_diff = tmp_dur$expected_time - tmp_dur$duration + tmp_dur$relative_time_diff = (tmp_dur$expected_time - tmp_dur$duration)/tmp_dur$expected_time + ndf = rbind(ndf, tmp_dur) + } + return(ndf); +} +#+end_src + +#+begin_src R :results output :session *R* :exports both +result = insert_sizes(durations[durations$startfile == "/hpl-2.2/src/pgesv/hpl_pdupdatett.c",], parameters[parameters$file == "/hpl-2.2/src/pgesv/hpl_pdupdatett.c",]) +#+end_src + +Now we plot the time measured by SMPI traces against the time measured by =gettimeofday=. + +#+begin_src R :file images/gettimeofday.png :results value graphics :session *R* :exports both +library(ggplot2) +ggplot(result, aes(x=duration, y=real_time)) + + geom_point(shape=1) + ggtitle("Time measured by SMPI against time measured by gettimeofday") +#+end_src + +#+RESULTS: +[[file:images/gettimeofday.png]] + +Checking with a linear regression, just to be sure: +#+begin_src R :results output :session *R* :exports both +summary(lm(duration~real_time, data=result)) +#+end_src + +#+RESULTS: +#+begin_example + +Call: +lm(formula = duration ~ real_time, data = result) + +Residuals: + Min 1Q Median 3Q Max +-4.917e-05 -4.088e-06 1.075e-06 5.261e-06 6.181e-05 + +Coefficients: + Estimate Std. Error t value Pr(>|t|) +(Intercept) -2.617e-06 6.285e-07 -4.163 3.57e-05 *** +real_time 9.999e-01 7.058e-06 141678.252 < 2e-16 *** +--- +Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 + +Residual standard error: 1.034e-05 on 634 degrees of freedom +Multiple R-squared: 1, Adjusted R-squared: 1 +F-statistic: 2.007e+10 on 1 and 634 DF, p-value: < 2.2e-16 +#+end_example + +It is not perfect, but it looks pretty great. So, let’s use this to measure time. + +**** Now we can finally re-do the analysis of =HPL_dgemm= :R:EXPERIMENTS:PERFORMANCE:HPL: +- There are less things to do, since all the data come from the output file. +- Recall the aim of doing this again: in the previous analysis, some calls to =HPL_dgemm= were missing. Thus, it needs to + be done again, just to check if it changes anything. +- Generate the CSV file by runing the same Python script as in the previous section (the output format did not change). +- Then, analysis in R: +#+begin_src R :results output :session *R* :exports both +results <- read.csv("/tmp/parameters.csv"); +head(results) +#+end_src + +#+RESULTS: +#+begin_example + file line rank n m k lead_A lead_B lead_C +1 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 0 5040 60 0 5040 120 5040 +2 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 0 5040 30 0 5040 120 5040 +3 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 0 5040 16 0 5040 120 5040 +4 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 0 5040 8 0 5040 120 5040 +5 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 0 5040 4 0 5040 120 5040 +6 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 8 5000 60 0 5000 120 5000 + real_time expected_time +1 5.7e-05 -0.002476 +2 7.0e-06 -0.002476 +3 0.0e+00 -0.002476 +4 0.0e+00 -0.002476 +5 0.0e+00 -0.002476 +6 9.0e-06 -0.002476 +#+end_example + +#+begin_src R :results output :session *R* :exports both +process_results = function(results) { + storage.mode(results$m) <- "double" # avoiding integer overflow when taking the product + storage.mode(results$n) <- "double" + storage.mode(results$k) <- "double" + storage.mode(results$lead_A) <- "double" + storage.mode(results$lead_B) <- "double" + storage.mode(results$lead_C) <- "double" + results$lead_product = results$lead_A * results$lead_B * results$lead_C + results$size_product = results$m * results$n * results$k + results$ratio = results$lead_product/results$size_product + results$absolute_time_diff = results$expected_time - results$real_time + results$relative_time_diff = (results$expected_time - results$real_time)/results$expected_time + results$idx = 1:length(results$rank) + return(results); +} +#+end_src + +#+begin_src R :results output :session *R* :exports both +results = process_results(results) +head(results) +#+end_src + +#+RESULTS: +#+begin_example + file line rank n m k lead_A lead_B lead_C +1 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 0 5040 60 0 5040 120 5040 +2 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 0 5040 30 0 5040 120 5040 +3 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 0 5040 16 0 5040 120 5040 +4 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 0 5040 8 0 5040 120 5040 +5 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 0 5040 4 0 5040 120 5040 +6 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 222 8 5000 60 0 5000 120 5000 + real_time expected_time lead_product size_product ratio absolute_time_diff +1 5.7e-05 -0.002476 3048192000 0 Inf -0.002533 +2 7.0e-06 -0.002476 3048192000 0 Inf -0.002483 +3 0.0e+00 -0.002476 3048192000 0 Inf -0.002476 +4 0.0e+00 -0.002476 3048192000 0 Inf -0.002476 +5 0.0e+00 -0.002476 3048192000 0 Inf -0.002476 +6 9.0e-06 -0.002476 3000000000 0 Inf -0.002485 + relative_time_diff idx +1 1.023021 1 +2 1.002827 2 +3 1.000000 3 +4 1.000000 4 +5 1.000000 5 +6 1.003635 6 +#+end_example + +#+begin_src R :file images/trace_gettimeofday1_16.png :results value graphics :session *R* :exports both +library(ggplot2) +ggplot(results, aes(x=idx, y=real_time, color=factor(file))) + + geom_point(shape=1) + ggtitle("Durations of HPL_dgemm") +#+end_src + +#+RESULTS: +[[file:images/trace_gettimeofday1_16.png]] + + +This is the plot of the duration of =HPL_dgemm= over time (analogous to the plot =duration= vs =start= that we had). The part +for =hpl_pduptatett= looks exactly as before. We see that the calls to =HPL_dgemm= in =hpl_pdrpanllt= are always very short. + +#+begin_src R :file images/trace_gettimeofday2_16.png :results value graphics :session *R* :exports both +library(ggplot2) +ggplot(results, aes(x=size_product, y=real_time, color=factor(rank))) + + geom_point(shape=1) + ggtitle("Durations of HPL_dgemm") +#+end_src + +#+RESULTS: +[[file:images/trace_gettimeofday2_16.png]] + +Without surprise, we find exactly the same kind of plot as before, since all the new calls to =HPL_dgemm= are very short +and thus hidden in the left part of the graph. + + +#+begin_src R :results output :session *R* :exports both +reg <- lm(duration~I(m*n*k), data=result) +summary(reg) +#+end_src + +#+RESULTS: +#+begin_example + +Call: +lm(formula = duration ~ I(m * n * k), data = result) + +Residuals: + Min 1Q Median 3Q Max +-0.004843 -0.001337 -0.000024 0.000280 0.055746 + +Coefficients: + Estimate Std. Error t value Pr(>|t|) +(Intercept) 2.393e-04 2.182e-04 1.097 0.273 +I(m * n * k) 1.064e-09 2.615e-12 406.932 <2e-16 *** +--- +Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 + +Residual standard error: 0.003594 on 634 degrees of freedom +Multiple R-squared: 0.9962, Adjusted R-squared: 0.9962 +F-statistic: 1.656e+05 on 1 and 634 DF, p-value: < 2.2e-16 +#+end_example + +#+begin_src R :file images/reg_gettimeofday_16.png :results value graphics :results output :session *R* :exports both +layout(matrix(c(1,2,3,4),2,2)) +plot(reg) +#+end_src + +#+RESULTS: +[[file:images/reg_gettimeofday_16.png]] + +The summary of the linear regression shows that the factor =m*n*k= barely changed. The intercept is very different, but +its t-value is too low, so it is not meaning-full. +The residuals vs fitted plot seems to look better, with no more heteroscedasticity. My guess is that we added a lot of +points with very low values, so their weight hide the problem. +The QQ-plot still looks problematic. +**** Replacing =HPL_dgemm= by =smpi_usleep= again :SMPI:PERFORMANCE:HPL: +- As for the =printf=, we will put the =smpi_usleep= in the =#define=. We take the coefficients of the latest linear regression. +- Testing: we still get the same number of Gflops (about 23 Gflops) but the simulation runs in 41 seconds now. +*** 2017-03-10 Friday +**** Tracing =HPL_dtrsm= :C:PYTHON:R:EXPERIMENTS:TRACING:PERFORMANCE: +- The goal is to do something similar for =HPL_dtrsm=. In a first time, we will trace the parameters used to call it and + its durations, then we will do a linear regression, to finally replace it by a =smpi_usleep=. +- Recall that this function solves a triangular set of equations. It takes as input two m \times n matrices. We expect the + complexity to be O(m*n). +- Replace the definition of =HPL_dtrsm= in =hpl_blas.h= by the following: +#+begin_src c +#define HPL_dtrsm(layout, Side, Uplo, TransA, Diag, M, N, alpha, A, lda, B, ldb) ({\ + int my_rank, buff=0;\ + MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);\ + struct timeval before = {};\ + struct timeval after = {};\ + gettimeofday(&before, NULL);\ + cblas_dtrsm(layout, Side, Uplo, TransA, Diag, M, N, alpha, A, lda, B, ldb);\ + gettimeofday(&after, NULL);\ + double time_before = (double)(before.tv_sec) + (double)(before.tv_usec)*1e-6;\ + double time_after = (double)(after.tv_sec) + (double)(after.tv_usec)*1e-6;\ + double real_time = time_after-time_before;\ + printf("file=%s line=%d rank=%d m=%d n=%d lead_A=%d lead_B=%d real_time=%f\n", __FILE__, __LINE__, my_rank, M, N, lda, ldb, real_time);\ +}) +#+end_src +- Run the simulation: +#+begin_src sh +smpirun --cfg=smpi/bcast:mpich --cfg=smpi/running-power:6217956542.969 --cfg=smpi/display-timing:yes\ +--cfg=smpi/privatize-global-variables:yes -np 16 -hostfile ../../../small_tests/hostfile_64.txt -platform\ +../../../small_tests/cluster_fat_tree_64.xml ./xhpl > /tmp/output +#+end_src +- Process the output file: +#+begin_src python +import re +import csv +reg = re.compile('file=([a-zA-Z0-9/_.-]+) line=([0-9]+) rank=([0-9]+) m=([0-9]+) n=([0-9]+) lead_A=([0-9]+) lead_B=([0-9]+) real_time=(-?[0-9]+.[0-9]+)') + +def process(in_file, out_file): + with open(in_file, 'r') as in_f: + with open(out_file, 'w') as out_f: + csv_writer = csv.writer(out_f) + csv_writer.writerow(('file', 'line', 'rank', 'n', 'm', 'lead_A', 'lead_B', 'real_time')) + for line in in_f: + match = reg.match(line) + if match is not None: + result = list(match.group(i) for i in range(1, 9)) + result[0] = result[0][result[0].index('/hpl'):].lower() + csv_writer.writerow(result) +process('/tmp/output', '/tmp/parameters.csv') +#+end_src + +- Analysis in R: +#+begin_src R :results output :session *R* :exports both +results <- read.csv("/tmp/parameters.csv"); +head(results) +#+end_src + +#+RESULTS: +: file line rank n m lead_A lead_B real_time +: 1 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 171 8 60 0 120 120 0.000102 +: 2 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 171 8 30 0 120 120 0.000013 +: 3 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 171 8 16 0 120 120 0.000000 +: 4 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 171 8 8 0 120 120 0.000000 +: 5 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 171 8 4 0 120 120 0.000000 +: 6 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 171 8 2 0 120 120 0.000000 + +#+begin_src R :results output :session *R* :exports both +process_results = function(results) { + storage.mode(results$m) <- "double" # avoiding integer overflow when taking the product + storage.mode(results$n) <- "double" + storage.mode(results$lead_A) <- "double" + storage.mode(results$lead_B) <- "double" + results$lead_product = results$lead_A * results$lead_B + results$size_product = results$m * results$n + results$ratio = results$lead_product/results$size_product + # results$absolute_time_diff = results$expected_time - results$real_time + # results$relative_time_diff = (results$expected_time - results$real_time)/results$expected_time + results$idx = 1:length(results$rank) + return(results); +} +#+end_src + +#+begin_src R :results output :session *R* :exports both +results = process_results(results) +head(results) +#+end_src + +#+RESULTS: +#+begin_example + file line rank n m lead_A lead_B real_time +1 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 171 8 60 0 120 120 0.000102 +2 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 171 8 30 0 120 120 0.000013 +3 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 171 8 16 0 120 120 0.000000 +4 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 171 8 8 0 120 120 0.000000 +5 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 171 8 4 0 120 120 0.000000 +6 /hpl-2.2/src/pfact/hpl_pdrpanllt.c 171 8 2 0 120 120 0.000000 + lead_product size_product ratio idx +1 14400 0 Inf 1 +2 14400 0 Inf 2 +3 14400 0 Inf 3 +4 14400 0 Inf 4 +5 14400 0 Inf 5 +6 14400 0 Inf 6 +#+end_example + +#+begin_src R :file images/trace_dtrsm1_16.png :results value graphics :session *R* :exports both +library(ggplot2) +ggplot(results, aes(x=idx, y=real_time, color=factor(file))) + + geom_point(shape=1) + ggtitle("Durations of HPL_dgemm") +#+end_src + +#+RESULTS: +[[file:images/trace_dtrsm1_16.png]] + +We can observe a trend similar to =HPL_dgemm=. The function is only used in two places, =HPL_pdrpanllT= and +=HPL_pdupdateTT=. In the former, all the calls are very short, whereas in the later, the calls are long at the beginning +and become shorter throughout the execution. We also have some outliers. + +#+begin_src R :file images/trace_dtrsm2_16.png :results value graphics :session *R* :exports both +library(ggplot2) +ggplot(results, aes(x=size_product, y=real_time, color=factor(rank))) + + geom_point(shape=1) + ggtitle("Durations of HPL_dgemm") +#+end_src + +#+RESULTS: +[[file:images/trace_dtrsm2_16.png]] + +As expected, the duration looks proportional to the product of the sizes. + + +#+begin_src R :results output :session *R* :exports both +reg <- lm(real_time~I(m*n), data=results) +summary(reg) +#+end_src + +#+RESULTS: +#+begin_example + +Call: +lm(formula = real_time ~ I(m * n), data = results) + +Residuals: + Min 1Q Median 3Q Max +-0.002999 0.000010 0.000010 0.000010 0.043651 + +Coefficients: + Estimate Std. Error t value Pr(>|t|) +(Intercept) -1.042e-05 2.445e-06 -4.263 2.02e-05 *** +I(m * n) 9.246e-08 3.915e-11 2361.957 < 2e-16 *** +--- +Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 + +Residual standard error: 0.0006885 on 81298 degrees of freedom +Multiple R-squared: 0.9856, Adjusted R-squared: 0.9856 +F-statistic: 5.579e+06 on 1 and 81298 DF, p-value: < 2.2e-16 +#+end_example + +#+begin_src R :file images/reg_dtrsm_16.png :results value graphics :results output :session *R* :exports both +layout(matrix(c(1,2,3,4),2,2)) +plot(reg) +#+end_src + +#+RESULTS: +[[file:images/reg_dtrsm_16.png]] + +The R-squared is high and both the intercept and sizes have a significant impact. +However, the outliers are even more concerning than with =HPL_dgemm=. The Q-Q plot shows a large tail, and the residual vs +leverage shows that these outliers are non-negligible in the linear regression (i.e. if we removed them, the +coefficients would change significantly). +**** Replacing =HPL_dtrsm= by =smpi_sleep= :SMPI:PERFORMANCE:HPL: +- Similarly to what have been done with =HPL_dgemm=, we use the coefficients found with the linear regression to replace + the function by a sleep. +#+begin_src c +#define HPL_dtrsm(layout, Side, Uplo, TransA, Diag, M, N, alpha, A, lda, B, ldb) ({\ + double expected_time = (9.246e-08)*(double)M*(double)N - 1.024e-05;\ + if(expected_time > 0)\ + smpi_usleep((useconds_t)(expected_time*1e6));\ +}) +#+end_src +- Running HPL again. We get the expected speed (about 23 Gflops) and a simulation time of 29 seconds (gain of 12 seconds). +**** Having a look at =malloc= :PYTHON:R:PERFORMANCE:HPL: +- To run HPL with larger matrices, we need to replace some calls to =malloc= (resp. =free=) by =SMPI_SHARED_MALLOC= + (resp. =SMPI_SHARED_FREE=). +- Firstly, let’s see where the big allocations are. +- Define =MY_MALLOC= in =hpl.h= as follows: +#+begin_src c +#define MY_MALLOC(n) ({\ + int my_rank;\ + MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);\ + printf("file=%s line=%d rank=%d size=%lu\n", __FILE__, __LINE__, my_rank, n);\ + malloc(n);\ +}) +#+end_src +- Replace all the calls to =malloc= in the files by =MY_MALLOC=: +#+begin_src sh +grep -l malloc testing/**/*.c src/**/*.c | xargs sed -i 's/malloc/MY_MALLOC/g' +#+end_src +- Run =smpirun= (N=20000, P=Q=4) and redirect the output to =/tmp/output=. +- Process the output file: +#+begin_src python +import re +import csv +reg = re.compile('file=([a-zA-Z0-9/_.-]+) line=([0-9]+) rank=([0-9]+) size=([0-9]+)') + +def process(in_file, out_file): + with open(in_file, 'r') as in_f: + with open(out_file, 'w') as out_f: + csv_writer = csv.writer(out_f) + csv_writer.writerow(('file', 'line', 'rank', 'size')) + for line in in_f: + match = reg.match(line) + if match is not None: + result = list(match.group(i) for i in range(1, 5)) + result[0] = result[0][result[0].index('/hpl'):].lower() + csv_writer.writerow(result) +process('/tmp/output', '/tmp/malloc.csv') +#+end_src +- Analysis in R: +#+begin_src R :results output :session *R* :exports both +results <- read.csv("/tmp/malloc.csv"); +head(results) +#+end_src + +#+RESULTS: +: file line rank size +: 1 /hpl-2.2/src/grid/hpl_reduce.c 127 0 4 +: 2 /hpl-2.2/src/grid/hpl_reduce.c 127 1 4 +: 3 /hpl-2.2/src/grid/hpl_reduce.c 127 2 4 +: 4 /hpl-2.2/src/grid/hpl_reduce.c 127 3 4 +: 5 /hpl-2.2/src/grid/hpl_reduce.c 127 4 4 +: 6 /hpl-2.2/src/grid/hpl_reduce.c 127 5 4 + +#+begin_src R :file images/trace_malloc1_16.png :results value graphics :session *R* :exports both +library(ggplot2) +ggplot(results, aes(x=file, y=size)) + + geom_boxplot() + ggtitle("Sizes of malloc") + theme(axis.text.x = element_text(angle = 90, hjust = 1)) +#+end_src + +#+RESULTS: +[[file:images/trace_malloc1_16.png]] + +#+begin_src R :results output :session *R* :exports both +storage.mode(results$size) <- "double" # avoiding integer overflow when taking the product +aggregated_results = aggregate(results$size, by=list(file=results$file), FUN=sum) +head(aggregated_results) +#+end_src + +#+RESULTS: +: file x +: 1 /hpl-2.2/src/comm/hpl_packl.c 9034816 +: 2 /hpl-2.2/src/grid/hpl_reduce.c 3200736 +: 3 /hpl-2.2/src/panel/hpl_pdpanel_init.c 11592866048 +: 4 /hpl-2.2/src/panel/hpl_pdpanel_new.c 3456 +: 5 /hpl-2.2/src/pauxil/hpl_pdlange.c 2560032 +: 6 /hpl-2.2/src/pfact/hpl_pdfact.c 2645504 + +#+begin_src R :file images/trace_malloc2_16.png :results value graphics :session *R* :exports both +library(ggplot2) +ggplot(aggregated_results, aes(x=file, y=x)) + + geom_boxplot() + ggtitle("Sizes of malloc") + theme(axis.text.x = element_text(angle = 90, hjust = 1)) +#+end_src + +#+RESULTS: +[[file:images/trace_malloc2_16.png]] + + +There are several things to notice: +- The biggest chunks are allocated in =HPL_pdtest=. These are the local matrices of each process. +- However, regarding the total quantity of allocated memory, =HPL_pdpanel_init= is the clear winner. +- In these tests, =htop= reported that about 20% of the 16GB of my laptop’s memory were used, i.e. about 3.2GB. We use a + matrix of size 20000, each element is of type =double= (8 bytes), so the total amount of memory for the whole matrix is + 20000^2/8 = 3.2GB. +- Thus, it seems that the =malloc= used in =HPL_pdpanel_init= are in fact negligible. An hypothesis is that they are quickly + followed by a =free=. +- Verifying that every process allocates the same thing: + +#+begin_src R :file images/trace_malloc3_16.png :results value graphics :session *R* :exports both +library(ggplot2) +ggplot(results[results$file == "/hpl-2.2/testing/ptest/hpl_pdtest.c",], aes(x="", y=size, fill=factor(rank))) + + coord_polar("y", start=0) + + geom_bar(width=1, stat="identity") + + ggtitle("Sizes of malloc in HPL_pdtest") +#+end_src + +#+RESULTS: +[[file:images/trace_malloc3_16.png]] + +#+begin_src R :results output :session *R* :exports both +res_pdtest = results[results$file == "/hpl-2.2/testing/ptest/hpl_pdtest.c",] +unique(res_pdtest[order(res_pdtest$size),]$size) +#+end_src + +#+RESULTS: +: [1] 193729992 196879432 198454152 200080072 201680392 203293512 + +- The different calls to =malloc= in =HPL_pdtest= have approximately the same size, but not exactly. This understandable, P + and Q may not divide the matrix sizes. Maybe this could cause =SMPI_SHARED_MALLOC= to not work properly? +**** Tentative to use =SMIP_SHARED_MALLOC= and =SMPI_SHARED_FREE= in HPL :SMPI:PERFORMANCE:BUG:HPL: +- Revert the previous changes regarding =malloc=. +- In file =hpl_pdtest.c=, replace =malloc= by =SMPI_SHARED_MALLOC= and =free= by =SMPI_SHARED_FREE=. +- Run HPL with Simgrid. Two issues: + + The memory consumption stays the same, about 20% of my laptop’s memory. A first guess would be that the + =SHARED_MALLOC= did not work, a new allocation was made for every process. Maybe because different sizes were given? + + The execution time (both virutal and real) decreased significantly. The virtual time dropped from 233 to 223 seconds, + the real time from 28 to 15 seconds. If we forget the first point, a guess could be that =SHARED_MALLOC= worked + properly and resulted in a lower number of cache misses (since all processes share the same sub-matrix) and thus + improved performances. It is an experimental bias, we should avoid it. + The fact that we have these two issues combined is very surprising. +- Let’s try to see if the =SHARED_MALLOC= makes only one allocation or not, by adding some =printf= in its implementation. + + The path =shmalloc_global= is taken. + + The =bogusfile= is created only once, as expected. + + Then, every process maps the file in memory, chunk by chunk. The base adress is not the same for every process, but + this is not an issue (we are speaking of virtual memory here). +- Tested my matrix product program. Got 34% memory utilization, 44 virtual seconds and 8 real seconds with =SMPI_SHARED_MALLOC=, but 11% memory utilization, 81 + virtual seconds and 7 real seconds with =malloc=. Very strange. +- Hypothesis: either the measure of the memory consumption is broken, or =SHARED_MALLOC= is broken. +- Try to use something else than =htop=: + #+begin_src sh + watch -n 0,1 cat /proc/meminfo + #+end_src + + With =malloc= and =free=, the available memory drop from 14.4 GB to 11.0 GB. + + With =SMPI_SHARED_MALLOC= and =SMPI_SHARED_FREE=, the available memory drop from 14.4 GB to 14.1 GB. + This seems more coherent, so =htop= would be a bad tool to measure memory consumption when using =SMPI_SHARED_MALLOC=. + But this does not solve the time issue. +*** 2017-03-12 Sunday +**** Experiment with SMPI macros in the matrix product code :C:R:EXPERIMENTS:PERFORMANCE: +- Use the matrix product code, at commit =91633ea99463109736b900c92f2eacc84630e5b5=. Run 10 tests with or without + =SMPI_SHARED_MALLOC= and =SMPI_SAMPLE= with a matrix size of 4000 and 64 processes, by running the command: + #+begin_src sh + ./smpi_macros.py 10 /tmp/results.csv + #+end_src +- Analysis, in R: +#+begin_src R :results output :session *R* :exports both +results <- read.csv("/tmp/results.csv"); +head(results) +#+end_src + +#+RESULTS: +: time size smpi_sample smpi_malloc +: 1 2.134820 4000 1 1 +: 2 2.608971 4000 0 0 +: 3 3.767625 4000 1 0 +: 4 2.412387 4000 0 1 +: 5 3.767162 4000 1 0 +: 6 2.497480 4000 0 0 + +We already see that the case where we use =SMPI_SAMPLE= but not =SMPI_SHARED_MALLOC= seems to be different than the others. + +#+begin_src R :results output :session *R* :exports both +res_aov = aov(time~(smpi_sample + smpi_malloc)^2, data=results) +summary(res_aov) +#+end_src + +#+RESULTS: +: Df Sum Sq Mean Sq F value Pr(>F) +: smpi_sample 1 1.202 1.202 9.227 0.00442 ** +: smpi_malloc 1 4.579 4.579 35.163 8.62e-07 *** +: smpi_sample:smpi_malloc 1 8.332 8.332 63.981 1.68e-09 *** +: Residuals 36 4.688 0.130 +: --- +: Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 + +#+begin_src R :file images/smpi_macros_1.png :results value graphics :results output :session *R* :exports both +suppressWarnings(suppressMessages(library(FrF2))) # FrF2 outputs a bunch of useless messages... +MEPlot(res_aov, abbrev=4, select=c(1, 2), response="time") +#+end_src + +#+RESULTS: +[[file:images/smpi_macros_1.png]] + +#+begin_src R :file images/smpi_macros_2.png :results value graphics :results output :session *R* :exports both +IAPlot(res_aov, abbrev=4, show.alias=FALSE, select=c(1, 2)) +#+end_src + +#+RESULTS: +[[file:images/smpi_macros_2.png]] + +#+begin_src R :results output :session *R* :exports both +mean(results[results$smpi_sample == 0 & results$smpi_malloc == 0,]$time) +mean(results[results$smpi_sample == 0 & results$smpi_malloc == 1,]$time) +mean(results[results$smpi_sample == 1 & results$smpi_malloc == 0,]$time) +mean(results[results$smpi_sample == 1 & results$smpi_malloc == 1,]$time) +#+end_src + +#+RESULTS: +: [1] 2.513953 +: [1] 2.750056 +: [1] 3.773385 +: [1] 2.183901 + +- In this small experiment, we see that both macros have a non-negligible impact on the time estimated by SMPI. When + none of the optimizations is used, adding one of them will decreases the application’s performances. When one of the + optimizations is already used, adding the other one increases the application’s performances. +- When I have added the SMPI macros in =matmul.c=, I have firstly added =SMPI_SHARED_MALLOC= and then =SMPI_SAMPLE_GLOBAL= + (see the entry for 13/02/2017). According to the tests above, here the variation is not huge (I did not try the + configuration with =SMPI_SAMPLE_GLOBAL= and without =SMPI_SHARED_MALLOC=). Furthermore, I did not perform extensive + tests. This may explain why I did not notice this sooner. +*** 2017-03-13 Monday +**** Let’s play with Grid 5000 :G5K: +- Connect to Grenoble’s site: + #+begin_src sh + ssh tocornebize@access.grid5000.fr + ssh grenoble + #+end_src +- Reserve a node and deploy: + #+begin_src sh + oarsub -I -l nodes=1,walltime=7 -t deploy + kadeploy3 -f $OAR_NODE_FILE -e jessie-x64-big -k + #+end_src +- Connect as root on the new node: + #+begin_src sh + ssh root@genepi-33.grenoble.grid5000.fr + #+end_src +- Install Simgrid: + #+begin_src sh + wget https://github.com/simgrid/simgrid/archive/c8db21208f3436c35d3fdf5a875a0059719bff43.zip -O simgrid.zip + unzip simgrid.zip + cd simgrid-* + mkdir build + cd build + cmake -Denable_documentation=OFF .. + make -j 8 + make install + #+end_src +- Copy HPL on the machine, with =scp=. +- Change the variable =TOPDIR= in the file =Make.SMPI=. +- Do not forget to clean HPL directory when copying it, otherwise the modification of the variable =TOPDIR= will not be + applied on the sub-makefiles. +- Success of compilation and execution of HPL with Simgrid on one Grid5000 node. +- Strange thing: the virtual time did not change much (228 seconds, or 23.3 Gflops), although the simulation time + changed a lot (50 seconds, against 15 seconds on my laptop) and I used the same value for the option =running-power=. +**** Scrit for automatic installation :SHELL:G5K: +- A small bash script to install Simgrid and compile HPL. Store it in file =deploy.sh=. It assume that archives for + Simgrid and HPL are located in =/home/tocornebize=. + #+begin_src sh +function abort { + echo -e "\e[1;31m Error:" $1 "\e[0m" + exit 1 +} + +rm -rf hpl* simgrid* +cp /home/tocornebize/{hpl,simgrid}.zip . &&\ +unzip hpl.zip +unzip simgrid.zip +if [ $? -ne 0 ] +then + abort "Could not copy or extract the archives." +fi + +echo "" +echo -e "\e[1;34m Installing Simgrid\e[0m" +cd simgrid* &&\ +mkdir build &&\ +cd build &&\ +cmake -Denable_documentation=OFF .. &&\ +make -j 8 &&\ +make install &&\ +cd ../.. +if [ $? -ne 0 ] +then + abort "Could not install Simgrid." +fi + +echo "" +echo -e "\e[1;34m Installing HPL\e[0m" +cd hpl* &&\ +sed -ri "s|TOPdir\s+=.+|TOPdir="`pwd`"|g" Make.SMPI &&\ # fixing TOPdir variable +make startup -j 8 arch=SMPI &&\ +make -j 8 arch=SMPI &&\ +cd .. +if [ $? -ne 0 ] +then + abort "Could not compile HPL." +fi + +echo "" +echo -e "\e[1;32m Everything was ok\e[0m" + #+end_src +- Given a node obtained with =oarsub= and =kadeploy3=, connect in ssh to it. Then, just run: + #+begin_src sh + /home/tocornebize/deploy.sh + #+end_src +**** Recurrent failure in HPL with =SMPI_SHARED_MALLOC= :SMPI:BUG:HPL: +- The following error often happens when runing HPL with =SMPI_SHARED_MALLOC=: + #+begin_example + src/simix/smx_global.cpp:557: [simix_kernel/CRITICAL] Oops ! Deadlock or code not perfectly clean. + #+end_example +- It does not seem to happen without =SMPI_SHARED_MALLOC=. +- It does not always happen with =SMPI_SHARED_MALLOC=. +- I do not understand what is happening. +**** Another failure in HPL with =SMPI_SHARED_MALLOC= :SMPI:BUG:HPL: +- Similarly, the tests on the matrix at the end of HPL are never computed when we use =SMPI_SHARED_MALLOC, because of an + error. For instance: + #+begin_example + HPL ERROR from process # 0, on line 331 of function HPL_pdtest: + >>> Error code returned by solve is 1021, skip <<< + #+end_example +- Example of error code: 1021, 1322, 1324, 1575... These values appear nowhere in the code. +**** Tracking the error in HPL +- Put some =printf= to track the error. +*** 2017-03-14 Tuesday +**** Keep tracking the error. :SMPI:GIT:TRACING:BUG: +- Add the option =--cfg=smpi/simulate-computation:0= to have a deterministic execution. +- The error code is the field =info= of the matrix. It is modified in the execution path =HPL_pdgesv= \to =HPL_pdgesv0= \to + =HPL_pdpanel_free=, by the following line: + #+begin_src c + if( PANEL->pmat->info == 0 ) PANEL->pmat->info = *(PANEL->DINFO); + #+end_src + Thus, we now have to track the values of the =DINFO= field in the panel. +- Strange thing, the field =DINFO= is a pointer to a =float=. +- To track this, use this function: + #+begin_src c + void print_info(HPL_T_panel *PANEL, int line) { + if(PANEL->grid->myrow == 0 && PANEL->grid->mycol == 0) { + printf("info = %f, line = %d\n", *PANEL->DINFO, line); + } + } + #+end_src + Put some calls to it at nearly every line of the target file (when you are done with a file, remove these calls). +- Field =DINFO= is modified in the execution path =HPL_pdgesv0= \to =HPL_pdfact= \to =panel->algo->rffun=. The pointer =rffun= is + one of the functions =HPL_pdrpan***=. In our settings, =HPL_pdrpanllT= is used. +- Field =DINFO= is modified by =PANEL->algo->pffun=, which is one of the functions =HPL_pdpan***=. In our settings, + =HPL_pdpanllT= is used. +- Then it is modified by the first call to =HPL_dlocswpT=. This function directly modifies the value of =DINFO= with the + line: + #+begin_src c + if( *(PANEL->DINFO) == 0.0 ) + *(PANEL->DINFO) = (double)(PANEL->ia + JJ + 1); + #+end_src +- If we remove this line, as expected the message about the error code disappears. So it confirms the error code come + from here. +- Looking at =HPL_pdpanel_init.c=, + + =DINFO= is a pointer to a part of =DPIV=: + #+begin_src c + PANEL->DINFO = PANEL->DPIV + JB; + #+end_src + + =DPIV= is a pointer to a part of =L1=: + #+begin_src c + PANEL->DPIV = PANEL->L1 + JB * JB; + #+end_src + + =L1= is an (aligned) alias for =WORK=, which is itself a block of memory allocated with =malloc=: + #+begin_src c + PANEL->WORK = (void*) malloc((size_t)(lwork) * sizeof(double)); + // [...] + PANEL->L1 = (double *)HPL_PTR( PANEL->WORK, dalign ); + #+end_src + L1 is the jb \times jb upper block of the local matrix. It is used for computations. Thus, it seems that HPL expects a + particular cell of this local matrix to have the value 0. This cell is not always the same. + Interpretation: HPL is checking that the matrix is correctly factorized (it uses LU factorization, so it computes L + and U such that A=LU, L is lower-triangular and U is upper-triangular). Since we use shared memory, it is not + surprising that the correctness check do not pass anymore. + What is more surprising is that this particular check was still passing when the two BLAS functions were replaced by + =smpi_usleep=. A guess: the fact that the resulting matrices are triangular only depends on the correctness of the + swapping of rows. +- Thus, it seems that the error code is explained. This is a normal behavior, considered what we are doing. +- The deadlock happening in some executions is not explained however. +**** Webinar :MEETING: +- Enabling open and reproducible research at computer system’s conferences: good, bad and ugly +- Grigori Fursin +- The speaker created an [[http://ctuning.org/][organization]] about reproducible research. +- Artifact evaluation is about peer review of experiments. +- How it works: papers accepted to a conference can ask for an artifact evaluation. If they pass it, they would get a + nice stamp on the paper. If they fail it, nobody will know. View this as a bonus for a paper. For the evaluation of + the artifacts, the conference nominates several reviewers. +- ACM conferences also start using this kind of things, with several different stamps. +- But artifact evaluation is not easy to do. Firstly, there is a lot of artifacts to evaluate, hard to scale. Some + artifact evaluations require proprietary software and/or rare hardware (e.g. supercomputers). Also, hard to find a + reviewer with suitable skills for some cases. +- Also, it is difficult to reproduce empirical results (changing software and hardware). Everyone has its own scripts, + so hard to standardize a universal workflow. +- Other [[http://cknowledge.org][website]]. +*** 2017-03-15 Wednesday +**** Hunting the deadlock :SMPI:PYTHON:R:TRACING:BUG:HPL: +- With N=40000, P=Q=4 and the option =--cfg=smpi/simulate-computation:0=, it seems we always have a deadlock. +- Let’s trace it, with option =-trace -trace-file /tmp/trace --cfg=smpi/trace-call-location:1=. +- Processing the trace file: +#+begin_src sh +pj_dump --user-defined --ignore-incomplete-links /tmp/trace > /tmp/trace.csv +grep "State," /tmp/trace.csv | sed -e 's/()//' -e 's/MPI_STATE, //ig' -e 's/State, //ig' -e 's/rank-//' -e\ +'s/PMPI_/MPI_/' | grep MPI_ | tr 'A-Z' 'a-z' > /tmp/trace_processed.csv +#+end_src + +Clean the paths: +#+begin_src python +import re +reg = re.compile('((?:[^/])*)(?:/[a-zA-Z0-9_-]*)*((?:/hpl-2.2(?:/[a-zA-Z0-9_-]*)*).*)') +def process(in_file, out_file): + with open(in_file, 'r') as in_f: + with open(out_file, 'w') as out_f: + for line in in_f: + match = reg.match(line) + out_f.write('%s%s\n' % (match.group(1), match.group(2))) +process('/tmp/trace_processed.csv', '/tmp/trace_cleaned.csv') + #+end_src + +Analysis: +#+begin_src R :results output :session *R* :exports both +trace <- read.csv("/tmp/trace_cleaned.csv", header=F, strip.white=T, sep=","); +names(trace) = c("rank", "start", "end", "duration", "level", "state", "Filename", "Linenumber"); +trace$idx = 1:length(trace$rank) +head(trace) +#+end_src + +#+RESULTS: +#+begin_example + rank start end duration level state +1 8 0.000000 0.000000 0.000000 0 mpi_init +2 8 0.000000 0.000202 0.000202 0 mpi_recv +3 8 0.000202 0.000403 0.000201 0 mpi_recv +4 8 0.000403 0.000806 0.000403 0 mpi_recv +5 8 0.000806 0.000806 0.000000 0 mpi_send +6 8 0.000806 0.001612 0.000806 0 mpi_recv + Filename Linenumber idx +1 /hpl-2.2/testing/ptest/hpl_pddriver.c 109 1 +2 /hpl-2.2/src/grid/hpl_reduce.c 165 2 +3 /hpl-2.2/src/grid/hpl_reduce.c 165 3 +4 /hpl-2.2/src/grid/hpl_reduce.c 165 4 +5 /hpl-2.2/src/grid/hpl_reduce.c 159 5 +6 /hpl-2.2/src/grid/hpl_broadcast.c 130 6 +#+end_example + +#+begin_src R :results output :session *R* :exports both +get_last_event = function(df) { + result = data.frame() + for(rank in (sort(unique(trace$rank)))) { + tmp_trace = trace[trace$rank == rank,] + result = rbind(result, tmp_trace[which.max(tmp_trace$idx),]) + } + return(result) +} +get_last_event(trace)[c(1, 2, 3, 6, 7, 8)] +#+end_src + +#+RESULTS: +#+begin_example + rank start end state Filename +18756 0 67.01313 67.10575 mpi_recv /hpl-2.2/src/pgesv/hpl_spreadt.c +9391 1 66.84201 67.10575 mpi_recv /hpl-2.2/src/pgesv/hpl_spreadt.c +7865 2 66.92821 67.10575 mpi_recv /hpl-2.2/src/pgesv/hpl_spreadt.c +7048 3 67.01313 67.10575 mpi_recv /hpl-2.2/src/pgesv/hpl_spreadt.c +6242 4 67.08334 67.10575 mpi_send /hpl-2.2/src/pgesv/hpl_rollt.c +4699 5 66.93228 67.10575 mpi_wait /hpl-2.2/src/pgesv/hpl_rollt.c +3174 6 67.02313 67.10575 mpi_wait /hpl-2.2/src/pgesv/hpl_rollt.c +2358 7 67.08334 67.10575 mpi_send /hpl-2.2/src/pgesv/hpl_rollt.c +1554 8 67.08334 67.10575 mpi_recv /hpl-2.2/src/pgesv/hpl_spreadt.c +17201 9 66.93228 67.10575 mpi_send /hpl-2.2/src/pgesv/hpl_spreadt.c +15675 10 67.02313 67.10575 mpi_send /hpl-2.2/src/pgesv/hpl_spreadt.c +14858 11 67.08334 67.10575 mpi_recv /hpl-2.2/src/pgesv/hpl_spreadt.c +14053 12 67.06093 67.10575 mpi_recv /hpl-2.2/src/pgesv/hpl_spreadt.c +12516 13 66.88778 67.10575 mpi_recv /hpl-2.2/src/pgesv/hpl_spreadt.c +10998 14 66.97831 67.10575 mpi_recv /hpl-2.2/src/pgesv/hpl_spreadt.c +10189 15 67.06093 67.10575 mpi_recv /hpl-2.2/src/pgesv/hpl_spreadt.c + Linenumber +18756 321 +9391 321 +7865 321 +7048 321 +6242 235 +4699 242 +3174 242 +2358 235 +1554 321 +17201 351 +15675 351 +14858 321 +14053 321 +12516 321 +10998 321 +10189 321 +#+end_example + +If the trace is correct, the deadlock happens in functions =HPL_rollT= and =HPL_spreadT=. +Some =printf= confirm that the deadlock is indeed happening in these places. +**** Found the deadlock :SMPI:C:BUG:HPL: +- Let’s add some =printf= in files =HPL_spreadT.c= and =HPL_rollT.c=. + First, add the functions: + #+begin_src c + int local_rank_to_global(int local_rank, MPI_Comm local_communicator) { + int result; + MPI_Group local_group, world_group; + MPI_Comm_group(local_communicator, &local_group); + MPI_Comm_group(MPI_COMM_WORLD, &world_group); + MPI_Group_translate_ranks(local_group, 1, &local_rank, world_group, &result); + return result; + } + void print_info(int src_rank, int dst_rank, char *function, int line, char *file) { + printf("src=%d dst=%d function=%s line=%d file=%s\n", src_rank, dst_rank, function, + line, file); + } + #+end_src + Then, add a call to =print_info= before each of the four lines we found: + + =HPL_spreadT.c=, line 321: + #+begin_src c + int local_rank = local_rank_to_global(IPMAP[SRCDIST+partner], comm); + print_info(my_rank, local_rank, "mpi_recv", __LINE__, __FILE__); + #+end_src + + =HPL_spreadT.c=, line 351: + #+begin_src c + int local_rank = local_rank_to_global(IPMAP[SRCDIST+partner], comm); + print_info(my_rank, local_rank, "mpi_send", __LINE__, __FILE__); + #+end_src + + =HPL_rollT.c=, line 235: + #+begin_src c + int local_rank = local_rank_to_global(partner, comm); + print_info(my_rank, local_rank, "mpi_send", __LINE__, __FILE__); + #+end_src + + =HPL_rollT.c=, line 242: + #+begin_src c + int local_rank = local_rank_to_global(partner, comm); + print_info(my_rank, local_rank, "mpi_wait", __LINE__, __FILE__); + #+end_src +- Then, run HPL with =stdout= redirected to a file =/tmp/output=. +- For each rank, look for the last time this rank was the caller of a blocking MPI primitive. For instance, for rank =15=: + #+begin_src sh + RANK="15 " && grep "src="$RANK /tmp/output | tail -n 1 + #+end_src + Observe the destination and the function. + With P=Q=4, we had these dependencies: + #+begin_example + 12 + | + mpi_recv | + | + v mpi_recv + 4 <——————————————+ + | | + | | + mpi_wait | | + | | + v | + 8 —————————————> 0 + mpi_send + #+end_example + There is the same pattern for {1, 5, 9, 13}, {2, 6, 10, 14} and {3, 7, 11, 15}. +- This exact deadlock has been reproduced on Grid 5000, with the same parameters. +*** 2017-03-16 Thursday +**** Still looking for the deadlock :SMPI:BUG:HPL: +- When HPL is ran with =smpi_usleep= but without =SMPI_SHARED_{MALLOC,FREE}=, there is no deadlock, even with the same parameters (N=40000, + P=Q=4). Warning: testing with N=40000 require a lot of memory, about 12GB. +- When HPL is ran with =SMPI_SHARED_{MALLOC,FREE}= but without =smpi_usleep=, there is a deadlock. Note that we still use + the option =--cfg=smpi/simulate-computation:0=. It happens in the same location, but the deadlock is different. Now, it + is like this (and is located only in =HPL_spreadT=): + #+begin_example + 4 + | + mpi_recv | + | + v mpi_recv + 0 <——————————————+ + | | + | | + mpi_send | | + | | + v | + 12 —————————————> 8 + mpi_send + #+end_example + There is the same pattern for {1, 5, 9, 13}, {2, 6, 10, 14} and {3, 7, 11, 15}. +**** Understanding HPL code :SMPI:TRACING:BUG:HPL: +- In file =HPL_spreadT.c=. +- In our settings, the following =if= statement is never taken: + #+begin_src c + if(SIDE == HplLeft) + #+end_src +- In the =else= part, there is a big =do while= loop. Some initializations happen before this loop. +- =npm1=: initialized to =nprow - SRCDIST - 1=, not modified during the loop. +- =ip2=: initialized to the biggest power of 2 smaller or equal to =npm1=. Divided by =2= at each step. The loop stops when + =ip2= is =0=. +- =mask=: initialized to =ip2*2-1= (=ip2= is a single bit set to =1= followed by a bunch of =0=, =mask= is the same bit set to =1= + followed by a bunch of =1=). At the beginning of each step, the first =1= of =mask= is flipped, so =mask= is =ip2-1= after this + statement. +- =IPMAP=: mapping of the processes. +- =IPMAPM1=: inverse mapping (=IPMAPM1[IPMAP[i]]= is equal to =i=). +- =mydist=: initialized to =IPMAP1[myrow]=, not modified after. +- =partner=: at each step, set to =mydist^ip2=, i.e. we flip exactly one bit of =mydist=. +- We do the communications only when =mydist & mask= is =0= and when =lbuf > 0=. + + If =mydist & ip2= is not =0=, we receive. + + If =mydist & ip2= is =0=, we send. +- Print the content of =IPMAP=. Add the following line before the =do while=: + #+begin_src c + printf("IPMAP: my_rank=%d, %d %d %d %d \n", my_rank, + local_rank_to_global(IPMAP[0], comm), local_rank_to_global(IPMAP[1], comm), + local_rank_to_global(IPMAP[2], comm), local_rank_to_global(IPMAP[3], comm)); + #+end_src + We get this output: + #+begin_example + IPMAP: my_rank=0, 0 4 12 8 + IPMAP: my_rank=12, 0 4 12 8 + IPMAP: my_rank=8, 0 4 8 12 + IPMAP: my_rank=4, 0 4 12 8 + IPMAP: my_rank=0, 0 4 12 8 + IPMAP: my_rank=4, 0 4 12 8 + IPMAP: my_rank=1, 1 5 13 9 + IPMAP: my_rank=5, 1 5 13 9 + IPMAP: my_rank=13, 1 5 13 9 + IPMAP: my_rank=9, 1 5 9 13 + IPMAP: my_rank=5, 1 5 13 9 + IPMAP: my_rank=1, 1 5 13 9 + IPMAP: my_rank=2, 2 6 14 10 + IPMAP: my_rank=3, 3 7 15 11 + IPMAP: my_rank=6, 2 6 14 10 + IPMAP: my_rank=7, 3 7 15 11 + IPMAP: my_rank=10, 2 6 10 14 + IPMAP: my_rank=11, 3 7 11 15 + IPMAP: my_rank=14, 2 6 14 10 + IPMAP: my_rank=15, 3 7 15 11 + IPMAP: my_rank=6, 2 6 14 10 + IPMAP: my_rank=2, 2 6 14 10 + IPMAP: my_rank=7, 3 7 15 11 + IPMAP: my_rank=3, 3 7 15 11 + #+end_example + Recall that our communicators are {n, n+4, n+8, n+12} for n in {0, 1, 2, 3}. We see a pattern here: when processes + have a local rank in {0, 1, 3}, their =IPMAP= is {0, 1, 3, 2} (local ranks), but when the local rank is 2, then =IPMAP= is + {0, 1, 2, 3}. +- Now, let’s print the other parameters. Add the following line just after the modification of =mask= at the beginning of + the =do while=: + #+begin_src c + printf("### my_rank=%d (%d) id_func=%d mask=%d ip2=%d mydist=%d", my_rank, + my_local_rank, id_func, mask, ip2, mydist); + #+end_src + Here, =id_func= is a static variable initialized to =-1= and incremented at the beginning of every function call. + Later in the code, add these: + #+begin_src c + printf(" partner=%d", partner); + #+end_src + and + #+begin_src c + printf(" mpi_recv(%d)\n", IPMAP[SRCDIST+partner]); + #+end_src + or + #+begin_src c + printf(" mpi_send(%d)\n", IPMAP[SRCDIST+partner]); + #+end_src + (depending on if we do a send or a receive). + We have this output for {0, 4, 8, 12} (this is similar for other communicators): + #+begin_src bash + grep "my_rank=0 " output | grep "###" + ### my_rank=0 (0) id_func=0 mask=1 ip2=2 mydist=0 partner=2 mpi_send(3) + ### my_rank=0 (0) id_func=0 mask=0 ip2=1 mydist=0 partner=1 mpi_send(1) + ### my_rank=0 (0) id_func=1 mask=1 ip2=2 mydist=0 partner=2 mpi_send(3) + grep "my_rank=4 " output | grep "###" + ### my_rank=4 (1) id_func=0 mask=1 ip2=2 mydist=1 + ### my_rank=4 (1) id_func=0 mask=0 ip2=1 mydist=1 partner=0 mpi_recv(0) + ### my_rank=4 (1) id_func=1 mask=1 ip2=2 mydist=1 + ### my_rank=4 (1) id_func=1 mask=0 ip2=1 mydist=1 partner=0 mpi_recv(0) + grep "my_rank=8 " output | grep "###" + ### my_rank=8 (2) id_func=0 mask=1 ip2=2 mydist=2 partner=0 mpi_recv(0) + grep "my_rank=12 " output | grep "###" + ### my_rank=12 (3) id_func=0 mask=1 ip2=2 mydist=2 partner=0 mpi_recv(0) + ### my_rank=12 (3) id_func=0 mask=0 ip2=1 mydist=2 partner=3 mpi_send(2) + #+end_src + We see that the pattern of communication looks like a binary tree. At each function call, in the first step 0 sends to + 12, in the second step 0 sends to 4 and 12 sends to 8. The problem is that all the =mpi_recv= match the =mpi_send= except + for the node 8. This node calls =mpi_recv= with node 0 for the source, but we would expect it to have 12 for source. + The same pattern is observed for other communicators. +- We saw that the nodes with local rank 2 call =MPI_Recv= with an unexpected source. These nodes also have a different + =IPMAP=. Hypothesis: these different =IPMAP= are a bug. +- Doing the same experiment without =SMPI_SHARED_{MALLOC,FREE}= (the case where we do not have a deadlock). Here, we + observe that the values of =IPMAP= are the same in all processes. Also, there is a matching =MPI_Recv= for every =MPI_Send=, + as expected. +- Thus, to fix the deadlock, we should search where =IPMAP= is defined. +**** Seminar :MEETING: +- Taking advantage of application structure for visual performance analysis +- Lucas Mello Schnorr +- Context: two models. Explicit programming (e.g. MPI) or task-based programming (e.g. Cilk). +- In task-based programming, no clear phases (contrarily to things like MPI, where we have communication phases and + computation phases). Thus, hard to understand the performances when visualizing a trace. +- The scheduler has to assign tasks, anticipate the critical path and minimize data movements. The difficulty is that it + does not know the whole DAG at the beginning. +- Workflow based on several tools: =pj_dump=, =R=, =tidyverse=, =ggplot2=, =plotly=. Everything can be done in org-mode. Agile + workflow, fail fast if the idea is not working, easily share experiments with colleagues. +*** 2017-03-17 Friday +**** Let’s look at =IPMAP= :SMPI:C:TRACING:BUG:HPL: +- =IPMAP= is given as an argument to =HPL_spreadT=. +- The function =HPL_spreadT= is usedd in =HPL_pdlaswp01T= and =HPL_equil=. +- In our settings, all processes begin by a call to =HPL_pdlaswp01T=. Then, all processes with local ranks =0= and =1= do a + call to =HPL_equil= (local ranks =2= and =3= are already deadlocked). Values of =IPMAP= are the same between the two different + calls. + We thus have to look at =HPL_pdlaswp01T=. +- =IPMAP= is defined in this function with other variables. They are all a contiguous block in =PANEL->IWORK=: + #+begin_src c + iflag = PANEL->IWORK; + // [...] + k = (int)((unsigned int)(jb) << 1); ipl = iflag + 1; ipID = ipl + 1; + ipA = ipID + ((unsigned int)(k) << 1); lindxA = ipA + 1; + lindxAU = lindxA + k; iplen = lindxAU + k; ipmap = iplen + nprow + 1; + ipmapm1 = ipmap + nprow; permU = ipmapm1 + nprow; iwork = permU + jb; + #+end_src +- =PANEL->IWORK= is allocated in =HPL_pdpanel_init= with a simple =malloc=. So the bugs does not come from here. +- The content of =IPMAP= is defined in the function =HPL_plindx10=. +- Function =HPL_plindx10= firstly compute the content of array =IPLEN=, then call function =HPL_logsort= to compute =IPMAP= (the + content of =IPMAP= depends on the content of =IPLEN=). +- Printing the content of =IPLEN= just after its initialization. + Add this code just before the call to =HPL_logsort=: + #+begin_src c + int my_rank; + MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); + printf(">> my_rank=%d, icurrow=%d, IPLEN =", my_rank, icurrow); + for(i = 0; i <= nprow; i++) { + printf(" %d", IPLEN[i]); + } + printf("\n"); + #+end_src + Here are the contents of =IPLEN= for ranks {0, 4, 8, 12}. + + With =SMPI_SHARED_{MALLOC,FREE}=: + | Rank | IPLEN[0] | IPLEN[1] | IPLEN[2] | IPLEN[3] | IPLEN[4] | + |------+----------+----------+----------+----------+----------| + | 0 | 0 | 103 | 14 | 1 | 2 | + | 4 | 0 | 102 | 15 | 1 | 2 | + | 8 | 0 | 102 | 14 | 2 | 2 | + | 12 | 0 | 102 | 14 | 1 | 3 | + + Without =SMPI_SHARED_{MALLOC,FREE}=: + | Rank | IPLEN[0] | IPLEN[1] | IPLEN[2] | IPLEN[3] | IPLEN[4] | + |------+----------+----------+----------+----------+----------| + | 0 | 0 | 31 | 24 | 26 | 39 | + | 4 | 0 | 31 | 24 | 26 | 39 | + | 8 | 0 | 31 | 24 | 26 | 39 | + | 12 | 0 | 31 | 24 | 26 | 39 | + + We can note two things. Firstly, without =SMPI_SHARED_{MALLOC,FREE}=, all processes have an =IPLEN= with the same + content. This is not the case with =SMPI_SHARED_{MALLOC,FREE}=. Furthermore, values in =IPLEN= are closer in the + =malloc/free= case. + Thus, the issue is very likely to come from =IPLEN=. +**** Let’s look at =IPLEN= and =IPID= :SMPI:TRACING:BUG:HPL: +- The content of =IPLEN= depends on the content of =IPID=. +- Add a =printf= to get its content. Every element it contains is present exactly twice in the array. +- With =SHARED_{MALLOC,FREE}=, + + =IPID= has a size of 300 for local rank =0=, 302 for the others. + + =IPID= of local rank =1= is equal to =IPID= of local rank =0= plus twice the element =120=. + + =IPID= of local rank =2= is equal to =IPID= of local rank =0= plus twice the element =240=. + + =IPID= of local rank =3= is equal to =IPID= of local rank =0= plus twice the element =360=. +- Without =SHARED_{MALLOC,FREE}=, + + =IPID= has a size of 478 for all ranks. + + All =IPID= are equal. +- =IPID= is computed in function =HPL_pipid=. +- The content of =IPID= depends on the content of the array =PANEL->DPIV=. This array is made of =120= elements. These + elements are of type =double=. The function cast them to =int= and do some comparisons using them, which is strange. +- Add a =printf= to get its content. +- With =SHARED_{MALLOC,FREE}=, + + The =DPIV= of the processes having the same local rank are equal. + + The 30 first elements of the arrays =DPIV= of the processes of a same communicator are equal. The following elements + are different. + + For the processes of local rank =0=, these following elements are 30, 31, 32,..., 119. In other words, for =i > 29=, we + have =DPIV[i]= equal to =i=. + + For the processes of local rank =1=, these elements are all equal to 120. For local rank =2=, they are equal to 240. For + local rank =3=, they are equal to =360. +- Without =SHARED_{MALLOC,FREE}=, + + All the =DPIV= of all processes are equal. + + All its elements are present exactly once, except =4143= which is present twice. +– Thus, it seems that the issue come from =PANEL->DPIV=. +**** Summing up :SMPI:BUG:HPL: +- The values of =IPMAP= depends on the values of =IPLEN=. +- The values of =IPLEN= depends on the values of =IPID=. +- The values of =IPID= depends on the values of =PANEL->DPIV=. +- For all these arrays, we can observe some strange things in the case =SMPI_SHARED_{MALLOC,FREE}= (comparing to the case + =malloc/free=): + + The content of the arrays is not the same for different ranks. + + The content itself looks kind of strange (e.g. =DPIV= has a lot of identical values). +**** So, why do we have these =DPIV=? :SMPI:BUG:HPL: +- The content of =DPIV= is defined at the end of function =HPL_pdmxswp=, by the line: + #+begin_src c + (PANEL->DPIV)[JJ] = WORK[2]; + #+end_src + With some =printf=, we see that =DPIV= is filled in order. The values are the same that the ones already observed in =DPIV=. +*** 2017-03-20 Monday +**** Write a small Python script to monitor memory usage :PYTHON: +- Based on command =smemstat=. +- Run the command every second in quiet mode with json output. Parse the json file and output the information on screen, + nicely formated. +- Future work: + + Different sampling rate passed as a command line argument. + + Export in CSV. This would allow to plot memory consumption over time. +**** Failed tentatives for =DPIV= :SMPI:BUG:HPL: +- Tried to hard-code the values of =DPIV= with something like that: + #+begin_src c + (PANEL->DPIV)[JJ] = 42; + #+end_src + Got a segmentation fault. +**** Discussion with Arnaud :SMPI:BUG:HPL:MEETING: +- Had a look at HPL code. +- Next steps to try to find the issue: + + Try another block size for global =SMPI_SHARED_MALLOC=. + + Retry local =SMPI_SHARED_MALLOC=. + + Try other matrix sizes, other process grids. + + In =HPL_pdmxswp=, print the values of =WORK[{0,1,2,3}]= before and after the execution. +**** Looking at =WORK[{0,1,2,3}]= :SMPI:C:BUG:HPL: +- Meaning of the first values of this array: + + =WORK[0]= : local maximum absolute value scalar, + + =WORK[1]= : corresponding local row index, + + =WORK[2]= : corresponding global row index, + + =WORK[3]= : coordinate of process owning this max. +- Just before the call to =HPL_pdxswp=, these values are computed locally. Then, =HPL_pdxswp= does some computations to get + the global values. +- Adding some =printf=. +- Without =SHARED_{MALLOC,FREE}=, the absolute value of =WORK[0]= increases at each call and quickly becomes very large. It + reaches =3.8e+302=, then it is =NaN=. This happens regardless of if we replace BLAS operations by =smpi_usleep=. +- With =SHARED_{MALLOC,FREE}=, the absolute value of =WORK[0]= is relatively small. +- If we replace the value of =WORK[0]= by a (small) constant, the simulations terminates without deadlock. +- Recall that we run the simulation with N=40000 and P=Q=4. +- The simulation takes 197 seconds, where 170 seconds are actual computations of the application (thus, there is still + room for optimization). +- The estimated performances are 27.5 Gflops. This is a bit higher than what we had before with the matrix of + size 20000. We need to check if this difference is due to the higher matrix size (expected and ok) or our dirty hack + (not ok). +*** 2017-03-21 Tuesday +**** Checking the consistency of =IPMAP= :SMPI:PYTHON:TRACING:BUG:HPL: +- Before the modification on =WORK[0]=, the =IPMAP= were not consistent on the different processes of a same communicator + (see the entry of =16/03/2017=). +- Let’s check if this issue is fixed. +- Add the following in =HPL_spreadT.c=: + #+begin_src c + int my_rank; + MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); + printf("my_rank=%d IPMAP=%d,%d,%d,%d\n", my_rank, IPMAP[0], IPMAP[1], IPMAP[2], IPMAP[3]); + #+end_src +- Run HPL with =stdout= redirected to =/tmp/output=. +- Check that at each step, the values of =IPMAP= are the same for the processes of a same communicator. Recall that the + communicators are ={0,4,8,12}=, ={1,5,9,13}=, ={2,6,10,14}= and ={3,7,11,15}=. +#+begin_src python :results output :exports both +import re +reg = re.compile('my_rank=([0-9]+) IPMAP=(.+)') + +def process(in_file): + results = {n: [] for n in range(16)} + with open(in_file, 'r') as in_f: + for line in in_f: + match = reg.match(line) + if match is not None: + n = int(match.group(1)) + ipmap = match.group(2) + results[n].append(ipmap) + for comm in range(4): + print('Number of entries for communicator %d: %d' % (comm, len(results[comm]))) + for rank in range(1, 4): + assert results[comm] == results[comm+4*rank] + print('OK') +process('/tmp/output') +#+end_src + + #+RESULTS: + : Number of entries for communicator 0: 913 + : Number of entries for communicator 1: 904 + : Number of entries for communicator 2: 907 + : Number of entries for communicator 3: 910 + : OK +- We see here that the values of =IPMAP= are consistent. +**** Comparison with previous code :SMPI:HPL: +- Let’s compare with the previous version of the code (without the modification on =WORK[0]=). We use N=20000, P=Q=4. + | Code | Virtual time | Gflops | Total simulation time | Time for application computations | + |--------+--------------+-----------+-----------------------+-----------------------------------| + | *Before* | 222.27 | 2.400e+01 | 19.2529 | 10.0526 | + | *After* | 258.28 | 2.065e+01 | 48.2851 | 41.7249 | +- We find that both the virtual time and the real time are longer, due to an higher ammount of time spent in the + application. +- Do not forget to remove the option =--cfg=smpi/simulate-computation:0= when testing for things like that. At first, I + did not removed it. The real time was higher but the virtual time was unchanged. +- It seems that the modification of =WORK[0]= has led to a modification of the behavior of the application, which yields + significant differences in terms of performances. +**** Having a look at what takes time :SMPI:PYTHON:R:EXPERIMENTS:PERFORMANCE:HPL: +- Using the settings N=20000, P=Q=4. Recall that with these settings, the simulation time was nearly 52 seconds. +- Simulation time drops to 30 seconds if we disable the calls to =HPL_dgemv= (this was not the case before, according to + experimentations of =08/03/2017=). +- There was no deadlock with N=20000, so we can compare the cases with and without the modification of =WORK[0]=. +- Modify the definition of =HPL_dgemv= in =HPL_blas.h= for both cases: + #+begin_src c + #define HPL_dgemv(Order, TransA, M, N, alpha, A, lda, X, incX, beta, Y, incY) ({\ + int my_rank, buff=0;\ + MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);\ + struct timeval before = {};\ + struct timeval after = {};\ + gettimeofday(&before, NULL);\ + cblas_dgemv(Order, TransA, M, N, alpha, A, lda, X, incX, beta, Y, incY);\ + gettimeofday(&after, NULL);\ + double time_before = (double)(before.tv_sec) + (double)(before.tv_usec)*1e-6;\ + double time_after = (double)(after.tv_sec) + (double)(after.tv_usec)*1e-6;\ + double real_time = time_after-time_before;\ + printf("file=%s line=%d rank=%d m=%d n=%d lead_A=%d inc_X=%d inc_Y=%d real_time=%f\n", __FILE__, __LINE__, my_rank, M, N, lda, incX, incY, real_time);\ + }) + #+end_src +- Run HPL for both cases, with =stdout= redirected to some file (=/tmp/output_before= when =WORK[0]= is unmodified, + =/tmp/output_after= when it is modified). +- Process the outputs: +#+begin_src python +import re +import csv +reg = re.compile('file=([a-zA-Z0-9/_.-]+) line=([0-9]+) rank=([0-9]+) m=([0-9]+) n=([0-9]+) lead_A=([0-9]+) inc_X=([0-9]+) inc_Y=([0-9]+) real_time=(-?[0-9]+.[0-9]+)') + +def process(in_file, out_file): + with open(in_file, 'r') as in_f: + with open(out_file, 'w') as out_f: + csv_writer = csv.writer(out_f) + csv_writer.writerow(('file', 'line', 'rank', 'm', 'n', 'lead_A', 'inc_X', 'inc_Y', 'real_time')) + for line in in_f: + match = reg.match(line) + if match is not None: + result = list(match.group(i) for i in range(1, 10)) + result[0] = result[0][result[0].index('/hpl'):].lower() + csv_writer.writerow(result) +process('/tmp/output_before', '/tmp/parameters_before.csv') +process('/tmp/output_after', '/tmp/parameters_after.csv') +#+end_src + +- Analysis with R: +#+begin_src R :results output :session *R* :exports both +parameters_before <- read.csv("/tmp/parameters_before.csv"); +parameters_after <- read.csv("/tmp/parameters_after.csv"); +head(parameters_before) +head(parameters_after) +#+end_src + +#+RESULTS: +#+begin_example + file line rank m n lead_A inc_X inc_Y +1 /hpl-2.2/src/pfact/hpl_pdpanllt.c 207 4 5040 1 5040 120 1 +2 /hpl-2.2/src/pfact/hpl_pdpanllt.c 207 12 4920 1 4920 120 1 +3 /hpl-2.2/src/pfact/hpl_pdpanllt.c 207 0 5039 1 5040 120 1 +4 /hpl-2.2/src/pfact/hpl_pdpanllt.c 207 8 5000 1 5000 120 1 +5 /hpl-2.2/src/pfact/hpl_pdpanllt.c 207 12 4920 1 4920 120 1 +6 /hpl-2.2/src/pfact/hpl_pdpanllt.c 207 4 5040 1 5040 120 1 + real_time +1 0.000034 +2 0.000034 +3 0.000030 +4 0.000156 +5 0.000043 +6 0.000031 + file line rank m n lead_A inc_X inc_Y +1 /hpl-2.2/src/pfact/hpl_pdpanllt.c 207 4 5040 1 5040 120 1 +2 /hpl-2.2/src/pfact/hpl_pdpanllt.c 207 12 4920 1 4920 120 1 +3 /hpl-2.2/src/pfact/hpl_pdpanllt.c 207 0 5039 1 5040 120 1 +4 /hpl-2.2/src/pfact/hpl_pdpanllt.c 207 8 5000 1 5000 120 1 +5 /hpl-2.2/src/pfact/hpl_pdpanllt.c 207 12 4920 1 4920 120 1 +6 /hpl-2.2/src/pfact/hpl_pdpanllt.c 207 8 5000 1 5000 120 1 + real_time +1 0.000026 +2 0.000030 +3 0.000030 +4 0.000123 +5 0.000030 +6 0.000028 +#+end_example + +#+begin_src R :results output :session *R* :exports both +sum(parameters_before$real_time) +sum(parameters_after$real_time) +#+end_src + +#+RESULTS: +: [1] 0.127207 +: [1] 2.61086 + +- There is a clear difference between the two cases. When =WORK[0]= is modified, the time spent in the function =HPL_dgemv= + is 20 times higher. However, this makes a difference of about 2.5 seconds, whereas a difference of 20 seconds was + observed when removing =HPL_dgemv=. +- Therefore, it seems that removing the calls to =HPL_dgemv= triggers a modification of the behavior of the application, + resulting in a lower time, but this is not this function itself which takes time. +- Note that this was not the case for functions =HPL_dgemm= and =HPL_dtrsm=: this was the calls to these functions which + took time, not a consequence of the calls (just tested: taking the sum of all the times gives a total of about 75 + seconds for =HPL_dtrsm= and about 2896 seconds for =HPL_dgemm=). +- In experimentation of =08/03/2017=, removing =HPL_dgemv= only resulted in a drop of 1 second for the execution time. +- Thus, it seems that modifying =WORK[0]= has increased the execution time, which is cancelled if we then remove + =HPL_dgemv=. Therefore, we should not treat this function as =HPL_dgemm= and =HPL_dtrsm= (replacing it by a =smpi_usleep=), we + should simply remove it. +- If we remove it, we get a virtual time of 226 seconds, i.e. 23.6 Gflops. This is much closer to what we used to + have. Now the simulation time is 26 seconds, this is worse than what we used to have, but still better than what we + had after the modification of =WORK[0]=. +*** 2017-03-22 Wednesday +**** Better usability of HPL :SMPI:C:HPL: +- Before, HPL code had to be changed by hand to enable or disable SMPI optimizations (=SHARED_{MALLOC,FREE}= and + =smpi_usleep=) and to enable or disable the tracing of BLAS function calls. +- Now, thanks to some preprocessor macros, these different settings can be configured on the command line when + compiling: + #+begin_src sh + # Compile vanilla HPL for SMPI + make arch=SMPI + # Compile HPL for SMPI with the tracing of BLAS function calls + make SMPI_OPTS=-DSMPI_MEASURE + # Compile HPL for SMPI with the SMPI optimizations (shared malloc/free, smpi_usleep) + make SMPI_OPTS=-DSMPI_OPTIMIZATION + #+end_src +- Next step: automation of the computation of the linear regression coefficients, to also pass these coefficients as + preprocessor variables. +**** Script to parse the output file and do the linear regression :SMPI:PYTHON:EXPERIMENTS:TRACING:HPL: +- Everything is done in Python (linear regression included) to simplify the procedure for the user. +- Given an output file =/tmp/output= as produced by a call to HPL (compiled with =-DSMPI_MEASURE= option), call the script: + #+begin_src sh :results output :exports both + python3 ../scripts/linear_regression.py /tmp/output + #+end_src + + #+RESULTS: + : -DSMPI_DGEMM_COEFF=1.097757e-09 -DSMPI_DTRSM_COEFF=9.134754e-08 + It outputs the list of the coefficients found by the linear regressions for the relevant BLAS functions. This list + should then be passed to the variable =SMPI_OPTS= when compiling with =-DSMPI_OPTIMIZATION=. +**** Discussion with Arnaud :SMPI:ORGMODE:PERFORMANCE:HPL:MEETING: +- Possible trip to Bordeaux on the week of [2017-04-10 Mon]-[2017-04-14 Fri]. The goal is to discuss with contributors + of Simgrid. +- Found very strange the different time we get when we modify =WORK[0]=, especially because it is computation time (would + be more understandable if it was communication time, since the communication patterns for the pivot exchange are very + likely to be impacted). Should do a profiling. +- Some tips regarding org-mode (tags). +**** DONE Profile HPL +:LOGBOOK: +- State "DONE" from "TODO" [2017-03-27 Mon 17:56] +- State "TODO" from [2017-03-22 Wed 18:08] +:END: +- Use Valgrind with [[http://valgrind.org/docs/manual/cl-manual.html][Callgrind]] and [[http://kcachegrind.sourceforge.net/html/Shot3.html][Kcachegrind]] or [[https://sourceware.org/binutils/docs/gprof/][Gprof]]. +- Do the profiling on unmodified HPL and modified HPL, to see if there is any obvious difference. +*** 2017-03-23 Thursday +**** Profiling vanilla HPL :EXPERIMENTS:PERFORMANCE:PROFILING:VALGRIND:SMPI:HPL: +- Profiling with Valgrind of vanilla HPL (no time measurements nor SMPI optimizations). Add the option =-g= in the Makefile. +- HPL commit: =4494976bc0dd67e04e54abec2520fd468792527a=. +- Settings: N=5000, P=Q=4. +- Compile with the command: + #+begin_src sh + make -j 4 arch=SMPI + #+end_src +- Run with the command: + #+begin_src sh + smpirun -wrapper "valgrind --tool=callgrind" --cfg=smpi/bcast:mpich --cfg=smpi/running-power:6217956542.969 + --cfg=smpi/display-timing:yes --cfg=smpi/privatize-global-variables:yes -np 16 -hostfile ./hostfile_64.txt -platform + ./cluster_fat_tree_64.xml ./xhpl + #+end_src +- At first, the package =libatlas3-base= was used for the BLAS functions. The name of these functions were not shown in + Kcachegrind. + + [[file:callgrind/callgrind.out.19313][Output file]] + + Visualization: + [[file:callgrind/callgrind.19313.png]] +- Then, removed this package and installed =libatlas-cpp-0.6-2-dbg=. + + [[file:callgrind/callgrind.out.26943][Output file]] + + Visualization + [[file:callgrind/callgrind.26943.png]] +- So now we have the names of the BLAS functions, but the layout is very different. +- Also, the executions with this library take more time, especially with Vallgrind. It also impacts the virtual time and + the Gflops. +- What we observe with this new library seems to be consistent with what we observed previously: =dgemm= is the most time + consumming function (by far), =dtrsm= comes after. So maybe this library is good enough to understand what happens, and + then we could switch back to the previous library to have good performances. +**** Profiling modified HPL :EXPERIMENTS:PERFORMANCE:PROFILING:VALGRIND:SMPI:HPL: +- Profiling with Valgrind of modified HPL. Add the option =-g= in the Makefile. +- HPL commit: =4494976bc0dd67e04e54abec2520fd468792527a=. Then for each case, a small piece of the code has been modified. +- Settings: N=5000, P=Q=4. +- Compile with the command: + #+begin_src sh + make SMPI_OPTS=-DSMPI_OPTIMIZATION -j 4 arch=SMPI + #+end_src +- Run with the command: + #+begin_src sh + smpirun -wrapper "valgrind --tool=callgrind" --cfg=smpi/bcast:mpich --cfg=smpi/running-power:6217956542.969 + --cfg=smpi/display-timing:yes --cfg=smpi/privatize-global-variables:yes -np 16 -hostfile ./hostfile_64.txt -platform + ./cluster_fat_tree_64.xml ./xhpl + #+end_src +- Using the library from =libatlas-cpp-0.6-2-dbg=. +- First experiment, the call to =HPL_dgemv= is a no-op and =WORK[0]= is set to a constant. + + [[file:callgrind/callgrind.out.5388][Output file]] + + Visualization: + [[file:callgrind/callgrind.5388.png]] +- Second experiment, the call to =HPL_dgemv= is aliased to =cblas_dgemv= and =WORK[0]= is set to a constant. + + [[file:callgrind/callgrind.out.10043][Output file]] + + Visualization: similar. +- Third experiment, the call to =HPL_dgemv= is aliased to =cblas_dgemv= and =WORK[0]= is not modified. + + [[file:callgrind/callgrind.out.12915][Output file]] + + Visualization: similar. +- It is clear that we can shrink even further the simulation by removing the code that initialize the matrices (this is + the code that calls the function =HPL_rand=). +- There is no explanation for the differences observed with =HPL_dgemv= and =WORK[0]=, the figures look similar. However the + differences observed between the three cases are quite small (in terms of execution time or Gflops). +**** Comparison of the code :SMPI:HPL: +- Let’s compare again the different versions of the code, this time with the new CBLAS library (package + =libatlas-cpp-0.6-2-dbg=). We use N=20000, P=Q=4. + | Code | Virtual time | Gflops | Total simulation time | Time for application computations | + |--------------------------------+--------------+-----------+-----------------------+-----------------------------------| + | *WORK[0] unmodified, real dgemv* | 223.81 | 2.383e+01 | 15.5049 | 9.5045 | + | *WORK[0] modified, real dgemv* | 223.74 | 2.384e+01 | 25.9935 | 20.0480 | + | *WORK[0] modified, no-op dgemv* | 226.28 | 2.357e+01 | 26.3907 | 20.3201 | + Remark: for the first version of the code, the experiment had to be ran twice, since the first run ended in a deadlock. +- The two first rows correspond to the two rows of the table of [2017-03-21 Tue]. +- There is no significant difference for the virtual time and the Gflops. +- There is a significant difference for the total simulation time and the time spent for aapplication computations, but + it is less important than what we previously observed. +- It is strange that this difference in the computation time does not appear in the virtual time. Note that the option + =--cfg=smpi/simulate-computation:0= was not used, so it does not come from here. +**** Seminar :MEETING: +- Decaf: Decoupled Dataflows for In Situ High-Performance Workflows +- Mathieu Dreher +- They do some physics experiment (with a particle collider). Then, they analyze the results and build a model. Thus, + the whole process has three major steps. +- In current systems, the bottleneck is the I/O. It will be even worse for future systems (computation speed will be + increased, not I/O speed). This is why we should have in-situ workflows (less data to move). +- In the “classical workflow”, we compute all the iterations, then we analyze them. +- In the “in situ workflow”, two things are possible. Time partitionning: we compute one iteration, then analyze it, + then go back to the computation. Space partitioning: the analysis is done in parallel on other nodes. +**** Profiling modified HPL, bigger matrices :EXPERIMENTS:PERFORMANCE:PROFILING:VALGRIND:SMPI:HPL: +- Profiling with Valgrind of modified HPL. Add the option =-g= in the Makefile. +- HPL commit: =4494976bc0dd67e04e54abec2520fd468792527a=. Then for each case, a small piece of the code has been modified. +- Settings: N=20000, P=Q=4. +- Compile with the command: + #+begin_src sh + make SMPI_OPTS=-DSMPI_OPTIMIZATION -j 4 arch=SMPI + #+end_src +- Run with the command: + #+begin_src sh + smpirun -wrapper "valgrind --tool=callgrind" --cfg=smpi/bcast:mpich --cfg=smpi/running-power:6217956542.969 + --cfg=smpi/display-timing:yes --cfg=smpi/privatize-global-variables:yes -np 16 -hostfile ./hostfile_64.txt -platform + ./cluster_fat_tree_64.xml ./xhpl + #+end_src +- Using the library from =libatlas-cpp-0.6-2-dbg=. +- First experiment, the call to =HPL_dgemv= is a no-op and =WORK[0]= is set to a constant. + + [[file:callgrind/callgrind.out.6159][Output file]] + + Visualization: + [[file:callgrind/callgrind.6159.png]] +- Second experiment, the call to =HPL_dgemv= is aliased to =cblas_dgemv= and =WORK[0]= is set to a constant. + + [[file:callgrind/callgrind.out.2590][Output file]] + + Visualization: + [[file:callgrind/callgrind.2590.png]] +- Third experiment, the call to =HPL_dgemv= is aliased to =cblas_dgemv= and =WORK[0]= is not modified. + + [[file:callgrind/callgrind.out.31804][Output file]] + + Visualization: + [[file:callgrind/callgrind.31804.png]] +- The three figures have roughly the same pattern. +- However, some numbers of the two first figures are twice as large as the corresponding numbers of the third + figure. For instance, the biggest =HPL_rand= has 70803540000 in the third figure and 141607080000 in the two first ones. +- The reason for that is that, in the two first cases, =HPL_pdmatgen= is called 32 times, whereas in the last case it is + called only 16 times. In our settings, we would expect this function to be called 16 times, since we simulate 16 processes. +- This is very strange that =WORK[0]= has an impact on the behavior of matrices generation. +*** 2017-03-24 Friday +**** Why =WORK[0]= impacts the number of calls to =HPL_pdmatgen= :SMPI:HPL: +- Everything happens in the file =HPL_pdtest.c=. This is in relation with the error code issue discussed on [2017-03-14 Tue]. +- When we use SMPI optimizations (=smpi_usleep= and =SMPI_SHARED_MALLOC=) without modifying =WORK[0]=, HPL detects an error in + the data of the matrices and returns an error code. If we also fix =WORK[0]= to some constant, HPL does not detect this + error. +- After doing the factorization, if no error code has been returned, HPL does some additional tests on the values of the + matrix. These tests are quite long, and imply to re-generate the matrix, by calling =HPL_pdmatgen=. +- This explains why =WORK[0]= has an impact on the simulation time and the number of time =HPL_pdmatgen= is called. +- This does not explain the difference in terms of virtual time observed on [2017-03-21 Tue], since this is only the + time needed for the factorization and not the time for the initialization and the checks. +- This difference of virtual time was not re-observed on [2017-03-23 Thu]. Note that another BLAS library was used. +**** Comparison of the code :SMPI:HPL: +- Let’s compare again the different versions of the code, with the “old” version of the CBLAS library (package + =libatlas3-base=). We use N=20000, P=Q=4. +- This is the same experiment than [2017-03-23 Thu], except for the BLAS library. +- The two first rows correspond to the two rows of the table of [2017-03-21 Tue]. + | Code | Virtual time | Gflops | Total simulation time | Time for application computations | + |--------------------------------+--------------+-----------+-----------------------+-----------------------------------| + | *WORK[0] unmodified, real dgemv* | 223.68 | 2.385e+01 | 15.8909 | 9.5658 | + | *WORK[0] modified, real dgemv* | 257.79 | 2.069e+01 | 47.9488 | 41.5125 | + | *WORK[0] modified, no-op dgemv* | 225.91 | 2.361e+01 | 26.2768 | 20.1776 | +- The experiment of [2017-03-21 Tue] is replicated: the two first rows look similar. +- There is still the big gap in terms of both simulation time and virtual time. The former could be explained by the + checks done at the end of HPL, but not the later (see previous entry of the journal). +- Interestingly, the first and the last rows look very similar to the first and the last row of the [2017-03-23 Thu] + experiment, although the BLAS library has changed. The middle row however is very different. +- These gaps are not replicated when using Valgrind. For all simulations, we have a virtual time of about 262 to 263 + seconds, which is about 2.03e+01 Gflops. +**** Removing initialization and checks :SMPI:PERFORMANCE:PROFILING:HPL: +- The previous experiments demonstrated that the initialization (done after the factorization) and the checks (done + after) take a significant amount of time. They do not account for the estimation of the Gflops, so we can safely + remove them. +- Quick experiment, with HPL at commit =cb54a92b8304e0cd2f1728b887cc4cc615334c2d=, N=20000 and P=Q=4, using library from + package =libatlas3-base=. +- We get a virtual time of 227.35, which is 2.346e+01 Gflops. It confirms that the initialization and the checks are not + accounted in this measure. +- The simulation time is now 9.63 seconds, with 3.53 seconds spent for actual computations of the application. +- We see here that the simulation is already well optimized, there is not much room for additional gains. +- Profiling with Valgrind: + + [[file:callgrind/callgrind.out.5744][Output file]] + + Visualization: + [[file:callgrind/callgrind.5744.png]] +- We see here that a large part of the time is spent in functions called by Simgrid (e.g. =memcpy=). +**** Work on the experiment script :PYTHON: +- Add three features: + + “Dump simulation and application times in the CSV.” + + “Dump physical and virtual memory in the CSV.” + + “Experiments with random sizes and number of processors.” +- Example of usage: + #+begin_src sh + ./run_measures.py --global_csv /tmp/bla.csv --nb_runs 10 --size 1000:2000,4000:5000,20000:21000 --nb_proc 1:4,8,16,32,64 + --fat_tree "2;8,8;1,1:4;1,1" --experiment HPL + #+end_src + This will run 10 times, in a random order, all combinations of the parameters: + + Matrix size in [1000,2000]\cup[4000,5000]\cup[20000,21000] + + Number of processes in {1,2,3,4,8,16,32,64} + + Fat-trees =2;8,8;1,1;1,1= and =2;8,8;1,2;1,1= and =2;8,8;1,3;1,1= and =2;8,8;1,4;1,1=. + The results are dumped in a CSV file. For each experiment, we store all the parameters (topology, size, number of + processes) as well as the interesting metrics (virtual time, Gflops, simulation time, time spent in application, peak + physical and virtual memory used). +*** 2017-03-25 Saturday +**** Time and memory efficiency of HPL simulation :SMPI:R:EXPERIMENTS:PERFORMANCE:HPL: +- HPL commit: =cb54a92b8304e0cd2f1728b887cc4cc615334c2d= +- Script commit: =8af35470776a0b6f2041cf8e0121739f94fdc34d= +- Command line to run the experiment: + #+begin_src sh + ./run_measures.py --global_csv hpl2.csv --nb_runs 3 --size 100,5000,10000,15000,20000,25000,30000,35000,40000 + --nb_proc 1,8,16,24,32,40,48,56,64 --fat_tree "2;8,8;1,8;1,1" --experiment HPL + #+end_src +- Plots: + #+begin_src R :results output :session *R* :exports both + library(ggplot2) + do_plot <- function(my_plot, title) { + return(my_plot + + stat_summary(geom="line", fun.y=mean)+ + stat_summary(fun.data = mean_sdl)+ + ggtitle(title) + ) + } + results <- read.csv('hpl_analysis/hpl.csv') + head(results) + #+end_src + + #+RESULTS: + #+begin_example + topology nb_roots nb_proc size time Gflops simulation_time + 1 2;8,8;1,8;1,1 8 48 40000 593.10 71.940 60.75480 + 2 2;8,8;1,8;1,1 8 40 20000 144.88 36.820 24.53460 + 3 2;8,8;1,8;1,1 8 8 30000 1290.99 13.940 13.39820 + 4 2;8,8;1,8;1,1 8 56 10000 37.93 17.580 12.92780 + 5 2;8,8;1,8;1,1 8 1 30000 9609.94 1.873 3.67895 + 6 2;8,8;1,8;1,1 8 64 10000 27.20 24.510 9.96141 + application_time uss rss + 1 14.47840 701091840 13509701632 + 2 6.44959 327905280 3533713408 + 3 6.14242 217612288 7422472192 + 4 2.55716 211193856 1016156160 + 5 3.58312 5619712 7209476096 + 6 2.10660 179879936 984698880 + #+end_example + + #+begin_src R :file hpl_analysis/1.png :results value graphics :results output :session *R* :exports both + do_plot(ggplot(results, aes(x=size, y=simulation_time, group=nb_proc, color=nb_proc)), + "Simulation time vs size") + #+end_src + + #+RESULTS: + [[file:hpl_analysis/1.png]] + + #+begin_src R :file hpl_analysis/2.png :results value graphics :results output :session *R* :exports both + do_plot(ggplot(results, aes(x=nb_proc, y=simulation_time, group=size, color=size)), + "Simulation time vs number of processes") + #+end_src + + #+RESULTS: + [[file:hpl_analysis/2.png]] + + #+begin_src R :file hpl_analysis/3.png :results value graphics :results output :session *R* :exports both + do_plot(ggplot(results, aes(x=size, y=uss, group=nb_proc, color=nb_proc)), + "Physical memory consumption vs size") + #+end_src + + #+RESULTS: + [[file:hpl_analysis/3.png]] + + #+begin_src R :file hpl_analysis/4.png :results value graphics :results output :session *R* :exports both + do_plot(ggplot(results, aes(x=nb_proc, y=uss, group=size, color=size)), + "Physical memory consumption vs number of processes") + #+end_src + + #+RESULTS: + [[file:hpl_analysis/4.png]] + +- We see here that despite all the optimizations: + + The simulation time seems to be quadratic in the matrix size. + + The simulation time seems to be (roughly) linear in the number of processes. + + The memory consumption seems to be linear in the matrix size. + + The memory consumption seems to be (roughly) linear in the number of processes. +- There are some irregularities regarding the time and memory vs the number of processes. An hypothesis is that it is + due to different virtual topologies. In this experiment, the number of processes are multiple of 8. So, some of these + numbers are square numbers, others are not. It seems that we achieve the best performances when the number of + processes is a square. + To generate P and Q, the sizes of the process grid, we try to find two divisors of the number of processes that are + reasonably close (if possible). Thus, when the number of processes is a square, we have P=Q. +*** 2017-03-27 Monday +**** DONE Remaining work on HPL (following discussion with Arnaud) [3/3] :SMPI:HPL:MEETING: +:LOGBOOK: +- State "DONE" from "TODO" [2017-03-27 Mon 17:01] +- State "TODO" from "TODO" [2017-03-27 Mon 13:27] +- State "TODO" from "TODO" [2017-03-27 Mon 12:33] +- State "TODO" from "TODO" [2017-03-27 Mon 12:33] +- State "TODO" from [2017-03-27 Mon 11:26] +:END: +- [X] Do not look further regarding the =WORK[0]= anomaly. +- [X] Do careful experiments to validate the optimizations. +- [X] Currently, the simulation will not scale in memory. Track the sizes of the =malloc= in =HPL_panel_init=. +**** More detailed analysis of =malloc= :R:TRACING:PERFORMANCE:HPL: +- We saw that the memory consumption is still too high, we need to reduce it. +- Let’s take back the results from [2017-03-17 Fri]. The corresponding CSV file has been copied in repository =hpl_malloc=. +- Recall that this is a trace of all the =malloc=, with N=20000 and P=Q=4. +- We will focus on the file =HPL_pdpanel_init.c= since we suppose that these are the biggest allocations (after the + allocation of the matrix). + #+begin_src R :results output :session *R* :exports both + results <- read.csv("hpl_malloc/malloc.csv"); + results <- results[results$file == "/hpl-2.2/src/panel/hpl_pdpanel_init.c",] + results$idx <- 0:(length(results$size)-1) + head(results) + #+end_src + + #+RESULTS: + : file line rank size idx + : 99 /hpl-2.2/src/panel/hpl_pdpanel_init.c 245 0 4839432 0 + : 100 /hpl-2.2/src/panel/hpl_pdpanel_init.c 339 0 5344 1 + : 101 /hpl-2.2/src/panel/hpl_pdpanel_init.c 245 0 4839432 2 + : 102 /hpl-2.2/src/panel/hpl_pdpanel_init.c 339 0 5344 3 + : 106 /hpl-2.2/src/panel/hpl_pdpanel_init.c 245 2 9640392 4 + : 107 /hpl-2.2/src/panel/hpl_pdpanel_init.c 339 2 5344 5 + + #+begin_src R :file hpl_malloc/1.png :results value graphics :results output :session *R* :exports both + library(ggplot2) + ggplot(results, aes(x=idx, y=size, color=factor(line))) + geom_point(alpha=.2) + ggtitle("Sizes of malloc in HPL_pdpanel_init (N=20000, P=Q=4)") + #+end_src + + #+RESULTS: + [[file:hpl_malloc/1.png]] + +- Now that we have removed the matrix allocation, the panel allocation is clearly the one responsible of the high memory + consumption. Here, for 16 processes and a matrix of size 20000, this allocation is responsible for 160MB of memory. +- The =malloc= of line 245 is the one that is a concern. It is made for the =WORK= attribute. +- The =malloc= of line 339 is not a concern. It is made for the =IWORK= attribute. +- It is strange that all these allocations are made. Why not allocating the panel once, and then reusing it until the + end? +- It may be difficult to split the panel in two parts (one =SHARED_MALLOC= and one classical =malloc=). In + =HPL_pdpanel_init.c=, we can find this comment: + #+begin_example + * L1: JB x JB in all processes + * DPIV: JB in all processes + * DINFO: 1 in all processes + * We make sure that those three arrays are contiguous in memory for the + * later panel broadcast. We also choose to put this amount of space + * right after L2 (when it exist) so that one can receive a contiguous + * buffer. + #+end_example +**** Validation of the optimizations :SMPI:R:EXPERIMENTS:HPL: +- Let’s compare vanilla HPL with optimized HPL, to see if the simulation is still faithful. +- Results for optimized HPL are those of [2017-03-25 Sat]. +- Results for vanilla HPL have been freshly generated: + + Using HPL commit =6cc643a5c2a123fa549d02a764bea408b5ad6114= + + Using script commit =7a9e467f9446c65a9dbc2a76c4dab7a3d8209148= + + Command: + #+begin_src sh + ./run_measures.py --global_csv hpl_vanilla.csv --nb_runs 1 --size 100,5000,10000,15000,20000 --nb_proc + 1,8,16,24,32,40,48,56,64 --fat_tree "2;8,8;1,8;1,1" --experiment HPL + #+end_src +- Analysis: + #+begin_src R :results output :session *R* :exports both + library(ggplot2) + optimized_results <- read.csv('hpl_analysis/hpl.csv') + vanilla_results <- read.csv('hpl_analysis/hpl_vanilla.csv') + optimized_results$hpl = 'optimized_hpl' + vanilla_results$hpl = 'vanilla_hpl' + results = rbind(optimized_results, vanilla_results) + head(results) + #+end_src + + #+RESULTS: + #+begin_example + topology nb_roots nb_proc size time Gflops simulation_time + 1 2;8,8;1,8;1,1 8 48 40000 593.10 71.940 60.75480 + 2 2;8,8;1,8;1,1 8 40 20000 144.88 36.820 24.53460 + 3 2;8,8;1,8;1,1 8 8 30000 1290.99 13.940 13.39820 + 4 2;8,8;1,8;1,1 8 56 10000 37.93 17.580 12.92780 + 5 2;8,8;1,8;1,1 8 1 30000 9609.94 1.873 3.67895 + 6 2;8,8;1,8;1,1 8 64 10000 27.20 24.510 9.96141 + application_time uss rss hpl + 1 14.47840 701091840 13509701632 optimized_hpl + 2 6.44959 327905280 3533713408 optimized_hpl + 3 6.14242 217612288 7422472192 optimized_hpl + 4 2.55716 211193856 1016156160 optimized_hpl + 5 3.58312 5619712 7209476096 optimized_hpl + 6 2.10660 179879936 984698880 optimized_hpl + #+end_example + + #+begin_src R :results output :session *R* :exports both + plot_results <- function(nb_proc) { + ggplot(results[results$nb_proc==nb_proc,], aes(x=size, y=Gflops, color=hpl)) + + geom_point() + geom_line() + + expand_limits(x=0, y=0) + + ggtitle(paste("Gflops vs size, nb_proc = ", nb_proc)) + } + #+end_src + + #+begin_src R :file hpl_analysis/5.png :results value graphics :results output :session *R* :exports both + plot_results(1) + #+end_src + + #+RESULTS: + [[file:hpl_analysis/5.png]] + + #+begin_src R :file hpl_analysis/6.png :results value graphics :results output :session *R* :exports both + plot_results(8) + #+end_src + + #+RESULTS: + [[file:hpl_analysis/6.png]] + + #+begin_src R :file hpl_analysis/7.png :results value graphics :results output :session *R* :exports both + plot_results(16) + #+end_src + + #+RESULTS: + [[file:hpl_analysis/7.png]] + + #+begin_src R :file hpl_analysis/8.png :results value graphics :results output :session *R* :exports both + plot_results(24) + #+end_src + + #+RESULTS: + [[file:hpl_analysis/8.png]] + + #+begin_src R :file hpl_analysis/9.png :results value graphics :results output :session *R* :exports both + plot_results(32) + #+end_src + + #+RESULTS: + [[file:hpl_analysis/9.png]] + + #+begin_src R :file hpl_analysis/10.png :results value graphics :results output :session *R* :exports both + plot_results(40) + #+end_src + + #+RESULTS: + [[file:hpl_analysis/10.png]] + + #+begin_src R :file hpl_analysis/11.png :results value graphics :results output :session *R* :exports both + plot_results(48) + #+end_src + + #+RESULTS: + [[file:hpl_analysis/11.png]] + + #+begin_src R :file hpl_analysis/12.png :results value graphics :results output :session *R* :exports both + plot_results(56) + #+end_src + + #+RESULTS: + [[file:hpl_analysis/12.png]] + + #+begin_src R :file hpl_analysis/13.png :results value graphics :results output :session *R* :exports both + plot_results(64) + #+end_src + + #+RESULTS: + [[file:hpl_analysis/13.png]] + +- From the above plots, it seems that optimized HPL is always too optimistic in terms of performances. However, the + difference is not so important. + + #+begin_src R :file hpl_analysis/14.png :results value graphics :results output :session *R* :exports both + merged_results = merge(x=vanilla_results, y=optimized_results, by=c("nb_proc", "size")) + merged_results$error = abs((merged_results$Gflops.x - merged_results$Gflops.y)/merged_results$Gflops.y) + ggplot(merged_results, aes(x=factor(size), y=error)) + + geom_boxplot() + geom_jitter(aes(color=nb_proc)) + + ggtitle("Error vs size") + #+end_src + + #+RESULTS: + [[file:hpl_analysis/14.png]] + + #+begin_src R :file hpl_analysis/15.png :results value graphics :results output :session *R* :exports both + ggplot(merged_results, aes(x=factor(nb_proc), y=error)) + + geom_boxplot() + geom_jitter(aes(color=size)) + + ggtitle("Error vs nb_proc") + #+end_src + + #+RESULTS: + [[file:hpl_analysis/15.png]] + +- We see here that the biggest errors made by the simulation are for a size of 100 and 1 process. For larger sizes and + numbers of processes, the error never goes above 10%. In average, it is lower than 5%. + + #+begin_src R :file hpl_analysis/16.png :results value graphics :results output :session *R* :exports both + ggplot(results[results$nb_proc==64,], aes(x=size, y=simulation_time, color=hpl)) + + geom_point() + geom_line() + + expand_limits(x=0, y=0) + + ggtitle("Simulation time vs size, P=Q=8") + #+end_src + + #+RESULTS: + [[file:hpl_analysis/16.png]] + + #+begin_src R :file hpl_analysis/17.png :results value graphics :results output :session *R* :exports both + ggplot(results[results$nb_proc==64,], aes(x=size, y=uss, color=hpl)) + + geom_point() + geom_line() + + expand_limits(x=0, y=0) + + ggtitle("Real memory vs size, P=Q=8") + #+end_src + + #+RESULTS: + [[file:hpl_analysis/17.png]] + +- There is a very important gain in terms of memory consumption and simulation time. +*** 2017-03-28 Tuesday +**** Booked the plane tickets for Bordeaux +**** Tentative of allocation hack in HPL_pdpanel_init :SMPI:C:PERFORMANCE:HPL: +- Greatly inspired from what is done for the global =SMPI_SHARED_MALLOC=. +- The idea is to reserve a large block of virtual addresses. The first part is mapped to a (short) buffer in a cyclic + way. The second part is kept private. +- Currently some bugs (invalid writes, leading to a segmentation fault). +*** 2017-03-29 Wednesday +**** Keep trying to use some shared memory for =PANEL->WORK= :SMPI:C:PERFORMANCE:BUG:HPL: +- The invalid writes of yesterday were on accesses to =WORK= buffer. Forgot the space needed for the buffer =U= at the end + of =WORK=. Now fixed. +- Add some =printf= to see the start and end addresses of the different buffers. Everything seems fine. +- Add a check. We fill the private zone (=DPIV= and =DINFO=) with 0. Then we fill the shared zone with garbage. Finally we + check that the private zone is still full 0. +- Now, there is an invalid write of 4 bytes, by =HPL_plindx1=, located just after the buffer =IWORK= (the allocation of this + buffer did not change). +- Test for N=5000, P=Q=4. Found that in file =HPL_plindx1.c=, variable =ipU= reaches 120 in the buggy case, but only 119 in + the normal case. So it is likely that the array is not too short, but rather that this variable is incremented too much. +- In the =for= loop where this happens, =ipU= is incremented when some conditions are fulfilled. One of these conditions is + the combination of these two =if=: + #+begin_src c + if( srcrow == icurrow ) { + if( ( dstrow == icurrow ) && ( dst - ia < jb ) ) + // [...] + #+end_src + When =ipU= reaches 120, the illegal write is: + #+begin_src c + iwork[ipU] = IPLEN[il]; + #+end_src + When this happens, the variable =dst= is 0 and thus the condition =dst-iaDPIV= (again) :SMPI:BUG:HPL: +- Add a =printf= in =HPL_pipid.c= (function which compute =IPID=, using =DPIV=) to see the content of =DPIV=. +- In the buggy case, sometimes, the array =DPIV= is full of 0. It does not happen in the normal case. If we put something + else in =DPIV= when it is allocated, then this is shown instead of the zeroes (e.g. if we put 0, 1, 2...). Thus, in + these cases, =DPIV= is never filled after its initialization. +- Hypothesis: when the pannels are sent with MPI, the size is too short and =DPIV= is not sent. +**** Discussion with Arnaud and Augustin :MEETING: +- Instead of putting an empty space between the shared block and the private block (for alignment), make them really + contiguous (and do not share the last page of the “shared” block). +**** Reimplement the shared allocation :SMPI:C:PERFORMANCE:HPL: +- The code was a mess, let’s restart something better, using Augustin’s idea. +- The interface is as follows: + #+begin_src c + void *allocate_shared(int size, int start_private, int stop_private); + #+end_src + It allocates a contiguous block of virtual addresses of given size that all fit in a small block of physical memory, + except for a contiguous block located between the indices start_private (included) and stop_private (excluded). + Calling =allocate_shared(size, size, size)= is (semantically) equivalent to calling =SMPI_SHARED_MALLOC(size)=. + Calling =allocate_shared(size, 0, size)= is (semantically) equivalent to calling =malloc(size)=. +- Similarly to =SHARED_MALLOC=, we map the shared zones by block, on a same range of addresses. The “best” block size is + to discuss. +- Since every call to =mmap= is a syscall, we should try to not have a too low block size. Used 0x1000 at the beginning, + the performances were terrible. +- Still for performance reasons, if the size is too low, we should simply do a malloc (and thus not have any shared zone). +- Valgrind does not report any error (it was the case with the previous implementation). There are some small memory + leaks however. +- Performances are good. Tested with N=40000, P=Q=8. Simulation time increased from 85 seconds to 112 seconds. Memory + consumption decreased from 675 MB to 95 MB. The virtual time and the Gflops were not impacted. +**** DONE Remaining work for shared allocation [4/4] +:LOGBOOK: +- State "DONE" from "TODO" [2017-04-05 Wed 17:24] +- State "TODO" from "TODO" [2017-03-30 Thu 09:52] +- State "TODO" from "TODO" [2017-03-30 Thu 09:52] +- State "TODO" from "TODO" [2017-03-30 Thu 09:18] +- State "TODO" from [2017-03-29 Wed 18:31] +:END: +- [X] Track the memory leaks (unclosed file?). +- [X] Clean the block size definition. Put it somewhere accessible by both =HPL_pdpanel_init= and =HPL_pdpanel_free=. Maybe + use two different values for the block size and the condition to switch to a simple =malloc=. +- [X] Find the best value(s) for the block size (and maybe the =malloc= condition). +- [X] Contribute this function to Simgrid. +*** 2017-03-30 Thursday +**** Quick work on shared allocations :SMPI:C:HPL: +- Clean the size definitions. + + Use a separate file that is imported in =HPL_pdpanel_init.c= and =HPL_pdpanel_free.c=. + + Use two different sizes: the block size, and the size at which we switch for =malloc=. +- Quick look at the possibilities for the sizes + + Some quick experiments with N=40000, P=Q=8. + + With =BLOCK_SIZE= and =MALLOC_MAX_SIZE= equal to 0x10000: + - Simulation time: 112 seconds + - Physical memory: 96 MB + + With =BLOCK_SIZE= equal to 0x10000 and =MALLOC_MAX_SIZE= equal to 0 (never do a =malloc=): + - Simulation time: also 112 seconds + - Physical memory: also 96 MB + + With =BLOCK_SIZE= equal to 0x10000 and =MALLOC_MAX_SIZE= equal to 0x40000 (4 times greater): + - Simulation time: 137 seconds + - Physical memory: 93 MB + + Thus, it seems that the gain of using =malloc= is not so important. Worst: it can yield a significant loss. Let’s + remove it. + + With =BLOCK_SIZE= equal to 0x100000 and =malloc= removed: execution cancelled, all the physical memory was used. +- Stop using =malloc=. Also move back the size definition in =HPL_pdpanel_init.c= +- The code is simpler like this, and the =malloc= trick did not give better performances. +- Do not bother with the memory leak. It was already here before the shared allocation. +- *Warning:* calling =munmap= with a size of 0 gives a huge memory consumption. It should be called with the correct size. +**** Implement the partial =shared_malloc= in Simgrid +- Even more generic implementation than the one done in HPL. Now, we give a list of offsets of block that should be + shared. Thus, we can have an arbitrary mix of shared zones with private zones inside an allocated block. +- Tests currently fail. To run a single test and see its output, run: + #+begin_src sh + ctest --verbose -R tesh-smpi-macro-shared-thread + #+end_src + I suspect (but did not check) that this is because we currently share only blocks aligned on the block size. + It would be better to share blocks aligned on the page size (need to fix it). But this does not change the fact that + some parts will not be shared. This is expected, we should modify the tests. +**** Time and memory efficiency of the partial =shared_malloc= :SMPI:R:EXPERIMENTS:PERFORMANCE:HPL: +- We switch back to the implementation of partial =shared_malloc= done in HPL, to measure its performances. +- Simgrid commit: =c8db21208f3436c35d3fdf5a875a0059719bff43= (the same commit that for the previous performance analysis) +- HPL commit: =7af9eb0ec54418bf1521c5eafa9acda1b150446f= +- Script commit: =7a9e467f9446c65a9dbc2a76c4dab7a3d8209148= +- Command line to run the experiment: + #+begin_src sh + ./run_measures.py --global_csv hpl_partial_shared.csv --nb_runs 1 --size 100,5000,10000,15000,20000,25000,30000,35000,40000 + --nb_proc 1,8,16,24,32,40,48,56,64 --fat_tree "2;8,8;1,8;1,1" --experiment HPL + #+end_src +- Analysis: + #+begin_src R :results output :session *R* :exports both + library(ggplot2) + partial_shared_results <- read.csv('hpl_analysis/hpl_partial_shared.csv') + optimized_results <- read.csv('hpl_analysis/hpl.csv') + vanilla_results <- read.csv('hpl_analysis/hpl_vanilla.csv') + partial_shared_results$hpl = 'partial_shared_hpl' + optimized_results$hpl = 'optimized_hpl' + vanilla_results$hpl = 'vanilla_hpl' + results = rbind(partial_shared_results, optimized_results, vanilla_results) + head(results) + #+end_src + + #+RESULTS: + #+begin_example + topology nb_roots nb_proc size time Gflops simulation_time + 1 2;8,8;1,8;1,1 8 24 25000 319.37 32.620000 25.8119000 + 2 2;8,8;1,8;1,1 8 24 5000 13.03 6.399000 2.7273300 + 3 2;8,8;1,8;1,1 8 24 35000 781.76 36.570000 49.3234000 + 4 2;8,8;1,8;1,1 8 40 100 0.23 0.003028 0.0779319 + 5 2;8,8;1,8;1,1 8 1 35000 15257.68 1.873000 5.8686300 + 6 2;8,8;1,8;1,1 8 64 40000 488.99 87.260000 111.7290000 + application_time uss rss hpl + 1 8.0867100 55365632 5274730496 partial_shared_hpl + 2 0.6131710 14643200 257220608 partial_shared_hpl + 3 16.0733000 74350592 10180751360 partial_shared_hpl + 4 0.0196671 0 0 partial_shared_hpl + 5 5.7156200 4775936 9809465344 partial_shared_hpl + 6 29.3046000 95391744 13475909632 partial_shared_hpl +#+end_example + +#+begin_src R :results output :session *R* :exports both +plot_results <- function(nb_proc) { + ggplot(results[results$nb_proc==nb_proc,], aes(x=size, y=Gflops, color=hpl)) + + geom_point() + geom_line() + + expand_limits(x=0, y=0) + + ggtitle(paste("Gflops vs size, nb_proc = ", nb_proc)) +} +#+end_src + +#+begin_src R :file hpl_analysis/18.png :results value graphics :results output :session *R* :exports both +plot_results(32) +#+end_src + +#+RESULTS: +[[file:hpl_analysis/18.png]] + +#+begin_src R :file hpl_analysis/19.png :results value graphics :results output :session *R* :exports both +plot_results(64) +#+end_src + +#+RESULTS: +[[file:hpl_analysis/19.png]] + +- It seems that this new optimization did not change the accuracy of the simulation. Let’s have a look at the time and + memory. + + #+begin_src R :file hpl_analysis/20.png :results value graphics :results output :session *R* :exports both + ggplot(results[results$nb_proc==64,], aes(x=size, y=simulation_time, color=hpl)) + + geom_point() + geom_line() + + expand_limits(x=0, y=0) + + ggtitle("Simulation time vs size, P=Q=8") + #+end_src + + #+RESULTS: + [[file:hpl_analysis/20.png]] + + #+begin_src R :file hpl_analysis/21.png :results value graphics :results output :session *R* :exports both + ggplot(results[results$nb_proc==64,], aes(x=size, y=uss, color=hpl)) + + geom_point() + geom_line() + + expand_limits(x=0, y=0) + + ggtitle("Real memory vs size, P=Q=8") + #+end_src + + #+RESULTS: + [[file:hpl_analysis/21.png]] + +- We see here that sharing some parts of the =PANEL->WORK= buffer has two effects. The simulation time is a bit larger, + but the memory consumption is much lower. +- Let’s have a look in more details at this version of HPL. + + #+begin_src R :results output :session *R* :exports both + do_plot <- function(my_plot, title) { + return(my_plot + + geom_point() + geom_line() + + ggtitle(title) + ) + } + #+end_src + + #+begin_src R :file hpl_analysis/22.png :results value graphics :results output :session *R* :exports both + do_plot(ggplot(partial_shared_results, aes(x=size, y=simulation_time, group=nb_proc, color=nb_proc)), + "Simulation time vs size") + #+end_src + + #+RESULTS: + [[file:hpl_analysis/22.png]] + + #+begin_src R :file hpl_analysis/23.png :results value graphics :results output :session *R* :exports both + do_plot(ggplot(partial_shared_results, aes(x=nb_proc, y=simulation_time, group=size, color=size)), + "Simulation time vs number of processes") + #+end_src + + #+RESULTS: + [[file:hpl_analysis/23.png]] + + #+begin_src R :file hpl_analysis/24.png :results value graphics :results output :session *R* :exports both + do_plot(ggplot(partial_shared_results, aes(x=size, y=uss, group=nb_proc, color=nb_proc)), + "Physical memory consumption vs size") + #+end_src + + #+RESULTS: + [[file:hpl_analysis/24.png]] + + + #+begin_src R :file hpl_analysis/25.png :results value graphics :results output :session *R* :exports both + do_plot(ggplot(partial_shared_results, aes(x=nb_proc, y=uss, group=size, color=size)), + "Physical memory consumption vs number of processes") + #+end_src + + #+RESULTS: + [[file:hpl_analysis/25.png]] + +- The trend for the simulation time looks similar to what we got previously. +- The memory consumption still looks linear in the size and in the number of processes. However, it is almost flat for + the number of processes. +**** Regression of Time and memory efficiency of the partial =shared_malloc= (Arnaud) :SMPI:R:EXPERIMENTS:PERFORMANCE:HPL: + +#+begin_src R :results output :session *R* :exports both +results$hpl=factor(results$hpl) +data = results[results$hpl=="partial_shared_hpl" & + results$nb_proc > 1 & results$size > 1000, # get rid of particularly small values + c("nb_proc","size","Gflops","simulation_time","uss")] +head(data) +#+end_src + +#+RESULTS: +: nb_proc size Gflops simulation_time uss +: 1 24 25000 32.620 25.81190 55365632 +: 2 24 5000 6.399 2.72733 14643200 +: 3 24 35000 36.570 49.32340 74350592 +: 6 64 40000 87.260 111.72900 95391744 +: 7 24 10000 16.600 6.22743 26472448 +: 8 40 40000 55.990 100.31300 91209728 + +#+begin_src R :results output graphics :file hpl_analysis/26.png :exports both :width 600 :height 400 :session *R* +plot(data) +#+end_src + +#+RESULTS: +[[file:hpl_analysis/26.png]] + +#+begin_src R :results output :session *R* :exports both +reg_rss = lm(data=data,uss ~ size+nb_proc) # Interactions do not bring much +summary(reg_rss) +#+end_src + +#+RESULTS: +#+begin_example + +Call: +lm(formula = uss ~ size + nb_proc, data = data) + +Residuals: + Min 1Q Median 3Q Max +-6941093 -1573650 -348763 1611008 8790400 + +Coefficients: + Estimate Std. Error t value Pr(>|t|) +(Intercept) 7.827e+05 1.030e+06 0.760 0.45 +size 2.054e+03 3.045e+01 67.449 < 2e-16 *** +nb_proc 1.717e+05 1.903e+04 9.022 7.85e-13 *** +--- +Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 + +Residual standard error: 2791000 on 61 degrees of freedom +Multiple R-squared: 0.987, Adjusted R-squared: 0.9866 +F-statistic: 2315 on 2 and 61 DF, p-value: < 2.2e-16 +#+end_example + +#+begin_src R :results output graphics :file hpl_analysis/27.png :exports both :width 600 :height 400 :session *R* +par(mfrow=c(2,3)) ; + plot(data=data,uss~size); + plot(data=data,uss~nb_proc); + plot(reg_rss); +par(mfrow=c(1,1)) +#+end_src + +#+RESULTS: +[[file:hpl_analysis/27.png]] + +The Stampede HPL output indicates: +#+BEGIN_EXAMPLE +The following parameter values will be used: + +N : 3875000 +NB : 1024 +PMAP : Column-major process mapping +P : 77 +Q : 78 +PFACT : Right +NBMIN : 4 +NDIV : 2 +RFACT : Crout +BCAST : BlongM +DEPTH : 0 +SWAP : Binary-exchange +L1 : no-transposed form +U : no-transposed form +EQUIL : no +ALIGN : 8 double precision words +#+END_EXAMPLE + +We aim at ~size=3875000~ and ~nb_proc=77*78~. + +#+begin_src R :results output :session *R* :exports both +data[data$nb_proc==64 & data$size==40000,] +data[data$nb_proc==64 & data$size==40000,]$uss/1E6 # in MB +example=data.frame(size=c(3875000,40000), nb_proc=c(77*78,64)); +predict(reg_rss, example, interval="prediction", level=0.95)/1E6 +#+end_src + +#+RESULTS: +: nb_proc size Gflops simulation_time uss +: 6 64 40000 87.26 111.729 95391744 +: [1] 95.39174 +: fit lwr upr +: 1 8991.32610 8664.69163 9317.96056 +: 2 93.93216 88.10931 99.75501 + +So we should need around 8 to 9 GB. Good. + +#+begin_src R :results output :session *R* :exports both +reg_time = lm(data=data,simulation_time ~ poly(size,3)*poly(nb_proc,2)) # Interactions do not bring much +summary(reg_time) +reg_time = lm(data=data,simulation_time ~ poly(size,3)+poly(nb_proc,2)+I(size*nb_proc)) # Interactions do not bring much +summary(reg_time) +reg_time = lm(data=data,simulation_time ~ poly(size,2)+poly(nb_proc,1)+I(size*nb_proc)) # Interactions do not bring much +summary(reg_time) +#+end_src + +#+RESULTS: +#+begin_example + +Call: +lm(formula = simulation_time ~ poly(size, 3) * poly(nb_proc, + 2), data = data) + +Residuals: + Min 1Q Median 3Q Max +-14.6972 -2.8188 0.1211 1.4618 23.6037 + +Coefficients: + Estimate Std. Error t value Pr(>|t|) +(Intercept) 34.3882 0.8715 39.458 < 2e-16 *** +poly(size, 3)1 200.7402 6.9721 28.792 < 2e-16 *** +poly(size, 3)2 37.6113 6.9721 5.395 1.71e-06 *** +poly(size, 3)3 0.9386 6.9721 0.135 0.8934 +poly(nb_proc, 2)1 110.2551 6.9721 15.814 < 2e-16 *** +poly(nb_proc, 2)2 -9.0383 6.9721 -1.296 0.2006 +poly(size, 3)1:poly(nb_proc, 2)1 619.6089 55.7771 11.109 2.43e-15 *** +poly(size, 3)2:poly(nb_proc, 2)1 101.1174 55.7771 1.813 0.0756 . +poly(size, 3)3:poly(nb_proc, 2)1 -2.3618 55.7771 -0.042 0.9664 +poly(size, 3)1:poly(nb_proc, 2)2 -54.5865 55.7771 -0.979 0.3323 +poly(size, 3)2:poly(nb_proc, 2)2 -13.4280 55.7771 -0.241 0.8107 +poly(size, 3)3:poly(nb_proc, 2)2 -6.7984 55.7771 -0.122 0.9035 +--- +Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 + +Residual standard error: 6.972 on 52 degrees of freedom +Multiple R-squared: 0.9597, Adjusted R-squared: 0.9511 +F-statistic: 112.5 on 11 and 52 DF, p-value: < 2.2e-16 + +Call: +lm(formula = simulation_time ~ poly(size, 3) + poly(nb_proc, + 2) + I(size * nb_proc), data = data) + +Residuals: + Min 1Q Median 3Q Max +-11.9992 -3.5157 0.0224 2.7090 25.8055 + +Coefficients: + Estimate Std. Error t value Pr(>|t|) +(Intercept) -2.954e+00 3.452e+00 -0.856 0.39567 +poly(size, 3)1 4.863e+01 1.527e+01 3.184 0.00236 ** +poly(size, 3)2 3.761e+01 6.930e+00 5.427 1.22e-06 *** +poly(size, 3)3 9.386e-01 6.930e+00 0.135 0.89275 +poly(nb_proc, 2)1 -4.186e+01 1.527e+01 -2.740 0.00818 ** +poly(nb_proc, 2)2 -9.038e+00 6.930e+00 -1.304 0.19742 +I(size * nb_proc) 4.610e-05 4.125e-06 11.176 5.47e-16 *** +--- +Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 + +Residual standard error: 6.93 on 57 degrees of freedom +Multiple R-squared: 0.9563, Adjusted R-squared: 0.9517 +F-statistic: 208 on 6 and 57 DF, p-value: < 2.2e-16 + +Call: +lm(formula = simulation_time ~ poly(size, 2) + poly(nb_proc, + 1) + I(size * nb_proc), data = data) + +Residuals: + Min 1Q Median 3Q Max +-11.8123 -3.6614 0.2628 2.4029 25.7019 + +Coefficients: + Estimate Std. Error t value Pr(>|t|) +(Intercept) -2.954e+00 3.444e+00 -0.858 0.39442 +poly(size, 2)1 4.863e+01 1.524e+01 3.191 0.00227 ** +poly(size, 2)2 3.761e+01 6.914e+00 5.440 1.07e-06 *** +poly(nb_proc, 1) -4.186e+01 1.524e+01 -2.747 0.00797 ** +I(size * nb_proc) 4.610e-05 4.115e-06 11.202 3.08e-16 *** +--- +Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 + +Residual standard error: 6.914 on 59 degrees of freedom +Multiple R-squared: 0.955, Adjusted R-squared: 0.952 +F-statistic: 313.1 on 4 and 59 DF, p-value: < 2.2e-16 +#+end_example + + +#+begin_src R :results output graphics :file hpl_analysis/28.png :exports both :width 600 :height 400 :session *R* +par(mfrow=c(2,3)) ; + plot(data=data,simulation_time~size); + plot(data=data,simulation_time~nb_proc); + plot(reg_time); +par(mfrow=c(1,1)) +#+end_src + +#+RESULTS: +[[file:hpl_analysis/28.png]] + + +#+begin_src R :results output :session *R* :exports both +data[data$nb_proc==64 & data$size==40000,] +predict(reg_time, example, interval="prediction", level=0.95)/3600 # in hours +#+end_src + +#+RESULTS: +: nb_proc size Gflops simulation_time uss +: 6 64 40000 87.26 111.729 95391744 +: fit lwr upr +: 1 467.31578577 385.82615026 548.80542127 +: 2 0.03431702 0.03008967 0.03854438 + +Aouch. This would be a 3 weeks simulation. :( We need to speed things +up. +*** 2017-03-31 Friday +**** Found a bug in the last commits of Simgrid :SMPI:BUG:HPL: +- Issue reported on [[https://github.com/simgrid/simgrid/issues/147][Github]]. +- Bug fixed. +- There are still some problems with HPL, some unitialized values used for comparisons: + #+begin_example + ==3320== Memcheck, a memory error detector + ==3320== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. + ==3320== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info + ==3320== Command: ./xhpl --cfg=surf/precision:1e-9 --cfg=network/model:SMPI --cfg=network/TCP-gamma:4194304 --cfg=smpi/bcast:mpich --cfg=smpi/running-power:6217956542.969 --cfg=smpi/display-timing:yes --cfg=smpi/privatize-global-variables:yes --cfg=smpi/shared-malloc:local --cfg=smpi/privatize-global-variables:1 ./cluster_fat_tree_64.xml smpitmp-apprXPdW8 + ==3320== + [0.000000] [xbt_cfg/INFO] Configuration change: Set 'surf/precision' to '1e-9' + [0.000000] [xbt_cfg/INFO] Configuration change: Set 'network/model' to 'SMPI' + [0.000000] [xbt_cfg/INFO] Configuration change: Set 'network/TCP-gamma' to '4194304' + [0.000000] [xbt_cfg/INFO] Configuration change: Set 'smpi/bcast' to 'mpich' + [0.000000] [xbt_cfg/INFO] Configuration change: Set 'smpi/running-power' to '6217956542.969' + [0.000000] [xbt_cfg/INFO] Option smpi/running-power has been renamed to smpi/host-speed. Consider switching. + [0.000000] [xbt_cfg/INFO] Configuration change: Set 'smpi/display-timing' to 'yes' + [0.000000] [xbt_cfg/INFO] Configuration change: Set 'smpi/privatize-global-variables' to 'yes' + [0.000000] [xbt_cfg/INFO] Configuration change: Set 'smpi/shared-malloc' to 'local' + [0.000000] [xbt_cfg/INFO] Configuration change: Set 'smpi/privatize-global-variables' to '1' + [0.000000] [smpi_coll/INFO] Switch to algorithm mpich for collective bcast + ================================================================================ + HPLinpack 2.2 -- High-Performance Linpack benchmark -- February 24, 2016 + Written by A. Petitet and R. Clint Whaley, Innovative Computing Laboratory, UTK + Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK + Modified by Julien Langou, University of Colorado Denver + ================================================================================ + + An explanation of the input/output parameters follows: + T/V : Wall time / encoded variant. + N : The order of the coefficient matrix A. + NB : The partitioning blocking factor. + P : The number of process rows. + Q : The number of process columns. + Time : Time in seconds to solve the linear system. + Gflops : Rate of execution for solving the linear system. + + The following parameter values will be used: + + N : 29 30 34 35 + NB : 1 2 3 4 + PMAP : Row-major process mapping + P : 2 1 4 + Q : 2 4 1 + PFACT : Left Crout Right + NBMIN : 2 4 + NDIV : 2 + RFACT : Left Crout Right + BCAST : 1ring + DEPTH : 0 + SWAP : Mix (threshold = 64) + L1 : transposed form + U : transposed form + EQUIL : yes + ALIGN : 8 double precision words + + -------------------------------------------------------------------------------- + + - The matrix A is randomly generated for each test. + - The following scaled residual check will be computed: + ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N ) + - The relative machine precision (eps) is taken to be 1.110223e-16 + - Computational tests pass if scaled residuals are less than 16.0 + + ==3320== Conditional jump or move depends on uninitialised value(s) + ==3320== at 0x42447D: HPL_pipid (HPL_pipid.c:144) + ==3320== by 0x418ED8: HPL_pdlaswp00T (HPL_pdlaswp00T.c:171) + ==3320== by 0x40E878: HPL_pdupdateTT (HPL_pdupdateTT.c:271) + ==3320== by 0x41AF9F: HPL_pdgesv0 (HPL_pdgesv0.c:152) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Conditional jump or move depends on uninitialised value(s) + ==3320== at 0x42476D: HPL_plindx0 (HPL_plindx0.c:246) + ==3320== by 0x418EF6: HPL_pdlaswp00T (HPL_pdlaswp00T.c:172) + ==3320== by 0x40E878: HPL_pdupdateTT (HPL_pdupdateTT.c:271) + ==3320== by 0x41AF9F: HPL_pdgesv0 (HPL_pdgesv0.c:152) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Conditional jump or move depends on uninitialised value(s) + ==3320== at 0x4247A9: HPL_plindx0 (HPL_plindx0.c:250) + ==3320== by 0x418EF6: HPL_pdlaswp00T (HPL_pdlaswp00T.c:172) + ==3320== by 0x40E878: HPL_pdupdateTT (HPL_pdupdateTT.c:271) + ==3320== by 0x41AF9F: HPL_pdgesv0 (HPL_pdgesv0.c:152) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Use of uninitialised value of size 8 + ==3320== at 0x420413: HPL_dlaswp01T (HPL_dlaswp01T.c:240) + ==3320== by 0x418BDD: HPL_pdlaswp00T (HPL_pdlaswp00T.c:194) + ==3320== by 0x40E878: HPL_pdupdateTT (HPL_pdupdateTT.c:271) + ==3320== by 0x41AF9F: HPL_pdgesv0 (HPL_pdgesv0.c:152) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Conditional jump or move depends on uninitialised value(s) + ==3320== at 0x4E779CC: idamax_ (in /usr/lib/libblas/libblas.so.3.6.0) + ==3320== by 0x4E779FA: idamaxsub_ (in /usr/lib/libblas/libblas.so.3.6.0) + ==3320== by 0x4E4796F: cblas_idamax (in /usr/lib/libblas/libblas.so.3.6.0) + ==3320== by 0x4134F0: HPL_dlocmax (HPL_dlocmax.c:125) + ==3320== by 0x40B277: HPL_pdpanllT (HPL_pdpanllT.c:167) + ==3320== by 0x4243C8: HPL_pdfact (HPL_pdfact.c:129) + ==3320== by 0x41AF61: HPL_pdgesv0 (HPL_pdgesv0.c:146) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== + ==3320== Conditional jump or move depends on uninitialised value(s) + ==3320== at 0x417083: HPL_pdmxswp (HPL_pdmxswp.c:238) + ==3320== by 0x40B4C2: HPL_pdpanllT (HPL_pdpanllT.c:221) + ==3320== by 0x4243C8: HPL_pdfact (HPL_pdfact.c:129) + ==3320== by 0x41AF61: HPL_pdgesv0 (HPL_pdgesv0.c:146) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Conditional jump or move depends on uninitialised value(s) + ==3320== at 0x417098: HPL_pdmxswp (HPL_pdmxswp.c:238) + ==3320== by 0x40B4C2: HPL_pdpanllT (HPL_pdpanllT.c:221) + ==3320== by 0x4243C8: HPL_pdfact (HPL_pdfact.c:129) + ==3320== by 0x41AF61: HPL_pdgesv0 (HPL_pdgesv0.c:146) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Conditional jump or move depends on uninitialised value(s) + ==3320== at 0x4170A2: HPL_pdmxswp (HPL_pdmxswp.c:239) + ==3320== by 0x40B4C2: HPL_pdpanllT (HPL_pdpanllT.c:221) + ==3320== by 0x4243C8: HPL_pdfact (HPL_pdfact.c:129) + ==3320== by 0x41AF61: HPL_pdgesv0 (HPL_pdgesv0.c:146) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Conditional jump or move depends on uninitialised value(s) + ==3320== at 0x4170A4: HPL_pdmxswp (HPL_pdmxswp.c:239) + ==3320== by 0x40B4C2: HPL_pdpanllT (HPL_pdpanllT.c:221) + ==3320== by 0x4243C8: HPL_pdfact (HPL_pdfact.c:129) + ==3320== by 0x41AF61: HPL_pdgesv0 (HPL_pdgesv0.c:146) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Conditional jump or move depends on uninitialised value(s) + ==3320== at 0x4170A6: HPL_pdmxswp (HPL_pdmxswp.c:239) + ==3320== by 0x40B4C2: HPL_pdpanllT (HPL_pdpanllT.c:221) + ==3320== by 0x4243C8: HPL_pdfact (HPL_pdfact.c:129) + ==3320== by 0x41AF61: HPL_pdgesv0 (HPL_pdgesv0.c:146) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Conditional jump or move depends on uninitialised value(s) + ==3320== at 0x4150D5: HPL_dlocswpT (HPL_dlocswpT.c:134) + ==3320== by 0x40B4D2: HPL_pdpanllT (HPL_pdpanllT.c:222) + ==3320== by 0x4243C8: HPL_pdfact (HPL_pdfact.c:129) + ==3320== by 0x41AF61: HPL_pdgesv0 (HPL_pdgesv0.c:146) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Conditional jump or move depends on uninitialised value(s) + ==3320== at 0x4150D7: HPL_dlocswpT (HPL_dlocswpT.c:134) + ==3320== by 0x40B4D2: HPL_pdpanllT (HPL_pdpanllT.c:222) + ==3320== by 0x4243C8: HPL_pdfact (HPL_pdfact.c:129) + ==3320== by 0x41AF61: HPL_pdgesv0 (HPL_pdgesv0.c:146) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Conditional jump or move depends on uninitialised value(s) + ==3320== at 0x40B4DF: HPL_pdpanllT (HPL_pdpanllT.c:223) + ==3320== by 0x4243C8: HPL_pdfact (HPL_pdfact.c:129) + ==3320== by 0x41AF61: HPL_pdgesv0 (HPL_pdgesv0.c:146) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Conditional jump or move depends on uninitialised value(s) + ==3320== at 0x40B4E1: HPL_pdpanllT (HPL_pdpanllT.c:223) + ==3320== by 0x4243C8: HPL_pdfact (HPL_pdfact.c:129) + ==3320== by 0x41AF61: HPL_pdgesv0 (HPL_pdgesv0.c:146) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Conditional jump or move depends on uninitialised value(s) + ==3320== at 0x42483B: HPL_plindx0 (HPL_plindx0.c:255) + ==3320== by 0x418EF6: HPL_pdlaswp00T (HPL_pdlaswp00T.c:172) + ==3320== by 0x40E878: HPL_pdupdateTT (HPL_pdupdateTT.c:271) + ==3320== by 0x41AF9F: HPL_pdgesv0 (HPL_pdgesv0.c:152) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Conditional jump or move depends on uninitialised value(s) + ==3320== at 0x424877: HPL_plindx0 (HPL_plindx0.c:269) + ==3320== by 0x418EF6: HPL_pdlaswp00T (HPL_pdlaswp00T.c:172) + ==3320== by 0x40E878: HPL_pdupdateTT (HPL_pdupdateTT.c:271) + ==3320== by 0x41AF9F: HPL_pdgesv0 (HPL_pdgesv0.c:152) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Use of uninitialised value of size 8 + ==3320== at 0x420B90: HPL_dlaswp02N (HPL_dlaswp02N.c:199) + ==3320== by 0x418570: HPL_pdlaswp00T (HPL_pdlaswp00T.c:198) + ==3320== by 0x40E878: HPL_pdupdateTT (HPL_pdupdateTT.c:271) + ==3320== by 0x41AF9F: HPL_pdgesv0 (HPL_pdgesv0.c:152) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Use of uninitialised value of size 8 + ==3320== at 0x422901: HPL_dlaswp04T (HPL_dlaswp04T.c:259) + ==3320== by 0x418CC3: HPL_pdlaswp00T (HPL_pdlaswp00T.c:329) + ==3320== by 0x40E878: HPL_pdupdateTT (HPL_pdupdateTT.c:271) + ==3320== by 0x41AF9F: HPL_pdgesv0 (HPL_pdgesv0.c:152) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Conditional jump or move depends on uninitialised value(s) + ==3320== at 0x41F06D: HPL_pdpanel_free (HPL_pdpanel_free.c:79) + ==3320== by 0x41AF31: HPL_pdgesv0 (HPL_pdgesv0.c:141) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Conditional jump or move depends on uninitialised value(s) + ==3320== at 0x4248A5: HPL_plindx0 (HPL_plindx0.c:258) + ==3320== by 0x418EF6: HPL_pdlaswp00T (HPL_pdlaswp00T.c:172) + ==3320== by 0x40E878: HPL_pdupdateTT (HPL_pdupdateTT.c:271) + ==3320== by 0x41AF9F: HPL_pdgesv0 (HPL_pdgesv0.c:152) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Conditional jump or move depends on uninitialised value(s) + ==3320== at 0x4203FF: HPL_dlaswp01T (HPL_dlaswp01T.c:237) + ==3320== by 0x418BDD: HPL_pdlaswp00T (HPL_pdlaswp00T.c:194) + ==3320== by 0x40E878: HPL_pdupdateTT (HPL_pdupdateTT.c:271) + ==3320== by 0x41AF9F: HPL_pdgesv0 (HPL_pdgesv0.c:152) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Use of uninitialised value of size 8 + ==3320== at 0x4205A0: HPL_dlaswp01T (HPL_dlaswp01T.c:245) + ==3320== by 0x418BDD: HPL_pdlaswp00T (HPL_pdlaswp00T.c:194) + ==3320== by 0x40E878: HPL_pdupdateTT (HPL_pdupdateTT.c:271) + ==3320== by 0x41AF9F: HPL_pdgesv0 (HPL_pdgesv0.c:152) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Conditional jump or move depends on uninitialised value(s) + ==3320== at 0x4170B5: HPL_pdmxswp (HPL_pdmxswp.c:240) + ==3320== by 0x40B4C2: HPL_pdpanllT (HPL_pdpanllT.c:221) + ==3320== by 0x4243C8: HPL_pdfact (HPL_pdfact.c:129) + ==3320== by 0x41AF61: HPL_pdgesv0 (HPL_pdgesv0.c:146) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + ==3320== Conditional jump or move depends on uninitialised value(s) + ==3320== at 0x41F06D: HPL_pdpanel_free (HPL_pdpanel_free.c:79) + ==3320== by 0x41F040: HPL_pdpanel_disp (HPL_pdpanel_disp.c:89) + ==3320== by 0x41AFCD: HPL_pdgesv0 (HPL_pdgesv0.c:161) + ==3320== by 0x40EFC4: HPL_pdgesv (HPL_pdgesv.c:103) + ==3320== by 0x406F64: HPL_pdtest (HPL_pdtest.c:197) + ==3320== by 0x401D38: smpi_simulated_main_ (HPL_pddriver.c:223) + ==3320== by 0x525BCDA: smpi_main_wrapper (smpi_global.cpp:366) + ==3320== by 0x5129B8D: operator() (functional.hpp:48) + ==3320== by 0x5129B8D: std::_Function_handler >::_M_invoke(std::_Any_data const&) (functional:1740) + ==3320== by 0x5151BB1: operator() (functional:2136) + ==3320== by 0x5151BB1: operator() (Context.hpp:92) + ==3320== by 0x5151BB1: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:303) + ==3320== + [0.884470] /home/degomme/simgrid/src/simix/smx_global.cpp:567: [simix_kernel/CRITICAL] Oops ! Deadlock or code not perfectly clean. + [0.884470] [simix_kernel/INFO] 16 processes are still running, waiting for something. + [0.884470] [simix_kernel/INFO] Legend of the following listing: "Process (@): " + [0.884470] [simix_kernel/INFO] Process 1 (0@host-0.hawaii.edu): waiting for communication synchro 0xfb4beb0 () in state 0 to finish + [0.884470] [simix_kernel/INFO] Process 2 (1@host-1.hawaii.edu): waiting for communication synchro 0xfb4b0c0 () in state 0 to finish + [0.884470] [simix_kernel/INFO] Process 3 (2@host-2.hawaii.edu): waiting for communication synchro 0xfb49760 () in state 0 to finish + [0.884470] [simix_kernel/INFO] Process 4 (3@host-3.hawaii.edu): waiting for communication synchro 0xfb47590 () in state 0 to finish + [0.884470] [simix_kernel/INFO] Process 5 (4@host-4.hawaii.edu): waiting for synchronization synchro 0xf8a1ae0 () in state 0 to finish + [0.884470] [simix_kernel/INFO] Process 6 (5@host-5.hawaii.edu): waiting for synchronization synchro 0xf8a1f10 () in state 0 to finish + [0.884470] [simix_kernel/INFO] Process 7 (6@host-6.hawaii.edu): waiting for synchronization synchro 0xf897500 () in state 0 to finish + [0.884470] [simix_kernel/INFO] Process 8 (7@host-7.hawaii.edu): waiting for synchronization synchro 0xf89b190 () in state 0 to finish + [0.884470] [simix_kernel/INFO] Process 9 (8@host-8.hawaii.edu): waiting for synchronization synchro 0xf8a3680 () in state 0 to finish + [0.884470] [simix_kernel/INFO] Process 10 (9@host-9.hawaii.edu): waiting for synchronization synchro 0xf896280 () in state 0 to finish + [0.884470] [simix_kernel/INFO] Process 11 (10@host-10.hawaii.edu): waiting for synchronization synchro 0xf8970d0 () in state 0 to finish + [0.884470] [simix_kernel/INFO] Process 12 (11@host-11.hawaii.edu): waiting for synchronization synchro 0xf89b5c0 () in state 0 to finish + [0.884470] [simix_kernel/INFO] Process 13 (12@host-12.hawaii.edu): waiting for synchronization synchro 0xf89ce30 () in state 0 to finish + [0.884470] [simix_kernel/INFO] Process 14 (13@host-13.hawaii.edu): waiting for synchronization synchro 0xf89f530 () in state 0 to finish + [0.884470] [simix_kernel/INFO] Process 15 (14@host-14.hawaii.edu): waiting for synchronization synchro 0xf89f100 () in state 0 to finish + [0.884470] [simix_kernel/INFO] Process 16 (15@host-15.hawaii.edu): waiting for synchronization synchro 0xf8a0ca0 () in state 0 to finish + ==3320== + ==3320== Process terminating with default action of signal 6 (SIGABRT) + ==3320== at 0x5619428: raise (raise.c:54) + ==3320== by 0x561B029: abort (abort.c:89) + ==3320== by 0x52347B8: xbt_abort (xbt_main.cpp:167) + ==3320== by 0x52F4768: SIMIX_run.part.110 (smx_global.cpp:569) + ==3320== by 0x52F6204: SIMIX_run (stl_algobase.h:224) + ==3320== by 0x5263E66: smpi_main (smpi_global.cpp:474) + ==3320== by 0x560482F: (below main) (libc-start.c:291) + ==3320== + ==3320== HEAP SUMMARY: + ==3320== in use at exit: 136,159,788 bytes in 7,560 blocks + ==3320== total heap usage: 39,378 allocs, 31,818 frees, 140,230,437 bytes allocated + ==3320== + ==3320== LEAK SUMMARY: + ==3320== definitely lost: 321 bytes in 4 blocks + ==3320== indirectly lost: 0 bytes in 0 blocks + ==3320== possibly lost: 134,294,280 bytes in 96 blocks + ==3320== still reachable: 1,865,187 bytes in 7,460 blocks + ==3320== suppressed: 0 bytes in 0 blocks + ==3320== Rerun with --leak-check=full to see details of leaked memory + ==3320== + ==3320== For counts of detected and suppressed errors, rerun with: -v + ==3320== Use --track-origins=yes to see where uninitialised values come from + ==3320== ERROR SUMMARY: 1147 errors from 24 contexts (suppressed: 0 from 0) + valgrind --track-origins:yes ./xhpl --cfg=surf/precision:1e-9 --cfg=network/model:SMPI --cfg=network/TCP-gamma:4194304 --cfg=smpi/bcast:mpich --cfg=smpi/running-power:6217956542.969 --cfg=smpi/display-timing:yes --cfg=smpi/privatize-global-variables:yes --cfg=smpi/shared-malloc:local --cfg=smpi/privatize-global-variables:1 ./cluster_fat_tree_64.xml smpitmp-apprXPdW8 + Execution failed with code 134. + #+end_example +- Note that this file has been obtained with a nearly-vanilla HPL (see the Github issue). No =smpi_usleep=, and shared + =malloc= only for the matrix (no partial shared =malloc= for =PANEL->WORK=). Thus, it is quite strange to see such errors. +- The first error (=HPL_pipid.c:144=) happens because =PANEL->ia= is unitialized (checked by modifying the two operands one + after the other to see if the error persists). +** 2017-04 April +** 2017-05 May +** 2017-06 June +*** 2017-06-01 Thursday +**** Redo validation of huge pages :SMPI:EXPERIMENTS:HPL:REPORT: +- Simgrid commit: =9a8e2f5bce8c6758d4367d21a66466a497d136fe= +- HPL commit: =41774905395aebcb73650defaa7e2aa462e6e1a3= +- Script commit: =eb071f09d822e1031ea0776949058bf2f55cb94a= +- Compilation and execution for optimized HPL (made on =nova-10= without the huge pages, =nova-11= with the huge pages) + #+begin_src sh + make SMPI_OPTS="-DSMPI_OPTIMIZATION_LEVEL=4 -DSMPI_DGEMM_COEFFICIENT=1.742435e-10 + -DSMPI_DTRSM_COEFFICIENT=8.897459e-11" arch=SMPI + #+end_src + #+begin_src sh + sysctl -w vm.overcommit_memory=1 && sysctl -w vm.max_map_count=40000000 + + mount none /root/huge -t hugetlbfs -o rw,mode=0777 && echo 1 >> /proc/sys/vm/nr_hugepages + #+end_src + #+begin_src sh + ./run_measures.py --global_csv result_size.csv --nb_runs 3 --size 50000,100000,150000,200000,250000,300000 --nb_proc + 64 --topo "2;16,32;1,16;1,1" --experiment HPL --running_power 5004882812.500 --nb_cpu 8 + + ./run_measures.py --global_csv result_size.csv --nb_runs 3 --size 50000,100000,150000,200000,250000,300000 --nb_proc + 64 --topo "2;16,32;1,16;1,1" --experiment HPL --running_power 5004882812.500 --nb_cpu 8 --hugepage /root/huge + #+end_src +- Analysis + #+begin_src R :results output :session *R* :exports both + library(ggplot2) + library(gridExtra) + old <- rbind(read.csv("validation/result_size_L4_big_nohugepage.csv"), read.csv("validation/result_size_L4_big_nohugepage_2.csv")) + new <- read.csv("validation/result_size_L4_big_hugepage.csv") + old$hugepage = FALSE + new$hugepage = TRUE + results = rbind(old, new) + #+end_src + + #+RESULTS: + + #+begin_src R :file validation/hugepage/1.pdf :results value graphics :results output :session *R* :exports both :width 6 :height 4 + do_plot(results, "size", "simulation_time", "hugepage", "Huge page", 64) + #+end_src + + #+RESULTS: + [[file:validation/hugepage/1.pdf]] + + #+begin_src R :file validation/hugepage/3.pdf :results value graphics :results output :session *R* :exports both :width 6 :height 4 + do_plot(results, "size", "memory_size", "hugepage", "Huge page", 64) + #+end_src + + #+results: + [[file:validation/hugepage/3.pdf]] + + #+begin_src R :file validation/hugepage/5.pdf :results value graphics :results output :session *R* :exports both :width 6 :height 4 + do_plot(results, "size", "Gflops", "hugepage", "Huge page", 64) + #+end_src + + #+results: + [[file:validation/hugepage/5.pdf]] + + #+begin_src R :file validation/hugepage/report_plot.pdf :results value graphics :results output :session *R* :exports both :width 12 :height 4 + grid_arrange_shared_legend( + do_plot(results, "size", "simulation_time", "hugepage", "Huge page", 64), + do_plot(results, "size", "memory_size", "hugepage", "Huge page", 64), + nrow=1, ncol=2 + ) + #+end_src + + #+RESULTS: + [[file:validation/hugepage/report_plot.pdf]] + + #+begin_src R :file validation/hugepage/2.png :results value graphics :results output :session *R* :exports both :width 800 :height 400 + plot1 = generic_do_plot(ggplot(results, aes(x=size, y=cpu_utilization, color=hugepage))) + + ggtitle("CPU utilization for different matrix sizes\nUsing 64 MPI processes") + plot2 = generic_do_plot(ggplot(results, aes(x=size, y=minor_page_fault, color=hugepage))) + + ggtitle("Number of page faults for different matrix sizes\nUsing 64 MPI processes") + grid.arrange(plot1, plot2, ncol=2) + #+end_src + + #+RESULTS: + [[file:validation/hugepage/2.png]] + + + #+begin_src R :results output :session *R* :exports both + library(data.table) + aggregate_results <- function(results) { + x = data.table(results) + x = as.data.frame(x[, list(simulation_time=mean(simulation_time), Gflops=mean(Gflops), application_time=mean(application_time)), by=c("size", "nb_proc")]) + return(x[with(x, order(size, nb_proc)),]) + } + aggr_old = aggregate_results(old) + aggr_new = aggregate_results(new) + aggr_new$Gflops_error = (aggr_new$Gflops - aggr_old$Gflops)/aggr_new$Gflops + #+end_src + + #+begin_src R :file validation/hugepage/3.png :results value graphics :results output :session *R* :exports both :width 800 :height 400 + generic_do_plot(ggplot(aggr_new, aes(x=size, y=Gflops_error))) + #+end_src + + #+RESULTS: + [[file:validation/hugepage/3.png]] + + - The Gflops error is negligible. + - The gain of using huge pages is pretty neat for both the simulation time and the memory consumption. + - Very large variability of the CPU utilization, something weird has happened. +**** Scalability test :SMPI:EXPERIMENTS:HPL:REPORT: +- Simgrid commit: =9a8e2f5bce8c6758d4367d21a66466a497d136fe= +- HPL commit: =41774905395aebcb73650defaa7e2aa462e6e1a3= +- Script commit: =8cfd8d16787f39a29342b64599cf02166af6d632= +- Compilation and execution for optimized HPL (made on =nova-10= and =nova-11=) + #+begin_src sh + make SMPI_OPTS="-DSMPI_OPTIMIZATION_LEVEL=4 -DSMPI_DGEMM_COEFFICIENT=1.742435e-10 + -DSMPI_DTRSM_COEFFICIENT=8.897459e-11" arch=SMPI + #+end_src + #+begin_src sh + sysctl -w vm.overcommit_memory=1 && sysctl -w vm.max_map_count=40000000 + + mount none /root/huge -t hugetlbfs -o rw,mode=0777 && echo 1 >> /proc/sys/vm/nr_hugepages + #+end_src + #+begin_src sh + ./run_measures.py --global_csv result_size_1000000_512.csv --nb_runs 1 --size 1000000 --nb_proc 512 --topo + "2;16,32;1,16;1,1" --experiment HPL --running_power 5004882812.500 --nb_cpu 8 --hugepage /root/huge + + ./run_measures.py --global_csv result_size_1000000_1024.csv --nb_runs 1 --size 1000000 --nb_proc 1024 --topo + "2;16,32;1,16;1,1" --experiment HPL --running_power 5004882812.500 --nb_cpu 8 --hugepage /root/huge + + ./run_measures.py --global_csv result_size_1000000_2048.csv --nb_runs 1 --size 1000000 --nb_proc 2048 --topo + "2;16,32;1,16;1,1" --experiment HPL --running_power 5004882812.500 --nb_cpu 8 --hugepage /root/huge + + ./run_measures.py --global_csv result_size_2000000_512.csv --nb_runs 1 --size 2000000 --nb_proc 512 --topo + "2;16,32;1,16;1,1" --experiment HPL --running_power 5004882812.500 --nb_cpu 8 --hugepage /root/huge + + ./run_measures.py --global_csv result_size_2000000_1024.csv --nb_runs 1 --size 2000000 --nb_proc 1024 --topo + "2;16,32;1,16;1,1" --experiment HPL --running_power 5004882812.500 --nb_cpu 8 --hugepage /root/huge + + ./run_measures.py --global_csv result_size_2000000_2048.csv --nb_runs 1 --size 2000000 --nb_proc 2048 --topo + "2;16,32;1,16;1,1" --experiment HPL --running_power 5004882812.500 --nb_cpu 8 --hugepage /root/huge + #+end_src +- Results: + #+begin_src R :results output :session *R* :exports both + rbind( + read.csv('scalability/result_1000000_512.csv'), + read.csv('scalability/result_1000000_1024.csv'), + read.csv('scalability/result_1000000_2048.csv'), + read.csv('scalability/result_2000000_512.csv'), + read.csv('scalability/result_2000000_1024.csv'), + read.csv('scalability/result_2000000_2048.csv') + ) + #+end_src + + #+RESULTS: + #+begin_example + topology nb_roots nb_proc size full_time time Gflops + 1 2;16,32;1,16;1,1 16 512 1000000 716521 716521.0 930.4 + 2 2;16,32;1,16;1,1 16 1024 1000000 363201 363201.0 1836.0 + 3 2;16,32;1,16;1,1 16 2048 1000000 186496 186495.7 3575.0 + 4 2;16,32;1,16;1,1 16 512 2000000 5685080 5685077.7 938.1 + 5 2;16,32;1,16;1,1 16 1024 2000000 2861010 2861012.5 1864.0 + 6 2;16,32;1,16;1,1 16 2048 2000000 1448900 1448899.1 3681.0 + simulation_time application_time user_time system_time major_page_fault + 1 2635.10 500.97 2367.19 259.91 0 + 2 6037.89 1036.96 5515.36 515.05 0 + 3 12391.90 2092.95 11389.36 995.39 0 + 4 6934.86 1169.66 6193.80 683.73 0 + 5 15198.30 2551.10 13714.01 1430.93 0 + 6 32263.60 5236.56 29357.92 2844.89 0 + minor_page_fault cpu_utilization uss rss page_table_size + 1 1916208 0.99 153665536 2317279232 10600000 + 2 2002989 0.99 369676288 4837175296 21252000 + 3 2154982 0.99 1010696192 7774138368 42908000 + 4 3801905 0.99 150765568 2758770688 10604000 + 5 3872820 0.99 365555712 5273034752 21220000 + 6 4038099 0.99 1009606656 7415914496 42884000 + memory_size + 1 894443520 + 2 1055309824 + 3 1581170688 + 4 3338420224 + 5 3497111552 + 6 4027408384 + #+end_example +**** Add the Stampede output file in the repository :HPL: +- File [[file:fullrun2.run1.notestmode.20000m.log]] +*** 2017-06-02 Friday +**** DONE New scalability tests to run [6/6] :SMPI:HPL: +:LOGBOOK: +- State "DONE" from "TODO" [2017-06-05 Mon 19:33] +- State "TODO" from "TODO" [2017-06-04 Sun 20:22] +- State "TODO" from "TODO" [2017-06-04 Sun 20:22] +- State "TODO" from "TODO" [2017-06-04 Sun 20:22] +- State "TODO" from "TODO" [2017-06-03 Sat 19:05] +- State "TODO" from "TODO" [2017-06-03 Sat 19:05] +- State "TODO" from [2017-06-02 Fri 09:48] +:END: +- [X] N=1000000, nbproc=4096, expected time \approx 206min \times 2.2 \approx 7.5h +- [X] N=2000000, nbproc=4096, expected time \approx 537min \times 2.2 \approx 19.7h +- [X] N=4000000, nbproc=512, expected time \approx 115min \times 2.6 \approx 5h +- [X] N=4000000, nbproc=1024, expected time \approx 253min \times 2.6 \approx 11h +- [X] N=4000000, nbproc=2048, expected time \approx 537min \times 2.6 \approx 23.3h +- [X] N=4000000, nbproc=4096, expected time \approx 537min \times 2.6 \times 2.2 \approx 51h +**** Cannot connect anymore on G5K nodes in Lyon :BUG:G5K: +- Reserved a job and made a deployment in =lyon=. Then, *cannot* connect to the node (both as =tocornebize= and as =root=). +- Reserved a job and made a deployment in =grenoble=. Then, *can* connect to the node (both as =tocornebize= and as =root=). +- Looked at the =.ssh= directories of =grenoble= and =lyon=, they look the same. +- Can =ssh= from =lyon= to =grenoble= (or any other site) but cannot =ssh= from =grenoble= (or any other site) to =lyon=. +- Fixed by replacing the =.ssh= folder from =lyon= by the =.ssh= folder from =grenoble= (might have messed up something...). +**** First capacity planning test :SMPI:EXPERIMENTS:HPL:REPORT: +- Simgrid commit: =9a8e2f5bce8c6758d4367d21a66466a497d136fe= +- HPL commit: =41774905395aebcb73650defaa7e2aa462e6e1a3= +- Script commit: =4ff3ccbcbb77e126e454a16dea0535493ff1ff0b= +- Compilation and execution (on =nova-6= and =nova-8=). + #+begin_src sh + make SMPI_OPTS="-DSMPI_OPTIMIZATION_LEVEL=4 -DSMPI_DGEMM_COEFFICIENT=1.742435e-10 + -DSMPI_DTRSM_COEFFICIENT=8.897459e-11" arch=SMPI + #+end_src + #+begin_src sh + sysctl -w vm.overcommit_memory=1 && sysctl -w vm.max_map_count=40000000 + + mount none /root/huge -t hugetlbfs -o rw,mode=0777 && echo 1 >> /proc/sys/vm/nr_hugepages + #+end_src + #+begin_src sh + ./run_measures.py --global_csv result_capacity_50000.csv --nb_runs 1 --size 50000 --nb_proc 512 --topo "2;16,32;1,1:16;1,1" + --experiment HPL --running_power 5004882812.500 --hugepage /root/huge + + ./run_measures.py --global_csv result_capacity_100000.csv --nb_runs 1 --size 100000 --nb_proc 512 --topo "2;16,32;1,1:16;1,1" + --experiment HPL --running_power 5004882812.500 --hugepage /root/huge + #+end_src +- Results: + #+begin_src R :results output :session *R* :exports both + library(ggplot2) + results <- rbind(read.csv("capacity_planning/result_capacity_50000.csv"), read.csv("capacity_planning/result_capacity_100000.csv")) + #+end_src + + #+begin_src R :file capacity_planning/1.png :results value graphics :results output :session *R* :exports both :width 800 :height 400 + ggplot(results, aes(x=nb_roots, y=Gflops, color=size, group=size)) + + stat_summary(fun.y = mean, geom="line")+ + stat_summary(fun.y = mean, geom="point")+ + expand_limits(x=0, y=0)+ + ggtitle("Gflops estimation for different number of root switches and matrix sizes\nUsing 512 MPI processes") + #+end_src + + #+RESULTS: + [[file:capacity_planning/1.png]] + +- In this experiment, we use a fat-tree which has a total of 512 nodes, all having only one core. We use 512 processes, + one per node. We change the number of up-ports of the L1 switches and therefore the number of L2 switches. +- It is strange, there is apparently no impact on the performances of HPL, we get the same performances with only one L2 + switch than with 16 L2 switches. +- Maybe we could try with a bigger matrix, to maybe have some network contention? But the experiment might take some time. +- We could also try with a hostfile randomly shuffled, to maybe have a less good mapping and thus more traffic going + through the L2 switches? +- We could also try a fat-tree more “high” and less “wide”. We could have a third layer of switches, but decrease the + number of ports to keep the same number of nodes. For instance, =3;8,8,8;1,8,16;1,1,1= instead of =2;16,32;1,16;1,1= (all + have 512 nodes). But it is a bit artificial, such topology would certainly never happen in “real life”. +*** 2017-06-03 Saturday +**** New scalability tests :SMPI:EXPERIMENTS:HPL:REPORT: +- Simgrid commit: =9a8e2f5bce8c6758d4367d21a66466a497d136fe= +- HPL commit: =41774905395aebcb73650defaa7e2aa462e6e1a3= +- Script commit: =4ff3ccbcbb77e126e454a16dea0535493ff1ff0b= +- Compilation and execution (made on =nova-5=, =nova-11=, =nova-13=, =nova-14=): + #+begin_src sh + make SMPI_OPTS="-DSMPI_OPTIMIZATION_LEVEL=4 -DSMPI_DGEMM_COEFFICIENT=1.742435e-10 + -DSMPI_DTRSM_COEFFICIENT=8.897459e-11" arch=SMPI + #+end_src + #+begin_src sh + sysctl -w vm.overcommit_memory=1 && sysctl -w vm.max_map_count=2000000000 + + mount none /root/huge -t hugetlbfs -o rw,mode=0777 && echo 1 >> /proc/sys/vm/nr_hugepages + #+end_src + #+begin_src sh + ./run_measures.py --global_csv result_size_1000000_4096.csv --nb_runs 1 --size 1000000 --nb_proc 4096 --topo + "2;16,32;1,16;1,1;8" --experiment HPL --running_power 5004882812.500 --hugepage /root/huge + + ./run_measures.py --global_csv result_size_4000000_512.csv --nb_runs 1 --size 4000000 --nb_proc 512 --topo + "2;16,32;1,16;1,1;8" --experiment HPL --running_power 5004882812.500 --hugepage /root/huge + + ./run_measures.py --global_csv result_size_4000000_1024.csv --nb_runs 1 --size 4000000 --nb_proc 1024 --topo + "2;16,32;1,16;1,1;8" --experiment HPL --running_power 5004882812.500 --hugepage /root/huge + + ./run_measures.py --global_csv result_size_2000000_4096.csv --nb_runs 1 --size 2000000 --nb_proc 4096 --topo + "2;16,32;1,16;1,1;8" --experiment HPL --running_power 5004882812.500 --hugepage /root/huge + + ./run_measures.py --global_csv result_size_4000000_2048.csv --nb_runs 1 --size 4000000 --nb_proc 2048 --topo + "2;16,32;1,16;1,1;8" --experiment HPL --running_power 5004882812.500 --hugepage /root/huge + + ./run_measures.py --global_csv result_size_4000000_4096.csv --nb_runs 1 --size 4000000 --nb_proc 4096 --topo + "2;16,32;1,16;1,1;8" --experiment HPL --running_power 5004882812.500 --hugepage /root/huge + + #+end_src + + #+begin_src R :results output :session *R* :exports both + rbind( + read.csv('scalability/result_500000_512.csv'), + read.csv('scalability/result_500000_1024.csv'), + read.csv('scalability/result_500000_2048.csv'), + read.csv('scalability/result_500000_4096.csv'), + read.csv('scalability/result_1000000_4096.csv'), + read.csv('scalability/result_2000000_4096.csv'), + read.csv('scalability/result_4000000_512.csv'), + read.csv('scalability/result_4000000_1024.csv'), + read.csv('scalability/result_4000000_2048.csv'), + read.csv('scalability/result_4000000_4096.csv') + ) + #+end_src + + #+RESULTS: + #+begin_example + topology nb_roots nb_proc size full_time time Gflops + 1 2;16,32;1,16;1,1;8 16 512 500000 91246.1 91246.02 913.3 + 2 2;16,32;1,16;1,1;8 16 1024 500000 46990.1 46990.02 1773.0 + 3 2;16,32;1,16;1,1;8 16 2048 500000 24795.5 24795.50 3361.0 + 4 2;16,32;1,16;1,1;8 16 4096 500000 13561.0 13561.01 6145.0 + 5 2;16,32;1,16;1,1;8 16 4096 1000000 97836.6 97836.54 6814.0 + 6 2;16,32;1,16;1,1;8 16 4096 2000000 742691.0 742690.59 7181.0 + 7 2;16,32;1,16;1,1;8 16 512 4000000 45305100.0 45305083.56 941.8 + 8 2;16,32;1,16;1,1;8 16 1024 4000000 22723800.0 22723820.45 1878.0 + 9 2;16,32;1,16;1,1;8 16 2048 4000000 11432900.0 11432938.62 3732.0 + 10 2;16,32;1,16;1,1;8 16 4096 4000000 5787160.0 5787164.09 7373.0 + simulation_time application_time user_time system_time major_page_fault + 1 1191.99 204.992 1098.25 93.12 0 + 2 2482.28 441.897 2296.51 184.70 0 + 3 5091.97 872.425 4741.26 349.79 0 + 4 11321.60 1947.320 10640.63 679.53 0 + 5 26052.50 4362.660 24082.38 1966.10 0 + 6 64856.30 10643.600 59444.40 5402.24 0 + 7 17336.50 3030.400 15090.31 1945.23 0 + 8 38380.90 6435.870 34249.71 3827.36 0 + 9 83535.20 13080.500 75523.95 7684.52 0 + 10 169659.00 26745.400 154314.76 15085.08 0 + minor_page_fault cpu_utilization uss rss page_table_size + 1 960072 0.99 155148288 2055086080 10604000 + 2 1054062 0.99 369696768 4383203328 21240000 + 3 1282294 0.99 1012477952 9367576576 42912000 + 4 1852119 0.99 3103875072 15318568960 87740000 + 5 2768705 0.99 3103895552 16934834176 87748000 + 6 4704339 0.99 3102445568 19464646656 87748000 + 7 7663911 0.98 151576576 2056916992 10604000 + 8 7725625 0.99 369872896 4120702976 21212000 + 9 7917525 0.99 1012191232 9221050368 42880000 + 10 8550745 0.99 3113381888 20408209408 87808000 + memory_size + 1 282558464 + 2 429948928 + 3 962826240 + 4 2814042112 + 5 3425406976 + 6 5910134784 + 7 13079060480 + 8 13275557888 + 9 13825183744 + 10 15763668992 +#+end_example +- +Memory measurement failed for the experiments with 4096 nodes (=smpimain= took too much time to start, so its PID was+ + +not found by =run_measure.py= at the beginning, so it assumed it was already terminated... really need to find something+ + +more robust).+ +- For the record, ran this command on the nodes (same command used in the script to estimate the memory consumption): + #+begin_src sh + python3 -c "import psutil; print(psutil.virtual_memory().available)" + #+end_src +- Result: + + For size=2000000 and nbproc=4096: 60468817920 + + For size=4000000 and nbproc=1024: 53105373184 + + For size=4000000 and nbproc=2048: 52539293696 + + For size=4000000 and nbproc=4096: 50614239232 +- On a freshly deployed node, the same command returns 66365100032 +*** 2017-06-04 Sunday +**** Investigate capacity planning: small test program :SMPI:HPL: +- As mentionned in [2017-06-02 Fri], the duration of HPL does not seem to be impacted by the topology, which is strange. +- Implemented a small test program, called =network_test=. It takes as argument a size and a number of iterations. Every + process sends the given number of messages, each having the given size, to the next process (and thus receives from + the previous one). +- Tested with the following topology (only changing the fat-tree description): + #+begin_example + + + + + + + + + #+end_example +- Results for one iteration: + + With a size of =200000000= and the fat-tree =2;4,4;1,4;1,1=, takes a time of 1.28 seconds. + + With a size of =200000000= and the fat-tree =2;4,4;1,1;1,1=, takes a time of 2.69 seconds. + + With a size of =200000= and the fat-tree =2;4,4;1,4;1,1=, takes a time of 0.0025 seconds. + + With a size of =200000= and the fat-tree =2;4,4;1,1;1,1=, takes a time of 0.0040 seconds. + + With a size of =2000= and the fat-tree =2;4,4;1,4;1,1=, takes a time of 0.0004 seconds. + + With a size of =2000= and the fat-tree =2;4,4;1,1;1,1=, takes a time of 0.0004 seconds. +- Thus, for large enough size, the difference is very clear, the topology *does* have a high impact. For small messages + however, this is not the case. +- It does not seem to change for several iterations. +**** TODO Check whas are the sizes of the messages in HPL. :SMPI:HPL: +:LOGBOOK: +- State "TODO" from [2017-06-04 Sun 19:44] +:END: +**** Investigate capacity planning: odd networks :SMPI:HPL: +- Simgrid commit: =9a8e2f5bce8c6758d4367d21a66466a497d136fe= +- HPL commit: =41774905395aebcb73650defaa7e2aa462e6e1a3= +- Script commit: =4ff3ccbcbb77e126e454a16dea0535493ff1ff0b= +- Try several topologies for HPL with absurdly good or bad networks (e.g. high/null bandwidth and/or high/null latency). +- The idea is that if doing so has a little impact on performances, then it is hopeless to observe any impact from + adding/removing switches. +- Quick and dirty experiments: do not add any option to the script, just modify the values in =topology.py= (lines + 161-164). +- Note that in the previous experiments, where nearly no impact was observed, the different values were: + #+begin_src python + bw = '10Gbps' + lat = '2.4E-5s' + loopback_bw = '5120MiBps' + loopback_lat = '1.5E-9s' + #+end_src +- Run this command, which outputs the Gflops: + #+begin_src sh + ./run_measures.py --global_csv /tmp/bla.csv --nb_runs 1 --size 10000 --nb_proc 16 --topo "2;4,4;1,4;1,1" --experiment + HPL --running_power 6217956542.969 && tail -n 1 /tmp/bla.csv | cut -f10 -d',' + #+end_src +- Result for the same network characteristics: 21.96 +- Results with other characteristics: + + Very high bandwidth: 22.15 + #+begin_src python + bw = '1000000Gbps' + lat = '2.4E-5s' + loopback_bw = '1000000GBps' + loopback_lat = '1.5E-9s' + #+end_src + + Very low bandwidth: 1.505 + #+begin_src python + bw = '10Mbps' + lat = '2.4E-5s' + loopback_bw = '10Mbps' + loopback_lat = '1.5E-9s' + #+end_src + + Low bandwidth: 19.95 + #+begin_src python + bw = '1Gbps' + lat = '2.4E-5s' + loopback_bw = '512MiBps' + loopback_lat = '1.5E-9s' + #+end_src + + Very low latency: 25.95 + #+begin_src python + bw = '10Gbps' + lat = '0s' + loopback_bw = '5120MiBps' + loopback_lat = '0s' + #+end_src + + Very high latency: 0.1534 + #+begin_src python + bw = '10Gbps' + lat = '2.4E-2s' + loopback_bw = '5120MiBps' + loopback_lat = '1.5E-5s + #+end_src + + High latency: 9.477 + #+begin_src python + bw = '10Gbps' + lat = '2.4E-4s' + loopback_bw = '5120MiBps' + loopback_lat = '1.5E-8s' + #+end_src +- Improving the network performances has a limited impact. Using a nearly infinite bandwidth increases the Gflops by + less than 1%. Using a null latency has more impact, but still limited, it increases the Gflops by 18%. +- Degrading the network performances has more impact. Using a bandwidth 1000 times lower divides by 15 the Gflops, but + using a bandwidth 10 times lower decreases the Gflops by only 9%. Both the very high latency and the high latency have + a great impact. +- To sum up, the latency seems to have an higher impact on HPL performances than the bandwidth. +- It is not clear if the contention created by using less switches will only decrease the bandwidth, or also increase + the latency. It depends if there is on the switches one queue per port, or one queue for all the ports (in the former + case, contention will have a much lower impact on the latency than in the later case). +- Hypothesis: in the case of a one queue per port model, removing switches will not increase too much the latency and + therefore have a very limited impact on HPL performances. +**** TODO Ask what model is used in Simgrid’s switches :SIMGRID: +:LOGBOOK: +- State "TODO" from [2017-06-04 Sun 19:40] +:END: +- Is it one queue per port, or one single queue for all the ports? +**** More thoughts on capacity planning :SMPI:HPL: +- The plot of the Gflops as a function of the bandwidth (resp. inverse of latency) seems to look like the plot of the + Gflops as a function of the number of processes or the size. It is a concave function converging to some finite limit. +- In the settings currently used for HPL, the bandwidth of 10Gbps seems to be already very close th the limit (since + using a bandwidth thousands of time larger has little to no impact). This is why decreasing the bandwidth a bit has a + very little impact. If we want to observe something when we remove switches, we should use lower bandwidths. +- Quick test, using the same command than previous section and with these values: + #+begin_src python + bw = '10Mbps' + lat = '2.4E-5s' + loopback_bw = '5120MiBps' + loopback_lat = '1.5E-9s' + #+end_src + + With =2;4,4;1,4;1,1=, 1.505 Gflops. + + With =2;4,4;1,1;1,1=, 1.025 Gflops. + + With =2;4,4;1,4;1,1= and a random mapping, 1.268 Gflops. + + With =2;4,4;1,1;1,1= and a random mapping, 0.6464 Gflops. +- The hypothesis seems to be confirmed. With a lower bandwidth, a difference of bandwidth has much more impact. Thus, + removing a switch and/or using a random mapping has also much more impact. +*** 2017-06-05 Monday +**** Comparison with real Taurus experiment :SMPI:EXPERIMENTS:HPL:REPORT: +- File [[file:hpl_analysis/taurus/real.csv]] holds real experiment data. It has been created manually, thanks to the energy + paper [[https://gitlab.inria.fr/fheinric/paper-simgrid-energy/tree/master/experiments/mpi_hpl_dvfs/taurus_2017-01-17/original-data][repository]]. + + #+begin_src R :results output :session *R* :exports both + library(ggplot2) + library(reshape2) + library(gridExtra) + + get_results <- function(nb_proc) { + result <- read.csv(paste('hpl_analysis/taurus/hpl_paper_', nb_proc, '.csv', sep='')) + result$full_time = max(result$time) + result$total_energy = sum(result$power_consumption) + + used_energy = 0 + result = result[with(result, order(-power_consumption)),] # sort by power consumption + result$used_energy = sum(head(result, nb_proc/12)$power_consumption) + result$nb_proc = nb_proc + return(unique(result[c('nb_proc', 'full_time', 'total_energy', 'used_energy')])) + } + simulation_vanilla_results = data.frame() + # for(i in (c(1,4,8,12,48,96,144))) { + for(i in (c(12,48,96,144))) { + simulation_vanilla_results = rbind(simulation_vanilla_results, get_results(i)) + } + simulation_vanilla_results$type = 'Vanilla simulation' + simulation_vanilla_results$time = -1 # do not have it + simulation_vanilla_results$Gflops = -1 # do not have it + + real_results = read.csv('hpl_analysis/taurus/real.csv') + real_results$type = 'Real execution' + real_results$used_energy = real_results$used_energy * 1e3 # kJ -> J + sim_results <- read.csv('hpl_analysis/taurus/hpl2.csv') + sim_results$type = 'Optimized simulation' + results = rbind(real_results[c('nb_proc', 'full_time', 'time', 'Gflops', 'used_energy', 'type')], + sim_results[c('nb_proc', 'full_time', 'time', 'Gflops', 'used_energy', 'type')], + simulation_vanilla_results[c('nb_proc', 'full_time', 'time', 'Gflops', 'used_energy', 'type')]) + results$type <- factor(results$type, levels = c('Optimized simulation', 'Vanilla simulation', 'Real execution')) + #+end_src + + #+RESULTS: + + #+begin_src R :file hpl_analysis/taurus/validation.pdf :results value graphics :results output :session *R* :exports both :width 12 :height 4 + p1 = generic_do_plot(ggplot(results, aes(x=nb_proc, y=full_time, color=type, shape=type)), fixed_shape=FALSE) + + xlab("Number of processes")+ + ylab("Duration (seconds)")+ + scale_shape_manual(values = c(0, 1, 2))+ + labs(colour="Experiment type")+ + labs(shape="Experiment type")+ + ggtitle("HPL duration for different numbers of processes\nMatrix size: 20,000") + p2 = generic_do_plot(ggplot(results, aes(x=nb_proc, y=used_energy, color=type, shape=type)), fixed_shape=FALSE) + + xlab("Number of processes")+ + ylab("Energy consumption (joules)")+ + scale_shape_manual(values = c(0, 1, 2))+ + labs(colour="Experiment type")+ + labs(shape="Experiment type")+ + ggtitle("HPL energy consumption for different numbers of processes\nMatrix size: 20,000") + grid_arrange_shared_legend(p1, p2, nrow=1, ncol=2) + #+end_src + + #+RESULTS: + [[file:hpl_analysis/taurus/validation.pdf]] + + #+begin_src R :file hpl_analysis/taurus/validation2.pdf :results value graphics :results output :session *R* :exports both :width 12 :height 4 + tmp_results = results[results$type != "Vanilla simulation",] + grid_arrange_shared_legend( + generic_do_plot(ggplot(tmp_results, aes(x=nb_proc, y=time, color=type))) + + xlab("Number of processes")+ + ylab("Duration (seconds)")+ + labs(colour="Simulated")+ + ggtitle("HPL “short” duration for different numbers of processes\nMatrix size: 20,000"), + generic_do_plot(ggplot(tmp_results, aes(x=nb_proc, y=Gflops, color=type))) + + xlab("Number of processes")+ + ylab("Energy consumption (joules)")+ + labs(colour="Simulated")+ + ggtitle("HPL performances for different numbers of processes\nMatrix size: 20,000"), + nrow=1, ncol=2 + ) + #+end_src + + #+RESULTS: + [[file:hpl_analysis/taurus/validation2.pdf]] + + #+begin_src R :results output :session *R* :exports both + library(data.table) + aggregate_results <- function(results) { + x = data.table(results) + x = x[x$nb_proc %in% c(12, 48, 96, 144)] + x = as.data.frame(x[, list(time=mean(full_time), energy=mean(used_energy)), by=c("nb_proc")]) + return(x[with(x, order(nb_proc)),]) + } + aggr_real = aggregate_results(real_results) + aggr_sim = aggregate_results(sim_results) + aggr_vanilla = aggregate_results(simulation_vanilla_results) + aggr_sim$time_error = (aggr_sim$time - aggr_real$time)/aggr_real$time * 100 + aggr_sim$energy_error = (aggr_sim$energy - aggr_real$energy)/aggr_real$energy * 100 + aggr_sim$optimized = TRUE + aggr_vanilla$time_error = (aggr_vanilla$time - aggr_real$time)/aggr_real$time * 100 + aggr_vanilla$energy_error = (aggr_vanilla$energy - aggr_real$energy)/aggr_real$energy * 100 + aggr_vanilla$optimized = FALSE + aggr_results = rbind(aggr_vanilla, aggr_sim) + aggr_results$optimized <- factor(aggr_results$optimized, levels = c(TRUE, FALSE)) + #+end_src + + #+RESULTS: + +- Get the three colors used for the previous plots to use the ones corresponding to vanilla and optimized. + #+begin_src R :results output :session *R* :exports both + x = unique(ggplot_build(p1)$data[[1]]$colour) + x + colors = x[c(1, 2)] + colors + #+end_src + + #+RESULTS: + : [1] "#F8766D" "#00BA38" "#619CFF" + : [1] "#F8766D" "#00BA38" + + #+begin_src R :file hpl_analysis/taurus/errors.pdf :results value graphics :results output :session *R* :exports both :width 12 :height 4 + grid_arrange_shared_legend( + generic_do_plot(ggplot(aggr_results, aes(x=nb_proc, y=time_error, color=optimized))) + + geom_hline(yintercept=0) + + scale_color_manual(values=colors) + + xlab("Number of processes")+ + ylab("Relative error (percent)")+ + labs(colour="Optimized simulation")+ + ggtitle("Error on the duration prediction")+ + expand_limits(y=15)+ + expand_limits(y=-15), + generic_do_plot(ggplot(aggr_results, aes(x=nb_proc, y=energy_error, color=optimized))) + + geom_hline(yintercept=0) + + scale_color_manual(values=colors) + + xlab("Number of processes")+ + ylab("Relative error (percent)")+ + labs(colour="Optimized simulation")+ + ggtitle("Error on the energy consumption prediction")+ + expand_limits(y=15)+ + expand_limits(y=-15), + nrow=1, ncol=2 + ) + #+end_src + + #+RESULTS: + [[file:hpl_analysis/taurus/errors.pdf]] + +- The plots are funny. The shapes of the error plots for optimized and vanilla look similar, but shifted. They both + reach some high errors (~ 10%), but not for the same number of processes. Also, the optimized version is always above + 0 while the vanill is below 0 for some points. +- There are some mismatches between time prediction and energy prediction. For instance, optimized has a large error for + the time prediction of 144 processes, but nearly no error for the energy prediction. Similarly, vanilla + over-estimates the duration for 48 processes but under-estimates the energy consumption, which seems odd. +**** Plots for scalability test :SMPI:EXPERIMENTS:HPL:REPORT: +#+begin_src R :results output :session *R* :exports both +library(ggplot2) +library(ggrepel) +library(reshape2) +library(gridExtra) +results = rbind( + read.csv('scalability/result_500000_512.csv'), + read.csv('scalability/result_500000_1024.csv'), + read.csv('scalability/result_500000_2048.csv'), + read.csv('scalability/result_500000_4096.csv'), + read.csv('scalability/result_1000000_512.csv'), + read.csv('scalability/result_1000000_1024.csv'), + read.csv('scalability/result_1000000_2048.csv'), + read.csv('scalability/result_1000000_4096.csv'), + read.csv('scalability/result_2000000_512.csv'), + read.csv('scalability/result_2000000_1024.csv'), + read.csv('scalability/result_2000000_2048.csv'), + read.csv('scalability/result_2000000_4096.csv'), + read.csv('scalability/result_4000000_512.csv'), + read.csv('scalability/result_4000000_1024.csv'), + read.csv('scalability/result_4000000_2048.csv'), + read.csv('scalability/result_4000000_4096.csv') +) +results$simulation_time = results$simulation_time/3600 +results$memory_size = results$memory_size * 1e-9 +number_verb <- function(n) { + return(format(n,big.mark=",",scientific=FALSE)) +} +results$size_verb = factor(unlist(lapply(results$size, number_verb)), levels = c('500,000','1,000,000','2,000,000','4,000,000')) +results$nb_proc_verb = factor(unlist(lapply(results$nb_proc, number_verb)), levels = c('512', '1,024', '2,048', '4,096')) +results +#+end_src + +#+RESULTS: +#+begin_example + topology nb_roots nb_proc size full_time time Gflops +1 2;16,32;1,16;1,1;8 16 512 500000 91246.1 91246.02 913.3 +2 2;16,32;1,16;1,1;8 16 1024 500000 46990.1 46990.02 1773.0 +3 2;16,32;1,16;1,1;8 16 2048 500000 24795.5 24795.50 3361.0 +4 2;16,32;1,16;1,1;8 16 4096 500000 13561.0 13561.01 6145.0 +5 2;16,32;1,16;1,1 16 512 1000000 716521.0 716521.00 930.4 +6 2;16,32;1,16;1,1 16 1024 1000000 363201.0 363201.04 1836.0 +7 2;16,32;1,16;1,1 16 2048 1000000 186496.0 186495.70 3575.0 +8 2;16,32;1,16;1,1;8 16 4096 1000000 97836.6 97836.54 6814.0 +9 2;16,32;1,16;1,1 16 512 2000000 5685080.0 5685077.72 938.1 +10 2;16,32;1,16;1,1 16 1024 2000000 2861010.0 2861012.55 1864.0 +11 2;16,32;1,16;1,1 16 2048 2000000 1448900.0 1448899.09 3681.0 +12 2;16,32;1,16;1,1;8 16 4096 2000000 742691.0 742690.59 7181.0 +13 2;16,32;1,16;1,1;8 16 512 4000000 45305100.0 45305083.56 941.8 +14 2;16,32;1,16;1,1;8 16 1024 4000000 22723800.0 22723820.45 1878.0 +15 2;16,32;1,16;1,1;8 16 2048 4000000 11432900.0 11432938.62 3732.0 +16 2;16,32;1,16;1,1;8 16 4096 4000000 5787160.0 5787164.09 7373.0 + simulation_time application_time user_time system_time major_page_fault +1 0.3311083 204.992 1098.25 93.12 0 +2 0.6895222 441.897 2296.51 184.70 0 +3 1.4144361 872.425 4741.26 349.79 0 +4 3.1448889 1947.320 10640.63 679.53 0 +5 0.7319722 500.970 2367.19 259.91 0 +6 1.6771917 1036.960 5515.36 515.05 0 +7 3.4421944 2092.950 11389.36 995.39 0 +8 7.2368056 4362.660 24082.38 1966.10 0 +9 1.9263500 1169.660 6193.80 683.73 0 +10 4.2217500 2551.100 13714.01 1430.93 0 +11 8.9621111 5236.560 29357.92 2844.89 0 +12 18.0156389 10643.600 59444.40 5402.24 0 +13 4.8156944 3030.400 15090.31 1945.23 0 +14 10.6613611 6435.870 34249.71 3827.36 0 +15 23.2042222 13080.500 75523.95 7684.52 0 +16 47.1275000 26745.400 154314.76 15085.08 0 + minor_page_fault cpu_utilization uss rss page_table_size +1 960072 0.99 155148288 2055086080 10604000 +2 1054062 0.99 369696768 4383203328 21240000 +3 1282294 0.99 1012477952 9367576576 42912000 +4 1852119 0.99 3103875072 15318568960 87740000 +5 1916208 0.99 153665536 2317279232 10600000 +6 2002989 0.99 369676288 4837175296 21252000 +7 2154982 0.99 1010696192 7774138368 42908000 +8 2768705 0.99 3103895552 16934834176 87748000 +9 3801905 0.99 150765568 2758770688 10604000 +10 3872820 0.99 365555712 5273034752 21220000 +11 4038099 0.99 1009606656 7415914496 42884000 +12 4704339 0.99 3102445568 19464646656 87748000 +13 7663911 0.98 151576576 2056916992 10604000 +14 7725625 0.99 369872896 4120702976 21212000 +15 7917525 0.99 1012191232 9221050368 42880000 +16 8550745 0.99 3113381888 20408209408 87808000 + memory_size size_verb nb_proc_verb +1 0.2825585 500,000 512 +2 0.4299489 500,000 1,024 +3 0.9628262 500,000 2,048 +4 2.8140421 500,000 4,096 +5 0.8944435 1,000,000 512 +6 1.0553098 1,000,000 1,024 +7 1.5811707 1,000,000 2,048 +8 3.4254070 1,000,000 4,096 +9 3.3384202 2,000,000 512 +10 3.4971116 2,000,000 1,024 +11 4.0274084 2,000,000 2,048 +12 5.9101348 2,000,000 4,096 +13 13.0790605 4,000,000 512 +14 13.2755579 4,000,000 1,024 +15 13.8251837 4,000,000 2,048 +16 15.7636690 4,000,000 4,096 +#+end_example + +#+begin_src R :file scalability/1.pdf :results value graphics :results output :session *R* :exports both :width 6 :height 4 +size_time = generic_do_plot(ggplot(results, aes(x=size, y=simulation_time, color=nb_proc_verb))) + + xlab("Matrix size") + + ylab("Simulation time (hours)") + + labs(colour="Number of processes")+ + ggtitle("Simulation time for different matrix sizes")+ + theme(legend.position = "none")+ + geom_text_repel( + data = subset(results, size == max(size)), + aes(label = nb_proc_verb), + nudge_x = 45, + segment.color = NA, + show.legend = FALSE + ) +size_time +#+end_src + +#+RESULTS: +[[file:scalability/1.pdf]] + +#+begin_src R :file scalability/2.pdf :results value graphics :results output :session *R* :exports both :width 6 :height 4 +nbproc_time = generic_do_plot(ggplot(results, aes(x=nb_proc, y=simulation_time, color=size_verb))) + + xlab("Number of processes") + + ylab("Simulation time (hours)") + + labs(colour="Matrix size")+ + ggtitle("Simulation time for different number of processes")+ + theme(legend.position = "none")+ + geom_text_repel( + data = subset(results, nb_proc == max(nb_proc)), + aes(label = size_verb), + nudge_x = 45, + segment.color = NA, + show.legend = FALSE + ) +nbproc_time +#+end_src + +#+RESULTS: +[[file:scalability/2.pdf]] + +#+begin_src R :file scalability/3.pdf :results value graphics :results output :session *R* :exports both :width 6 :height 4 +size_mem = generic_do_plot(ggplot(results, aes(x=size, y=memory_size, color=nb_proc_verb))) + + xlab("Matrix size") + + ylab("Memory consumption (gigabytes)") + + labs(colour="Number of processes")+ + ggtitle("Memory consumption for different matrix sizes")+ + theme(legend.position = "none")+ + geom_text_repel( + data = subset(results, size == max(size)), + aes(label = nb_proc_verb), + nudge_x = 45, + segment.color = NA, + show.legend = FALSE + ) +size_mem +#+end_src + +#+RESULTS: +[[file:scalability/3.pdf]] + +#+begin_src R :file scalability/4.pdf :results value graphics :results output :session *R* :exports both :width 6 :height 4 +nbproc_mem = generic_do_plot(ggplot(results, aes(x=nb_proc, y=memory_size, color=size_verb))) + + xlab("Number of processes") + + ylab("Memory consumption (gigabytes)") + + labs(colour="Matrix size")+ + ggtitle("Memory consumption for different number of processes")+ + theme(legend.position = "none")+ + geom_text_repel( + data = subset(results, nb_proc == max(nb_proc)), + aes(label = size_verb), + nudge_x = 45, + segment.color = NA, + show.legend = FALSE + ) +nbproc_mem +#+end_src + +#+RESULTS: +[[file:scalability/4.pdf]] + +#+begin_src R :file scalability/plot_size.pdf :results value graphics :results output :session *R* :exports both :width 12 :height 4 +grid_arrange_shared_legend(size_time, size_mem, nrow=1, ncol=2) +#+end_src + +#+RESULTS: +[[file:scalability/plot_size.pdf]] + +#+begin_src R :file scalability/plot_nbproc.pdf :results value graphics :results output :session *R* :exports both :width 12 :height 4 +grid_arrange_shared_legend(nbproc_time, nbproc_mem, nrow=1, ncol=2) +#+end_src + +#+RESULTS: +[[file:scalability/plot_nbproc.pdf]] + +*** 2017-06-06 Tuesday +**** Discussion about the report :MEETING:REPORT: +***** State of the art +****** Important features +- offline vs. online, en particulier pour HPL (probes pour le pipeline des comms) +- si online: language scope, besoin de modifier le code pour que ça passe +- modèles: notion de topologies et prise en compte de la contention (super important a priori), prise en compte des + spécificités des communications avec MPI (sémantique de synchronisation, différents ranges de performance, + probablement pas trop grave dans le cas de HPL), collectives (dans le cas de HPL, on s’en fiche) + - modèle classique dans ce contexte = LogP* mais prend mal en compte la contention (au niveau des noeuds mais pas du + tout au niveau de la topologie) + - deux approches principales: packet level et flow level +- passage à l’échelle: ça motive l’utilisation de Parallel DES et de techniques d’émulation d’applications MPI un peu + “système” +****** Projects: +- Dimemas (Barcelona Supercomputing center), offline (extrae/paraver), “performance debugging” (sensibility analysis, what if, +performance prediction+) +- LogoPsim (Torsten Hoefler), offline (dag, GOAL), collective algorithms @ scale +- SST macro, online/offline (DUMPI), MPI only, skeletonization/templating, more robust but more specialized (C++) +- +BigSIM (?)+, offline, PDES éventuellement, projet mort. source-to-source transformation for privatization for CHARM++/AMPI +- xSim, online, PDES aux modèles sous-jacents à la validité plus que discutable, mais scalable, privatization à coup de + copie du segment data et pas de mmap +- CODES, offline, PDES, new kid on the block +***** Validation and capacity planning +- For the comparison with a real execution (Taurus), get the data for real experiment by executing the org-file. Long (~ + 5 minutes). +- On capacity planning, it is expected that removing switches has little to no impact. Computation is in O(n^3) while + communication is in O(n^2) (and most of the communications are asynchronous, so happen during computations). +**** Webinar :MEETING: +- [[https://github.com/alegrand/RR_webinars/blob/master/9_experimental_testbeds/index.org][Testbeds in computer science]] +- Lucas Nussbaum +*** 2017-06-07 Wednesday +**** DONE Some text is displayed in the pdf but not in the printed version :REPORT: +:LOGBOOK: +- State "DONE" from "TODO" [2017-06-07 Wed 09:43] +- State "TODO" from [2017-06-07 Wed 09:16] +:END: +- It seems that no text entered between === signs (translated to =\texttt= in Latex) appear in the printed version of the + report. It is displayed correctly in the pdf. Fix this. +- Reprint the first page of the same file, it is fixed now. The difference is that I printed it by the network and not + by plugging a USB stick in the printer. +**** Network printer setup :TOOLS: +- Make sure the right package is installed: + #+begin_src sh + sudo aptitude install cups-browsed + #+end_src +- Add these lines to the file =/etc/cups/cups-browsed.conf=: + #+begin_example + BrowseRemoteProtocols cups + BrowsePoll print.imag.fr:631 + #+end_example +- Enable the service: + #+begin_src sh + sudo systemctl enable cups-browsed + #+end_src +- Restart the service: + #+begin_src sh + sudo service cups-browsed restart + #+end_src +*** 2017-06-08 Thursday +**** Capacity planning: components :SMPI:EXPERIMENTS:HPL:REPORT: +- Simgrid commit: =9a8e2f5bce8c6758d4367d21a66466a497d136fe= +- HPL commit: =41774905395aebcb73650defaa7e2aa462e6e1a3= +- Script commit: =c2d1d734c80f084157ad70d702e8c669772fb2e4= +- Command (used on =nova-21=, configured as above experiments): + #+begin_src sh + bash run_capacity_planning.sh 100000 512 + + bash run_capacity_planning.sh 50000 512 + #+end_src +- Results: + #+begin_src R :results output :session *R* :exports both + library(ggplot2) + library(reshape2) + library(gridExtra) + + get_results <- function(directory, name) { + result <- read.csv(paste('capacity_planning/', directory, '/', name, '.csv', sep='')) + result$name = name + return(result) + } + get_all_results <- function(directory) { + results = data.frame() + for(type in c('bandwidth', 'latency', 'speed')) { + for(subtype in c('high', 'low')) { + name = paste(type, subtype, sep='_') + tmp = get_results(directory, name) + tmp$type = type + if(type == 'latency'){ + if(subtype == 'high') + tmp$subtype = 'bad' + else + tmp$subtype = 'good' + } + else { + if(subtype == 'high') + tmp$subtype = 'good' + else + tmp$subtype = 'bad' + } + results = rbind(results, tmp) + } + default = get_results(directory, 'default') + default$type = type + default$subtype = 'default' + results = rbind(results, default) + } + return(results[c('size', 'Gflops', 'type', 'subtype')]) + } + results_1E5 = get_all_results('exp_100000_512') + results_5E4 = get_all_results('exp_50000_512') + results_1E5 + results_5E4 + #+end_src + + #+RESULTS: + #+begin_example + size Gflops type subtype + 1 100000 710.40 bandwidth good + 2 100000 702.20 bandwidth bad + 3 100000 722.70 bandwidth default + 4 100000 349.10 latency bad + 5 100000 823.70 latency good + 6 100000 722.70 latency default + 7 100000 3419.00 speed good + 8 100000 83.94 speed bad + 9 100000 722.70 speed default + size Gflops type subtype + 1 50000 458.80 bandwidth good + 2 50000 477.00 bandwidth bad + 3 50000 475.20 bandwidth default + 4 50000 127.30 latency bad + 5 50000 697.60 latency good + 6 50000 475.20 latency default + 7 50000 1346.00 speed good + 8 50000 71.95 speed bad + 9 50000 475.20 speed default +#+end_example + +#+begin_src R :results output :session *R* :exports both +do_plot <- function(results, type) { + tmp = results[results$type == type,] + title = paste('HPL performance estimation for different components\nMatrix size of', + format(unique(results$size),big.mark=",",scientific=FALSE)) + plot = ggplot(results, aes(x=type, y=Gflops, color=subtype, shape=subtype)) + + geom_point(size=4, stroke=1) + + scale_shape_manual(values = c(0, 1, 2))+ + theme_bw()+ + expand_limits(x=0, y=0)+ + ggtitle(title)+ + xlab('Component')+ + ylab('Performance estimation (Gflops)')+ + labs(colour='Metric')+ + labs(shape='Metric') + return(plot) +} +#+end_src + +#+RESULTS: + +#+begin_src R :file capacity_planning/components_perf.pdf :results value graphics :results output :session *R* :exports both :width 12 :height 4 +grid_arrange_shared_legend( + do_plot(results_5E4, 'bandwidth') + expand_limits(x=0, y=max(results_1E5$Gflops)), + do_plot(results_1E5, 'bandwidth') + expand_limits(x=0, y=max(results_1E5$Gflops)), + nrow=1, + ncol=2 +) +#+end_src + +#+RESULTS: +[[file:capacity_planning/components_perf.pdf]] +**** Capacity planning: topology :SMPI:EXPERIMENTS:HPL:REPORT: +- Simgrid commit: =9a8e2f5bce8c6758d4367d21a66466a497d136fe= +- HPL commit: =41774905395aebcb73650defaa7e2aa462e6e1a3= +- Script commit: =c2d1d734c80f084157ad70d702e8c669772fb2e4= +- Four series of experiments: + + Bandwidth of 10Gbps, sequential mapping of the processes + + Bandwidth of 10Gbps, random mapping of the processes + + Bandwidth of 10Mbps, sequential mapping of the processes + + Bandwidth of 10Mbps, random mapping of the processes +- For the series with a bandwidth of 10MBps, the file =topology.py= has been locally modified to use a bandwidth 1000 + times lower: + #+begin_example + 176 % git diff -- INSERT -- 15:42:08 + diff --git a/topology.py b/topology.py + index 2d7d76c..1a3cd67 100644 + --- a/topology.py + +++ b/topology.py + @@ -158,7 +158,7 @@ class FatTree: + prefix = 'host-' + suffix = '.hawaii.edu' + speed = '1Gf' + - bw = '10Gbps' + + bw = '10Mbps' + lat = '2.4E-5s' + loopback_bw = '5120MiBps' + loopback_lat = '1.5E-9s' + #+end_example +- Command (used on =nova-2=, =nova-8=, =nova-15= and =nova-16= configured as above experiments): + + For sequential mapping: + #+begin_src sh + ./run_measures.py --global_csv result_capacity_50000.csv --nb_runs 3 --size 50000 --nb_proc 512 --topo + "2;16,32;1,1:16;1,1" --experiment HPL --running_power 5004882812.500 --hugepage /root/huge + #+end_src + + For random mapping: + #+begin_src sh + ./run_measures.py --global_csv result_capacity_50000.csv --nb_runs 3 --size 50000 --nb_proc 512 --topo + "2;16,32;1,1:16;1,1" --experiment HPL --running_power 5004882812.500 --hugepage /root/huge --shuffle_hosts + #+end_src +- For the random mapping with 10Mbps bandwidth, more runs have been done (8 instead of 3) to get rid of any bias. +- Results: + #+begin_src R :results output :session *R* :exports both + library(ggplot2) + results_highbw_sequential <- read.csv("capacity_planning/exp_topo_50000_512/result_capacity_50000.csv") + results_highbw_random <- read.csv("capacity_planning/exp_topo_50000_512/result_capacity_50000_shuffled.csv") + results_lowbw_sequential <- read.csv("capacity_planning/exp_topo_50000_512/result_capacity_50000_lowbw.csv") + results_lowbw_random <- read.csv("capacity_planning/exp_topo_50000_512/result_capacity_50000_lowbw_shuffled.csv") + results_highbw_sequential$mapping = "Sequential" + results_highbw_random$mapping = "Random" + results_lowbw_sequential$mapping = "Sequential" + results_lowbw_random$mapping = "Random" + results_highbw = rbind(results_highbw_sequential, results_highbw_random) + results_highbw$bandwidth = '10Gbps' + results_lowbw = rbind(results_lowbw_sequential, results_lowbw_random) + results_lowbw$bandwidth = '10Mbps' + #+end_src + + #+RESULTS: + + #+begin_src R :results output :session *R* :exports both + do_plot <- function(results) { + title = paste('HPL performance estimation for different topologies\nBandwidth of', unique(results$bandwidth)) + plot = generic_do_plot(ggplot(results, aes(x=nb_roots, y=Gflops, color=mapping, shape=mapping)), fixed_shape=FALSE) + + ggtitle(title)+ + xlab('Number of L2 switches')+ + ylab('Performance estimation (Gflops)')+ + scale_shape_manual(values = c(1, 2))+ + labs(colour='Mapping')+ + labs(shape='Mapping') + return(plot) + } + #+end_src + + #+RESULTS: + + #+begin_src R :file capacity_planning/topology.pdf :results value graphics :results output :session *R* :exports both :width 12 :height 6 + grid_arrange_shared_legend( + do_plot(results_lowbw), + do_plot(results_highbw), + nrow=1, ncol=2 + ) + #+end_src + + #+RESULTS: + [[file:capacity_planning/topology.pdf]] + +- The results for 10MBps are somehow expected. Removing switches deteriorates the performances, using a random mapping + of the processes makes things even worse. Also, we can observe some performance peaks for 4, 8 and 16 root + switches. Maybe this is due to the D mod K algorithm (TODO: check that this is indeed this algorithm). For instance, + 16 divides 512 but 15 does not. So the load of all messages should be spread more uniformly with 16 root switches than + with 15. +- For 10GBps however, this is more strange. The number of switches has no impact, but this has already been observed on + previous experiments (see [2017-06-02 Fri]). What is more surprising however is that the random mapping yields to + betterperformances than the sequential mapping. Is it a bug? + + #+begin_src R :file capacity_planning/topology_sim_time.pdf :results value graphics :results output :session *R* :exports both :width 6 :height 4 + tmp = rbind(results_lowbw, results_highbw) + tmp$bandwidth <- factor(tmp$bandwidth, levels = c('10Mbps', '10Gbps')) + generic_do_plot(ggplot(tmp, aes(x=nb_roots, y=simulation_time, color=mapping, shape=mapping, linetype=bandwidth)), fixed_shape=FALSE)+ + ggtitle('Simulation time for different networks')+ + xlab('Number of L2 switches')+ + ylab('Simulation time (seconds)')+ + scale_shape_manual(values = c(1, 2))+ + labs(colour='Mapping')+ + labs(shape='Mapping')+ + labs(linetype='Bandwidth') + #+end_src + + #+RESULTS: + [[file:capacity_planning/topology_sim_time.pdf]] + + #+begin_src R :file capacity_planning/2.png :results value graphics :results output :session *R* :exports both + results_lowbw$simgrid_time = results_lowbw$simulation_time - results_lowbw$application_time + generic_do_plot(ggplot(results_lowbw, aes(x=nb_roots, y=simgrid_time, color=mapping))) + #+end_src + + #+RESULTS: + [[file:capacity_planning/2.png]] + + #+begin_src R :results output :session *R* :exports both + library(data.table) + aggregate_results <- function(results) { + x = data.table(results) + x = as.data.frame(x[, list(Gflops=mean(Gflops)), by=c("nb_roots")]) + return(x[with(x, order(nb_roots)),]) + } + aggr_seq = aggregate_results(results_lowbw_sequential) + aggr_rand = aggregate_results(results_lowbw_random) + aggr_rand$gflops_ratio = aggr_seq$Gflops / aggr_rand$Gflops + #+end_src + + #+begin_src R :file capacity_planning/3.png :results value graphics :results output :session *R* :exports both + generic_do_plot(ggplot(aggr_rand, aes(x=nb_roots, y=gflops_ratio))) + #+end_src + + #+RESULTS: + [[file:capacity_planning/3.png]] + +- There are *huge* differences (factor 10) in the simulation time depending on the mapping and the number of root + switches. This time is spent in Simgrid. It certainly comes from more complex communication behaviors (congestion) + that gives much more work to the network part of Simgrid. + +*** 2017-06-09 Friday +**** TODO Work on =run_measure.py= script [0/5] :PYTHON: +:LOGBOOK: +- State "TODO" from [2017-06-09 Fri 15:47] +:END: +- [ ] Clean the code. In particular, remove the stuff related to the small matrix product test. +- [ ] Write some unit tests. +- [ ] Add options, e.g. to set the bandwidth or the latency without modifying the code. +- [ ] Add flexibility in the way the series of experiments are described. Maybe describe them with Python code in a + separate file? Or a JSON file? +- [ ] Parallelism: allow to launch experiments on remote machines by ssh. +*** 2017-06-12 Monday +**** Add [[file:hplpaper.pdf][The LINPACK Benchmark: Past, Present and Future]] :PAPER: +Bibtex: Dongarra03thelinpack +*** 2017-06-14 Wednesday +**** Add [[file:simgrid.pdf][Versatile, Scalable and Accurate Simulation of Distributed Applications and Platforms]] :PAPER: +Bibtex: casanova:hal-01017319 +**** Add [[file:CKP93.pdf][LogP: Towards a Realistic Model of Parallel Computation]] :PAPER: +Bibtex: Culler_1993 +**** Finally found a grammar checker \o/ :TOOLS: +- Check https://www.languagetool.org/ +- It can be used in several ways, one of which is a command line tool +*** 2017-06-19 Monday +**** Discussion about the slides :MEETING: +- Grosse partie sur le contexte (~10min). + + Supercalculateurs du top500 (dont Stampede), avec leur topologie. Aussi Piz-daint (dragonfly, en Suisse). Monter la + variabilité des topologies. Photos des supercalculateurs et schéma de la topo. + + Routage, workload, placement des processus. + + HPL, HPL sur Stampede, lutte contre le n^3, et aussi contre le p (p*n^2 dans la complexité). +- Pas d’état de l’art (ou seulement on-line vs off-line). +- Dessins: inkscape ou xfig +- Contribution: pas trop long (~7min). +- Validation: but final, comparer avec Stampede. +- Ouverture: capacity planning, étude de topologie. +*** 2017-06-20 Tuesday +**** Pré-soutenance Tom :MEETING: +:LOGBOOK: +CLOCK: [2017-06-20 mar. 10:30]--[2017-06-20 mar. 10:51] => 0:21 +:END: + +- Numéroter les slides 1/18 +- Essayer d'inclure le schéma général des modifications +- distiller des informations sur le type de gain. + +Slides: +1. Rank ? + - +On peut plus accélérer les processeurs?+ + - Informations sur l'échelle, sur la topologie, la diversité + #+BEGIN_QUOTE + As an answer to the power and heat challenges, processor + constructors have increased the amount of computing units (or + cores) per processor. Modern High Performance Computing (HPC) + systems comprise thousands of nodes, each of them holding several + multi-core processors. For example, one of the world fastest + computers, the IBM Sequoia system 1 Laurence Livermoor National + Laboratory (USA), contains 96 racks of 98,304 nodes + interconnected through a 5-dimensional torus and comprising + 16-core each, for a total of 1,572,864 cores. The Cray Titan + system 2 at Oak Ridge National Laboratory is made of 18,688 AMD + Opteron (16-core CPUs) and 18,688 Nvidia Tesla K20X GPUs + interconnected through a Gemini three-dimensional torus. Another + recent Cray machine, Piz Daint 3 at the Swiss National + Supercomputing Centre, comprises 5,272 nodes (with 8 cores and a + Nvidia Tesla K20X GPU each) interconnected through a custom Aries + dragonfly topology. More recently, the Tianhe-2 4 was built with + 32,000 Intel Xeon (12 cores) and 48,000 Xeon Phi 31S1P + interconnected through a TH-Express fat tree. Finally the Sunway + TaihuLight 5 (Jiangsu, China), which is currently the fastest + supercomputer in the world, is made of 40,950 nodes + interconnected through a custom five level hierarchy of cabinets + and comprising each 260 custom RISC cores for a total 10,649,600 + cores + #+END_QUOTE +2. HPL: + - où N est le rang de la matrice + - ça marche comme un pivot de Gauss + - recherche du maximum, petite factorization, diffusion, update + et on recommence + - dans le code, c'est un peu mélangé afin de bien recouvrir les + calculs et les communications +3. Lien avec le slide 1... + - Transitions un peu maladroites. Expliquer que c'est un domaine + très actif. +4. +SimGrid+ Simulation of HPC applications + - Trace. + - Deux problèmes (taille pour obtention de la trace, application + dynamique -faire le lien avec HPL-) + - Émulation à la simgrid: *exclusion mutuelle*. Avantage = + émulation sans modification mais ne passe pas à l'échelle. Il + faut des approches hybrides. + - Plein de projets. Majoritairement offline. SimGrid permet les deux. +5. 10:36. Pourquoi Stampede ? On s'en fiche. On est là pour donner un + ordre de grandeur + - 500 jours de calcul sans même compter la simulation de + l'application elle même. +6. + - Laboratory notebook and scripts + - Modified HPL + - Modifications to SimGrid +7. Intégrer le "To sum up" dans cette série de slides. + - T_dgemm = .4354*M*N*L + - T_dtrsm = .234234*M*N^2 + Gain = ??? ordre de grandeur. Illustration sur une config donnée + (une petite et une grande ?). +8. + - Négligeable mais gain important ? + - Quantité de modifications sur HPL ? + - À ce stade, On ne fait quasiment plus de calcul, mais la + consommation mémoire reste importante. +9. + - L'application accède de temps en temps à ces zones donc on ne + peut pas simplement supprimer ces allocations... +10. Panel = information échangée entre les processus au cours de + l'exécution. +11. 10:44 + - Cette allocation pose un "problème". + - Modification HPL ? +12. + - Consequences = observation à grande échelle. +13. + - +14. +15. +16. 10:50 + Cas difficile, erreur principalement sur 1 noeud et qui diminue + apprès, *sous-estimation systématique* + - Expérience à petite échelle + - Sous-estimation systématique + - Facteur 2 sur les outliers ? + + Optimistic après dgemm. +17. Conclusion + - Modifications légères de HPL + - Nouvelles fonctionnalités dans SG + - Démontré qu'on pouvait simuler à cette échelle tout en prenant + en compte les caractéristiques fines de la topologie, du + placement, ... +18. Rajouter capacity planning ? +**** Last remarks from Arnaud :MEETING: +- Various functions → swap, max, ... +- Simulation of HPC application → parler de Simgrid +- Slide 7: écrire que c’est très optimiste (remplacer \approx par \geq ) +- Slide 18: ajouter un mot sur les aspects failure et énergétiques +- Slide 16 → systematically +- Slide 1: ajouter nom, rang, nombre de noeuds et de coeurs, topologie +*** 2017-06-21 Wednesday +**** Pré-soutenance Tom V2 :MEETING: +:LOGBOOK: +CLOCK: [2017-06-21 mer. 10:43]--[2017-06-21 mer. 11:03] => 0:20 +:END: + +Intro: simulation MPI à large échelle... et capacity planning ? +1. +2. Expliquer l'algo avant l'animation + - mentionner l'overlapping +3. Il y a des +questions+... Il y a plusieurs leviers sur lesquels on + peut agir pour que ça aille plus vite. + - Il y a des "recettes". Les gens disent "je veux ça" mais c'est + leur expérience/avis et l'argumentation est limitée +4. + - applications adaptatives. C'est d'ailleurs le cas de HPL + - avantage/inconvénient de l'approche émulée? +5. +6. + - several optimizations (certaines assez logiques et d'autres qui + étaient moins évidentes) +7. +8. +9. + - pourquoi ne pas simplement enlever les mallocs ? +10. +11. Bon, ben maintenant, on a enlevé tous les calculs, toutes les + allocations, il ne reste quasiment plus que le contrôle. Et + pourtant, à grande échelle, ça ne passe toujours pas. +12. +13. + - Vous voyez, l'effet quadratique en N et en P est toujours là et + c'est ça qui était dur. +14. + - Expliquer la courbe! C'est très petit. + - Pas d'outlier. Dire plutôt pas de variabilité, et du coup pas + d'outlier. Problème: ça à des conséquences à cause des synchros. + - Modèle optimiste (pas de variabilité injectée, partage de bande + passante parfait) +15. +*** 2017-06-23 Friday +**** Trying to understand the low CPU utilization for large allocations :C:EXPERIMENTS: +- According to Olivier, the low CPU utilization when doing large allocations (without huge pages) is not expected. Let’s + investigate. +- Script commit: =80c6cd6f0853821a08da3994ce89572c9996b5ea= +- Command (the size correspond to an allocation of a matrix of size at most 600,000): + #+begin_src sh + ./cpu_utilization.py 8 2880000000000 /tmp/cpu_exp.csv + #+end_src +- Analysis: + #+begin_src R :results output :session *R* :exports both + library(ggplot2) + results <- read.csv('cpu_utilization/cpu_exp.csv') + #+end_src + + #+begin_src R :file cpu_utilization/1.png :results value graphics :results output :session *R* + ggplot(results, aes(x=size, y=cpu_utilization)) + + geom_point() + geom_line() + #+end_src + + #+RESULTS: + [[file:cpu_utilization/1.png]] + +- So we reproduce this behavior outside of HPL and Simgrid. +**** DONE Draw a flame graph with this small program and a large allocation. +:LOGBOOK: +- State "DONE" from "TODO" [2017-06-23 Fri 19:41] +- State "TODO" from [2017-06-23 Fri 16:13] +:END: +**** Flame graph for the CPU utilization :C:EXPERIMENTS: +- Script commit: =80c6cd6f0853821a08da3994ce89572c9996b5ea= +- Command (the size correspond to an allocation of a matrix of size 600,000): + #+begin_src sh + sudo perf record -F1000 --call-graph dwarf ./page_faults 1 2880000000000 1 + + sudo perf script | ~/Documents/FlameGraph/stackcollapse-perf.pl --kernel | ~/Documents/FlameGraph/flamegraph.pl > /tmp/flame_2880000000000.svg + #+end_src +- Kernel version: + #+begin_src sh + uname -r + #+end_src + + #+RESULTS: + : 4.4.0-81-generic + +- Result: + [[file:cpu_utilization/flame_2880000000000.svg]] +- This flame graph is very interesting, although incomplete. First, note that the function =main= accounts for less than + 40% of the samples, which is approximately equal to the CPU utilization. It means that this approach also captures + what is done when the process is *not* executed. +- Most of the time spent in the function =main= is spent in a function =do_page_fault=. +- The remaining 60% of the whole execution time is spent on two functions, one unknown, and one called =native_irq_return_iret=. +- It is also strange to see this very large function =page_faults= located below the function =main= (and =_start=) and not on + the side, although these functions are (a priori) not called by the function =page_faults=. Maybe a bug of =perf=? +**** TODO Next steps in the investigation of low CPU utilization [4/7] +:LOGBOOK: +- State "TODO" from "TODO" [2017-06-26 Mon 10:27] +- State "TODO" from "TODO" [2017-06-26 Mon 10:27] +- State "TODO" from "TODO" [2017-06-26 Mon 10:27] +- State "TODO" from [2017-06-23 Fri 19:52] +:END: +- [X] Plot the CPU utilization for different number of calls to =memset= (including 0). +- [ ] Draw the flame graph with more calls to =memset=. +- [ ] Draw the flame graph with no call to =memset=. +- [X] Try other flags for the =mmap=, try adding the flag =MAP_POPULATE=. +- [X] Try with another kernel version. +- [X] Try with huge pages, to see the difference. +- [ ] Speak with someone (Olivier? Samuel? Vincent? Stack Overflow?). +*** 2017-06-24 Saturday +**** Small test: several calls to =memset= :C:EXPERIMENTS: +- Script commit: =b8a110e9a57c821b37a3843738b97bc0affb52f6= +- No call to =memset=: + #+begin_src sh + /usr/bin/time ./page_faults 1 2880000000000 0 + #+end_src + #+begin_example + 2.00202 + 0.04user 1.95system 0:02.00elapsed 99%CPU (0avgtext+0avgdata 5108maxresident)k + 0inputs+4096outputs (0major+521minor)pagefaults 0swaps + #+end_example +- One call to =memset=: + #+begin_src sh + /usr/bin/time ./page_faults 1 2880000000000 1 + #+end_src + #+begin_example + 2013.29 + 158.71user 604.73system 33:33.29elapsed 37%CPU (0avgtext+0avgdata 2812501956maxresident)k + 0inputs+102400outputs (0major+703125270minor)pagefaults 0swaps + #+end_example +- Ten call to =memset=: + #+begin_src sh + /usr/bin/time ./page_faults 1 2880000000000 10 + #+end_src + #+begin_example + 23344.3 + 1622.97user 5224.14system 6:29:04elapsed 29%CPU (0avgtext+0avgdata 2812502520maxresident)k + 0inputs+958464outputs (0major+7031250411minor)pagefaults 0swaps + #+end_example +- No call to =memset=, but using the flag =MAP_POPULATE=: + #+begin_src sh + /usr/bin/time ./page_faults 1 2880000000000 0 + #+end_src + #+begin_example + 136.016 + 0.04user 103.22system 2:16.01elapsed 75%CPU (0avgtext+0avgdata 2812501680maxresident)k + 0inputs+4096outputs (0major+43946592minor)pagefaults 0swaps + #+end_example +- When no accesses are made and the flag =MAP_POPULATE= is not used, then the execution is very fast, there is nearly no + page fault and the CPU utilization is high. +- With one access, we get the very low CPU utilization and the very large time. +- With ten accesses, the CPU utilization is even lower, the number of page faults is ten times higher and both user time + and system time are also about ten times higher. This is very strange. +- With no access but with the flag =MAP_POPULATE=, there is a much larger time and number of page faults, but still about + ten times lower than with one access and no =MAP_POPULATE=. +*** 2017-06-25 Sunday +**** More experments about the low CPU utilization for large allocations :C:R:EXPERIMENTS: +- Script commit: =b8a110e9a57c821b37a3843738b97bc0affb52f6=, modified to have between 0 and 3 calls to =memset=. +- With these results, the flag =MAP_POPULATE= is *not* used when calling =mmap=. +- Command (the size correspond to an allocation of a matrix of size at most 300,000): + #+begin_src sh + ./cpu_utilization.py 100 720000000000 cpu_exp2.csv + #+end_src +- Run on =nova-17= with kernel =4.9.0-2-amd64=. +- Analysis: + #+begin_src R :results output :session *R* :exports both + library(gridExtra) + library(ggplot2) + results <- read.csv('cpu_utilization/cpu_exp2.csv') + #+end_src + + #+begin_src R :file cpu_utilization/2.png :results value graphics :results output :session *R* :width 800 + p1 = ggplot(results, aes(x=size, y=cpu_utilization, color=factor(mem_access))) + + geom_point() + geom_line() + p2 = ggplot(results, aes(x=size, y=total_time, color=factor(mem_access))) + + geom_point() + geom_line() + grid.arrange(p1, p2, ncol=2) + #+end_src + + #+RESULTS: + [[file:cpu_utilization/2.png]] + + #+begin_src R :file cpu_utilization/3.png :results value graphics :results output :session *R* :width 800 + p1 = ggplot(results, aes(x=size, y=user_time, color=factor(mem_access))) + + geom_point() + geom_line() + p2 = ggplot(results, aes(x=size, y=system_time, color=factor(mem_access))) + + geom_point() + geom_line() + grid.arrange(p1, p2, ncol=2) + #+end_src + + #+RESULTS: + [[file:cpu_utilization/3.png]] + + #+begin_src R :file cpu_utilization/4.png :results value graphics :results output :session *R* :width 800 + p1 = ggplot(results, aes(x=size, y=memory_size, color=factor(mem_access))) + + geom_point() + geom_line() + p2 = ggplot(results, aes(x=size, y=nb_page_faults, color=factor(mem_access))) + + geom_point() + geom_line() + grid.arrange(p1, p2, ncol=2) + #+end_src + + #+RESULTS: + [[file:cpu_utilization/4.png]] + +- Finally, the number of accesses to the buffer does not seem to impact the CPU utilization. The difference we observed + on [2017-06-24 Sat] was probably only noise. +- The user time seems to be proportionnal to both the allocation size and the number of calls to =memset=. This is expected. +- The system time also seems to be proportionnal to them. We could expect the impact of the allocation size, but the + impact of the number of accesses is not trivial. It seems to come from the number of page faults (which is also + proportionnal to both the allocation size and the number of accesses). But the plot of the number of page faults is + hard to understand. Why would more accesses cause more page faults, when the page table is already initialized? +- Another strange thing is that the memory consumption is lower with only one access than with two or three. They should + all have the same page table size and thus the same memory consumption. +*** 2017-06-26 Monday +**** Small test: several calls to =memset= with huge pages :C:EXPERIMENTS: +- Script commit: =005461dad4c06a2e2463d54eec228e65c07b1015= +- Compilation: + #+begin_src sh + gcc -DHUGEPAGE -std=gnu11 -ggdb3 -O3 -o page_faults page_faults.c -Wall + #+end_src +- So, same experiment than [2017-06-24 Sat], except that huge pages and the =MAP_POPULATE= flag are used. +- No call to =memset=, but using the flag =MAP_POPULATE=: + #+begin_src sh + 3.34278 + 0.04user 3.29system 0:03.34elapsed 99%CPU (0avgtext+0avgdata 1476maxresident)k + 0inputs+0outputs (0major+65minor)pagefaults 0swaps + #+end_src + Much lower number of page faults and system time. Higher CPU utilization. +- One call to =memset=: + #+begin_src sh + /usr/bin/time ./page_faults 1 2880000000000 1 + #+end_src + #+begin_example + 102.2 + 98.77user 3.26system 1:42.20elapsed 99%CPU (0avgtext+0avgdata 1492maxresident)k + 0inputs+0outputs (0major+67minor)pagefaults 0swaps + #+end_example + In comparison with the case where no huge pages are used, the number of page faults and the time are much lower. + Also, the system time and the number of page faults are the same than the previous test, where no =memset= were done, + only the user time increased. + It is strange that the number of page faults is so low. With such an allocation size, we have about 1.3M huge pages. +- Ten call to =memset=: + #+begin_src sh + /usr/bin/time ./page_faults 1 2880000000000 10 + #+end_src + #+begin_example + 988.682 + 984.74user 3.45system 16:28.68elapsed 99%CPU (0avgtext+0avgdata 1488maxresident)k + 0inputs+0outputs (0major+66minor)pagefaults 0swaps + #+end_example + Same system time and number of page faults than with only one call to =memset=, only the user time increases. This is + the expected behavior. +- Let’s try without =MAP_POPULATE= flag. +- One call to =memset=: + #+begin_src sh + /usr/bin/time ./page_faults 1 2880000000000 1 + #+end_src + #+begin_example + 102.302 + 99.10user 3.18system 1:42.30elapsed 99%CPU (0avgtext+0avgdata 1520maxresident)k + 0inputs+0outputs (0major+1373356minor)pagefaults 0swaps + #+end_example + The number of page faults is now as expected, but this did not change the system time. +- Ten calls to =memset=: + #+begin_src sh + /usr/bin/time ./page_faults 1 2880000000000 10 + #+end_src + #+begin_example + 1001.42 + 997.40user 3.30system 16:41.41elapsed 99%CPU (0avgtext+0avgdata 1572maxresident)k + 0inputs+0outputs (0major+1373359minor)pagefaults 0swaps + #+end_example + We observe the same behavior than with the flag =MAP_POPULATE=: going from 1 call to =memset= to 10 does not impact the + number of page faults or the system time, it only changes the user time. +***** Conclusion +Using classical pages or huge pages does not only change the page size (and thus the page table size). It actually +changes the *behavior* of the OS. With classical pages, the system time and the number of page faults are proportionnal to +both the allocation size and the number of accesses, whereas with huge pages they are only proportionnal to the +allocation size. +**** Flame graph for the CPU utilization :C:EXPERIMENTS: +- Script commit: =005461dad4c06a2e2463d54eec228e65c07b1015= (the file has been modified to remove the flag =MAP_POPULATE=). +- Command (the size correspond to an allocation of a matrix of size 600,000): + #+begin_src sh + sudo perf record -F1000 --call-graph dwarf ./page_faults 1 2880000000000 1 + + sudo perf script | ~/Documents/FlameGraph/stackcollapse-perf.pl --kernel | ~/Documents/FlameGraph/flamegraph.pl > /tmp/flame_2880000000000_hugepage.svg + #+end_src +- Kernel version: + #+begin_src sh + uname -r + #+end_src + + #+RESULTS: + : 4.4.0-81-generic + +- Result: + [[file:cpu_utilization/flame_2880000000000_hugepage.svg]] +- This flame graph is hard to relate with the previous results. +- We saw that there was a high CPU utilization (99%) and that most of the time was spent in user mode. But the graph + shows that a very large part of the time is spent in some other function, outside of the program scope. My guess would + be that such function should not be accounted in the program execution time and that we should therefore have a very + low CPU utilization. +**** Segmented regression :R: +- [[https://en.wikipedia.org/wiki/Segmented_regression][Wikipedia page]] +- [[https://stats.stackexchange.com/questions/20890/how-to-use-segmented-package-to-fit-a-piecewise-linear-regression-with-one-break][Example on StackExchange]] +- Let’s try with dummy data. + #+begin_src R :results output :session *R* :exports both + NB = 100 + A1 = 2 # coeff for first part + A2 = 1 # coeff for second part + B1 = 0 # intercept for first part + B2 = 100 # intercept for second part + df = data.frame(n=1:NB) + df$n = sample(500, size=NB, replace=TRUE) + df$noise = sample(20, size=NB, replace=TRUE)-10 + my_func <- function(n, noise) { + if(n < 100) { + return(A1*n+B1 + noise) + } + else { + return(A2*n+B2 + noise) + } + } + df$fn = mapply(my_func, df$n, df$noise) + #+end_src + + #+RESULTS: + + #+begin_src R :file segmented_regression/1.png :results value graphics :results output :session *R* + library(ggplot2) + ggplot(df, aes(x=n, y=fn)) + geom_point() + #+end_src + + #+RESULTS: + [[file:segmented_regression/1.png]] + +- The two modes are clearly visible, let’s try some regressions. + + #+begin_src R :results output :session *R* :exports both + library(segmented) + lm = segmented(lm(fn~n, data=df), seg.Z = ~ n) + summary(lm) + #+end_src + + #+RESULTS: + #+begin_example + + ***Regression Model with Segmented Relationship(s)*** + + Call: + segmented.lm(obj = lm(fn ~ n, data = df), seg.Z = ~n) + + Estimated Break-Point(s): + Est. St.Err + 99.197 3.361 + + Meaningful coefficients of the linear terms: + Estimate Std. Error t value Pr(>|t|) + (Intercept) 1.22041 4.02077 0.304 0.762 + n 1.99373 0.06389 31.208 <2e-16 *** + U1.n -0.98928 0.06420 -15.409 NA + --- + Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 + + Residual standard error: 6.183 on 96 degrees of freedom + Multiple R-Squared: 0.9985, Adjusted R-squared: 0.9985 + + Convergence attained in 6 iterations with relative change 2.230614e-15 + #+end_example + + #+begin_src R :file segmented_regression/2.png :results value graphics :results output :session *R* + plot(lm) + #+end_src + + #+RESULTS: + [[file:segmented_regression/2.png]] + +- Need to check, but it seems that: + + It expects the underlying “function” to be “continuous”, which is not the case of what we have with =dgemm= on + Stampede. If there is a discontinuity at the break point, the estimation fails. + + The intercept value is =B1=. + + The =n= coefficient is =A1=. + + The =U1.n= coefficient is =A2-A1=. +*** 2017-06-27 Tuesday +**** Keep trying the segmented regression :R: +- Using code from [[https://stackoverflow.com/questions/8758646/piecewise-regression-with-r-plotting-the-segments][stackoverflow]] +- Asked a question on [[https://stackoverflow.com/questions/44778954/segmented-linear-regression-with-discontinuous-data][stackoverflow]]. +- Let’s try with dummy data. + #+begin_src R :results output :session *R* :exports both + NB = 100 + A1 = 2 # coeff for first part + A2 = 1 # coeff for second part + B1 = 0 # intercept for first part + B2 = 300 # intercept for second part + df = data.frame(n=1:NB) + df$n = sample(500, size=NB, replace=TRUE) + df$noise = sample(20, size=NB, replace=TRUE)-10 + my_func <- function(n, noise) { + if(n < 100) { + return(A1*n+B1 + noise) + } + else { + return(A2*n+B2 + noise) + } + } + df$fn = mapply(my_func, df$n, df$noise) + #+end_src + + #+RESULTS: + + #+begin_src R :file segmented_regression/3.png :results value graphics :results output :session *R* + library(ggplot2) + ggplot(df, aes(x=n, y=fn)) + geom_point() + #+end_src + + #+RESULTS: + [[file:segmented_regression/3.png]] + +- First, using =segmented= package. + + #+begin_src R :results output :session *R* :exports both + library(segmented) + model_segmented = segmented(lm(fn~n, data=df), seg.Z = ~ n) + summary(model_segmented) + #+end_src + + #+RESULTS: + #+begin_example + + ***Regression Model with Segmented Relationship(s)*** + + Call: + segmented.lm(obj = lm(fn ~ n, data = df), seg.Z = ~n) + + Estimated Break-Point(s): + Est. St.Err + 136.566 5.677 + + Meaningful coefficients of the linear terms: + Estimate Std. Error t value Pr(>|t|) + (Intercept) -61.0463 11.7827 -5.181 1.22e-06 *** + n 3.6374 0.1534 23.706 < 2e-16 *** + U1.n -2.6332 0.1593 -16.525 NA + --- + Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 + + Residual standard error: 33.92 on 96 degrees of freedom + Multiple R-Squared: 0.9804, Adjusted R-squared: 0.9798 + + Convergence attained in 4 iterations with relative change -7.90412e-16 + #+end_example + + #+begin_src R :file segmented_regression/4.png :results value graphics :results output :session *R* + predict_segmented = data.frame(n = df$n, fn = broken.line(model_segmented)$fit) + ggplot(df, aes(x = n, y = fn)) + + geom_point() + geom_line(data = predict_segmented, color = 'blue') + #+end_src + + #+RESULTS: + [[file:segmented_regression/4.png]] + +- Then, doing the segmentation by hand. + + #+begin_src R :file segmented_regression/5.png :results value graphics :results output :session *R* + Break<-sort(unique(df$n)) + Break<-Break[2:(length(Break)-1)] + d<-numeric(length(Break)) + for (i in 1:length(Break)) { + model_manual<-lm(fn~(n=Break[i])*n, data=df) + d[i]<-summary(model_manual)[[6]] + } + plot(d) + #+end_src + + #+RESULTS: + [[file:segmented_regression/5.png]] + + #+begin_src R :results output :session *R* :exports both + # Smallest breakpoint + breakpoint = Break[which.min(d)] + breakpoint + df$group = df$n >= breakpoint + model_manual<-lm(fn~n*group, data=df) + summary(model_manual) + #+end_src + + #+RESULTS: + #+begin_example + [1] 100 + + Call: + lm(formula = fn ~ n * group, data = df) + + Residuals: + Min 1Q Median 3Q Max + -9.6223 -5.0330 -0.5436 4.7791 10.4031 + + Coefficients: + Estimate Std. Error t value Pr(>|t|) + (Intercept) 1.02021 2.39788 0.425 0.671 + n 1.98517 0.04128 48.090 <2e-16 *** + groupTRUE 300.21629 3.07455 97.646 <2e-16 *** + n:groupTRUE -0.98826 0.04174 -23.678 <2e-16 *** + --- + Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 + + Residual standard error: 5.984 on 96 degrees of freedom + Multiple R-squared: 0.9994, Adjusted R-squared: 0.9994 + F-statistic: 5.248e+04 on 3 and 96 DF, p-value: < 2.2e-16 + #+end_example + + #+begin_src R :file segmented_regression/6.png :results value graphics :results output :session *R* + dat_pred = data.frame(n = df$n, fn = predict(model_manual, df)) + ggplot(df, aes(x = n, y = fn)) + + geom_point() + + geom_line(data=dat_pred[dat_pred$n < breakpoint,], color = 'blue')+ + geom_line(data=dat_pred[dat_pred$n >= breakpoint,], color = 'blue') + #+end_src + + #+RESULTS: + [[file:segmented_regression/6.png]] + +- The =segmented= package fails when the data is discontinuous. +- The dirty method works great. diff --git a/module2/ressources/video_examples/paper.org b/module2/ressources/video_examples/paper.org new file mode 100644 index 0000000..5eaaf6e --- /dev/null +++ b/module2/ressources/video_examples/paper.org @@ -0,0 +1,2634 @@ +# -*- coding: utf-8 -*- +# -*- org-confirm-babel-evaluate: nil -*- +# -*- mode: org -*- +#+TITLE: +#+LANGUAGE: en +#+OPTIONS: H:5 author:nil email:nil creator:nil timestamp:nil skip:nil toc:nil ^:nil +#+TAGS: ARNAUD(a) CHRISTIAN(c) TOM(T) +#+TAGS: noexport(n) DEPRECATED(d) ignore(i) +#+TAGS: EXPERIMENT(e) LU(l) EP(e) +#+STARTUP: overview indent inlineimages logdrawer hidestars +#+EXPORT_SELECT_TAGS: export +#+EXPORT_EXCLUDE_TAGS: noexport +#+SEQ_TODO: TODO(t!) STARTED(s!) WAITING(w@) | DONE(d!) CANCELLED(c@) DEFERRED(@) FLAWED(f@) +#+LATEX_CLASS: IEEEtran +#+LATEX_CLASS_OPTIONS: [nofonttune] +#+PROPERTY: header-args :eval never-export + +* LaTeX Preamble :ignore: +#+LATEX_HEADER: \usepackage{DejaVuSansMono} +#+LATEX_HEADER: \usepackage[T1]{fontenc} +#+LATEX_HEADER: \usepackage[utf8]{inputenc} +#+LATEX_HEADER: %\usepackage{fixltx2e} +#+LATEX_HEADER: \usepackage{ifthen,figlatex} +#+LATEX_HEADER: \usepackage{longtable} +#+LATEX_HEADER: \usepackage{float} +#+LATEX_HEADER: \usepackage{wrapfig} +#+LATEX_HEADER: \usepackage{subfigure} +#+LATEX_HEADER: \usepackage{graphicx} +#+LATEX_HEADER: \usepackage{color,soul} +#+LATEX_HEADER: \usepackage[export]{adjustbox} +#+LATEX_HEADER: \usepackage{xspace} +#+LATEX_HEADER: \usepackage{amsmath,amssymb} +#+LATEX_HEADER: \usepackage[american]{babel} +#+LATEX_HEADER: \usepackage{relsize} +#+LATEX_HEADER: \AtBeginDocument{ +#+LATEX_HEADER: \definecolor{pdfurlcolor}{rgb}{0,0,0.6} +#+LATEX_HEADER: \definecolor{pdfcitecolor}{rgb}{0,0.6,0} +#+LATEX_HEADER: \definecolor{pdflinkcolor}{rgb}{0.6,0,0} +#+LATEX_HEADER: \definecolor{light}{gray}{.85} +#+LATEX_HEADER: \definecolor{vlight}{gray}{.95} +#+LATEX_HEADER: } +#+LATEX_HEADER: %\usepackage[paper=letterpaper,margin=1.61in]{geometry} +#+LATEX_HEADER: \usepackage{url} \urlstyle{sf} +#+LATEX_HEADER: \usepackage[normalem]{ulem} +#+LATEX_HEADER: \usepackage{todonotes} +#+LATEX_HEADER: \usepackage{fancyvrb} +#+LATEX_HEADER: \usepackage[colorlinks=true,citecolor=pdfcitecolor,urlcolor=pdfurlcolor,linkcolor=pdflinkcolor,pdfborder={0 0 0}]{hyperref} +#+LATEX_HEADER: \usepackage{color,colortbl} +#+LATEX_HEADER: \definecolor{gray98}{rgb}{0.98,0.98,0.98} +#+LATEX_HEADER: \definecolor{gray20}{rgb}{0.20,0.20,0.20} +#+LATEX_HEADER: \definecolor{gray25}{rgb}{0.25,0.25,0.25} +#+LATEX_HEADER: \definecolor{gray16}{rgb}{0.161,0.161,0.161} +#+LATEX_HEADER: \definecolor{gray60}{rgb}{0.6,0.6,0.6} +#+LATEX_HEADER: \definecolor{gray30}{rgb}{0.3,0.3,0.3} +#+LATEX_HEADER: \definecolor{bgray}{RGB}{248, 248, 248} +#+LATEX_HEADER: \definecolor{amgreen}{RGB}{77, 175, 74} +#+LATEX_HEADER: \definecolor{amblu}{RGB}{55, 126, 184} +#+LATEX_HEADER: \definecolor{amred}{RGB}{228,26,28} +#+LATEX_HEADER: \definecolor{amdove}{RGB}{102,102,122} +#+LATEX_HEADER: \usepackage{xcolor} +#+LATEX_HEADER: \usepackage[procnames]{listings} +#+LATEX_HEADER: \lstset{ % +#+LATEX_HEADER: backgroundcolor=\color{gray98}, % choose the background color; you must add \usepackage{color} or \usepackage{xcolor} +#+LATEX_HEADER: basicstyle=\tt\scriptsize, % the size of the fonts that are used for the code +#+LATEX_HEADER: breakatwhitespace=false, % sets if automatic breaks should only happen at whitespace +#+LATEX_HEADER: breaklines=true, % sets automatic line breaking +#+LATEX_HEADER: showlines=true, % sets automatic line breaking +#+LATEX_HEADER: captionpos=b, % sets the caption-position to bottom +#+LATEX_HEADER: commentstyle=\color{gray30}, % comment style +#+LATEX_HEADER: extendedchars=true, % lets you use non-ASCII characters; for 8-bits encodings only, does not work with UTF-8 +#+LATEX_HEADER: frame=single, % adds a frame around the code +#+LATEX_HEADER: keepspaces=true, % keeps spaces in text, useful for keeping indentation of code (possibly needs columns=flexible) +#+LATEX_HEADER: keywordstyle=\color{amblu}, % keyword style +#+LATEX_HEADER: procnamestyle=\color{amred}, % procedures style +#+LATEX_HEADER: language=[95]fortran, % the language of the code +#+LATEX_HEADER: numbers=left, % where to put the line-numbers; possible values are (none, left, right) +#+LATEX_HEADER: numbersep=5pt, % how far the line-numbers are from the code +#+LATEX_HEADER: numberstyle=\tiny\color{gray20}, % the style that is used for the line-numbers +#+LATEX_HEADER: rulecolor=\color{gray20}, % if not set, the frame-color may be changed on line-breaks within not-black text (\eg comments (green here)) +#+LATEX_HEADER: showspaces=false, % show spaces everywhere adding particular underscores; it overrides 'showstringspaces' +#+LATEX_HEADER: showstringspaces=false, % underline spaces within strings only +#+LATEX_HEADER: showtabs=false, % show tabs within strings adding particular underscores +#+LATEX_HEADER: stepnumber=2, % the step between two line-numbers. If it's 1, each line will be numbered +#+LATEX_HEADER: stringstyle=\color{amdove}, % string literal style +#+LATEX_HEADER: tabsize=2, % sets default tabsize to 2 spaces +#+LATEX_HEADER: % title=\lstname, % show the filename of files included with \lstinputlisting; also try caption instead of title +#+LATEX_HEADER: procnamekeys={call} +#+LATEX_HEADER: } +#+LATEX_HEADER: \definecolor{colorfuncall}{rgb}{0.6,0,0} +#+LATEX_HEADER: \newcommand{\prettysmall}{\fontsize{6}{8}\selectfont} +#+LATEX_HEADER: \let\oldtexttt=\texttt +#+LATEX_HEADER: \renewcommand\texttt[1]{\oldtexttt{\smaller[1]{#1}}} +# #+LATEX_HEADER: \usepackage[round-precision=3,round-mode=figures,scientific-notation=true]{siunitx} +#+LATEX_HEADER: \usepackage[binary-units]{siunitx} +#+LATEX_HEADER: \DeclareSIUnit\flop{Flop} +#+LATEX_HEADER: \DeclareSIUnit\flops{\flop\per\second} +#+LATEX_HEADER:\usepackage{tikz} +#+LATEX_HEADER:\usetikzlibrary{arrows,shapes,positioning,shadows,trees,calc} +#+LATEX_HEADER:\usepackage{pgfplots} +#+LATEX_HEADER:\pgfplotsset{compat=1.13} + +#+LATEX_HEADER: \usepackage{enumitem} +#+LATEX_HEADER: \setlist[itemize,1]{leftmargin=\dimexpr 26pt-.2in} +#+LATEX_HEADER: \usepackage[mode=buildnew]{standalone} +#+LATEX_HEADER: \usepackage[ruled,vlined,english]{algorithm2e} +#+LATEX_HEADER: \DontPrintSemicolon + +#+LaTeX: \newcommand\myemph[1]{\color{colorfuncall}\textbf{#1}}% + +#+LaTeX: \newcommand\labspace[1][-0cm]{\vspace{#1}} +#+LaTeX: \renewcommand\O{\ensuremath{\mathcal{O}}\xspace}% + +#+BEGIN_EXPORT latex +\makeatletter +\newcommand{\removelatexerror}{\let\@latex@error\@gobble} +\makeatother +#+END_EXPORT + +* LaTeX IEEE title and authors :ignore: +#+BEGIN_EXPORT latex +\let\oldcite=\cite +\renewcommand\cite[2][]{~\ifthenelse{\equal{#1}{}}{\oldcite{#2}}{\oldcite[#1]{#2}}\xspace} +\let\oldref=\ref +\def\ref#1{~\oldref{#1}\xspace} +\def\eqref#1{~(\oldref{#1})\xspace} +\def\ie{i.e.,\xspace} +\def\eg{e.g.,\xspace} +\def\etal{~\textit{et al.\xspace}} +\newcommand{\AL}[2][inline]{\todo[caption={},color=green!50,#1]{\small\sf\textbf{AL:} #2}} +\newcommand{\TC}[2][inline]{\todo[caption={},color=blue!50,#1]{\small\sf\textbf{TOM:} #2}} +\newcommand{\CH}[2][inline]{\todo[color=red!30,#1]{\small\sf \textbf{CH:} #2}} +%\newcommand{\AL}[2][inline]{} +%\newcommand{\TC}[2][inline]{} +%\newcommand{\CH}[2][inline]{} + +%% Omit the copyright space. +%\makeatletter +%\def\@copyrightspace{} +%\makeatother + +%\def\IEEEauthorblockN#1{\gdef\IEEEauthorrefmark##1{\ensuremath{{}^{\textsf{##1}}}}#1} +%\newlength{\blockA} +%\setlength{\blockA}{.35\linewidth} +%\def\IEEEauthorblockA#1{ +% \scalebox{.9}{\begin{minipage}{\blockA}\normalsize\sf +% \def\IEEEauthorrefmark##1{##1: } +% #1 +% \end{minipage}} +%} +% \def\IEEEauthorrefmark#1{#1: } + +\title{Emulating High Performance Linpack on a Commodity Server at the Scale of a Supercomputer} +%\title{Simulating the Energy Consumption of MPI~Applications} +% Predicting the Performance and the Power Consumption of MPI Applications With SimGrid + %\titlerunning{Power-aware simulation for large-scale systems with SimGrid} + % + + \author{ + \begin{minipage}{.55\linewidth}\centering + \IEEEauthorblockN{Tom Cornebize, Franz C. Heinrich, Arnaud Legrand}\\ + \IEEEauthorblockA{Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP, LIG\\ 38000 Grenoble, France\\ + firstname.lastname@inria.fr} + \end{minipage} + \begin{minipage}{.35\linewidth}\centering + \IEEEauthorblockN{Jérôme Vienne}\\ + \IEEEauthorblockA{Texas Advanced Computing Center\\Austin, Texas, USA\\ + viennej@tacc.utexas.edu} + \end{minipage} + + } + + + \maketitle % typeset the title of the contribution +#+END_EXPORT +* Abstract :ignore: +#+LaTeX: \begin{abstract} +The Linpack benchmark, in particular the High-Performance Linpack +(HPL) implementation, has emerged as the de-facto standard benchmark +to rank supercomputers in the TOP500. With a power consumption of +several MW per hour on a TOP500 machine, test-running HPL on the whole +machine for hours is extremely expensive. With core-counts beyond the +100,000 cores threshold being common and sometimes even ranging into +the millions, an optimization of HPL parameters (problem size, grid +arrangement, granularity, collective operation algorithms, etc.) +specifically suited to the network topology and performance is +essential. Such optimization can be particularly time consuming and +can hardly be done through simple mathematical performance models. In +this article, we explain how we both extended the SimGrid's SMPI +simulator and slightly modified HPL to allow a fast emulation of HPL +on a single commodity computer at the scale of a supercomputer. More +precisely, we take as a motivating use case the large-scale run +performed on the Stampede cluster at TACC in 2013, when it got ranked +6th in the TOP500. While this qualification run required the +dedication of 6,006 computing nodes of the supercomputer and more than +120\nbsp{}TB of RAM for more than 2\nbsp{}hours, we manage to simulate a similar +configuration on a commodity computer with 19\nbsp{}GB of RAM in about +62\nbsp{}hours. Allied to a careful modeling of Stampede, this simulation +allows us to evaluate the performance that would have been obtained +using the freely available version of HPL. Such performance reveals much +lower than what was reported and which was obtained using a +closed-source version specifically designed by the Intel +engineers. Our simulation allows us to hint where the main algorithmic +improvements must have been done in HPL. +#+LaTeX: \end{abstract} + + +#+BEGIN_EXPORT latex +% this is need to trim the number of authors and et al. for more than 3 authors +\bstctlcite{IEEEexample:BSTcontrol} +#+END_EXPORT +* Introduction + +The world's largest and fastest machines are ranked twice a year in the so-called +TOP500 list. Among the benchmarks that are often used to evaluate +those machines, the Linpack benchmark, in particular the High-Performance Linpack (HPL) +implementation, has emerged as the de-facto standard benchmark, although +other benchmarks such as HPCG and HPGMG have recently been proposed to +become the new standard. Today, machines with 100,000\nbsp{}cores +and more are common and several machines beyond the 1,000,000\nbsp{}cores mark +are already in production. This high density of computation units requires diligent optimization of application +parameters, such as problem size, process organization or choice of algorithm, as these +have an impact on load distribution and network utilization. +Furthermore, to yield best benchmark results, +runtimes (such as OpenMPI) and supporting libraries (such as BLAS) need to be fine-tuned and adapted to the +underlying platform. + +Alas, it takes typically several hours to run HPL on the list's number one system. +This duration, combined with the power consumption that often reaches several MW +for TOP500 machines, makes it financially infeasible to test-run HPL on the whole +machine just to tweak parameters. +Yet, performance results of an already deployed, current-generation machine typically also +play a role in the funding process for future machines. Results near +the optimal performance for the current machine are hence considered critical for +HPC centers and vendors. These entities would benefit from being able to +tune parameters without actually running the benchmark for hours. +# This estimation can be done either via (mathematical) performance models (e.g., by +# estimating performance of specific functions) or by a simulation based approach. +# While performance models neglect the +# oftentimes serious impact of the network (\eg due to congestion, shared bandwidth, +# ...), this is not in general true for the simulation approach. + +# \CH{Furthermore, simulations can be used to validate/check that the execution went well (operated near the peak performance) but can also help to find the right parameters for the application, runtime and network.} + +In this article, we explain how to predict the performance of HPL +through simulation with the SimGrid/SMPI simulator. We detail how we obtained +faithful models for several functions (\eg =DGEMM= and =DTRSM=) and how we managed +to reduce the memory consumption from more than a hundred terabytes to several +gigabytes, allowing us to emulate HPL on a commonly available server node. +We evaluate the effectiveness of our solution by +simulating a scenario similar to the run conducted on the Stampede +cluster (TACC) in 2013 for the TOP500 . + +This article is organized as follows: +Section\ref{sec:con} presents the main characteristics of the HPL +application and provides detail on the run that was conducted at TACC +in 2013. Section\ref{sec:relwork} discusses existing related work and +explains why emulation (or /online simulation/) is the only relevant +approach when studying an application as complex as HPL. In +Section\ref{sec:smpi}, we briefly present the simulator we used for +this work, SimGrid/SMPI, followed by an +extensive discussion in Section\ref{sec:em} about the +optimizations on all levels (\ie simulator, application, system) that +were necessary to make a large-scale run tractable. The scalability of +our approach is evaluated in Section\ref{sec:scalabilityevol}. The +modeling of the Stampede platform and the comparison of our simulation +with the 2013 execution is detailed in +Section\ref{sec:science}. Lastly, Section\ref{sec:cl} concludes this +article by summarizing our contributions. + +* Context +#+LaTeX: \label{sec:con} + +# The HPLinpack benchmark consists of a set of rules: A set of linear +# equations, $Ax = b$, needs to be solved and it requires furthermore that the input matrix can be of +# arbitrary dimension =n= and that O(n³) + O(n²) operations be used +# (hence, Strassen's matrix multiplication is prohibited). + +** High-Performance Linpack +\label{sec:hpl} +#+BEGIN_EXPORT latex +\begin{figure} + \newcommand{\mykwfn}[1]{{\bf\textsf{#1}}}% + \SetAlFnt{\sf}% + \SetKwSty{mykwfn}% + \SetKw{KwStep}{step}% + \centering + \begin{minipage}[m]{0.4\linewidth} + % \vspace{0.3cm} % ugly, could not align the drawing with the algorithm with minipages or tabular... + \begin{tikzpicture}[scale=0.23] + \draw (0, 0) -- (0, 12) -- (12, 12) -- (12, 0) -- cycle; + \foreach \i in {2}{ + \draw [fill=lightgray] (\i, 0) -- (\i, 12-\i) -- (12, 12-\i) -- (12, 0) -- cycle; + \draw [fill=gray] (\i, 12-\i) -- (\i, 12-\i-1) -- (\i+1, 12-\i-1) -- (\i+1, 12-\i) -- cycle; + \draw[very thick, -latex] (\i,12-\i) -- (\i+2,12-\i-2); + \draw[<->] (\i, 12-\i+0.5) -- (\i+1, 12-\i+0.5) node [pos=0.5, yshift=+0.15cm] {\scalebox{.8}{\texttt{NB}}}; + } + \foreach \i in {3}{ + \draw [fill=white] (\i, 0) -- (\i, 12-\i) -- (12, 12-\i) -- (12, 0) -- cycle; + \draw (\i,12-\i) -- (\i,0); + \draw[very thick, -latex] (\i,12-\i) -- (\i+2,12-\i-2); + } + \draw[dashed] (0, 12) -- (12, 0); + \node(L) at (2, 2) {\ensuremath{\boldsymbol{L}}}; + \node(U) at (10, 10) {\ensuremath{\boldsymbol{U}}}; + \node(A) at (8, 4) {\ensuremath{\boldsymbol{A}}}; + \draw[<->] (0, -0.5) -- (12, -0.5) node [pos=0.5, yshift=-0.3cm] {$N$}; + + \end{tikzpicture} + \end{minipage}% + \begin{minipage}[m]{0.6\linewidth} + \removelatexerror + \begin{algorithm}[H] + allocate and initialize $A$\; + \For{$k=N$ \KwTo $0$ \KwStep \texttt{NB}}{ + allocate the panel\; + factor the panel\; + broadcast the panel\; + update the sub-matrix; + } + \end{algorithm} + \vspace{1em} + \end{minipage} + + \caption{Overview of High Performance Linpack}\vspace{-1em} + \label{fig:hpl_overview} +\end{figure} +#+END_EXPORT + +For this work, we use the freely-available reference-implementation of +the High-Performance Linpack benchmark\cite{HPL}, HPL, which is +used to benchmark systems for the TOP500\cite{top500} list. HPL +requires MPI to be available and implements +a LU decomposition, \ie a factorization of a square matrix $A$ as the +product of a lower triangular matrix $L$ and an upper triangular +matrix $U$. HPL checks the correctness of this factorization by +solving a linear system $A\cdot{}x=b$, but only the factorization step is +benchmarked. The factorization is based on a right-looking variant of +the LU factorization with row partial pivoting and allows multiple +look-ahead depths. The working principle of the factorization is depicted in +Figure\ref{fig:hpl_overview} and consists of a series of panel +factorizations followed by an update of the trailing sub-matrix. +HPL uses a two-dimensional block-cyclic data distribution of $A$ and implements several custom +collective communication algorithms to efficiently overlap communication +with computation. +The main parameters of HPL are listed subsequently: +- $N$ is the order of the square matrix $A$. +- =NB= is the ``blocking factor'', \ie the granularity at + which HPL operates when panels are distributed or worked on. +- $P$ and $Q$ denote the number of process rows and the + number of process columns, respectively. +- =RFACT= determines the panel factorization algorithm. Possible values are Crout, left- or right-looking. +- =SWAP= specifies the swapping algorithm used while pivoting. Two + algorithms are available: one based on /binary exchange/ (along a virtual tree topology) and the other one based on + a /spread-and-roll/ (with a higher number of parallel communications). HPL + also provides a panel-size threshold triggering a switch from one variant to the other. +- =BCAST= sets the algorithm used to broadcast the + panel of columns to the other process columns. Legacy versions of + the MPI standard only supported non-blocking point-to-point communications but did + not support non-blocking collective communications, which is why HPL + ships with in total 6 self-implemented variants to efficiently + overlap the time spent waiting for an incoming panel with updates to + the trailing matrix: =ring=, =ring-modified=, =2-ring=, =2-ring-modified=, + =long=, and =long-modified=. The =modified= versions guarantee that + the process right after the root (\ie the process that will become the root + in the next iteration) receives data first and does not participate + further in the broadcast. This process can thereby start working on the + panel as soon as possible. The =ring= and =2-ring= versions correspond + to the name-giving two virtual topologies while the =long= version + is a /spread and roll/ algorithm where messages are chopped into $Q$ + pieces. This generally leads to better bandwidth exploitation. The =ring= and + =2-ring= variants rely on =MPI_Iprobe=, meaning they + return control if no message has been fully received yet and hence + facilitate partial overlapping of communication with computations. In HPL 2.2 and 2.1, this capability + has been deactivated for the =long= and =long-modified= algorithms. A comment in the source code states that some + machines apparently get stuck when there are too many ongoing messages. +- =DEPTH= controls how many iterations of the outer loop can overlap with each other. + +#+BEGIN_EXPORT latex +\begin{figure}[t] + \centering + \includegraphics[width=.95\linewidth,page=1]{./figures/stampede.pdf} + \caption{The fat-tree network topology of Stampede.} + \label{fig:fat_tree_topology} + \labspace +\end{figure} +#+END_EXPORT + +The sequential complexity of this factorization is +$\mathrm{flop}(N) = \frac{2}{3}N^3 + 2N^2 + \O(N)$ where $N$ is the +order of the matrix to factorize. The time complexity can be +approximated by +$$T(N) \approx \frac{\left(\frac{2}{3}N^3 + 2N^2\right)}{P\cdot{}Q\cdot{}w} + \Theta((P+Q)\cdot{}N^2),$$ where +$w$ is the flop rate of a single node and +the second term corresponds to the communication overhead which is +influenced by the network capacity and by the previously listed parameters (=RFACT=, =SWAP=, =BCAST=, +=DEPTH=, \ldots). +After each run, HPL reports the overall flop +rate $\mathrm{flop}(N)/T(N)$ (expressed in \si{\giga\flops}) for +the given configuration. See Figure\ref{fig:hpl_output} for a (shortened) +example output. + +A large-scale execution of HPL on a real machine in order to submit to the TOP500 +can therefore be quite time consuming as all the BLAS kernels, the MPI runtime, and HPL's numerous parameters +need to be tuned carefully in order to reach optimal performance. +** A Typical Run on a Supercomputer +\label{sec:stampede} +In June 2013, the Stampede supercomputer at TACC was ranked 6th in the +TOP500 by achieving \SI{5168.1}{\tera\flops} and was still ranked 20th in +June 2017. In 2017, this machine got upgraded and renamed Stampede2. The Stampede platform +consisted of 6400 Sandy Bridge nodes, each with two 8-core Xeon E5-2680 and one +Intel Xeon Phi KNC MIC coprocessor. The nodes were interconnected +through a \SI{56}{\giga\bit\per\second} FDR InfiniBand 2-level Clos +fat-tree topology built on Mellanox switches. As can be seen in +Figure\ref{fig:fat_tree_topology}, the 6400 nodes are +divided into groups of 20, with each group being connected to one of the 320 36-port switches (\SI{4}{\tera\bit\per\second} +capacity), which are themselves connected to 8 648-port +``core\nbsp{}switches'' (each with a capacity of \SI{73}{\tera\bit\per\second}). +The peak performance of the 2 Xeon CPUs per node was approximately \SI{346}{\giga\flops}, +while the peak performance of the KNC co-processor was about +\SI{1}{\tera\flops}. The theoretical peak performance of the +platform was therefore \SI{8614}{\tera\flops}. However, in the TOP500, Stampede +was ranked with \SI{5168}{\tera\flops}. According to the log submitted +to the TOP500 (see Figure\ref{fig:hpl_output}) that was provided to us, +this execution took roughly two hours and used $77\times78 = 6,006$ +processes. The matrix of order $N = 3,875,000$ occupied approximately +\SI{120}{\tera\byte} of memory, \ie \SI{20}{\giga\byte} per node. +One MPI process per node was used and each node's +computational resources (the 16 CPU-cores and the Xeon Phi) must have +been controlled by OpenMP and/or Intel's MKL. + +#+BEGIN_EXPORT latex +\begin{figure}%[!htb] + \centering + \scalebox{.73}{\begin{minipage}[b]{.68\textwidth} + \lstset{frame=bt,language=html,numbers=none,escapechar=£}\lstinputlisting{fullrun_hpl.txt} + \end{minipage}} + \null\vspace{-2em}\caption{HPL output submitted in June 2013 for the ranking of Stampede in the TOP500.}\vspace{-1em} + \label{fig:hpl_output} +\end{figure} +#+END_EXPORT + +*** Hidden information about the Stampede execution :noexport: +#+BEGIN_SRC C :exports none :tangle fullrun_hpl.txt +================================================================================ +HPLinpack 2.1 -- High-Performance Linpack benchmark -- October 26, 2012 +Written by A. Petitet and R. Clint Whaley, Innovative Computing Laboratory, UTK +Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK +Modified by Julien Langou, University of Colorado Denver +================================================================================ + +The following parameter values will be used: + +£\myemph{N}£ : £\myemph{3875000}£ +£\myemph{NB}£ : £\myemph{1024}£ +PMAP : Column-major process mapping +£\myemph{P}£ : £\myemph{77}£ +£\myemph{Q}£ : £\myemph{78}£ +PFACT : Right +NBMIN : 4 +NDIV : 2 +RFACT : Crout +BCAST : BlongM +DEPTH : 0 +SWAP : Binary-exchange +L1 : no-transposed form +U : no-transposed form +EQUIL : no +ALIGN : 8 double precision words + +-------------------------------------------------------------------------------- + + +[...] + + +Peak Performance = 5172687.23 GFlops / 861.25 GFlops per node +================================================================================ +T/V N NB P Q Time Gflops +-------------------------------------------------------------------------------- +WC05C2R4 3875000 1024 77 78 7505.72 £\myemph{5.16811e+06}£ +HPL_pdgesv() start time Sun Jun 2 13:04:59 2013 + +HPL_pdgesv() end time Sun Jun 2 15:10:04 2013 + +-------------------------------------------------------------------------------- +||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0007822 ...... PASSED +#+END_SRC + +** Performance Evaluation Challenges +:LOGBOOK: +- State "TODO" from [2017-11-15 mer. 16:26] +:END: +#+LaTeX: \label{sec:con:diff} + +The performance achieved by Stampede, \SI{5168}{\tera\flops}, needs to +be compared to the peak performance of the 6,006 nodes, \ie +\SI{8084}{\tera\flops}. This difference may be attributed to the node +usage (\eg the MKL), to the MPI library, to the network topology that +may be unable to deal with the very intensive communication workload, to +load imbalance among nodes because some node happens to be slower for some +reason (defect, system noise, \ldots), to the algorithmic structure of +HPL, etc. All these factors make it difficult to know precisely what +performance to expect +without running the application at scale. + +It is clear that due to the level of complexity of both HPL and +the underlying hardware, simple performance models (analytic expressions based +on $N, P, Q$ and estimations of platform characteristics as presented in +Section\ref{sec:hpl}) may be able to provide trends but can by no means +predict the performance for each configuration (\ie consider the +exact effect of HPL's 6 different broadcast algorithms on network +contention). Additionally, these expressions do not allow +engineers to improve the performance through actively identifying performance bottlenecks. +For complex optimizations such as partially non-blocking +collective communication algorithms intertwined with computations, +very faithful modeling of both the application and the platform is +required. Given the scale of this scenario +(3,785\nbsp{}steps on 6,006 nodes in two hours), detailed +simulations quickly become intractable without significant effort. +* Related Work +#+LaTeX: \label{sec:relwork} + +Performance prediction of MPI application through simulation has been +widely studied over the last decades, with today's literature distinguishing mainly +between two approaches: offline and online simulation. + +With the most common approach, /offline simulation/, a time-independent +trace of the application is first obtained on a real platform. This +trace comprises sequences of MPI optimizations and CPU bursts and can +be given as an input to a simulator that implements performance models +for the CPUs and the network to derive timings. Researchers +interested in finding out how their application reacts to changes to +the underlying platform can replay the trace on commodity hardware at +will with different platform models. +Most HPC simulators available today, notably BigSim\cite{bigsim_04}, +Dimemas\cite{dimemas} and CODES\cite{CODES}, rely on this approach. + +The main limitation of this approach comes from the trace +acquisition requirement. +Additionally, tracing an application provides only information about +its behavior at the time of the run. Even light modifications +(\eg to communication patterns) may make the trace inaccurate. For +simple applications (\eg =stencil=) it is sometimes +possible to extrapolate behavior from small-scale +traces\cite{scalaextrap,pmac_lspp13} but the execution is +non-deterministic whenever the application relies on +non-blocking communication patterns, which is unfortunately the +case for HPL. + +The second approach discussed in literature is /online simulation/. +Here, the application is executed (emulated) on top of a simulator +that is responsible for determining when each process +is run. This approach allows researchers +to study directly the behavior of MPI applications but only a few +recent simulators such as SST Macro\cite{sstmacro}, +SimGrid/SMPI\cite{simgrid} +and the closed-source extreme-scale simulator xSim\cite{xsim} support +it. To the best of our knowledge, only SST Macro and +SimGrid/SMPI are not only mature enough to faithfully emulate +HPL but also free software. For our work, we relied on SimGrid as we +have an excellent knowledge of its internals although the developments we +propose would a priori also be possible with SST Macro. Emulation of +HPL comes with at least two challenges: +- Firstly, the time-complexity of the + algorithm is $\Theta(N^3)$. Furthermore, + $\Theta(N^2)$ communications are performed, with $N$ being very + large. The execution on the Stampede cluster took roughly two hours + on 6,006\nbsp{}compute nodes. Using only a single node, a naive + emulation of HPL at the scale of the Stampede run would take about + 500\nbsp{}days if perfect scaling is reached. Although the emulation could + be done in parallel, we want to use as little computing resources as possible. +- Secondly, the tremendous memory consumption and consequent high + number of RAM accesses for read/write operations need to be dealt with. + +# Real execution: +# - Matrix of order 3,875,000 +# - Using 6,006 MPI processes +# - About 2 hours +# Requirement for the emulation of Stampede's execution: +# - $\ge 3, 875, 000 2 \times 8$ bytes \approx 120 terabytes of memory +# - $\ge 6, 006 \times 2$ hours \approx 500 days (very optimistic) + +* SimGrid/SMPI in a nutshell +#+LATEX: \label{sec:smpi} + +SimGrid\cite{simgrid} is a flexible and open-source simulation +framework that was originally designed in 2000 to study scheduling +heuristics tailored to heterogeneous grid computing +environments. Since then, SimGrid has also been used to study +peer-to-peer systems with up to two million +peers\cite{simgrid_simix2_12} just as cloud and HPC infrastructures. +To this end, SMPI, a simulator based on SimGrid, has been +developed and used to faithfully simulate unmodified MPI applications +written in C/C++ or FORTRAN\cite{smpi}. +A main development goal for SimGrid has been to provide validated +performance models particularly for scenarios leveraging the network. +Such a validation normally consists of comparing simulation +predictions with results from real experiments to confirm or debunk network and application models. +In\cite{heinrich:hal-01523608}, we have for instance validated +SimGrid's energy module by accurately and consistently predicting within a few +percent the performance and the energy consumption of HPL and some +other benchmarks on small-scale clusters (up to $12\times12$ cores +in\cite{heinrich:hal-01523608} and up to $128\times1$ cores +in\cite{smpi}). + +In this article, we aim to validate our approach through much larger experiments. +This scale, however, comes at the cost of a much less controlled +scenario for real-life experiments since the Stampede run of HPL was done +in 2013 and we only have very limited information about the +setup (\eg software versions). + +** MPI Communication Modeling +The complex network optimizations done in real MPI implementations +need to be considered when predicting performance of MPI applications. +For instance, message size not only influences the network's latency +and bandwidth factors but also the protocol used, such as ``eager'' or +``rendez-vous'', as they are selected +based on the message size, with each protocol having its own +synchronization semantics. +To deal with this, SMPI relies on a generalization of the LogGPS +model\cite{smpi} and supports specifying synchronization and performance modes. This model +needs to be instantiated once per platform through a carefully controlled series of messages +(=MPI_Send= and =MPI_Recv=) between two nodes and through a set of +piece-wise linear regressions. +#+LABEL: \CH{This last sentence may be too long.} + +Modeling network topologies and contention is also difficult. SMPI +relies on SimGrid's communication models where each ongoing +communication is represented as a whole (as opposed to single packets) +by a /flow/. Assuming steady-state, contention between active +communications can be modeled as a bandwidth sharing problem that +accounts for non-trivial phenomena (\eg RTT-unfairness of TCP, +cross-traffic interference or network +heterogeneity\cite{Velho_TOMACS13}). Communications that start or end +trigger re-computation of the bandwidth sharing if needed. In this +model, the time to simulate a message passing through the network is +independent of its size, which is advantageous for large-scale +applications frequently sending large messages. SimGrid does not +model transient phenomena incurred by the network protocol but +accounts for network topology and heterogeneity. + +Finally, collective operations are also challenging, particularly since +these operations often play a key factor to an application's performance. Consequently, performance optimization +of these operations has been studied intensively. As a result, MPI +implementations now commonly have several alternatives for each +collective operation and select one at runtime, depending on message size and communicator +geometry. SMPI implements collective +communication algorithms and the selection logic from several MPI implementations (\eg +Open MPI, MPICH), which helps to ensure that +simulations are as close as possible to real +executions. +Although SMPI supports these facilities, they are not required in the +case of HPL as it ships with its own implementation of collective +operations. + #+BEGIN_EXPORT latex + \tikzset{draw half paths/.style 2 args={% + % From https://tex.stackexchange.com/a/292108/71579 + decoration={show path construction, + lineto code={ + \draw [#1] (\tikzinputsegmentfirst) -- + ($(\tikzinputsegmentfirst)!0.5!(\tikzinputsegmentlast)$); + \draw [#2] ($(\tikzinputsegmentfirst)!0.5!(\tikzinputsegmentlast)$) + -- (\tikzinputsegmentlast); + } + }, decorate + }} + \begin{figure}[b]%[htbp] + \centering + \begin{tikzpicture}[yscale=0.7, scale=0.7] + \pgfmathtruncatemacro{\size}{4} + \pgfmathtruncatemacro{\width}{2} + \pgfmathtruncatemacro{\sizem}{\size-1} + \pgfmathtruncatemacro{\smallbasex}{4} + \pgfmathtruncatemacro{\smallbasey}{\size/2} + \pgfmathtruncatemacro{\smallstopx}{\smallbasex+\width} + \pgfmathtruncatemacro{\smallstopy}{\smallbasey+1} + \foreach \i in {0,\sizem}{ + \pgfmathtruncatemacro{\j}{\i+1} + \draw (0, \i) -- (0, \j); + \draw (\width, \i) -- (\width, \j); + \draw[dotted] (0, \i) -- (\width, \i); + \draw[dotted] (0, \j) -- (\width, \j); + } + \draw[dashed] (0, 1) -- (0, \sizem); + \draw[dashed] (\width, 1) -- (\width, \sizem); + \draw (0, 0) -- (\width, 0); + \draw (0, \size) -- (\width, \size); + \draw (\smallbasex,\smallbasey) -- (\smallstopx,\smallbasey) -- (\smallstopx,\smallstopy) -- (\smallbasex,\smallstopy) -- cycle; + \foreach \i in {0,\sizem}{ + \pgfmathtruncatemacro{\j}{\i+1} + \draw[dotted] (\width, \i) -- (\smallbasex, \smallbasey); + \draw[dotted] (\width, \j) -- (\smallbasex, \smallstopy); + \pgfmathsetmacro{\xleft}{\width} + \pgfmathsetmacro{\xright}{\smallbasex}%{\width/2.0+\smallbasex/2.0} + \pgfmathsetmacro{\yleft}{\i + 0.5} + \pgfmathsetmacro{\yright}{\smallbasey + 0.5} + \path [draw half paths={solid, -latex}{draw=none}] (\xleft, \yleft) -- (\xright, \yright); + } + \draw[decorate,line width=1pt,decoration={brace,raise=0.2cm}] (0, 0) -- (0, \size) node [pos=0.5, xshift=-1cm] {virtual}; + \draw[decorate,line width=1pt,decoration={brace,mirror,raise=0.2cm}] (\smallstopx, \smallbasey) -- (\smallstopx, \smallstopy) node [pos=0.5, xshift=1.2cm] {physical}; + \end{tikzpicture} + \caption{\label{fig:global_shared_malloc}SMPI shared malloc mechanism: large area of virtual memory are cyclically mapped onto the same physical pages.}\vspace{-1em} + \end{figure} + #+END_EXPORT +** Application Behavior Modeling +#+LATEX: \label{sec:appmodeling} +In Section\ref{sec:relwork} we explained that SMPI relies on the /online/ simulation approach. +Since SimGrid is a sequential simulator, SMPI maps every MPI process of the application onto a +lightweight simulation thread. These threads are then run one at a +time, \ie in mutual exclusion. +Every time a thread enters an MPI call, +SMPI takes control and the time that was spent +computing (isolated from the other threads) since the previous +MPI call can be injected into the simulator as a virtual delay. + +Mapping MPI processes to threads of a single +process effectively folds them into the same address space. +Consequently, global variables in the MPI application are shared +between threads unless these variables are /privatized/ and the +simulated MPI ranks thus isolated from each other. Several +technical solutions are possible to handle this issue\cite{smpi}. The +default strategy in SMPI consists of making a copy of the =data= +segment (containing all global variables) per MPI rank at startup and, +when context switching to another rank, to remap the =data= segment via =mmap= to the private copy of that rank. +SMPI also implements another mechanism relying on the =dlopen= +function that saves calls to =mmap= when context switching. + +This causes online simulation to be expensive in terms of both simulation time and memory +since the whole parallel application is executed on a single node. +To deal with this, SMPI provides two simple annotation mechanisms: +- *Kernel sampling*: Control flow is in many cases + independent of the computation results. This allows + computation-intensive kernels (\eg BLAS kernels for HPL) + to be skipped during the simulation. For this purpose, SMPI + supports annotation of regular kernels through several macros + such as =SMPI_SAMPLE_LOCAL= and =SMPI_SAMPLE_GLOBAL=. The regularity allows SMPI to execute these + kernels a few times, estimate their cost and skip the kernel in + the future by deriving its cost from these samples, hence cutting + simulation time significantly. Skipping kernels renders the + content of some variables invalid but in simulation, only the + behavior of the application and not the correctness of computation + results are of concern. +- *Memory folding*: SMPI provides the =SMPI_SHARED_MALLOC= (=SMPI_SHARED_FREE=) macro to + replace calls to =malloc= (=free=). They indicate that some data structures can safely be + shared between processes and that the data they contain is not + critical for the execution (\eg an input matrix) and that it may + even be overwritten. + =SMPI_SHARED_MALLOC= works as follows (see Figure\ref{fig:global_shared_malloc}) : a single block of physical memory (of default size \SI{1}{\mega\byte}) for the whole + execution is allocated and shared by all MPI processes. + A range of virtual addresses corresponding to a specified size is reserved and cyclically mapped onto the previously obtained + physical address. + This mechanism allows applications to obtain a nearly constant memory + footprint, regardless of the size of the actual allocations. + + # At the first call to =SMPI_SHARED_MALLOC=, a temporary file is created. The file descriptor is a global variable, + # accessible by all the MPI processes, since they are implemented by POSIX threads. + + # At every call to =SMPI_SHARED_MALLOC=, a first call to =mmap= is done with the required size and the flag =MAP_ANONYMOUS= + # (thus without any file descriptor). The effect of this call is to reserve the whole interval of virtual + # addresses. Then, for each sub-interval, a new call to =mmap= is done with the temporary file. The address of the + # sub-interval itself is passed with the flag =MAP_FIXED=, which forces the mapping to keep the same virtual address. + # As a result, each of these sub-intervals of virtual addresses are mapped onto a same interval of physical + # addresses. We therefore have a block of virtual addresses of arbitrary size backed by a constant amount of physical + # memory. Since there are almost no computations left, this is harmless with respect to the simulation. Note that such + # allocations cannot be fully removed as many parts of the code + # still access it from time to time. + +* Improving SMPI Emulation Mechanisms and Preparing HPL +#+LaTeX: \label{sec:em} + +We now present our changes to SimGrid and HPL that were +required for a scalable and faithful simulation. We provide +only a brief evaluation of our modifications and refer the +reader interested in details to\cite{cornebize:hal-01544827} and our laboratory +#+LaTeX: notebook\footnote{See \texttt{journal.org} at \url{https://github.com/Ezibenroc/simulating_mpi_applications_at_scale/}}. +For our experiments in this section, we used a single core from nodes +of the Nova cluster provided by the Grid'5000 testbed\cite{grid5000} with +\SI{32}{\giga\byte} RAM, two 8-core Intel Xeon E5-2620 v4 +CPUs processors with \SI{2.1}{\GHz} and Debian Stretch (kernel 4.9). + +** Kernel modeling + As explained in Section\ref{sec:con:diff}, faithful prediction + of HPL necessitates emulation, \ie to execute the code. + HPL relies heavily on BLAS kernels such as =dgemm= (for matrix-matrix multiplication) or =dtrsm= (for solving + an equation of the form $Ax=b$). An analysis of an HPL + simulation with $64$ processes and a very small matrix of order + $30,000$ showed that roughly \SI{96}{\percent} of + the time is spent in these two very regular kernels. + For larger matrices, these kernels will consume + an even bigger percentage of the computation time. Since these + kernels do not influence the control flow, simulation time can + be reduced by substituting =dgemm= and =dtrsm= function calls + with a performance model for the respective kernel. + Figure\ref{fig:macro_simple} shows an example of this + macro-based mechanism that allows us to keep HPL code modifications to an absolute + minimum. The =(1.029e-11)= value represents the inverse of the + flop rate for this computation kernel and was obtained + through calibration. The estimated time for the real + kernel is calculated based on the parameters and eventually + passed on to =smpi_execute_benched= that advances the clock of the executing + rank by this estimate by entering a sleep state. + The effect on simulation time for a small scenario is depicted in Figure\ref{fig:kernel_sampling}. + On the one hand, this modification speeds up the simulation by + orders of magnitude, especially when the matrix order + grows. On the other hand, this kernel model leads to an + optimistic estimation of the floprate. This may + be caused by inaccuracies in our model as well as by the fact + that the initial emulation is generally more sensitive to pre-emptions, + \eg by the operating system, and therefore more likely to be + pessimistic compared to a real execution. + # #+LATEX: \CH{Re-work this. I don't like that we talk about inaccuracies in our model. Shouldn't the pre-emptions be modeled alread? We did rely on measurements! "Absence of performance variability when kernel models are used."} + # #+LATEX: \TC{I don't get the explanation about the inaccuracies. I think OS preemptions is one of the smallest factors here, especially since in real executions they will certainly fix this issue (e.g. with chrt --fifo 99) whereas in the calibration I did not take care of that.} + +#+BEGIN_EXPORT latex +\begin{figure}%[!htb] +% \null\vspace{-1cm} + \centering + \subfigure[Non-intrusive macro replacement.\label{fig:macro_simple}]{ + \begin{minipage}[b]{\linewidth} + \lstset{frame=bt,language=C,numbers=none,escapechar=|}\lstinputlisting{HPL_dgemm_macro_simple.c} + \end{minipage}} + \subfigure[Gain in term of simulation time.\label{fig:kernel_sampling}]{ + \begin{minipage}[b]{\linewidth} + \includegraphics[width=\linewidth,page=2]{figures/validation_kernel_modeling.pdf} + \end{minipage}} + \caption{Replacing the calls to computationally expensive functions by a model allows to significantly reduce simulation time.}\vspace{-1em} +\end{figure} +#+END_EXPORT + +*** Hidden section with estimation of the quality/speed of the simulation :noexport: +Inspire from the entry of [[file:~/Work/Journals/tom_cornebize/m2_internship_journal/journal.org::*2017-11-15%20Wednesday][Tom's journal]] ([[https://github.com/Ezibenroc/m2_internship_journal/tree/master/journal.org][Github version]]) "2017-11-15 Wednesday": +Regenerating the validation plot for smpi_execute". + +#+begin_src R :results output :session *R* :exports both +library(ggplot2) +library(gridExtra) +library(grid) +old <- read.csv("/home/alegrand/Work/SimGrid/tom/m2_internship_journal/validation/result_size_L0.csv") +new <- read.csv("/home/alegrand/Work/SimGrid/tom/m2_internship_journal/validation/result_size_L1.csv") +old$kernel_sampling = FALSE +new$kernel_sampling = TRUE +results = rbind(old, new) +generic_do_plot <- function(plot, fixed_shape=TRUE) { +# For xrange, see https://stackoverflow.com/questions/7705345/how-can-i-extract-plot-axes-ranges-for-a-ggplot2-object +# old version for xrange (broken) +# xrange = ggplot_build(plot)$panel$ranges[[1]]$x.range +# new version for xrange (may break in the next ggplot update...) + xrange = ggplot_build(plot)$layout$panel_ranges[[1]]$x.range + xwidth = xrange[2] - xrange[1] + if(fixed_shape) { + point = stat_summary(fun.y = mean, geom="point", shape=21) + } + else { + point = stat_summary(fun.y = mean, geom="point") + } + return(plot + + stat_summary(fun.data = mean_se, geom = "errorbar", width=xwidth/20)+ + stat_summary(fun.y = mean, geom="line")+ + point+ + theme_bw()+ scale_color_brewer(palette="Set1") + + expand_limits(x=0, y=0)) +} + +# From https://stackoverflow.com/a/38420690/4110059 +grid_arrange_shared_legend <- function(..., nrow = 1, ncol = length(list(...)), position = c("bottom", "top", "right")) { + + plots <- list(...) + position <- match.arg(position) + g <- ggplotGrob(plots[[1]] + theme(legend.position = position))$grobs + legend <- g[[which(sapply(g, function(x) x$name) == "guide-box")]] + lheight <- sum(legend$height) + lwidth <- sum(legend$width) + gl <- lapply(plots, function(x) x + theme(legend.position = "none")) + gl <- c(gl, nrow = nrow, ncol = ncol) + + combined <- switch(position, + "bottom" = arrangeGrob(do.call(arrangeGrob, gl), + legend, + ncol = 1, + heights = unit.c(unit(1, "npc") - lheight, lheight)), + "top" = arrangeGrob(legend, do.call(arrangeGrob,gl), + ncol = 1, + heights = unit.c(lheight, unit(1, "npc") - lheight)), + "right" = arrangeGrob(do.call(arrangeGrob, gl), + legend, + ncol = 2, + widths = unit.c(unit(1, "npc") - lwidth, lwidth))) + grid.newpage() + grid.draw(combined) + +} +#+end_src + +#+RESULTS: + +#+begin_src R :file figures/validation_kernel_modeling.pdf :results value graphics :results output :session *R* :exports both :width 6.2 :height 3.5 +plot1 = generic_do_plot(ggplot(results, aes(x=size, y=Gflops, color=kernel_sampling, linetype=kernel_sampling))) + + labs(colour="Kernel modeling") + + labs(linetype="Kernel modeling") + + xlab('Matrix order') + + ylab('Performance [Gflop/s]') + + ggtitle("Performance estimation\n(P=Q=8, i.e., 64 MPI process)") +plot2 = generic_do_plot(ggplot(results, aes(x=size, y=simulation_time, color=kernel_sampling, linetype=kernel_sampling))) + + labs(colour="Kernel modeling") + + labs(linetype="Kernel modeling") + + xlab('Matrix order') + + ylab('Time [seconds]') + + ggtitle("Simulation time\n(P=Q=8, i.e., 64 MPI process)") + +grid_arrange_shared_legend(plot2, plot1, ncol=2, position="top") +#+end_src + +#+RESULTS: +[[file:figures/validation_kernel_modeling.pdf]] + + + +*** Hidden section with macro code :noexport: +#+BEGIN_SRC C :exports none :tangle HPL_dtrsm_macro_real.c +#define |\color{colorfuncall}HPL\_dtrsm|(layout, Side, Uplo, TransA, Diag, M, N, alpha, A, lda, B, ldb) ({ \ + double expected_time; \ + double coefficient, intercept; \ + if((M) > 512 && (N) > 512) { \ + coefficient = (double)SMPI_DTRSM_PHI_COEFFICIENT; \ + intercept = (double)SMPI_DTRSM_PHI_INTERCEPT; \ + } else { \ + coefficient = (double)SMPI_DTRSM_CPU_COEFFICIENT; \ + intercept = (double)SMPI_DTRSM_CPU_INTERCEPT; \ + } \ + if((Side) == HplLeft) { \ + expected_time = coefficient*((double)(M))*((double)(M))*((double)(N)); \ + } else { \ + expected_time = coefficient*((double)(M))*((double)(N))*((double)(N)); \ + } \ + expected_time += intercept \ + if(expected_time > 0) \ + |\color{colorfuncall}smpi\_execute\_benched|(expected_time); \ +}) +#+END_SRC + +#+BEGIN_SRC C :exports none :tangle HPL_dtrsm_macro_simple_old.c +#define |\color{colorfuncall}HPL\_dtrsm|(layout, Side, Uplo, TransA, Diag, M, N, alpha, A, lda, B, ldb) ({ \ + double expected_time = (9.882e-12)*((double)M)*((double)M)*((double)N) + 4.329e-02; \ + if(expected_time > 0) \ + |\color{colorfuncall}smpi\_execute\_benched|(expected_time); \ +}) +#+END_SRC + +#+BEGIN_SRC C :exports none :tangle HPL_dtrsm_macro_simple.c +#define |\color{colorfuncall}HPL\_dtrsm|(layout, Side, Uplo, TransA, Diag, \ + M, N, alpha, A, lda, B, ldb) ({ \ + double expected_time = (9.882e-12)*((double)M)* \ + ((double)M)*((double)N) + 4.329e-02; \ + if(expected_time > 0) \ + |\color{colorfuncall}smpi\_execute\_benched|(expected_time); \ +}) +#+END_SRC + +#+BEGIN_SRC C :exports none :tangle HPL_dgemm_macro_simple.c +#define |\color{colorfuncall}HPL\_dgemm|(layout, TransA, TransB, \ + M, N, K, alpha, A, lda, B, ldb, beta, C, ldc) ({ \ + double expected_time = (1.029e-11)*((double)M)* \ + ((double)N)*((double)K) + 1.981e-12; \ + if(expected_time > 0) \ + |\color{colorfuncall}smpi\_execute\_benched|(expected_time); \ +}) +#+END_SRC + +#+BEGIN_EXPORT latex +\CH{Found this in Tom's logbook. Check if this is the final version. Also, we can apparently just call \texttt{make SMPI\_OPTS=-DSMPI\_OPTIMIZATION} (what about \texttt{arch=SMPI}?). See his logbook} +#+END_EXPORT +** Adjusting the behavior of HPL +#+LaTeX: \label{sec:hplchanges} + +HPL uses pseudo-randomly generated +matrices that need to be setup every time HPL is executed. The time +spent on this just as the validation of the computed result is +not considered in the reported \si{\giga\flops} performance. +We skip all the +computations since we replaced them by a kernel model and therefore, +result validation is meaningless. Since both +phases do not have an impact on the reported performance, we can safely +skip them. + +In addition to the main computation kernels =dgemm= and =dtrsm=, +we identified seven other BLAS functions through +profiling as computationally expensive enough to justify a specific +handling: =dgemv=, =dswap=, =daxpy=, +=dscal=, =dtrsv=, =dger= and =idamax=. Similarly, a significant amount of time was +spent in fifteen functions implemented in HPL: +=HPL_dlaswp*N=, =HPL_dlaswp*T=, =HPL_dlacpy= and =HPL_dlatcpy=. +# =HPL_dlaswp00N=, =HPL_dlaswp01N=, =HPL_dlaswp01T=, =HPL_dlaswp02N=, =HPL_dlaswp03N=, +# =HPL_dlaswp03T=, =HPL_dlaswp04N=, =HPL_dlaswp04T=, =HPL_dlaswp05N=, =HPL_dlaswp05T=, +# =HPL_dlaswp06N=, =HPL_dlaswp06T=, =HPL_dlaswp10N=, =HPL_dlacpy= and =HPL_dlatcpy=. + +All of these functions are called during the +LU factorization and hence impact the performance measured by HPL; however, because of +the removal of the =dgemm= and =dtrsm= computations, they all operate on +bogus data and hence also produce bogus data. We also determined +through experiments that their impact on the performance prediction is +minimal and hence modeled them for the sake of simplicity as being instantaneous. + +Note that HPL +implements an LU factorization with partial pivoting and a special +treatment of the =idamax= function that returns the index of the first +element equaling the maximum absolute value. Although we ignored the +cost of this function as well, we set its return value to an arbitrary +value to make the simulation fully deterministic. +We confirmed that this modification is harmless in terms of performance prediction while it +speeds up the simulation by an additional factor of $\approx3$ to $4$ +on small ($N=30,000$) and even more on large scenarios. +** Memory folding +As explained in Section\ref{sec:smpi}, when emulating an application +with SMPI, all MPI processes are run within the same simulation process on a single +node. The memory consumption of the simulation can therefore quickly reach +several \si{\tera\byte} of RAM. + +Yet, as we no longer operate on real data, storing the whole +input matrix $A$ is needless. However, since only a minimal portion of the code was +modified, some functions may still read or write some parts of the matrix. +It is thus not possible to simply remove the memory allocations of +large data structures altogether. Instead, SMPI's =SHARED_MALLOC= mechanism can be used +to share unimportant data structures between all ranks, minimizing the memory footprint. + +#+BEGIN_EXPORT latex +\tikzstyle{switch}=[draw, circle, minimum width=1cm, minimum height = 1cm] +\tikzstyle{compute}=[draw, rectangle, minimum width=0.5cm, minimum height = 0.5cm, node distance=0.5cm] +\tikzstyle{base}=[ellipse, minimum width=2cm, minimum height = 0.5cm, node distance = 0.5cm] +\tikzstyle{bigswitch}=[base, draw] +\begin{figure}%[htbp] + \centering + {\begin{minipage}{1.0\linewidth} + \subfigure[Structure of the panel in HPL.\label{fig:panel_structure}]{\small + \begin{minipage}[b]{\linewidth}\centering + \begin{tikzpicture}[yscale=.6,scale=0.8] + \draw [fill=gray] (3, 2) -- (6, 2) -- (6, 3) -- (3, 3) -- cycle; + \draw (0, 2) -- (9, 2) -- (9, 3) -- (0, 3) -- cycle; + \draw[dashed] (3, 2) -- (3, 3); + \draw[dashed] (6, 2) -- (6, 3); + \node(1) at (1.5, 2.5) {matrix parts}; + \node(2) at (4.5, 2.5) {indices}; + \node(3) at (7.5, 2.5) {matrix parts}; + \draw[decorate,line width=1pt,decoration={brace,raise=0.2cm}] (0, 3) -- (3, 3) node [pos=0.5, yshift=0.5cm] {can be shared}; + \draw[decorate,line width=1pt,decoration={brace,raise=0.2cm}] (6, 3) -- (9, 3) node [pos=0.5, yshift=0.5cm] {can be shared}; + \draw[decorate,line width=1pt,decoration={brace,raise=0.2cm, mirror}] (3, 2) -- (6, 2) node [pos=0.5, yshift=-0.5cm] {must not be shared}; + \end{tikzpicture} + \end{minipage}} + \subfigure[Reusing panel allocation from an iteration to another.\label{fig:panel_reuse}]{\small + \begin{minipage}[b]{\linewidth}\centering + \begin{tikzpicture}[yscale=.6] + \draw [fill=gray] (2, 1) -- (4, 1) -- (4, 1.5) -- (2, 1.5) --cycle; + \draw (0, 1) -- (6, 1) -- (6, 1.5) -- (0, 1.5) -- cycle; + \draw[dashed] (2, 1) -- (2, 1.5); + \draw[dashed] (4, 1) -- (4, 1.5); + + \draw [fill=gray] (2, 0) -- (3, 0) -- (3, .5) -- (2, .5) --cycle; + \draw (1, 0) -- (4, 0) -- (4, .5) -- (1, .5) -- cycle; + \draw[dashed] (2, 0) -- (2, .5); + \draw[dashed] (3, 0) -- (3, .5); + + \draw[-latex] (2, 1) -- (2, .5); + \draw[decorate,line width=1pt,decoration={brace,raise=0.2cm}] (0, 1.5) -- (6, 1.5) node [pos=0.5, yshift=0.5cm] {initial buffer}; + \draw[decorate,line width=1pt,decoration={brace,raise=0.2cm, mirror}] (1, 0) -- (4, 0) node [pos=0.5, yshift=-0.5cm] {current buffer}; + \end{tikzpicture} + \end{minipage} + } + \end{minipage}} + \caption{Panel structure and allocation strategy when simulating.\label{fig:panel}}\vspace{-1em} +\end{figure} +#+END_EXPORT + +The largest two allocated data structures in HPL are the input matrix =A= +(with a size of typically several \si{\giga\byte} per process) and the =panel= which contains +information about the sub-matrix currently being factorized. This sub-matrix +typically occupies a few hundred \si{\mega\byte} per process. +Although using the default =SHARED_MALLOC= mechanism works flawlessly +for =A=, a more careful strategy needs to be used for the +=panel=. Indeed, the =panel= is an intricate data structure with both \texttt{int}s +(accounting for matrix indices, error codes, MPI tags, and pivoting information) +and \texttt{double}s (corresponding to a copy of a sub-matrix of =A=). To +optimize data transfers, HPL flattens this structure into a single +allocation of \texttt{double}s (see +Figure\ref{fig:panel_structure}). Using a fully shared memory +allocation for the =panel= therefore leads to index corruption that results in +classic invalid memory accesses as well as communication +deadlocks, as processes may not send to or receive from the correct +process. Since \texttt{int}s and \texttt{double}s are stored in +non-contiguous parts of this flat allocation, it is therefore +essential to have a mechanism that preserves the process-specific +content. We have thus introduced the macro +=SMPI_PARTIAL_SHARED_MALLOC= that works as follows: +~mem = SMPI_PARTIAL_SHARED_MALLOC(500, {27,42 , 100,200}, 2)~. +In this example, 500 bytes are allocated in =mem= with the elements +=mem[27]=, ..., =mem[41]= and =mem[100]=, ..., =mem[199]= being shared between +processes (they are therefore generally completely corrupted) while all other +elements remain private. To apply this to HPL's =panel= data\-structure +and partially share it between processes, we only had to modify a few lines. + +Designating memory explicitly as private, shared or partially shared +helps with both memory management and overall performance. +As SMPI is internally aware of the memory's +visibility, it can avoid calling =memcopy= when large messages +containing shared segments are sent from one MPI rank to another. +For fully private or partially shared segments, SMPI +identifies and copies only those parts that are process-dependent +(private) into the corresponding buffers on the receiver side. + +HPL simulation times were considerably improved in our experiments because +the =panel= as the most frequently transferred datastructure +is partially shared with only a small part being private. +The additional error introduced by this technique was negligible (below \SI{1}{\percent}) while the +memory consumption was lowered significantly: for a matrix of order $40,000$ and $64$ MPI processes, the memory consumption +decreased from about \SI{13.5}{\giga\byte} to less than \SI{40}{\mega\byte}. +** Panel reuse +HPL \texttt{malloc}s/\texttt{free}s panels in each +iteration, with the size of the panel strictly decreasing from +iteration to iteration. As we explained above, the partial sharing of panels requires +many calls to =mmap= and introduces an overhead that makes these repeated +allocations / frees become a bottleneck. Since +the very first allocation can fit all subsequent panels, we modified +HPL to allocate only the first panel and reuse it for subsequent +iterations (see Figure\ref{fig:panel_reuse}). + +We consider this optimization harmless with respect to simulation +accuracy as the maximum additional error that we observed was always less than \SI{1}{\percent}. Simulation +time is reduced significantly, albeit the reached speed-up is less impressive than for previous +optimizations: For a very small matrix of order $40,000$ and $64$ MPI processes, +the simulation time decreases by four seconds, from \SI{20.5}{\sec} to +\SI{16.5}{\sec}. Responsible for this is a reduction of system time, +namely from \SI{5.9}{\sec} to \SI{1.7}{\sec}. The number of page faults decreased from $2$ million to +$0.2$ million, confirming the devastating effect these allocations/deallocations would have at scale. +** MPI process representation (mmap vs. dlopen) +We already explained in Section\ref{sec:appmodeling} that SMPI +supports two mechanisms to keep local static and global variables +private to each rank, even though they run in the same process. In +this section, we discuss the impact of the choice. + +- *mmap* When =mmap= is used, SMPI copies the =data= segment on startup for + each rank into the heap. When control is transferred from one rank + to another, the =data= segment is =mmap='ed to the location of the other + rank's copy on the heap. All ranks have hence the same addresses in + the virtual address space at their disposition although =mmap= ensures + they point to different physical addresses. This also means + inevitably that caches must be flushed to ensure that no data of one + rank leaks into the other rank, making =mmap= a rather expensive + operation. + +# \TOM{Can you tell me how often these operations were executed, as +# you've already done in your journal on 2017-04-11 ("Looking at the +# syscalls")?} +- *dlopen* With =dlopen=, copies of the global variables are still made + but they are stored inside the =data= segment as opposed to the + heap. When switching from one rank to another, the starting virtual + address for the storage is readjusted rather than the target of the + addresses. This means that each rank has distinct addresses for + global variables. The main advantage of this approach is that caches + do not need to be flushed as is the case for the =mmap= approach, + because data consistency can always be guaranteed. +\noindent +*Impact of choice of mmap/dlopen* +The choice of =mmap= or =dlopen= influences the simulation time indirectly +through its impact on system/user time and page faults, \eg for a +matrix of order $80,000$ and $32$ MPI processes, the number +of minor page faults drops from \num{4412047} (with =mmap=) to +\num{6880} (with =dlopen=). This results in a reduction of system time from +\SI{10.64}{\sec} (out of \SI{51.47}{\sec} in total) to +\SI{2.12}{\sec}. Obviously, the larger the matrix and the number of +processes, the larger the number of context switch during the +simulation, and thus the higher the gain. + +# See Tom's journal (Performance evaluation of the privatization +# mechanism: =dlopen= vs =mmap= ) ; there are some graphs that we might be +# able to use, such as in +# https://github.com/Ezibenroc/m2_internship_journal/blob/master/simgrid_privatization/ + +** Huge pages +For larger matrix orders (\ie $N$ larger than a few hundred thousand), the performance of the simulation quickly +deteriorates as the memory consumption rises rapidly. + +We explained already how we fold the memory in order to reduce the /physical/ +memory usage. The /virtual/ memory, on the other hand, is still +allocated for every process since the allocation calls are still executed. +Without a reduction of allocated virtual addresses, the page table +rapidly becomes too large to fit in a single node. More +precisely, the size of the page table containing pages of size \SI{4}{\kibi\byte} can be computed as: + + #+LATEX: \[ PT_{size}(N) = \frac{N^2 \cdot \texttt{sizeof(double)}}{4,096} \cdot \texttt{sizeof(pointer)} \] + +This means that the addresses in the page table for a matrix of order $N=4,000,000$ +consume $PT_{size}(4,000,000) = \num{2.5e11}$ bytes, \ie +\SI{250}{\giga\byte} on a system where double-precision floating-point numbers +and addresses take 8 bytes. Thankfully, the x86-64 architecture supports several page +sizes, known as ``huge pages'' in Linux. Typically, these pages are +around \SI{2}{\mebi\byte} (instead of \SI{4}{\kibi\byte}), although other sizes +(\SIrange{2}{256}{\mebi\byte}) are possible as well. +Changing the page size requires administrator (root) privileges as the +Linux kernel support for /hugepages/ needs to be activated and a +=hugetlbfs= file system must be mounted. After at least one huge +page has been allocated, the path of the allocated file system can then be +passed on to SimGrid. +Setting the page size to \SI{2}{\mebi\byte} reduces drastically the page table size. +For example, for a matrix of order $N=4,000,000$, it shrinks from \SI{250}{\giga\byte} +to \SI{0.488}{\giga\byte}. + +# Unfortunately, changing the page size requires administrator (root) privileges as the +# Linux kernel support for /hugepages/ needs to be activated and a +# =hugetlbfs= file system must be mounted. After at least one huge +# page was allocated, the path of the allocated file system can then be +# passed on to SimGrid that will then pass the flag =MAP_HUGETLB= +# to =mmap= in =SMPI_SHARED_MALLOC= and replace the file given to =mmap= by +# a file opened in the =hugetlbfs= file system. +# #+LATEX: \CH{I think this is too detailed. Who cares if we pass MAP\_HUGETLB?} +* Scalability Evaluation +#+LaTeX: \label{sec:scalabilityevol} + +#+BEGIN_EXPORT latex +\begin{figure}[t] + \centering + \includegraphics[width=\linewidth,page=2]{./figures/scalability_plot_size.pdf} +% \includegraphics[width=\linewidth,page=2]{./figures/scalability_plot_nbproc.pdf} + \caption{Time complexity and memory consumption are linear in the number of processes but remain mildly quadratic with matrix rank.}\vspace{-1em} + \label{fig:hpl_scalability} + \labspace +\end{figure} +#+END_EXPORT + +#+BEGIN_EXPORT latex +\begin{figure*}%[!htb] + \centering + % \begin{minipage}[b]{.27\textwidth} + % \includegraphics[width=\linewidth,page=2]{./figures/stampede_knc_model.pdf} + % \vspace{-2em} + % \caption{Automatic offloading on the KNC depends on matrix dimensions.} + % \vspace{-1em} + % \label{fig:hpl_mkl} + % \end{minipage}~~~ + \begin{minipage}[b]{.7\textwidth}\centering + \scalebox{.88}{\begin{tabular}{l|r|r|r|r} + & \multicolumn{2}{c|}{CPU (\texttt{CPU})} & \multicolumn{2}{c}{KNC (\texttt{PHI}) }\\ + & Coefficient $[\si{\sec\per\flop}]$& Intercept $[\sec]$ & Coefficient $[\si{\sec\per\flop}]$& Intercept $[\sec]$ \\ + \hline + \texttt{DGEMM} & \num{1.029e-11} & \num{2.737e-02} & \num{1.981e-12} & \num{6.316e-01} \\ + \texttt{DTRSM} & \num{9.882e-12} & \num{4.329e-02} & \num{1.954e-12} & \num{5.222e-01} + \end{tabular}}\medskip\\ + \lstset{frame=bt,language=C,numbers=none,escapechar=|}\lstinputlisting{HPL_dtrsm_macro_real.c} + \caption{Modeling automatic offloading on KNC in MKL BLAS kernels.} + \vspace{-1em} + \label{fig:macro_real} + \end{minipage}~~~\begin{minipage}[b]{.27\textwidth} + \centering + \includegraphics[width=\linewidth,page=1]{./figures/stampede_calibration_send.png} + \caption{Modeling communication time on stampede. Each color is manually adjusted and + corresponds to a different synchronization mode + (eager, rendez-vous,...). }\vspace{-1em} + \label{fig:stampede_calibration} + \labspace + \end{minipage} +\end{figure*} +#+END_EXPORT + +# SMPI_DGEMM_COEFFICIENT=1.029e-11 SMPI_DGEMM_INTERCEPT=2.737e-02 SMPI_DGEMM_PHI_COEFFICIENT=1.981e-12 SMPI_DGEMM_PHI_INTERCEPT=6.316e-01 \ +# SMPI_DTRSM_COEFFICIENT=9.882e-12 SMPI_DTRSM_INTERCEPT=4.329e-02 SMPI_DTRSM_PHI_COEFFICIENT=1.954e-12 SMPI_DTRSM_PHI_INTERCEPT=5.222e-01" + +In Section\ref{sec:em} we explained the problems we encountered when trying +to run a large-scale simulation on a single node and how we solved them. +For the most part, we identified and eliminated bottlenecks one after +another while simultaneously making sure that the accuracy of our performance prediction was +not impacted. Certainly, the main goal was to reduce the +complexity from $\O(N^3) + \O(N^2\cdot{}P\cdot{}Q)$ to something more reasonable. +The $\O(N^3)$ was removed through skipping most computations. +Ideally, since there are $N/NB$ iterations (steps), +the complexity of simulating one step should be decreased to something independent of +$N$. SimGrid's fluid models, used to simulate communications, do not +depend on $N$. Therefore, the time to simulate a step of HPL should mostly depend on $P$ and +$Q$. Yet, some memory operations on the panel that are related to pivoting +are intertwined in HPL with collective communications, meaning that it +is impossible to completely get rid of the $\O(N)$ complexity without +modifying HPL more profoundly. + +Although our goal was to model and simulate HPL on the Stampede +platform, we decided to conduct a first evaluation on a +similar, albeit non-existing, platform comprising 4,096 8-core nodes +interconnected through a $\langle2;16,32;1,16;1,1\rangle$ fat-tree topology +built on ideal network links with a bandwidth of +\SI{50}{\giga\byte\per\sec} and a latency of \SI{5}{\micro\sec}. We ran +simulations with $512$; $1,024$; $2,048$ or $4,096$ MPI processes and +with matrices of orders \num{5e5}, \num{1e6}, \num{2e6} or \num{4e6}. +The impact of the matrix order on total makespan and memory is illustrated in Figure\ref{fig:hpl_scalability}. +With all previously described +optimizations enabled, the simulation with the largest matrix took close to $47$ hours and consumed +\SI{16}{\giga\byte} of memory whereas the smallest one took $20$ minutes and \SI{282}{\mega\byte} of memory. +One can also see that, when the matrix order ($N$) is increased, memory consumption and +simulation time both grow slightly quadratic as the amount of matrix +elements is $N^{2}$ and the number of steps of the algorithm also linearly. + +Moreover, all the simulations spend less than \SI{10}{\percent} of their execution time in kernel +mode, which means the number of system calls is reasonably low. +** Hidden section :noexport: +Got data and code from the "2017-06-05 Monday: Plots for scalability +test" section of Tom's journal: + +#+begin_src R :results output :session *R* :exports both +library(ggplot2) +library(ggrepel) +library(reshape2) +library(gridExtra) +results = rbind( + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/scalability/result_500000_512.csv'), + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/scalability/result_500000_1024.csv'), + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/scalability/result_500000_2048.csv'), + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/scalability/result_500000_4096.csv'), + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/scalability/result_1000000_512.csv'), + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/scalability/result_1000000_1024.csv'), + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/scalability/result_1000000_2048.csv'), + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/scalability/result_1000000_4096.csv'), + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/scalability/result_2000000_512.csv'), + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/scalability/result_2000000_1024.csv'), + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/scalability/result_2000000_2048.csv'), + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/scalability/result_2000000_4096.csv'), + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/scalability/result_4000000_512.csv'), + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/scalability/result_4000000_1024.csv'), + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/scalability/result_4000000_2048.csv'), + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/scalability/result_4000000_4096.csv') +) +results$simulation_time = results$simulation_time/3600 +results$memory_size = results$memory_size * 1e-9 +number_verb <- function(n) { + return(format(n,big.mark=",",scientific=FALSE)) +} +results$size_verb = factor(unlist(lapply(results$size, number_verb)), levels = c('500,000','1,000,000','2,000,000','4,000,000')) +results$nb_proc_verb = factor(unlist(lapply(results$nb_proc, number_verb)), levels = c('512', '1,024', '2,048', '4,096')) +results +#+end_src + +#+RESULTS: +#+begin_example + topology nb_roots nb_proc size full_time time Gflops +1 2;16,32;1,16;1,1;8 16 512 500000 91246.1 91246.02 913.3 +2 2;16,32;1,16;1,1;8 16 1024 500000 46990.1 46990.02 1773.0 +3 2;16,32;1,16;1,1;8 16 2048 500000 24795.5 24795.50 3361.0 +4 2;16,32;1,16;1,1;8 16 4096 500000 13561.0 13561.01 6145.0 +5 2;16,32;1,16;1,1 16 512 1000000 716521.0 716521.00 930.4 +6 2;16,32;1,16;1,1 16 1024 1000000 363201.0 363201.04 1836.0 +7 2;16,32;1,16;1,1 16 2048 1000000 186496.0 186495.70 3575.0 +8 2;16,32;1,16;1,1;8 16 4096 1000000 97836.6 97836.54 6814.0 +9 2;16,32;1,16;1,1 16 512 2000000 5685080.0 5685077.72 938.1 +10 2;16,32;1,16;1,1 16 1024 2000000 2861010.0 2861012.55 1864.0 +11 2;16,32;1,16;1,1 16 2048 2000000 1448900.0 1448899.09 3681.0 +12 2;16,32;1,16;1,1;8 16 4096 2000000 742691.0 742690.59 7181.0 +13 2;16,32;1,16;1,1;8 16 512 4000000 45305100.0 45305083.56 941.8 +14 2;16,32;1,16;1,1;8 16 1024 4000000 22723800.0 22723820.45 1878.0 +15 2;16,32;1,16;1,1;8 16 2048 4000000 11432900.0 11432938.62 3732.0 +16 2;16,32;1,16;1,1;8 16 4096 4000000 5787160.0 5787164.09 7373.0 + simulation_time application_time user_time system_time major_page_fault +1 0.3311083 204.992 1098.25 93.12 0 +2 0.6895222 441.897 2296.51 184.70 0 +3 1.4144361 872.425 4741.26 349.79 0 +4 3.1448889 1947.320 10640.63 679.53 0 +5 0.7319722 500.970 2367.19 259.91 0 +6 1.6771917 1036.960 5515.36 515.05 0 +7 3.4421944 2092.950 11389.36 995.39 0 +8 7.2368056 4362.660 24082.38 1966.10 0 +9 1.9263500 1169.660 6193.80 683.73 0 +10 4.2217500 2551.100 13714.01 1430.93 0 +11 8.9621111 5236.560 29357.92 2844.89 0 +12 18.0156389 10643.600 59444.40 5402.24 0 +13 4.8156944 3030.400 15090.31 1945.23 0 +14 10.6613611 6435.870 34249.71 3827.36 0 +15 23.2042222 13080.500 75523.95 7684.52 0 +16 47.1275000 26745.400 154314.76 15085.08 0 + minor_page_fault cpu_utilization uss rss page_table_size +1 960072 0.99 155148288 2055086080 10604000 +2 1054062 0.99 369696768 4383203328 21240000 +3 1282294 0.99 1012477952 9367576576 42912000 +4 1852119 0.99 3103875072 15318568960 87740000 +5 1916208 0.99 153665536 2317279232 10600000 +6 2002989 0.99 369676288 4837175296 21252000 +7 2154982 0.99 1010696192 7774138368 42908000 +8 2768705 0.99 3103895552 16934834176 87748000 +9 3801905 0.99 150765568 2758770688 10604000 +10 3872820 0.99 365555712 5273034752 21220000 +11 4038099 0.99 1009606656 7415914496 42884000 +12 4704339 0.99 3102445568 19464646656 87748000 +13 7663911 0.98 151576576 2056916992 10604000 +14 7725625 0.99 369872896 4120702976 21212000 +15 7917525 0.99 1012191232 9221050368 42880000 +16 8550745 0.99 3113381888 20408209408 87808000 + memory_size size_verb nb_proc_verb +1 0.2825585 500,000 512 +2 0.4299489 500,000 1,024 +3 0.9628262 500,000 2,048 +4 2.8140421 500,000 4,096 +5 0.8944435 1,000,000 512 +6 1.0553098 1,000,000 1,024 +7 1.5811707 1,000,000 2,048 +8 3.4254070 1,000,000 4,096 +9 3.3384202 2,000,000 512 +10 3.4971116 2,000,000 1,024 +11 4.0274084 2,000,000 2,048 +12 5.9101348 2,000,000 4,096 +13 13.0790605 4,000,000 512 +14 13.2755579 4,000,000 1,024 +15 13.8251837 4,000,000 2,048 +16 15.7636690 4,000,000 4,096 +#+end_example + +#+begin_src R :results output :session *R* :exports both + library(ggplot2) + library(gridExtra) + library(grid) + + generic_do_plot <- function(plot, fixed_shape=TRUE) { + # For xrange, see https://stackoverflow.com/questions/7705345/how-can-i-extract-plot-axes-ranges-for-a-ggplot2-object + # old version for xrange (broken) + # xrange = ggplot_build(plot)$panel$ranges[[1]]$x.range + # new version for xrange (may break in the next ggplot update...) + xrange = ggplot_build(plot)$layout$panel_ranges[[1]]$x.range + xwidth = xrange[2] - xrange[1] + if(fixed_shape) { + point = stat_summary(fun.y = mean, geom="point", shape=21) + } + else { + point = stat_summary(fun.y = mean, geom="point") + } + return(plot + + stat_summary(fun.data = mean_se, geom = "errorbar", width=xwidth/20)+ + stat_summary(fun.y = mean, geom="line")+ + point+ + theme_bw()+ + expand_limits(x=0, y=0)) + } + do_plot <- function(df, x, y, color, color_title, fixed_val, other_fixed_val=-1) { + if(y == "simulation_time") { + y_title = "Simulation time (seconds)" + title = "Simulation time" + } + else if(y == "memory_size") { + y_title = "Memory consumption (bytes)" + title = "Memory consumption" + } + else { + stopifnot(y == "Gflops") + y_title = "Performance estimation (Gflops)" + title = "Performance estimation" + } + if(x == "size") { + fixed_arg = "nb_proc" + x_title = "Matrix size" + title = paste(title, "for different matrix sizes\nUsing", fixed_val, "MPI processes") + } + else { + stopifnot(x == "nb_proc") + fixed_arg = "size" + x_title = "Number of processes" + title = paste(title, "for different number of processes\nUsing a matrix size of", format(fixed_val,big.mark=",",scientific=FALSE)) + } + sub_df = df[df[fixed_arg] == fixed_val,] + p = generic_do_plot(ggplot(sub_df, aes_string(x=x, y=y, linetype=color, color=color, group=color))) + + ggtitle(title)+ + xlab(x_title)+ + ylab(y_title)+ + labs(colour=color_title)+ + labs(linetype=color_title) + if(other_fixed_val != -1) { + rect <- data.frame(xmin=-Inf, xmax=Inf, ymin=-Inf, ymax=Inf) + my_xmin = other_fixed_val * 0.9 + my_xmax = other_fixed_val * 1.1 + my_ymax = max(sub_df[sub_df[x] == other_fixed_val,][y]) + y_delta = my_ymax * 0.1 + my_ymax = my_ymax + y_delta + my_ymin = min(sub_df[sub_df[x] == other_fixed_val,][y]) - y_delta + p = p + geom_rect(data=rect, aes(xmin=my_xmin, xmax=my_xmax, ymin=my_ymin, ymax=my_ymax),color="grey20", alpha=0.1, inherit.aes=FALSE) + } + return(p) + } + + # From https://stackoverflow.com/a/38420690/4110059 + grid_arrange_shared_legend <- function(..., nrow = 1, ncol = length(list(...)), position = c("bottom", "right")) { + + plots <- list(...) + position <- match.arg(position) + g <- ggplotGrob(plots[[1]] + theme(legend.position = position))$grobs + legend <- g[[which(sapply(g, function(x) x$name) == "guide-box")]] + lheight <- sum(legend$height) + lwidth <- sum(legend$width) + gl <- lapply(plots, function(x) x + theme(legend.position = "none")) + gl <- c(gl, nrow = nrow, ncol = ncol) + + combined <- switch(position, + "bottom" = arrangeGrob(do.call(arrangeGrob, gl), + legend, + ncol = 1, + heights = unit.c(unit(1, "npc") - lheight, lheight)), + "right" = arrangeGrob(do.call(arrangeGrob, gl), + legend, + ncol = 2, + widths = unit.c(unit(1, "npc") - lwidth, lwidth))) + grid.newpage() + grid.draw(combined) + + } + + do_multiple_plot <- function(df, x1, x2, y, color, color_title, fixed_x1, fixed_x2) { + my_ymax = max(df[y]) + return( + grid_arrange_shared_legend( + do_plot(df, x1, y, color, color_title, fixed_x1, fixed_x2) + expand_limits(x=0, y=my_ymax), + do_plot(df, x2, y, color, color_title, fixed_x2, fixed_x1) + expand_limits(x=0, y=my_ymax), + nrow=1, ncol=2 + )) + } + + do_four_plot <- function(df, x1, x2, y1, y2, color, color_title, fixed_x1, fixed_x2) { + my_y1max = max(df[y1]) + my_y2max = max(df[y2]) + return( + grid_arrange_shared_legend( + do_plot(df, x1, y1, color, color_title, fixed_x1, fixed_x2) + expand_limits(x=0, y=my_y1max), + do_plot(df, x2, y1, color, color_title, fixed_x2, fixed_x1) + expand_limits(x=0, y=my_y1max), + do_plot(df, x1, y2, color, color_title, fixed_x1, fixed_x2) + expand_limits(x=0, y=my_y2max), + do_plot(df, x2, y2, color, color_title, fixed_x2, fixed_x1) + expand_limits(x=0, y=my_y2max), + nrow=2, ncol=2 + )) + } +#+end_src + +#+RESULTS: + +#+begin_src R :file figures/scalability_2.pdf :results value graphics :results output :session *R* :exports both :width 4 :height 2.5 +nbproc_time = generic_do_plot(ggplot(results, aes(x=nb_proc, y=simulation_time, color=size_verb))) + + xlab("Number of processes") + + ylab("Simulation time (hours)") + + labs(colour="Matrix size")+ + ggtitle("Simulation time for different number of processes")+ + theme(legend.position = "none")+ + geom_text_repel( + data = subset(results, nb_proc == max(nb_proc)), + aes(label = size_verb), + nudge_x = 45, + segment.color = NA, + show.legend = FALSE + ) +nbproc_time +#+end_src + +#+RESULTS: +[[file:figures/scalability_2.pdf]] + +#+begin_src R :file figures/scalability_4.pdf :results value graphics :results output :session *R* :exports both :width 4 :height 2.5 +nbproc_mem = generic_do_plot(ggplot(results, aes(x=nb_proc, y=memory_size, color=size_verb))) + + xlab("Number of processes") + + ylab("Memory consumption (gigabytes)") + + labs(colour="Matrix size")+ + ggtitle("Memory consumption for different number of processes")+ + theme(legend.position = "none")+ + geom_text_repel( + data = subset(results, nb_proc == max(nb_proc)), + aes(label = size_verb), + nudge_x = 45, + segment.color = NA, + show.legend = FALSE + ) +nbproc_mem +#+end_src + +#+RESULTS: +[[file:figures/scalability_4.pdf]] + + +#+begin_src R :file figures/scalability_1.pdf :results value graphics :results output :session *R* :exports both :width 4 :height 2.5 +size_time = generic_do_plot(ggplot(results, aes(x=size, y=simulation_time, color=nb_proc_verb))) + + xlab("Matrix rank") + + ylab("Simulation time (hours)") + + labs(colour="Number of processes")+ scale_color_brewer(palette="Set1")+ +# ggtitle("Simulation time for different matrix sizes")+ + theme(legend.position = "none")+ + geom_text_repel( + data = subset(results, size == max(size)), + aes(label = nb_proc_verb), + nudge_x = 45, + segment.color = NA, + show.legend = FALSE + ) +size_time +#+end_src + +#+RESULTS: +[[file:figures/scalability_1.pdf]] + +#+begin_src R :file figures/scalability_3.pdf :results value graphics :results output :session *R* :exports both :width 4 :height 2.5 +size_mem = generic_do_plot(ggplot(results, aes(x=size, y=memory_size, color=nb_proc_verb))) + + xlab("Matrix rank") + + ylab("Memory consumption (gigabytes)") + + labs(colour="Number of processes")+ +# ggtitle("Memory consumption for different matrix sizes")+ + theme(legend.position = "none")+scale_color_brewer(palette="Set1")+ + geom_text_repel( + data = subset(results, size == max(size)), + aes(label = nb_proc_verb), + nudge_x = 45, + segment.color = NA, + show.legend = FALSE + ) +size_mem +#+end_src + +#+RESULTS: +[[file:figures/scalability_3.pdf]] + +#+begin_src R :file figures/scalability_plot_size.pdf :results value graphics :results output :session *R* :exports both :width 7 :height 4 +grid_arrange_shared_legend(size_time, size_mem, nrow=1, ncol=2) +#+end_src + +#+RESULTS: +[[file:figures/scalability_plot_size.pdf]] + +#+begin_src R :file figures/scalability_plot_nbproc.pdf :results value graphics :results output :session *R* :exports both :width 8 :height 3.5 +grid_arrange_shared_legend(nbproc_time, nbproc_mem, nrow=1, ncol=2) +#+end_src + +#+RESULTS: +[[file:figures/scalability_plot_nbproc.pdf]] + + +#+begin_src R :results output :session *R* :exports both +fit_sim = lm(data=results, simulation_time ~ nb_proc*(size+I(size^2))) +summary(fit_sim) +#+end_src + +#+RESULTS: +#+begin_example + +Call: +lm(formula = simulation_time ~ nb_proc * (size + I(size^2)), + data = results) + +Residuals: + Min 1Q Median 3Q Max +-0.192256 -0.050079 -0.004809 0.045721 0.231054 + +Coefficients: + Estimate Std. Error t value Pr(>|t|) +(Intercept) -1.522e-01 1.866e-01 -0.815 0.4339 +nb_proc -1.162e-04 7.907e-05 -1.469 0.1725 +size 6.919e-08 2.214e-07 0.313 0.7610 +I(size^2) -8.691e-14 4.689e-14 -1.853 0.0935 . +nb_proc:size 1.608e-09 9.379e-11 17.142 9.64e-09 *** +nb_proc:I(size^2) 3.450e-16 1.987e-17 17.366 8.49e-09 *** +--- +Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 + +Residual standard error: 0.1343 on 10 degrees of freedom +Multiple R-squared: 0.9999, Adjusted R-squared: 0.9999 +F-statistic: 2.46e+04 on 5 and 10 DF, p-value: < 2.2e-16 +#+end_example + +#+begin_src R :results output :session *R* :exports both +grid.lines = 26 +x.pred <- seq(min(results$nb_proc), max(results$nb_proc), length.out = grid.lines) +y.pred <- seq(min(results$size), max(results$size), length.out = grid.lines) +xy <- expand.grid( nb_proc = x.pred, size = y.pred) +z.pred <- matrix(predict(fit_sim, newdata = xy), + nrow = grid.lines, ncol = grid.lines) +# fitted points for droplines to surface +fitpoints <- predict(fit_sim) +#+end_src + +#+RESULTS: + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session *R* +library("plot3D") +scatter3D( + results$nb_proc, results$size, results$simulation_time, ticktype = "detailed", phi = 20, theta = -50, bty ="g", + pch = 20, cex = 2, type="l", r=10, + surf = list(x = x.pred, y = y.pred, z = z.pred, + facets = NA, fit = fitpoints),colvar=NULL) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-23284Iao/figure23284S2p.png]] + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session *R* +library("plot3D") +scatter3D(results$nb_proc, results$size, results$simulation_time, ticktype = "detailed", phi = 0, theta = -50, bty ="g", + surf = list(x = unique(results$nb_proc), y = unique(results$size), z = matrix(results$simulation_time, nrow=length(unique(results$nb_proc))), facets = NA)) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-23284Iao/figure23284QCE.png]] + + + +#+begin_src R :results output :session *R* :exports both +fit_sim = lm(data=results, memory_size ~ (nb_proc + I(nb_proc^2)) + I(size^2)) +summary(fit_sim) +#+end_src + +#+RESULTS: +#+begin_example + +Call: +lm(formula = memory_size ~ (nb_proc + I(nb_proc^2)) + I(size^2), + data = results) + +Residuals: + Min 1Q Median 3Q Max +-0.046408 -0.005840 0.001738 0.011710 0.058452 + +Coefficients: + Estimate Std. Error t value Pr(>|t|) +(Intercept) -3.785e-02 2.247e-02 -1.685 0.1179 +nb_proc 1.264e-04 2.519e-05 5.019 0.0003 *** +I(nb_proc^2) 1.288e-07 5.211e-09 24.712 1.17e-11 *** +I(size^2) 8.166e-13 1.063e-15 767.967 < 2e-16 *** +--- +Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 + +Residual standard error: 0.02691 on 12 degrees of freedom +Multiple R-squared: 1, Adjusted R-squared: 1 +F-statistic: 2.043e+05 on 3 and 12 DF, p-value: < 2.2e-16 +#+end_example + +#+begin_src R :results output :session *R* :exports both +grid.lines = 26 +x.pred <- seq(min(results$nb_proc), max(results$nb_proc), length.out = grid.lines) +y.pred <- seq(min(results$size), max(results$size), length.out = grid.lines) +xy <- expand.grid( nb_proc = x.pred, size = y.pred) +z.pred <- matrix(predict(fit_sim, newdata = xy), + nrow = grid.lines, ncol = grid.lines) +# fitted points for droplines to surface +fitpoints <- predict(fit_sim) +#+end_src + +#+RESULTS: + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session *R* +library("plot3D") +scatter3D( + results$nb_proc, results$size, results$memory_size, ticktype = "detailed", phi = -10, theta = -50, bty ="g", + pch = 18, cex = 2, + surf = list(x = x.pred, y = y.pred, z = z.pred, + facets = NA, fit = fitpoints),colvar=NULL) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-23284Iao/figure23284e_o.png]] + +* Modeling Stampede and Simulating HPL +#+LaTeX: \label{sec:science} + +** Modeling Stampede +*** Computations +Each node of the Stampede cluster comprises two 8-core Intel Xeon +E5-2680 8C \SI{2.7}{\GHz} CPUs and one 61-core Intel Xeon Phi SE10P +(KNC) \SI{1.1}{\GHz} accelerator that is roughly three times more +powerful than the two CPUs and can be used in two ways: +either as a classical accelerator, \ie for offloading expensive +computations from the CPU, or by compiling +binaries specifically for and executing them directly on the Xeon Phi. +While the accelerator's \SI{8}{\gibi\byte} of RAM are rather +small, the main advantage of the second approach is that data does not +need to be transferred back and forth between the node's CPUs and the +accelerator via the x16 PCIe bus. + +The HPL output submitted to the TOP500 (Figure\ref{fig:hpl_output}) +does not indicate how the KNC was used. However, because of the values assigned +to $P$ and $Q$, we are certain that only a single MPI process per node +was run. For this reason, it is likely that the KNC used as an accelerator. +With Intel's Math Kernel Library (MKL), this is effortless as the MKL comes with +support for automatic offloading *for* selected BLAS functions. +Unfortunately, we do not know which MKL version was used in 2013 and therefore decided to +use the default version used on Stampede in the beginning of 2017, \ie +version 11.1.1. The MKL documentation states +that, depending on the matrix geometry, the computation will run on +either all the cores of the CPU or exclusively on the KNC. In the case of +=DGEMM=, the computation of $A=\alpha\cdot{}A+\beta\cdot{}B\times{}C$ with $A, B, C$ of +dimensions $M\times{}K$, $K\times{}N$ and $M\times{}N$, respectively, is offloaded onto the KNC whenever $M$ +and $N$ are both larger than $1280$ while $K$ is simultaneously larger +than $256$. Similarly, offloading for =DTRSM= is used when both $M$ and $N$ +are larger than $512$, which results in a +better throughput but incurs a higher latency. The complexity for =DGEMM= is always of the order +of $M\cdot{}N\cdot{}K$ ($M\cdot{}N^2$ for =DTRSM=) but the model that describes the time it +takes to run =DGEMM= (=DTRSM=) is very different for small and large +matrices. The table in Figure\ref{fig:macro_real} indicates the +parameters of the linear regression for the four scenarios (=DGEMM= +or =DTRSM= and CPU or Phi). The measured performance was close to the +peak performance: \eg for =DGEMM= on the Phi reached +$2/\num{1.981e-12} = \SI{1.009}{\tera\flops}$. Since the granularity +used in HPL (see Figure\ref{fig:hpl_output}) is 1024, all calls (except +for maybe the very last iteration) are offloaded to the KNC. +In any case, this behavior can easily be accounted for by replacing the +macro in Figure\ref{fig:macro_simple} by the one in Figure\ref{fig:macro_real}. + +# The accelerators are essential to the performance of the cluster, +# delivering \SI{7}{\peta\flops} of sustainable performance whereas +# the CPUs are only capable of delivering \SI{2}{\peta\flops}. On +# matrices of the size used for this work, however, CPUs are barely +# used. + +# See CH's journal from [2017-10-03 Tue] to see how the version was determined +**** R figures :noexport: +#+begin_src R :results output :session *R* :exports both + library(gridExtra) + library(ggplot2) + dgemm <- read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/stampede/offloading_dgemm.csv') + dgemm$m = as.double(dgemm$m) + dtrsm <- read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/stampede/offloading_dtrsm.csv') + dtrsm$m = as.double(dtrsm$m) + + dgemm_new = dgemm[dgemm$automatic_offload == 'True',] + dtrsm_new = dtrsm[dtrsm$automatic_offload == 'True',] + dgemm_new$offload = (dgemm_new$m > 1280 & dgemm_new$n > 1280 & dgemm_new$k > 256) + dtrsm_new$offload = (dtrsm_new$m > 512 & dtrsm_new$n > 512) + + dgemm_new$flops = dgemm_new$m * dgemm_new$n * dgemm_new$k; + dtrsm_new$flops = dtrsm_new$m * dtrsm_new$n^2; + + dgemm_new$type = "dgemm"; + dtrsm_new$type = "dtrsm"; + + df = dtrsm_new + df$k = NA; + df$lead_C = NA; + df = rbind(df,dgemm_new) + head(df) + tail(df) +#+end_src + +#+RESULTS: +#+begin_example + time m n lead_A lead_B automatic_offloading offload flops +1 0.029975 7251 261 7251 7251 True FALSE 493945371 +4 2.227428 578 4619 4619 4619 True TRUE 12331723058 +5 0.042097 4424 420 4424 4424 True FALSE 780393600 +8 0.018786 3115 305 3115 3115 True FALSE 289772875 +10 2.931274 650 5466 5466 5466 True TRUE 19420151400 +12 3.240624 5606 6490 6490 6490 True TRUE 236125280600 + type k lead_C +1 dtrsm NA NA +4 dtrsm NA NA +5 dtrsm NA NA +8 dtrsm NA NA +10 dtrsm NA NA +12 dtrsm NA NA + time m n lead_A lead_B automatic_offloading offload flops +89 0.083594 244 5757 5757 5757 True FALSE 847038924 +91 4.932572 5527 6493 6493 6493 True TRUE 189518248891 +931 2.943795 1425 6127 6127 6127 True TRUE 33954761775 +96 0.262358 62 6621 6621 6621 True FALSE 2151851484 +981 2.749753 4991 2256 4991 4991 True TRUE 17002140960 +1001 2.383139 1421 1348 1646 1646 True TRUE 3152926168 + type k lead_C +89 dgemm 603 5757 +91 dgemm 5281 6493 +931 dgemm 3889 6127 +96 dgemm 5242 6621 +981 dgemm 1510 4991 +1001 dgemm 1646 1646 +#+end_example + +#+begin_src R :results output :session *R* :exports both +get_legend<-function(myggplot){ + tmp <- ggplot_gtable(ggplot_build(myggplot)) + leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box") + legend <- tmp$grobs[[leg]] + return(legend) +} +#+end_src + +#+RESULTS: + +#+begin_src R :results output graphics :file figures/stampede_knc_model.pdf :exports both :width 3.5 :height 5.3 :session *R* + labels = rbind( + data.frame(x = 2E11, y=3.5, label="Offloading on the KNC", offload = T), + data.frame(x = 1.1E11, y=1, label="Computation on the CPU", offload = F)); + + p1 = ggplot(dgemm_new, aes_string(x='m*n*k', y='time', color='offload')) + + geom_point() + geom_smooth(method="lm",fullrange=T) + theme(legend.position="top") + + geom_text(data = labels, aes(x=x, y=y, color=offload, label=label)) + ylim(0,1.1*max(dgemm_new$time)) + + ggtitle('Duration of DGEMM') + theme_bw() + + xlab("M.N.K [Flop]") + ylab("Duration [s]") + scale_color_brewer(palette="Set1") + + p1_legend = get_legend(p1); + p1 = p1 + theme(legend.position="none") + labels = rbind( + data.frame(x = 3E11, y=2.8, label="Offloading on the KNC", offload = T), + data.frame(x = 1.6E11, y=.6, label="Computation on the CPU", offload = F)); + p2 = ggplot(dtrsm_new, aes_string(x='m*n*n', y='time', color='offload')) + + geom_point()+ geom_smooth(method="lm",fullrange=T) + ylim(0,1.1*max(dtrsm_new$time)) + + geom_text(data = labels, aes(x=x, y=y, color=offload, label=label)) + + ggtitle('Duration of DTRSM') + theme_bw() + + xlab("M.N² [Flop]") + ylab("Duration [s]") + scale_color_brewer(palette="Set1") + p2 = p2 + theme(legend.position="none") + + lay <- rbind(c(1), c(2)); + grid.arrange(p1,p2, layout_matrix = lay,widths=c(1), heights=c(2,2)); +#+end_src + +#+RESULTS: +[[file:figures/stampede_knc_model.pdf]] + +#+begin_src sh :results output :exports both +pdfcrop figures/stampede_knc_model.pdf figures/stampede_knc_model.pdf +#+end_src + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session *R* + ggplot(df, aes(x=flops, y=time, color=offload)) + + geom_point() + geom_smooth(method="lm") + + facet_wrap(~type) +#+end_src + +#+RESULTS: +[[file:/tmp/babel-1674kfe/figure1674sWN.png]] + +*** Communications +# #+BEGIN_EXPORT latex +# \begin{figure}[t] +# \centering +# \includegraphics[width=\linewidth,page=1]{./figures/stampede_calibration_send.png} +# \caption{Modeling communication time on stampede. Each color corresponds to a +# manually adjusted breakpoint corresponding to a synchronization mode +# (eager, rendez-vous,...). }\vspace{-1em} +# \label{fig:stampede_calibration} +# \labspace +# \end{figure} +# #+END_EXPORT + +We unfortunately do not know for sure which version of Intel MPI was used in +2013, so we decided to use the default one on Stampede +in May 2017, \ie version 3.1.4. As explained in +Section\ref{sec:smpi}, SMPI's communication model is a hybrid model +between the LogP family and a fluid model. For each message, the send mode +(\eg fully asynchronous, detached or eager) is determined solely by the +message size. It is hence possible to model the resulting performance +of communication operations through a piece-wise linear model, as depicted in +Figure\ref{fig:stampede_calibration}. For a thorough discussion of +the calibration techniques used to obtain this model, +see\cite{smpi}. As illustrated, the results for +=MPI_Send= are quite stable and piece-wise regular, but the behavior of +=MPI_Recv= is surprising: for small messages with a size of less than \SI{17420}{\byte} +(represented by purple, blue and red dots), one can observe two modes, +namely ``slow'' and ``fast'' communications. ``Slow'' +operations take twice longer and are much more common than the +``fast'' ones. We observed this behavior in several experiments even though both MPI +processes that were used in the calibration were connected through +the same local switch. When observed, this ``perturbation'' was present throughout the execution of that +calibration. +Having taken into consideration that small messages are scarce in HPL, we eventually decided to +ignore this phenomenon and opted to use the more favorable scenario (fast +communications) for small messages. We believe that the impact of +our choice on the simulation accuracy is minimal as primarily large, +bulk messages are sent that make use of the /rendez-vous/ mode (depicted in dark green). + +Furthermore, we configured SMPI to use Stampede's network topology, +\ie Mellanox FDR InfiniBand technology with \SI{56}{\giga\bit\per\second}, setup in +a fat-tree topology (see Figure\ref{fig:fat_tree_topology}). We +assumed the routing was done through D-mod-K\cite{dmodk} as it is +commonly used on this topology. +**** Stampede network calibration figures :noexport: +This figure is generated in [[file:~/Work/SimGrid/platform-calibration/data/stampede_17_06_01-17:14/calibration/analysis.org][the platform calibration repository]]. Data +should be read from there. Final adjustments (in the "Combined plot +section") were done here: + +#+begin_src R :results output :session *R* :exports both +library(gridExtra) +get_legend<-function(myggplot){ + tmp <- ggplot_gtable(ggplot_build(myggplot)) + leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box") + legend <- tmp$grobs[[leg]] + return(legend) +} + +p1 = eth$p_send + theme(legend.position="top", legend.background = element_rect(fill = "white", colour = NA)) + guides(colour = guide_legend(override.aes = list(alpha = 1))) +# + annotate("text",x=1E2,y=2.3E-6, label="40GB IB model",color="black") + + +p2 = eth$p_recv + theme(legend.position="top", legend.background = element_rect(fill = "white", colour = NA)) +# + annotate("text",x=1E2,y=2.3E-6, label="40GB IB model",color="black") + +p1_legend = get_legend(p1); +p1 = p1 + theme(legend.position="none") +p2 = p2 + theme(legend.position="none") + +# lay <- rbind(c(1,1), +# c(2,3)); +# p = grid.arrange(p1_legend,p1,p2, layout_matrix = lay,widths=c(2,2), heights=c(1.3,4)); +# ggsave(filename="/tmp/taurus_send_recv.pdf",plot=p,width = 6, height = 4) + +lay <- rbind(c(1,2)); +p = grid.arrange(p1,p2, layout_matrix = lay,widths=c(2,2), heights=c(4)); +ggsave(filename="/tmp/stampede_send_recv_eth.pdf",plot=p,width = 6, height = 3) +#+end_src + +#+RESULTS: +: Warning message: +: Transformation introduced infinite values in continuous x-axis +: Warning messages: +: 1: Transformation introduced infinite values in continuous x-axis +: 2: Transformation introduced infinite values in continuous x-axis + +#+begin_src R :results output :session *R* :exports both +ggsave(filename="/tmp/stampede_send_recv_eth.png",plot=p,width = 6, height = 3) +#+end_src + +#+RESULTS: + +#+begin_src sh :results output :exports both +cp /tmp/stampede_send_recv_eth.png ./figures/stampede_calibration_send.png +#+end_src + +#+RESULTS: +*** Summary of modeling uncertainties +For the compiler, Intel MPI and MKL, we were unable to determine +which version was used in 2013, but decided to go for rather optimistic +choices. The models for the MKL and for Intel MPI are close to the peak +performance. It is plausible that the compiler managed to optimize +computations in HPL. While it is true that most of these computations +are executed in our simulations, they are not accounted for. This +allows us to obtain fully deterministic simulations without harming the +outcome of the simulation as these parts only represent a tiny fraction of +the total execution time of HPL. A few HPL compilation flags (\eg +=HPL_NO_MPI_DATATYPE= and =HPL_COPY_L= that control whether MPI datatypes +should be used and how, respectively) could not be deduced from +HPL's original output on Stampede but we believe their impact to be +minimal. Finally, the HPL output reports the use of HPL v2.1 but the +main difference between v2.1 and v2.2 is the option to +continuously report factorization progress. We hence decided to apply +our modifications to the later version of HPL. + +With all these modifications in place, we expected the prediction of +our simulations to be optimistic yet close to results obtained by a real life execution. +# - iMPI version ??? +# - HPL compilation ? Possible modifications s.a. using openMP to have thread taking care of MPI communications and progressions. +** Simulating HPL +*** Performance Prediction +Figure\ref{fig:stampede_prediction} compares two simulation scenarios +with the original result from 2013. The solid red line represents the HPL +performance prediction as obtained with SMPI with the Stampede model +that we described in the previous section. Although we expected SMPI to be +optimistic, the prediction was surprisingly much lower than the TOP500 result. +We verified that no part of HPL was left unmodeled and decided to +investigate whether a flaw in our network model that would result in +too much congestion could explain the performance. +Alas, even a congestion-free network model +(represented by the dashed blue line in Figure\ref{fig:stampede_prediction}) only +results in minor improvements. In our experiments to model =DGEMM= and =DTRSM=, +either the CPU or the KNC seemed to be used at one time and a specifically +optimized version of the MKL may have been used in 2013. +Removing the offloading latency and modeling each node as a +single \SI{1.2}{\tera\flops} node does not sufficiently explain the +divide between our results and reality. + +#+BEGIN_EXPORT latex +\begin{figure}[t] + \centering + \includegraphics[width=\linewidth,page=1]{./figures/stampede_simgrid.pdf} + \caption{Performance prediction of HPL on Stampede using SimGrid.}\vspace{-1em} + \label{fig:stampede_prediction} + \labspace +\end{figure} +#+END_EXPORT + +**** HPL prediction :noexport: + +Starting from Tom's journal, entry "2017-09-27 Wednesday : Complete experiments with the crosstraffic desabled" +#+begin_src R :results output :session *R* :exports both +library(ggplot2) +# files <- dir('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/stampede/simulation/8', pattern = '\\.csv', full.names = TRUE) +# tables <- lapply(files, read.csv) +# results = do.call(rbind, tables) + +classical = rbind(read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/stampede/simulation/8/result_1000000.csv'), + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/stampede/simulation/8/result_2000000.csv'), + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/stampede/simulation/8/result_3875000.csv')) +classical$mode = 'classical' + +highbw = rbind(read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/stampede/simulation/8/result_highbw_1000000.csv'), + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/stampede/simulation/8/result_highbw_2000000.csv'), + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/stampede/simulation/8/result_highbw_3875000.csv')) +highbw$mode = 'highbw' + +fatpipe = rbind(read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/stampede/simulation/8/result_fatpipe_1000000.csv'), + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/stampede/simulation/8/result_fatpipe_2000000.csv'), + read.csv('/home/alegrand/Work/SimGrid/tom/m2_internship_journal/stampede/simulation/8/result_fatpipe_3875000.csv')) +fatpipe$mode = 'fatpipe' + +results = rbind(classical, highbw, fatpipe) +#+end_src + +#+RESULTS: + +#+begin_src R :results output graphics :file figures/stampede_simgrid.pdf :exports both :width 5 :height 3.5 :session *R* +res_lab = results[results$mode != 'highbw' & results$size > 3E6,]; +res_lab$x=res_lab$size; +res_lab$y=res_lab$Gflops; +res_lab$xl=res_lab$x*.9; +res_lab$yl=res_lab$y*.9; + + +res_lab$xl=res_lab$x*.8; +res_lab$yl=res_lab$y*.9; + +res_lab[res_lab$mode=='classical',]$xl=3.4e6 +res_lab[res_lab$mode=='classical',]$yl=2.5e6 +res_lab[res_lab$mode=='fatpipe',]$xl=2.e6 +res_lab[res_lab$mode=='fatpipe',]$yl=4.1e6 + +res_lab$label = NA; +res_lab[res_lab$mode=='fatpipe',]$label = "Simulation\n (No Contention)"; +res_lab[res_lab$mode=='classical',]$label = "Simulation\n (Fat Tree)"; + +ggplot(results[results$mode != 'highbw',], aes(x=size, y=Gflops, color=mode, linetype=mode)) + + # Inner labels + geom_segment(data=res_lab, + aes(x=xl, xend=x, y=yl, yend=y), linetype="solid", + color="black") + + geom_label(data=res_lab, + aes(label = factor(label), x=xl, y=yl, fill=mode), + colour = "white", fontface = "bold") + + # SMPI lines + geom_point() + geom_line() + + # Top500 perf + geom_hline(yintercept=5.16811e+06) + + annotate("text",x=3875000, 1E2,y=5.4e+06, hjust="right", + label="Top 500 performance",color="black") + + annotate("text",x=3875000, 1E2,y=4.96e+06, hjust="right", + label="(5.168 TeraFlop/s)",color="black") + + annotate("point",x=3875000, 1E2,y=5.16811e+06,color="black") + + # Cosmetics + guides(fill=FALSE, color=FALSE, linetype=FALSE) + ylab("GFlop/s") + xlab("Matrix rank") + + theme_bw() + scale_color_brewer(palette="Set1") + scale_fill_brewer(palette="Set1")+ + theme(legend.position="top") + + ggtitle('Performance of HPL') +#+end_src + +#+RESULTS: +[[file:figures/stampede_simgrid.pdf]] + + +#+begin_src R :results output graphics :file (org-babel-temp-file "figure" ".png") :exports both :width 600 :height 400 :session *R* +ggplot(results, aes(x=size, y=simulation_time, color=mode, linetype=mode)) + geom_point() + geom_line() + ggtitle('Simulation time') +#+end_src + +#+RESULTS: +[[file:/tmp/babel-23284Iao/figure232842Rn.png]] + +*** Performance Gap Investigation +# The simulation time to get the full-scale trace was: +# - 420 seconds for two iterations (250 seconds spent in HPL), +# - 609 seconds for five iterations (265 seconds spent in HPL). +In this section, we explain our investigation and give possible reasons for +the aforementioned mismatch (apparent in Figure\ref{fig:stampede_prediction}). With SMPI, it is simple to trace +the first iterations of HPL to get an idea of what could be +improved (the trace for the first five iterations can be obtained in +about 609 seconds on a commodity computer and is compressed about +\SI{175}{\mega\byte} large). Figure\ref{fig:hpl_gantt} illustrates the +very synchronous and iterative nature of the first iterations: One can identify first a factorization of the panel, then a broadcast to all the +nodes, and finally an update of trailing matrix. +More than one fifth of each iteration is spent communicating (although the first +iterations are the ones with the lowest communication to computation ratio), +which prevents HPL from reaching the Top500 performance. +Overlapping of these heavy communication phases with computation would improve +performance significantly. The fact that this is +almost not happening can be explained by the look-ahead ~DEPTH~ +parameter that was supposedly set to =0= (see +Figure\ref{fig:hpl_output}). This is quite surprising as even +the tuning section of the HPL documentation indicates that a depth of +1 is supposed to yield the best results, even though a large problem size could +be needed to see some performance gain. We discussed this +surprising behavior with the Stampede-team and were informed that the +run in 2013 was executed with an HPL binary provided by Intel +and probably specifically modified for Stampede. We +believe that some configuration values have been hardcoded to enforce an overlap of +iterations with others. Indeed, the shortened part (marked ``[...]'') in +Figure\ref{fig:hpl_output} provides information about the progress of +HPL throughout iterations and statistics for the panel-owning process +about the time spent in the most important parts. +According to these statistics, the total time +spent in the =Update= section was \SI{9390}{\sec} whereas the total +execution time was \SI{7505}{\sec}, which is impossible unless iterations have overlapped. + +The broadcast and swapping algorithms use very heavy +communication patterns. This is not at all surprising since for a matrix of +this order, several hundred megabytes need to be broadcast. +Although the output states that the =blongM= algorithm was +used it could be the case that another algorithm had been used. +We tried the other of the 6 broadcast algorithms HPL comes with but +did not achieve significantly better overall performance. +An analysis of the symbols in the Intel binary +revealed that another broadcast algorithm named +=HPL_bcast_bpush= was available. Unlike the others, this new algorithm relies on non-blocking sends, +which could contribute to the performance obtained in 2013. +Likewise, the swapping algorithm that was used (~SWAP=Binary-exchange~) involves communications that are rather long and +organized in trees, which is surprising as the ~spread-roll~ algorithm +is recommended for large matrices. + +#+BEGIN_EXPORT latex +\begin{figure}[t] + \centering + \includegraphics[width=\linewidth,page=1]{./figures/fullscale_unzoomed.png} + \caption{Gantt chart of the first two iterations of HPL. Communication states are in + red while computations are in cyan. Each communication between two process + is represented with a white arrow, which results in very cluttered white areas.}\vspace{-1em} + \label{fig:hpl_gantt} + \labspace +\end{figure} +#+END_EXPORT + +We do not aim to reverse engineer the Intel HPL code. We can, however, +already draw two conclusions from our simple analysis: 1) it is apparent that many optimizations have been done on +the communication side and 2) it is very likely that the reported +parameters are not the ones used in the real execution, probably because +these values were hardcoded and the configuration output file was not updated accordingly. +*** Gantt charts :noexport: +#+begin_src sh :results output :exports both +cp /home/alegrand/Work/SimGrid/tom/m2_internship_journal/stampede/communications/fullscale_unzoomed.png figures/ +#+end_src + +#+RESULTS: + +* Conclusions +#+LaTeX: \label{sec:cl} + +Studying HPC applications at scale can be very time- and +resource-consuming. Simulation is often an effective approach in this +context and SMPI has previously been successfully validated in several small-scale +studies with standard HPC applications\cite{smpi,heinrich:hal-01523608}. In this +article, we proposed and evaluated extensions to the SimGrid/SMPI +framework that allowed us to emulate HPL at the scale of a +supercomputer. Our application of choice, HPL, is particularly challenging in terms of simulation +as it implements its own set of non-blocking collective operations +that rely on =MPI_Iprobe= in order to facilitate overlapping with computations. + +More specifically, we tried to reproduce the execution of HPL on the +Stampede supercomputer conducted in $2013$ for the TOP500, which +involved a \SI{120}{\tera\byte} matrix and took two hours on 6,006\nbsp{}nodes. +Our emulation of a similar configuration ran on a single machine for +about $62$ hours and required less than \SI{19}{\giga\byte} of RAM. This emulation +employed several non-trivial operating-system level optimizations +(memory mapping, dynamic library loading, huge pages) that have since been +integrated into the last version of SimGrid/SMPI. + +The downside of scaling this high is a less well-controlled scenario. +The reference run of HPL on Stampede was done several years ago and we only +have very limited information about the setup (\eg software versions +and configuration), but a reservation and re-execution on the whole +machine was impossible for us. We nevertheless modeled Stampede carefully, which +allowed us to predict the performance that would +have been obtained using an unmodified, freely available version of HPL. +Unfortunately, despite all our efforts, the predicted performance +was much lower than what was reported in 2013. We determined that this +discrepancy comes from the fact that a modified, closed-source version of HPL +supplied by Intel was used in 2013. +We believe that some of the HPL configuration parameters were +hardcoded and therefore misreported in the output. A quick analysis of the optimized +HPL binary confirmed that algorithmic differences were likely to be the +reason for the performance differences. + +We conclude that a large-scale (in)validation is unfortunately not +possible due to the modified source code being unavailable to us. +We claim that the modifications we made are +minor and are applicable to that optimized version. In fact, while HPL +comprises 16K lines of ANSI C over 149 files, our modifications only +changed 14 files with 286 line insertions and 18 deletions. + +We believe being capable of precisely predicting an application's +performance on a given platform will become +invaluable in the future to aid compute centers with the decision of +whether a new machine (and what technology) will work best for a given +application or if an upgrade of the current machine should be +considered. As a future work, we intend to conduct similar studies +with other HPC benchmarks (\eg HPCG or HPGMG) and with other top500 +machines. From our experience, we believe that a faithful and public +reporting of the experimental conditions (compiler options, library +versions, HPL output, etc.) is invaluable and allows researchers +to better understand of these platforms actually behave. + +# This goal will be subject to a more thorough investigation +# in the very near future. + +# As we saw in Section\ref{sec:hplchanges}, two BLAS functions (=dgemm= +# and =dtrsm=) were the dominating factor with regards to the runtime although other BLAS +# functions were called as well. For this study, we neglected the other +# functions but with a fully automatic calibration procedure for any +# BLAS function results could effortlessly become more precise as the +# application could just be linked against a BLAS-replacement +# library. +# #+LaTeX: \CH{Problem here: HPL uses \texttt{HPL\_dtrsm()} wrappers.} + +# #+LaTeX: \CH{I like the idea of pointing out again that our simulation takes much longer (48 hours instead of 2?) but that we use 1/6000 of the ressources} + +* Acknowledgements + +Experiments presented in this paper were carried out using the Grid'5000 testbed, supported by a scientific interest group hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations (see https://www.grid5000.fr). +We warmly thank our TACC colleagues for their support in this study and +providing us with as much information as they could. +** References :ignore: + +# See next section to understand how refs.bib file is created. + +#+LATEX: \bibliographystyle{IEEEtran} +#+LATEX: \bibliography{refs} + +* Bib file is here :noexport: + +Tangle this file with C-c C-v t + +#+begin_src bib :tangle refs.bib +@IEEEtranBSTCTL{IEEEexample:BSTcontrol, + CTLuse_article_number = "yes", + CTLuse_paper = "yes", + CTLuse_url = "yes", + CTLuse_forced_etal = "yes", + CTLmax_names_forced_etal = "6", + CTLnames_show_etal = "3", + CTLuse_alt_spacing = "yes", + CTLalt_stretch_factor = "4", + CTLdash_repeated_names = "yes", + CTLname_format_string = "{f. ~}{vv ~}{ll}{, jj}", + CTLname_latex_cmd = "", + CTLname_url_prefix = "[Online]. Available:" +} + +@mastersthesis{cornebize:hal-01544827, + TITLE = {{Capacity Planning of Supercomputers: Simulating MPI Applications at Scale}}, + AUTHOR = {Cornebize, Tom}, + URL = {https://hal.inria.fr/hal-01544827}, + SCHOOL = {{Grenoble INP ; Universit{\'e} Grenoble - Alpes}}, + YEAR = {2017}, + MONTH = Jun, + KEYWORDS = {Simulation ; MPI runtime and applications ; Performance prediction and extrapolation ; High Performance LINPACK}, + PDF = {https://hal.inria.fr/hal-01544827/file/report.pdf}, + HAL_ID = {hal-01544827}, + HAL_VERSION = {v1}, +} + +@incollection{grid5000, + title = {Adding Virtualization Capabilities to the {Grid'5000} Testbed}, + author = {Balouek, Daniel and Carpen-Amarie, Alexandra and Charrier, Ghislain and Desprez, Fr{\'e}d{\'e}ric and Jeannot, Emmanuel and Jeanvoine, Emmanuel and L{\`e}bre, Adrien and Margery, David and Niclausse, Nicolas and Nussbaum, Lucas and Richard, Olivier and P{\'e}rez, Christian and Quesnel, Flavien and Rohr, Cyril and Sarzyniec, Luc}, + booktitle = {Cloud Computing and Services Science}, + publisher = {Springer International Publishing}, + OPTpages = {3-20}, + volume = {367}, + editor = {Ivanov, IvanI. and Sinderen, Marten and Leymann, Frank and Shan, Tony }, + series = {Communications in Computer and Information Science }, + isbn = {978-3-319-04518-4 }, + doi = {10.1007/978-3-319-04519-1\_1 }, + year = {2013}, +} + +%%% Online simulation of MPI applications +@article{xsim, + author = "Christian Engelmann", + title = {{Scaling To A Million Cores And Beyond: {Using} Light-Weight + Simulation to Understand The Challenges Ahead On The Road To + Exascale}}, + journal = "FGCS", + volume = 30, + pages = "59--65", + month = jan, + year = 2014, + publisher = "Elsevier"} + +@Article{sstmacro, + author = {Curtis L. Janssen and Helgi Adalsteinsson and Scott Cranford and Joseph P. Kenny and Ali Pinar and David A. Evensky and Jackson Mayo}, + journal = {International Journal of Parallel and Distributed Systems}, + title = {A Simulator for Large-scale Parallel Architectures}, + volume = {1}, + number = {2}, + pages = {57--73}, + year = {2010}, + note = "\url{http://dx.doi.org/10.4018/jdst.2010040104}", + doi = {10.4018/jdst.2010040104} +} + +@article{SST, + author = {Rodrigues, Arun and Hemmert, Karl and Barrett, Brian + and Kersey, Chad and Oldfield, Ron and Weston, Marlo + and Riesen, Rolf and Cook, Jeanine and Rosenfeld, + Paul and CooperBalls, Elliot and Jacob, Bruce }, + title = {{The Structural Simulation Toolkit}}, + journal = {{SIGMETRICS} Performance Evaluation Review}, + volume = 38, + number = 4, + pages = {37--42}, + year = 2011 +} + +@article{dickens_tpds96, + title={{Parallelized Direct Execution Simulation of Message-Passing + Parallel Programs}}, + author={Dickens, Phillip and Heidelberger, Philip and Nicol, David}, + journal={IEEE Transactions on Parallel and Distributed Systems}, + volume=7, + number=10, + year=1996, + pages={1090--1105} +} + +@ARTICLE{bagrodia_ijhpca01, + author={Bagrodia, Rajive and Deelman, Ewa and Phan, Thomas}, + title={{Parallel Simulation of Large-Scale Parallel Applications}}, + journal={International Journal of High Performance Computing and + Applications}, + volume=15, + number=1, + year=2001, + pages={3--12} +} + +%%% Offline simulation of MPI applications +@INPROCEEDINGS{loggopsim_10, + title={{LogGOPSim - Simulating Large-Scale Applications in + the LogGOPS Model}}, + author={Hoefler, Torsten and Siebert, Christian and Lumsdaine, Andrew}, + month=Jun, + year={2010}, + pages = {597--604}, + booktitle={Proc. of the LSAP Workshop}, +} + +@inproceedings{hoefler-goal, + author={T. Hoefler and C. Siebert and A. Lumsdaine}, + title={{Group Operation Assembly Language - A Flexible Way to Express Collective Communication}}, + year={2009}, + booktitle={Proc. of the 38th ICPP} +} + +@inproceedings{bigsim_04, + author={Zheng, Gengbin and Kakulapati, Gunavardhan and Kale, + Laxmikant}, + title={{BigSim: A Parallel Simulator for Performance Prediction of + Extremely Large Parallel Machines}}, + year=2004, + booktitle={Proc. of the 18th IPDPS}, +} + +@inproceedings{dimemas, + title = {{Dimemas: Predicting MPI Applications Behaviour in Grid Environments}}, + year = {2003}, + month = jun, + booktitle = {Proc. of the Workshop on Grid Applications and + Programming Tools}, + author = {Rosa M. Badia and Jes{\'u}s Labarta and Judit Gim{\'e}nez and Francesc Escal{\'e}} +} + +@article {CODES, + title = {Enabling Parallel Simulation of Large-Scale {HPC} Network Systems}, + journal = {IEEE Transactions on Parallel and Distributed Systems}, + year = {2016}, + author = {Mubarak, M. and C. D. Carothers and Robert B. Ross and Philip H. Carns} +} + +@article{ROSS_SC12, +author = {Misbah Mubarak and Christopher D. Carothers and Robert Ross and Philip Carns}, +title = {{Modeling a Million-Node Dragonfly Network Using Massively Parallel Discrete-Event Simulation}}, +journal ={SC Companion}, +year = {2012}, +pages = {366-376}, +} + +%%% Self citations on previous work +@Article{simgrid, + title = {{Versatile, Scalable, and Accurate Simulation of Distributed Applications and Platforms}}, + author = {Casanova, Henri and Giersch, Arnaud and Legrand, Arnaud and Quinson, Martin and Suter, Fr{\'e}d{\'e}ric}, + publisher = {Elsevier}, + pages = {2899-2917}, + journal = {Journal of Parallel and Distributed Computing}, + volume = {74}, + number = {10}, + year = {2014} +} + +@InProceedings{simetierre, + author = {Bobelin, Laurent and Legrand, Arnaud and + M{\'a}rquez, David Alejandro Gonz{\'a}lez and Navarro, + Pierre and Quinson, Martin and Suter, + Fr{\'e}d{\'e}ric and Thiery, Christophe}, + title = {{Scalable Multi-Purpose Network Representation for + Large Scale Distributed System Simulation}}, + booktitle = {Proc. of the 12th IEEE/ACM International + Symposium on Cluster, Cloud and Grid Computing}, + year = 2012, + pages = {220--227}, + address = {Ottawa, Canada} +} + +@InProceedings {simgrid_simix2_12, + author = {Martin Quinson and Cristian Rosa and Christophe Thi{\'e}ry}, + title = {Parallel Simulation of Peer-to-Peer Systems}, + booktitle = {{P}roc. of the 12th {IEEE/ACM} {I}ntl. {S}ymposium on {C}luster, Cloud and Grid {C}omputing}, + year = {2012}, + address = {Ottawa, Canada} +} + +@InProceedings {DCLV_LSAP_10, + title = {{Fast and Scalable Simulation of Volunteer Computing Systems + Using SimGrid}}, + booktitle = {Proc. of the Workshop on Large-Scale System and Application + Performance}, + year = {2010}, + month = Jun, + address = {Chicago, IL}, + author = {Donassolo, Bruno and Casanova, Henri and Legrand, Arnaud + and Velho, Pedro}, + category = {core} +} + +@InProceedings{SMPI_IPDPS, + author = {Clauss, Pierre-Nicolas and Stillwell, Mark and Genaud, + St\'ephane and Suter, Fr\'ed\'eric and Casanova, Henri and + Quinson, Martin}, + title = {{Single Node On-Line Simulation of MPI Applications with + SMPI}}, + booktitle= {Proc. of the 25th IEEE Intl. Parallel and + Distributed Processing Symposium}, + year = 2011, + address = {Anchorage, AK} +} + + +@Article{Velho_TOMACS13, + author = {Velho, Pedro and Schnorr, Lucas and Casanova, Henri and Legrand, Arnaud}, + title = {{On the Validity of Flow-level {TCP} Network Models for Grid and Cloud Simulations}}, + journal = {ACM Transactions on Modeling and Computer Simulation}, + year = {2013}, + PUBLISHER = {ACM}, + VOLUME = 23, + NUMBER = 4, + pages = 23, + MONTH = Oct +} + +@article{smpi, + TITLE = {Simulating {MPI} applications: the {SMPI} approach}, + AUTHOR = {Degomme, Augustin and Legrand, Arnaud and Markomanolis, Georges and Quinson, Martin and Stillwell, Mark S and Suter, Frédéric}, + JOURNAL = {{IEEE Transactions on Parallel and Distributed Systems}}, + PUBLISHER = {{Institute of Electrical and Electronics Engineers}}, + volume = "28", + number = "8", + pages = "2387--2400", + PAGES = {14}, + YEAR = {2017}, + MONTH = Feb, + DOI = {10.1109/TPDS.2017.2669305}, + KEYWORDS = {Simulation ; MPI runtime and applications ; Performance prediction and extrapolation}, + PDF = {https://hal.inria.fr/hal-01415484/file/smpi_article.pdf}, + HAL_ID = {hal-01415484}, + HAL_VERSION = {v2}, + category = "core", +} + +@InProceedings{heinrich:hal-01523608, + title = "{Predicting the Energy Consumption of {MPI} Applications + at Scale Using a Single Node}", + author = "Franz C. Heinrich and Tom Cornebize and Augustin + Degomme and Arnaud Legrand and Alexandra Carpen-Amarie + and Sascha Hunold and Anne-Cécile Orgerie and Martin + Quinson", + URL = "https://hal.inria.fr/hal-01523608", + booktitle = "Proc. of the 19th IEEE Cluster Conference", + year = "2017", + keywords = "simulation ; HPC ; energy ; platform modeling", + pdf = "https://hal.inria.fr/hal-01523608/file/predicting-energy-consumption-at-scale.pdf", + hal_id = "hal-01523608", + category = "core", +} + +% Trace extrapolation +@InProceedings{scalaextrap, + author = {Xing Wu and Frank Mueller}, + title = {{S}cala{E}xtrap: Trace-Based Communication Extrapolation + for {SPMD} Programs}, + booktitle = {Proc. of the 16th ACM Symp. on Principles and + Practice of Parallel Programming}, + year = {2011}, + pages = {113--122}, +} + +@InProceedings{pmac_lspp13, + author = {Laura Carrington and Michael Laurenzano and Ananta Tiwari}, + title = {Inferring Large-scale Computation Behavior via Trace Extrapolation}, + booktitle = {Proc. of the Workshop on Large-Scale Parallel Processing}, + year = {2013}, +} + +@Misc{hpl, + author = {Antoine Petitet and Clint Whaley and Jack Dongarra and Andy Cleary and Piotr Luszczek}, + title = {{HPL} - A Portable Implementation of the {High-Performance Linpack} Benchmark for Distributed-Memory Computers}, + howpublished = {\url{http://www.netlib.org/benchmark/hpl}}, + month = {February}, + year = {2016}, + note = {Version 2.2} +} + +@book{top500, + author = {Meuer, Hans Werner and Strohmaier, Erich and Dongarra, Jack and Simon, Horst D.}, + title = {The {TOP500}: History, Trends, and Future Directions in {High Performance Computing}}, + year = {2014}, + isbn = {143981595X, 9781439815953}, + edition = {1st}, + publisher = {Chapman \& Hall/CRC}, +} + +@techreport{dmodk, + author = {Eitan Zahavi}, + title = {{D-Mod-K} Routing Providing Non-Blocking Traffic for Shift Permutations on Real Life Fat Trees}, + institution = {Technion Israel Institute of Technology}, + year = {2010}, +} +#+end_src + +* Emacs Setup :noexport: +# Local Variables: +# eval: (require 'org-install) +# eval: (org-babel-do-load-languages 'org-babel-load-languages '( (shell . t) (R . t) (perl . t) (ditaa . t) )) +# eval: (setq org-confirm-babel-evaluate nil) +# eval: (unless (boundp 'org-latex-classes) (setq org-latex-classes nil)) +# eval: (add-to-list 'org-latex-classes '("IEEEtran" +# "\\documentclass[conference, 10pt]{IEEEtran}\n \[NO-DEFAULT-PACKAGES]\n \[EXTRA]\n \\usepackage{graphicx}\n \\usepackage{hyperref}" ("\\section{%s}" . "\\section*{%s}") ("\\subsection{%s}" . "\\subsection*{%s}") ("\\subsubsection{%s}" . "\\subsubsection*{%s}") ("\\paragraph{%s}" . "\\paragraph*{%s}") ("\\subparagraph{%s}" . "\\subparagraph*{%s}"))) +# eval: (add-to-list 'org-latex-classes '("llncs" "\\documentclass{llncs2e/llncs}\n \[NO-DEFAULT-PACKAGES]\n \[EXTRA]\n" ("\\section{%s}" . "\\section*{%s}") ("\\subsection{%s}" . "\\subsection*{%s}") ("\\subsubsection{%s}" . "\\subsubsection*{%s}") ("\\paragraph{%s}" . "\\paragraph*{%s}") ("\\subparagraph{%s}" . "\\subparagraph*{%s}"))) +# eval: (add-to-list 'org-latex-classes '("acm-proc-article-sp" "\\documentclass{acm_proc_article-sp}\n \[NO-DEFAULT-PACKAGES]\n \[EXTRA]\n" ("\\section{%s}" . "\\section*{%s}") ("\\subsection{%s}" . "\\subsection*{%s}") ("\\subsubsection{%s}" . "\\subsubsection*{%s}") ("\\paragraph{%s}" . "\\paragraph*{%s}") ("\\subparagraph{%s}" . "\\subparagraph*{%s}"))) +# eval: (add-to-list 'org-latex-classes '("sig-alternate" "\\documentclass{sig-alternate}\n \[NO-DEFAULT-PACKAGES]\n \[EXTRA]\n" ("\\section{%s}" . "\\section*{%s}") ("\\subsection{%s}" . "\\subsection*{%s}") ("\\subsubsection{%s}" . "\\subsubsection*{%s}") ("\\paragraph{%s}" . "\\paragraph*{%s}") ("\\subparagraph{%s}" . "\\subparagraph*{%s}"))) +# eval: (setq org-alphabetical-lists t) +# eval: (setq org-src-fontify-natively t) +# eval: (setq ispell-local-dictionary "american") +# eval: (eval (flyspell-mode t)) +# eval: (setq org-todo-keyword-faces '(("FLAWED" . (:foreground "RED" :weight bold)))) +# eval: (custom-set-variables '(org-babel-shell-names (quote ("sh" "bash" "csh" "ash" "dash" "ksh" "mksh" "posh" "zsh")))) +# eval: (add-to-list 'load-path ".") +# eval: (require 'ox-extra) +# eval: (setq org-latex-tables-centered nil) +# eval: (ox-extras-activate '(ignore-headlines)) +# End: diff --git a/module2/ressources/video_examples/technical_report.org b/module2/ressources/video_examples/technical_report.org new file mode 100644 index 0000000..fb13e53 --- /dev/null +++ b/module2/ressources/video_examples/technical_report.org @@ -0,0 +1,477 @@ +# -*- coding: utf-8 -*- +# -*- mode: org -*- + +#+TITLE: A reproducible comparison between @@latex:\\@@ GNU MPFR and machine double-precision +#+AUTHOR: Paul Zimmermann (reproduction with org-mode by Arnaud Legrand) +#+STARTUP: overview indent inlineimages logdrawer +#+LANGUAGE: en +#+LATEX_CLASS: IEEEtran +#+LaTeX_CLASS_OPTIONS: [onecolumn] +# #+HTML_HEAD: +#+HTML_HEAD: +#+HTML_HEAD: +#+HTML_HEAD: +#+HTML_HEAD: +#+HTML_HEAD: +#+HTML_HEAD: +#+PROPERTY: header-args :eval never-export + +Several authors claim that GNU MPFR [1] is $x$ times slower than +double-precision floating-point numbers, for various values of $x$, +without any way for the reader to reproduce their claim. For example +in [2], Joris van der Hoeven writes “the MPFR library for arbitrary +precision and IEEE-style standardized floating-point arithmetic is +typically about a factor 100 slower than double precision machine +arithmetic”. Such a claim typically: (i) does not say which version of +MPFR was used (and which version of GMP, since MPFR being based on +GMP, its efficiency also depends on GMP); (ii) does not detail the +environment used (processor, compiler, operating system); (iii) does +not explain which application was used for the comparison. Therefore +it cannot be reproduced by the reader, which could thus have no +confidence in the claimed factor of 100. In this short note we provide +reproducible figures that can be checked by the reader. +** Reproducible Experimental Setup +We use the programs in appendix to multiply two $1000 × 1000$ +matrices. The matrix $A$ has coefficients $1/(i + j + 1)$ for $0 ≤ i, +j < 1000$, and matrix $b$ has coefficients $1/(ij + 1)$. Both programs +print the time for the matrix product (not counting the time to +initialize the matrix), and the sum of coefficients of the product +matrix (used as a simple checksum between both programs). + +We used MFPR version 3.1.5, configured with GMP 6.1.2 (both are the +latest releases as of the date of this document). + +We used as test processor =gcc12.fsffrance.org=, which is a machine from +the GCC Compile Farm, a set of machines available for developers of +free software. The compiler used was GCC 4.5.1, which is installed in +~/opt/cfarm/release/4.5.1~ on this machine, with optimization level +~-O3~. Both GMP and MPFR were also compiled with this compiler, and the +GMP and MPFR libraries were linked statically with the application +programs (given in appendix). + +** Experimental Results From Arnaud Legrand +*** Code +The program (=a.c=) using the C double-precision type is the +following. It takes as command-line argument the matrix dimension. +#+BEGIN_SRC C :tangle /tmp/a.c +#include +#include +#include +#include + +static int cputime() +{ + struct rusage rus; + getrusage(0, &rus); + return rus.ru_utime.tv_sec * 1000 + rus.ru_utime.tv_usec / 1000; +} + +int main(int argc, char *argv[]) +{ + double **a; + double **b; + double **c; + double t = 0.0; + int i, j, k, st; + int N = atoi(argv[1]); + st = cputime(); + a = malloc(N * sizeof(double *)); + b = malloc(N * sizeof(double *)); + c = malloc(N * sizeof(double *)); + for (i = 0; i < N; i++) { + a[i] = malloc(N * sizeof(double)); + b[i] = malloc(N * sizeof(double)); + c[i] = malloc(N * sizeof(double)); + for (j = 0; j < N; j++) { + a[i][j] = 1.0 / (1.0 + i + j); + b[i][j] = 1.0 / (1.0 + i * j); + } + } + st = cputime(); + for (i = 0; i < N; i++) + for (j = 0; j < N; j++) + c[i][j] = 0.0; + for (i = 0; i < N; i++) + for (k = 0; k < N; k++) + for (j = 0; j < N; j++) + c[i][j] += a[i][k] * b[k][j]; + for (i = 0; i < N; i++) + for (j = 0; j < N; j++) + t += c[i][j]; + printf("matrix product took %dms\n", cputime() - st); + printf("t=%f\n", t); + for (i = 0; i < N; i++) { + free(a[i]); + free(b[i]); + free(c[i]); + } + free(a); + free(b); + free(c); + return 0; +} +#+END_SRC + +The program (=d.c=) using GNU MPFR is the following. It takes as +command-line argument the matrix dimension and the MPFR precision (in +bits). + +#+BEGIN_SRC C :tangle /tmp/d.c +#include +#include +#include +#include +#include + +static int cputime() +{ + struct rusage rus; + getrusage(0, &rus); + return rus.ru_utime.tv_sec * 1000 + rus.ru_utime.tv_usec / 1000; +} + +int main(int argc, char *argv[]) +{ + mpfr_t **a; + mpfr_t **b; + mpfr_t **c; + mpfr_t s; + double t = 0.0; + int i, j, k, st; + int N = atoi(argv[1]); + int prec = atoi(argv[2]); + printf("MPFR library: %-12s\nMPFR header: %s (based on %d.%d.%d)\n", + mpfr_get_version(), MPFR_VERSION_STRING, MPFR_VERSION_MAJOR, + MPFR_VERSION_MINOR, MPFR_VERSION_PATCHLEVEL); + st = cputime(); + a = malloc(N * sizeof(mpfr_t *)); + b = malloc(N * sizeof(mpfr_t *)); + c = malloc(N * sizeof(mpfr_t *)); + mpfr_init2(s, prec); + for (i = 0; i < N; i++) { + a[i] = malloc(N * sizeof(mpfr_t)); + b[i] = malloc(N * sizeof(mpfr_t)); + c[i] = malloc(N * sizeof(mpfr_t)); + for (j = 0; j < N; j++) { + mpfr_init2(a[i][j], prec); + mpfr_init2(b[i][j], prec); + mpfr_init2(c[i][j], prec); + mpfr_set_ui(a[i][j], 1, MPFR_RNDN); + mpfr_div_ui(a[i][j], a[i][j], i + j + 1, MPFR_RNDN); + mpfr_set_ui(b[i][j], 1, MPFR_RNDN); + mpfr_div_ui(b[i][j], b[i][j], i * j + 1, MPFR_RNDN); + } + } + st = cputime(); + for (i = 0; i < N; i++) + for (j = 0; j < N; j++) + mpfr_set_ui(c[i][j], 0, MPFR_RNDN); + for (i = 0; i < N; i++) + for (k = 0; k < N; k++) + for (j = 0; j < N; j++) { + mpfr_mul(s, a[i][k], b[k][j], MPFR_RNDN); + mpfr_add(c[i][j], c[i][j], s, MPFR_RNDN); + } + for (i = 0; i < N; i++) + for (j = 0; j < N; j++) + t += mpfr_get_d(c[i][j], MPFR_RNDN); + printf("matrix product took %dms\n", cputime() - st); + printf("t=%f\n", t); + for (i = 0; i < N; i++) { + for (j = 0; j < N; j++) { + mpfr_clear(a[i][j]); + mpfr_clear(b[i][j]); + mpfr_clear(c[i][j]); + } + free(a[i]); + free(b[i]); + free(c[i]); + } + mpfr_clear(s); + free(a); + free(b); + free(c); + return 0; +} +#+END_SRC + +*** Setup +- Name of the machine and OS version: + #+begin_src shell :results output :exports results :tangle get_info.sh + uname -a + #+end_src + + #+RESULTS: + : Linux sama 4.2.0-1-amd64 #1 SMP Debian 4.2.6-1 (2015-11-10) x86_64 GNU/Linux + +- CPU/architecture information: + #+begin_src shell :results output :exports both :tangle get_info.sh + cat /proc/cpuinfo + #+end_src + + #+RESULTS: + #+begin_example + processor : 0 + vendor_id : GenuineIntel + cpu family : 6 + model : 58 + model name : Intel(R) Core(TM) i7-3687U CPU @ 2.10GHz + stepping : 9 + microcode : 0x15 + cpu MHz : 2165.617 + cache size : 4096 KB + physical id : 0 + siblings : 4 + core id : 0 + cpu cores : 2 + apicid : 0 + initial apicid : 0 + fpu : yes + fpu_exception : yes + cpuid level : 13 + wp : yes + flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt + bugs : + bogomips : 5182.68 + clflush size : 64 + cache_alignment : 64 + address sizes : 36 bits physical, 48 bits virtual + power management: + + processor : 1 + vendor_id : GenuineIntel + cpu family : 6 + model : 58 + model name : Intel(R) Core(TM) i7-3687U CPU @ 2.10GHz + stepping : 9 + microcode : 0x15 + cpu MHz : 3140.515 + cache size : 4096 KB + physical id : 0 + siblings : 4 + core id : 1 + cpu cores : 2 + apicid : 2 + initial apicid : 2 + fpu : yes + fpu_exception : yes + cpuid level : 13 + wp : yes + flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt + bugs : + bogomips : 5182.68 + clflush size : 64 + cache_alignment : 64 + address sizes : 36 bits physical, 48 bits virtual + power management: + + processor : 2 + vendor_id : GenuineIntel + cpu family : 6 + model : 58 + model name : Intel(R) Core(TM) i7-3687U CPU @ 2.10GHz + stepping : 9 + microcode : 0x15 + cpu MHz : 2860.000 + cache size : 4096 KB + physical id : 0 + siblings : 4 + core id : 0 + cpu cores : 2 + apicid : 1 + initial apicid : 1 + fpu : yes + fpu_exception : yes + cpuid level : 13 + wp : yes + flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt + bugs : + bogomips : 5182.68 + clflush size : 64 + cache_alignment : 64 + address sizes : 36 bits physical, 48 bits virtual + power management: + + processor : 3 + vendor_id : GenuineIntel + cpu family : 6 + model : 58 + model name : Intel(R) Core(TM) i7-3687U CPU @ 2.10GHz + stepping : 9 + microcode : 0x15 + cpu MHz : 2813.585 + cache size : 4096 KB + physical id : 0 + siblings : 4 + core id : 1 + cpu cores : 2 + apicid : 3 + initial apicid : 3 + fpu : yes + fpu_exception : yes + cpuid level : 13 + wp : yes + flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt + bugs : + bogomips : 5182.68 + clflush size : 64 + cache_alignment : 64 + address sizes : 36 bits physical, 48 bits virtual + power management: + +#+end_example + +- Compiler version + #+begin_src shell :results output :exports both :tangle get_info.sh + gcc --version + #+end_src + + #+RESULTS: + : gcc (Debian 5.3.1-6) 5.3.1 20160114 + : Copyright (C) 2015 Free Software Foundation, Inc. + : This is free software; see the source for copying conditions. There is NO + : warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. + : + +- Libpmfr version: + #+begin_src shell :results output :exports both :tangle get_info.sh + apt-cache show libmpfr-dev + #+end_src + + #+RESULTS: + #+begin_example + Package: libmpfr-dev + Source: mpfr4 + Version: 3.1.5-1 + Installed-Size: 1029 + Maintainer: Debian GCC Maintainers + Architecture: amd64 + Replaces: libgmp3-dev (<< 4.1.4-3) + Depends: libgmp-dev, libmpfr4 (= 3.1.5-1) + Suggests: libmpfr-doc + Breaks: libgmp3-dev (<< 4.1.4-3) + Description-en: multiple precision floating-point computation developers tools + This development package provides the header files and the symbolic + links to allow compilation and linking of programs that use the libraries + provided in the libmpfr4 package. + . + MPFR provides a library for multiple-precision floating-point computation + with correct rounding. The computation is both efficient and has a + well-defined semantics. It copies the good ideas from the + ANSI/IEEE-754 standard for double-precision floating-point arithmetic + (53-bit mantissa). + Description-md5: a2580b68a7c6f1fcadeefc6b17102b32 + Multi-Arch: same + Homepage: http://www.mpfr.org/ + Tag: devel::lang:c, devel::library, implemented-in::c, role::devel-lib, + suite::gnu + Section: libdevel + Priority: optional + Filename: pool/main/m/mpfr4/libmpfr-dev_3.1.5-1_amd64.deb + Size: 207200 + MD5sum: e5c7872461f263e27312c9ef4f4218b9 + SHA256: 279970e210c7db4e2550f5a3b7abb2674d01e9f0afd2a4857f1589a6947e0cbd + +#+end_example + +*** A first measurement +#+begin_src shell :results output :exports both :tangle measure.sh +cd /tmp/ +gcc -O3 a.c -o a +./a 1000 +#+end_src + +#+RESULTS: +: matrix product took 680ms +: t=9062.368470 + +#+begin_src shell :results output :exports both :tangle measure.sh +cd /tmp/ +gcc -O3 d.c -o d -lmpfr +./d 1000 53 +#+end_src + +#+RESULTS: +: MPFR library: 3.1.5 +: MPFR header: 3.1.5 (based on 3.1.5) +: matrix product took 74460ms +: t=9062.368470 + +Et donc, chez moi, le ratio est plutôt de +#+begin_src R :results output :session *R* :exports both +74460/844 +#+end_src + +#+RESULTS: +: [1] 88.22275 + +*** A second measurement +Ceci étant dit, si je reexécute ces deux codes: + +#+begin_src shell :results output :exports both +cd /tmp/ +gcc -O3 a.c -o a +./a 1000 +#+end_src + +#+RESULTS: +: matrix product took 676ms +: t=9062.368470 + +#+begin_src shell :results output :exports both +cd /tmp/ +gcc -O3 d.c -o d -lmpfr +./d 1000 53 +#+end_src + +#+RESULTS: +: MPFR library: 3.1.5 +: MPFR header: 3.1.5 (based on 3.1.5) +: matrix product took 68732ms +: t=9062.368470 + +J'obtiens une valeur assez différente qui me donnerait cette fois ci +un ratio de +#+begin_src R :results output :session *R* :exports both +68732/676 +#+end_src + +#+RESULTS: +: [1] 101.6746 + +c'est à dire "plus proche" de ce qui est annoncé dans [2] mais c'est +un coup de chance, j'aurais tout aussi bien pu obtenir 120 ! Bref, +c'est pas le même setup que vous mais statistiquement parlant, il doit +aussi y avoir quelque chose à faire là, non ? + +** References +[1] Fousse, L., Hanrot, G., Lefèvre, V., Pélissier, P., and +Zimmermann, P. MPFR: A multiple-precision binary floating- point +library with correct rounding. ACM Trans. Math. Softw. 33, 2 (2007), +article 13. + +[2] van der Hoeven, J. Multiple precision floating-point arithmetic on +SIMD processors. In Proceedings of Arith’24 (2017), IEEE, pp. 2–9. + +Entered on [2017-09-01 ven. 17:12] + +* Emacs Setup :noexport: + This document has local variables in its postembule, which should + allow Org-mode (9) to work seamlessly without any setup. If you're + uncomfortable using such variables, you can safely ignore them at + startup. Exporting may require that you copy them in your .emacs. + +# Local Variables: +# eval: (require 'org-install) +# eval: (org-babel-do-load-languages 'org-babel-load-languages '((sh . t) (R . t) (perl . t) (python .t) )) +# eval: (setq org-confirm-babel-evaluate nil) +# eval: (unless (boundp 'org-latex-classes) (setq org-latex-classes nil)) +# eval: (add-to-list 'org-latex-classes '("IEEEtran" +# "\\documentclass[conference, 10pt, compsocconf]{IEEEtran}\n \[NO-DEFAULT-PACKAGES]\n \[EXTRA]\n \\usepackage{graphicx}\n \\usepackage{hyperref}" ("\\section{%s}" . "\\section*{%s}") ("\\subsection{%s}" . "\\subsection*{%s}") ("\\subsubsection{%s}" . "\\subsubsection*{%s}") ("\\paragraph{%s}" . "\\paragraph*{%s}") ("\\subparagraph{%s}" . "\\subparagraph*{%s}"))) +# eval: (setq org-alphabetical-lists t) +# eval: (setq org-src-fontify-natively t) +# eval: (add-to-list 'load-path ".") +# eval: (add-to-list 'org-latex-packages-alist '("" "minted")) +# eval: (setq org-latex-listings 'minted) +# eval: (setq org-latex-pdf-process '("pdflatex -shell-escape -interaction nonstopmode -output-directory %o %f" "pdflatex -shell-escape -interaction nonstopmode -output-directory %o %f" "pdflatex -shell-escape -interaction nonstopmode -output-directory %o %f")) +# End: -- 2.18.1