@licstart The following is the entire license notice for the
JavaScript code in this tag.
Copyright (C) 2012-2019 Free Software Foundation, Inc.
The JavaScript code in this tag is free software: you can
redistribute it and/or modify it under the terms of the GNU
General Public License (GNU GPL) as published by the Free Software
Foundation, either version 3 of the License, or (at your option)
any later version. The code is distributed WITHOUT ANY WARRANTY;
without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE. See the GNU GPL for more details.
As additional permission under GNU GPL version 3 section 7, you
may distribute non-source (e.g., minimized or compacted) forms of
that code without the copy of the GNU GPL normally required by
section 4, provided you include this license notice and a URL
through which recipients can access the Corresponding Source.
@licend The above is the entire license notice
for the JavaScript code in this tag.
*/
<!--/*--><![CDATA[/*><!--*/
functionCodeHighlightOn(elem,id)
{
vartarget=document.getElementById(id);
if(null!=target){
elem.cacheClassElem=elem.className;
elem.cacheClassTarget=target.className;
target.className="code-highlighted";
elem.className="code-highlighted";
}
}
functionCodeHighlightOff(elem,id)
{
vartarget=document.getElementById(id);
if(elem.cacheClassElem)
elem.className=elem.cacheClassElem;
if(elem.cacheClassTarget)
target.className=elem.cacheClassTarget;
}
/*]]>*///-->
</script>
</head>
<body>
<divid="content">
<h1class="title">Org document examples</h1>
<p>
In the MOOC video, I quickly demo how org-mode can be used in various
contexts. Here are the (sometimes trimmed) corresponding
org-files. These documents depend on many other external data files
and are not meant to lead to reproducible documents but it will give
you an idea of how it can be organized:
</p>
<olclass="org-ol">
<li><ahref="journal.html">journal.org</a>: an excerpt (I've only left a few code samples and links
to some resources on R, Stats, …) from my own journal. This is a
personal document where everything (meeting notes, hacking, random
thoughts, …) goes by default. Entries are created with the <code>C-c c</code>
shortcut.</li>
<li><ahref="labbook_single.html">labbook<sub>single.org</sub></a>: this is an excerpt from the laboratory notebook
<ahref="https://cornebize.net/">Tom Cornebize</a> wrote during his Master thesis internship under my
supervision. This a personal labbook. I consider this notebook to be
excellent and was the ideal level of details for us to communicate
without any ambiguity and for him to move forward with confidence.</li>
<li><ahref="paper.html">paper.org</a>: this is an ongoing paper based on the previous labbook of
Tom Cornebize. As such it is not reproducible as there are hardcoded
paths and uncleaned dependencies but writing it from the labbook was
super easy as we just had to cut and paste the parts we
needed. What may be interesting is the organization and the org
tricks to export to the right LaTeX style. As you may notice, in
the end of the document, there is a commented section with emacs
commands that are automatically executed when opening the file. It
is an effective way to depend less on the <code>.emacs/init.el</code> which is
generally customized by everyone.</li>
<li><ahref="labbook_several.html">labbook<sub>several.org</sub></a>: this is a labbook for a specific project shared
by several persons. As a consequence it starts with information
about installation, common scripts, has section with notes about all
our meetings, a section with information about experiments and an
other one about analysis. Entries could have been labeled by who
wrote them but there were only a few of us and this information was
available in git so we did not bother. In such labbook, it is common
to find annotations indicating that such experiment was <code>:FLAWED:</code> as
it had some issues.</li>
<li><ahref="technical_report.html">technical<sub>report.org</sub></a>: this is a short technical document I wrote
after a colleague sent me a PDF describing an experiment he was
conducting and asked me about how reproducible I felt it was. It
turned out I had to cut and paste the C code from the PDF, then
remove all the line numbers and fix syntax, etc. Obviously I got
quite different performance results but writing everything in
org-mode made it very easy to generate both HTML and PDF and to
explicitly explain how the measurements were done.</li>
</ol>
<p>
Here are a few links to other kind of examples:
</p>
<ulclass="org-ul">
<li>Slides: all my slides for a series of lectures is available here:
<ahref="https://github.com/alegrand/SMPE">https://github.com/alegrand/SMPE</a>. Here is a <ahref="https://raw.githubusercontent.com/alegrand/SMPE/master/lectures/lecture_central_limit_theorem.org">typical source</a> and the
<h2id="org5865c53"><spanclass="section-number-2">2</span> LaTeX IEEE title and authors   <spanclass="tag"><spanclass="ignore">ignore</span></span></h2>
In Section\ref{sec:relwork} we explained that SMPI relies on the <i>online</i> simulation approach.
Since SimGrid is a sequential simulator, SMPI maps every MPI process of the application onto a
lightweight simulation thread. These threads are then run one at a
time, \ie in mutual exclusion.
Every time a thread enters an MPI call,
SMPI takes control and the time that was spent
computing (isolated from the other threads) since the previous
MPI call can be injected into the simulator as a virtual delay.
</p>
<p>
Mapping MPI processes to threads of a single
process effectively folds them into the same address space.
Consequently, global variables in the MPI application are shared
between threads unless these variables are <i>privatized</i> and the
simulated MPI ranks thus isolated from each other. Several
technical solutions are possible to handle this issue\cite{smpi}. The
default strategy in SMPI consists of making a copy of the <code>data</code>
segment (containing all global variables) per MPI rank at startup and,
when context switching to another rank, to remap the <code>data</code> segment via <code>mmap</code> to the private copy of that rank.
SMPI also implements another mechanism relying on the <code>dlopen</code>
function that saves calls to <code>mmap</code> when context switching.
</p>
<p>
This causes online simulation to be expensive in terms of both simulation time and memory
since the whole parallel application is executed on a single node.
To deal with this, SMPI provides two simple annotation mechanisms:
</p>
<ulclass="org-ul">
<li><b>Kernel sampling</b>: Control flow is in many cases
independent of the computation results. This allows
computation-intensive kernels (\eg BLAS kernels for HPL)
to be skipped during the simulation. For this purpose, SMPI
supports annotation of regular kernels through several macros
such as <code>SMPI_SAMPLE_LOCAL</code> and <code>SMPI_SAMPLE_GLOBAL</code>. The regularity allows SMPI to execute these
kernels a few times, estimate their cost and skip the kernel in
the future by deriving its cost from these samples, hence cutting
simulation time significantly. Skipping kernels renders the
content of some variables invalid but in simulation, only the
behavior of the application and not the correctness of computation
results are of concern.</li>
<li><b>Memory folding</b>: SMPI provides the <code>SMPI_SHARED_MALLOC</code> (<code>SMPI_SHARED_FREE</code>) macro to
replace calls to <code>malloc</code> (<code>free</code>). They indicate that some data structures can safely be
shared between processes and that the data they contain is not
critical for the execution (\eg an input matrix) and that it may
even be overwritten.
<code>SMPI_SHARED_MALLOC</code> works as follows (see Figure\ref{fig:global_shared_malloc}) : a single block of physical memory (of default size \SI{1}{\mega\byte}) for the whole
execution is allocated and shared by all MPI processes.
A range of virtual addresses corresponding to a specified size is reserved and cyclically mapped onto the previously obtained
physical address.
This mechanism allows applications to obtain a nearly constant memory
footprint, regardless of the size of the actual allocations.</li>
Experiments presented in this paper were carried out using the Grid'5000 testbed, supported by a scientific interest group hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations (see <ahref="https://www.grid5000.fr">https://www.grid5000.fr</a>).
We warmly thank our TACC colleagues for their support in this study and
providing us with as much information as they could.