# -*- coding: utf-8 -*- # -*- mode: org -*- #+TITLE: Org Mode Template for an IEEE Conference Article #+AUTHOR: John Doe, John Smith, and Jane Doe #+STARTUP: overview indent inlineimages logdrawer #+LANGUAGE: en #+OPTIONS: num:nil toc:t \n:nil @:t ::t |:t ^:nil -:t f:t *:t <:t #+OPTIONS: TeX:t LaTeX:t skip:nil d:nil todo:t pri:nil tags:not-in-toc #+OPTIONS: email:nil creator:nil timestamp:t #+TAGS: noexport(n) deprecated(d) #+EXPORT_SELECT_TAGS: export #+EXPORT_EXCLUDE_TAGS: noexport # # Default org-mode HTML style # #+HTML_HEAD: # # Shiny readthedocs HTML style #+HTML_HEAD: #+HTML_HEAD: #+HTML_HEAD: #+HTML_HEAD: #+HTML_HEAD: #+HTML_HEAD: # ### By default, all code chunks are being run when exporting. To # ### avoid this, simply remove the "# " of the next line. # #+PROPERTY: header-args :eval never-export #+LATEX_CLASS: IEEEtran #+LATEX_HEADER: \usepackage[T1]{fontenc} #+LATEX_HEADER: \usepackage[utf8]{inputenc} #+LATEX_HEADER: \usepackage{graphicx} #+LATEX_HEADER: \usepackage{xspace,ifthen} #+LATEX_HEADER: \usepackage{amsmath,amssymb} #+LATEX_HEADER: \usepackage[american]{babel} #+LATEX_HEADER: \usepackage{url} \urlstyle{sf} #+BEGIN_EXPORT latex \let\oldcite=\cite \renewcommand\cite[2][]{~\ifthenelse{\equal{#1}{}}{\oldcite{#2}}{\oldcite[#1]{#2}}\xspace} \let\oldref=\ref \def\ref#1{~\oldref{#1}\xspace} \sloppy #+END_EXPORT #+BEGIN_abstract Reproducible research has become increasingly important, just like parallel architectures so we propose a novel experimental study of a parallel implementation of the quicksort algorithm that builds on reproducible research technology. #+END_abstract * Getting the LaTeX packages :noexport: #+begin_src python :session python :results output :exports both import urllib.request import os.path def download_urls(url_list): for url,filename in url_list: if(not(os.path.isfile(filename))): urllib.request.urlretrieve(url, filename) download_urls([("http://mirrors.ctan.org/macros/latex/contrib/IEEEtran/IEEEtran.cls","IEEEtran.cls"), ("http://mirrors.ctan.org/macros/latex/contrib/IEEEtran/bibtex/IEEEtran.bst","IEEEtran.bst")]) #+end_src #+RESULTS: * Getting the data :noexport: #+begin_src python :session python :results output :exports both download_urls([("https://raw.githubusercontent.com/alegrand/M2R-ParallelQuicksort/master/data/sama_2014-10-13/measurements_03%3A47.txt","measurements.txt")]) #+end_src #+RESULTS: * Extracting traces from data files :noexport: #+begin_src python :results output raw :exports both :var input="measurements.txt" output="data.csv" import re,sys INPUT = open(input, "r") OUTPUT = open(output, "w") OUTPUT.writelines("Size, Seq, Par, Libc\n") for line in INPUT.readlines(): m = re.match('^Size: ([\d\.]*)$',line) if(m): size = m.group(1) continue m = re.match('^Sequential quicksort.*: ([\d\.]*) sec.$',line) if(m): seq = m.group(1) continue m = re.match('^Parallel quicksort.*: ([\d\.]*) sec.$',line) if(m): par = m.group(1) continue m = re.match('^Built-in quicksort.*: ([\d\.]*) sec.$',line) if(m): libc = m.group(1) OUTPUT.writelines(size+", "+seq+", "+par+", "+libc+"\n") print("[[file:"+output+"]]") #+end_src #+RESULTS: [[file:data.csv]] * R functions :noexport: Initialization and reading of the data. #+begin_src R :results output :session *R* :exports both df=read.csv("data.csv",strip.white=T,header=T) #+end_src #+RESULTS: Let's compute some (crappy) statistics. #+begin_src R :results output :session *R* :exports both sdf = data.frame() for(size in unique(df$Size)) { seq = mean(df[df$Size == size,]$Seq) par = mean(df[df$Size == size,]$Par) libc = mean(df[df$Size == size,]$Libc) sdf = rbind(sdf,data.frame(Size=size,Seq=seq,Par=par,Libc=libc)) } #+end_src #+RESULTS: * Introduction With the advent of parallel architecture, it is tempting to propose parallel implementations of classical operations. In this wonderful article, we report the performance gain of the well known quicksort algorithm, whose parallelization seems natural. * Context Quicksort (sometimes called partition-exchange sort) is an efficient sorting algorithm, serving as a systematic method for placing the elements of an array in order. Developed by Tony Hoare in 1959 with his work published in 1961\cite{qsort}, it is still a commonly used algorithm for sorting. Quicksort is a divide and conquer algorithm. Quicksort first divides a large array into two smaller sub-arrays: the low elements and the high elements. Quicksort can then recursively sort the sub-arrays. We propose to sort the sub-arrays in parallel on two threads and to bound the recursion level. * Related Work Well, this is such a brilliant and novel idea that none has worked on this cutting-edge topic. Parallel architecture barely existed in 1959, when Hoare proposed the sequential version of this algorithm\cite{qsort}. Some colleagues mentioned some obscure parallelization of the partitioning phase but it is too complex to implement and certainly too costly to provide any performance gain. * Methodology We used a Dell Latitude 6430u with 16Gb of RAM running a Debian with Linux 3.14.15. The CPU is an Intel(R) Core(TM) i7-3687U CPU @ 2.10GHz comprising two physical cores and hyperthreading. The =performance= frequency governor was used. We used FCC 5.3.1 with the following compilation flags: =-g -Wall -O2 -pthread -lrt=. Since we care a lot about reproducibility of our research, all the sources and detailed information regarding our experiments are provided on [[https://github.com/alegrand/M2R-ParallelQuicksort/blob/master/journal.org][Github]] #+LaTeX:\footnote{Or figshare, or zenodo, or whatever platform you prefer!}. We used the wonderful Org-mode format\cite{schulte11:_activ_docum_org_mode} to both document our experiments and create a replicable article that can be obtained on [[https://github.com/alegrand/RR_webinars/blob/master/1_replicable_article_laboratory_notebook/replicable/article.org][Github]] as well. We could have used the Ipython approach\cite{PER-GRA:2007} but we dishonestly ruled it out because we wanted to be in full control of our C experimental setup and did not know how to properly use it to write a replicable article. Finally, it is important to explain that each measurement was repeated five times and the set of experiments was absolutely not randomized, which may compromise the validity of our observations and conclusions. * Experimental results #+begin_src R :results graphics :file "figure.pdf" :exports none :width 6 :height 4 :session *R* # This code will not appear. It is just here to generate a nice figure to include later on. linetype <- c(1:3) plotchar <- c(1:3) plot(sdf$Size,sdf$Seq,log="x", type="b", lty=linetype[1], pch=plotchar[1]) legend( x="topleft", legend=c("Sequential execution","Parallel execution", "Default Libc execution"), lwd=1, lty=linetype, pch=plotchar) lines(sdf$Size,sdf$Par, type="b", lty=linetype[2], pch=plotchar[2]) lines(sdf$Size,sdf$Libc, type="b", lty=linetype[3], pch=plotchar[3]) #+end_src #+RESULTS: [[file:figure.pdf]] As we can see in Figure\ref{fig.comparison}, to benefit from the parallel version of our quicksort implementation, significantly large arrays (over a million entry) are required. Of course, it is difficult to conclude from so few measurements and a better experiment design would be useful. #+LaTeX: \vspace{-1cm} #+CAPTION: Comparing performances of several implementation of the quicksort algorithm\label{fig.comparison} #+ATTR_LaTeX: :width .8\linewidth file:figure.pdf #+LaTeX: \vspace{-.3cm} * Conclusion Exploiting parallel machine can be quite difficult and tedious. From our experience, such architecture are useful only when processing sufficiently large data sets. As a future work, we intend to consolidate our study with more experiments. #+Latex:\section*{Acknowledgments} This work is partially supported by the FOO and BAR projects. Experiments presented in this paper were carried out using our own experimental testbed with support of our university. #+LaTeX: \nocite{*} #+LaTeX: \def\raggedright{} #+LaTeX: \bibliographystyle{IEEEtran} #+LaTeX: \bibliography{biblio} * Emacs Setup :noexport: This document has local variables in its postembule, which should allow Org-mode to work seamlessly without any setup. If you're uncomfortable using such variables, you can safely ignore them at startup. Exporting may require that you copy them in your .emacs. # Local Variables: # eval: (add-to-list 'load-path ".") # eval: (require 'org-install) # eval: (org-babel-do-load-languages 'org-babel-load-languages '((R . t) (python . t) )) # eval: (setq org-babel-python-command (if (memq system-type '(windows-nt ms-dos)) "Python" "python3")) # eval: (require 'ess-site) # eval: (setq org-confirm-babel-evaluate nil) # eval: (unless (boundp 'org-latex-classes) (setq org-latex-classes nil)) # eval: (add-to-list 'org-latex-classes '("IEEEtran" # "\\documentclass[conference, 10pt]{IEEEtran}\n \[NO-DEFAULT-PACKAGES]\n \[EXTRA]\n \\usepackage{graphicx}\n \\usepackage{hyperref}" ("\\section{%s}" . "\\section*{%s}") ("\\subsection{%s}" . "\\subsection*{%s}") ("\\subsubsection{%s}" . "\\subsubsection*{%s}") ("\\paragraph{%s}" . "\\paragraph*{%s}") ("\\subparagraph{%s}" . "\\subparagraph*{%s}"))) # eval: (setq org-alphabetical-lists t) # eval: (setq org-src-fontify-natively t) # eval: (setq ess-ask-for-ess-directory nil) # eval: (setq org-latex-pdf-process '("%latex -interaction nonstopmode -output-directory %o %f" "bibtex %b" "%latex -interaction nonstopmode -output-directory %o %f" "%latex -interaction nonstopmode -output-directory %o %f")) # End: