systemtomakeeveryone's life easier. All these design choices tend
to make reproducibility often a bit painful with python, even though
the community is slowly taking this into account.
[[https://en.wikipedia.org/wiki/R_(programming_language)][R]], in comparison is much closer (in terms of developer community) to
languages like [[https://en.wikipedia.org/wiki/SAS_(software)][SAS]], which is heavily used in the pharmaceutical
industry where statistical procedures need to be standardized and rock
solid/stable. R is obviously not immune to evolutions that break old
versions and hinder reproducibility/backward compatibility. Here is a
relatively recent [[http://members.cbio.mines-paristech.fr/~thocking/HOCKING-reproducible-research-with-R.html][true story about this]] and some colleagues who worked
on the [[https://www.fun-mooc.fr/courses/UPSUD/42001S06/session06/about][statistics introductory course with R on FUN]] reported us
several issues with functions from a few functions (=plotmeans= from
=gplots=, =survfit= from =survival=, or =hclust=) whose default
parameters had changed over the years. It is thus probably a good
practice to explicitly indicate in your code default values (, which
can be cumbersome) and to restrict your dependencies as much as
possible.
This being said, the R development community is generally quite
careful about stability. We (the authors of this MOOC) think open
source (, which allows to inspect how computation is done and to
identify both mistakes and sources of non reproducibility) is more
important than the rock solid stability of SAS, which is a proprietary
software. Yet, if you really need to stay with SAS (similar solutions
probably exist for other languages as well), you should know that SAS
can be used within Jupyter using either the [[https://sassoftware.github.io/sas_kernel/][Python SASKernel]] or the
[[https://sassoftware.github.io/saspy/][Python SASPy]] package (step by step explanations about this are given
[[https://app-learninglab.inria.fr/gitlab/85bc36e0a8096c618fbd5993d1cca191/mooc-rr/blob/master/documents/tuto_jupyter_windows/tuto_jupyter_windows.md][here]]). Using such literate programming approach allied with systematic
control version and environment control will help anyway.
** Controlling your software environment
As we mentioned in the video sequences, there are several solutions to
control your environment:
- The easy (preserve the mess) ones: [[http://www.pgbovine.net/cde.html][CDE]] or [[https://vida-nyu.github.io/reprozip/][ReproZip]]
- The more demanding (encourage cleanliness) where you start with a
clean environment and install only what'sstrictlynecessary(anddocumentit):