When taking notes, it may be difficult to remember which version of the code or of a file was used. This is what version control is useful for. Here are a few useful commands that we typically insert at the top of our notebooks in shell cells
git log -1
commit 741b0088af5b40588493c23c46d6bab5d0adeb33 Author: Arnaud Legrand <arnaud.legrand@imag.fr> Date: Tue Sep 4 12:45:43 2018 +0200 Fix a few typos and provide information on jupyter-git plugins.
git status
On branch master Your branch is ahead of 'origin/master' by 4 commits. (use "git push" to publish your local commits) Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git checkout -- <file>..." to discard changes in working directory) modified: resources.org Untracked files: (use "git add <file>..." to include in what will be committed) ../../module2/ressources/replicable_article/IEEEtran.bst ../../module2/ressources/replicable_article/IEEEtran.cls ../../module2/ressources/replicable_article/article.bbl ../../module2/ressources/replicable_article/article.tex ../../module2/ressources/replicable_article/data.csv ../../module2/ressources/replicable_article/figure.pdf ../../module2/ressources/replicable_article/logo.png .#resources.org no changes added to commit (use "git add" and/or "git commit -a")
Then, I often include commands at the end of my notebook indicating how to commit the results (adding the new files, committing with a clear message and pushing). E.g.,
git add resources.org;
git commit -m "Completing the section on getting Git information"
git push
[master 514fe2c1 ] Completing the section on getting Git information 1 file changed, 61 insertions(+) Counting objects: 25, done. Delta compression using up to 4 threads. Compressing objects: 100% (20/20), done. Writing objects: 100% (25/25), 7.31 KiB | 499.00 KiB/s, done. Total 25 (delta 11), reused 0 (delta 0) To ssh://app-learninglab.inria.fr:9418/learning-lab/mooc-rr-ressources.git 6359f8c..1f8a567 master -> master
Obviously, in this case you need to save the notebook before running this cell, hence the output of this final command (with the new git hash) will not be stored in the cell. This is not really a problem and is the price to pay for running git from within the notebook itself.
https://stackoverflow.com/questions/20180543/how-to-check-version-of-python-modules
pip3 freeze
asn1crypto==0.24.0 attrs==17.4.0 bcrypt==3.1.4 beautifulsoup4==4.6.0 bleach==2.1.3 ... pandas==0.22.0 pandocfilters==1.4.2 paramiko==2.4.0 patsy==0.5.0 pexpect==4.2.1 ... traitlets==4.3.2 tzlocal==1.5.1 urllib3==1.22 wcwidth==0.1.7 webencodings==0.5
pip3 show pandas echo " " pip3 show statsmodels
Name: pandas Version: 0.22.0 Summary: Powerful data structures for data analysis, time series,and statistics Home-page: http://pandas.pydata.org Author: None Author-email: None License: BSD Location: /usr/lib/python3/dist-packages Requires: Name: statsmodels Version: 0.9.0 Summary: Statistical computations and models for Python Home-page: http://www.statsmodels.org/ Author: None Author-email: None License: BSD License Location: /home/alegrand/.local/lib/python3.6/site-packages Requires: patsy, pandas
Inspiring from StackOverflow, here is a simple function that lists
loaded package (that have a __version__
attribute, which is
unfortunately not completely standard).
def print_imported_modules(): import sys for name,val in sorted(sys.modules.items()): if(hasattr(val, '__version__')): print(val.__name__, val.__version__) print("**** Package list in the beginning ****"); print_imported_modules() print("**** Package list after loading pandas ****"); import pandas print_imported_modules()
**** Package list in the beginning **** **** Package list after loading pandas **** _csv 1.0 _ctypes 1.1.0 decimal 1.70 argparse 1.1 csv 1.0 ctypes 1.1.0 cycler 0.10.0 dateutil 2.7.3 decimal 1.70 distutils 3.6.5rc1 ipaddress 1.0 json 2.0.9 logging 0.5.1.2 matplotlib 2.1.1 numpy 1.14.5 numpy.core 1.14.5 numpy.core.multiarray 3.1 numpy.core.umath b'0.4.0' numpy.lib 1.14.5 numpy.linalg._umath_linalg b'0.1.5' pandas 0.22.0 _libjson 1.33 platform 1.0.8 pyparsing 2.2.0 pytz 2018.5 re 2.2.1 six 1.11.0 urllib.request 3.6 zlib 1.0
The easiest way to go is as follows:
pip3 freeze > requirements.txt # to obtain the list of packages with their version pip3 install -r requirements.txt # to install the previous list of packages, possibly on an other machine
If you want to have several installed python environments, you may want to use Pipenv. I doubt it allows to track correctly FORTRAN or C dynamic libraries that are wrapped by Python.
The best way seems to be to rely on the devtools
package.
sessionInfo() devtools::session_info()
R version 3.5.1 (2018-07-02) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Debian GNU/Linux buster/sid Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0 locale: [1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C [3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8 [5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8 [7] LC_PAPER=fr_FR.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.5.1 Session info ------------------------------------------------------------------ setting value version R version 3.5.1 (2018-07-02) system x86_64, linux-gnu ui X11 language (EN) collate fr_FR.UTF-8 tz Europe/Paris date 2018-08-01 Packages ---------------------------------------------------------------------- package * version date source base * 3.5.1 2018-07-02 local compiler 3.5.1 2018-07-02 local datasets * 3.5.1 2018-07-02 local devtools 1.13.6 2018-06-27 CRAN (R 3.5.1) digest 0.6.15 2018-01-28 CRAN (R 3.5.0) graphics * 3.5.1 2018-07-02 local grDevices * 3.5.1 2018-07-02 local memoise 1.1.0 2017-04-21 CRAN (R 3.5.1) methods * 3.5.1 2018-07-02 local stats * 3.5.1 2018-07-02 local utils * 3.5.1 2018-07-02 local withr 2.1.2 2018-03-15 CRAN (R 3.5.0)
Some actually advocate that writing a reproducible research compendium can be done by writing an R package. Those of you willing to have a clean R dependency management should thus have a look at Packrat.