Coding

General

Python

Juypter noteboos tricks

  • magics are great % env to list enviornmental variables ! to run shell commands and % lsmagic to get a list with all of them. You can even do profiling. Another nice one is :code:`% pastebin ` with which you can select linenumbers which you want to paste to pastebin
  • ipython widgets can be nice to make simple interactive plots (i.e. for education purposes). The dominadatalab blog has a nice overview and and interactive ping plot
  • numpy-html is quite need to render numpy arrays in notebooks.

Jupyter notebooks on HPC environments

Using configuration file

In some cases it can be useful to, instead of running the ipython kernel on the headnode, to just submit it to computing node of cluster and then access the kernel from the browser (either from the headnode or your local machine).

You can achieve this by setting some things in your ~/.jupyter/jupyter_notebook_config.py file you can create this file using jupyter notebook --generate-config.

In this file you then can modify the following:

c.NotebookApp.ip = '*'
c.NotebookApp.open_browser = False
c.NotebookApp.port = XXXXX

Where XXXX is a number greater than 8888. You might also want to set a password instead of a token (you can do this in the config file or by running jupyter-notebook password). In your submission script you can then add hostname to print the hostname which you can then use to access the notebook at hostname:XXXXX. In some cases you might also want to hardcode c.NotebookApp.ip to the ip of a particular compute node and then simply bookmark this address.

Using tunneling

Add the following lines to your submission script (in this case setting a password is really useful)

unset XDG_RUNTIME_DIR
NODEIP=$(hostname -i)
NODEPORT=$(shuf -i 8888-9999 -n 1)
echo $NODEIP:$NODEPORT

jupyter-notebook --ip=$NODEIP --port=$NODEPORT --no-browser

shuf -i 8888-9999 -n 1 is just to get a random port (binding to 0 and letting the OS is chose is better practice, but this here is easier). You can then look into the output file which your scheduler produced as it should contain NODEIP and NODEPORT which you can then use to establish a tunnel connection from your local machine using

ssh -N -L 8888:$NODEIP:$NODEPORT <username>@<machinename>

on your local machine you can now access the jupyter server at http://localhost:8888. If you have to use windows on your local machine, you can set up tunneling using MobaXTerm. On some schedulers you may want to enable direct writing of output files, on the most recent version of PBS Pro, this is possible with qsub -koed.

Note that in both approaches the kernel will of course die after the walltime is exceeded.

Speaking about jupyter notebooks, I like the jupytertheme package.

Online

Faster Python

  • Consider trying PyPY instead of CPython (check the benchmarks).
  • Nice introduction in vectorization, numpy and numba (and what are the bottlenecks in python) by Donald Whyte

C

Online

Editors/IDEs

VIM

Vim is a really powerful editor, but you need to spend some time learning and configuring it.

Configuration

Some useful settings for the .vimrc file are:

  • syntax highlighting syntax enable is probably self-explanatory

  • search

    set incsearch           " lookahead search
    set ignorecase          " in most cases I want to be case-insenstivie
    set smartcase           " unless i explicitely use uppercase
    set hlsearch            " highlight matches
    
  • identations

    set tabstop=4           " number of spaces per <TAB>
    set expandtab           " convert <TAB> key-presses to spaces in insert mode
    set shiftwidth=4        " set a <TAB> key-press equal to 4 spaces
    
    set autoindent          " copy indent from current line when starting a new line
    set smartindent         " even better autoindent ('smart' insert after e.g. {)
    
  • Persistent undo

    if has('persistent_undo')
      " Save all undo files in a single location (less messy, more risky)...
      set undodir=$HOME/.VIM_UNDO_FILES
    
      " Save a lot of back-history...
      set undolevels=5000
    
      " Actually switch on persistent undo
      set undofile
    
    endif
    
  • I am paranoid, I want to lose at max 10 keystrokes

    set updatecount=10
    
  • If you do not want to type all the search replace syntax (vide infra) remap it

    nmap  S  :%s//g<LEFT><LEFT>
    

    now you need to type only

    SX/Y<CR>
    

    for global search/replace on all lines.

If you want to see a really crazy setup, check out Damian Conway’s vim setup. There you can also find how to create the Star Wars intro in vim.

Plugins

Commands

  • Use $ to get to the end of the lines

  • Use different navigation levels b, w, { and (

  • Search/Replace (g means global)

    • all lines :%s/foo/bar/g
    • this line :s/foo/bar/g

PyCharm

PyCharm is the IDE I use for larger python projects, some useful features are:

Sublime

Sublime is a lot faster than PyCharm and supports basically all languages. For setting it up, the realpython blog has some useful package recommendation (especially the package manager is really good). In addition to that I would recommend PyYapf and the Flake8 linter

Development process

Starting a project

The easiest way to start a (python) project is to use a cookiecutter that creates the basic project structure and also some configuration files for you. A nice one in the field of molecular simulations is the cookiecutter for computational molecular sciences python packages

CI/CD

Docker

On HPC environments, where you don’t have root rights, singularity might be a way to go. There is also a image to convert singularity images to docker images

Git(hub)

Pre-Commit

Documentation

Schedulers

SGE

  • Kill all jobs:

    squeue -u $USER | grep 5 | awk '{print $1}' | xargs -n 1 scancel
    

    replace 5 with the number all your jobids start with.