================= Coding ================= General ------- * Software developers who train for interviews use `Hackerrank `_ but it has nice problems if you just want to improve your computational thinking and coding skills in general * The `Common-Sense Guide to Data Structures and Algorithms: Level Up Your Core Programming Skills `_, written by Jay Wengrow, is a fast and easy read to get an introductions to datastructures, time complexity and some important sorting and searching algorithms * for filemagement I find it useful to sometimes use * `tree `_ * and for copying files etc. `midnightcommander `_ * `Visualization of latency numbers every progammer should now `_ (why you want to be in L1 Cache, also nice is the `huminized version with SSD random read as a "normal weekend" and L1 Cache as hearbeat `_ ) * Bloom filters are interesting data structures for constant time lookup in a compact way, i found this blog article instructive: https://prakhar.me/articles/ bloom-filters-for-dummies/ * For SQL design and code the `WWW SQL Designer tool `_ is pretty useful * `Best Practices for Scientific Computing `_ is a good intro in software development for scientists Python ------- * `Why scientists should learn to program in Python `_ is a neat introduction to python for natural scientists Juypter noteboos tricks ``````````````````````` * magics are great :code:`% env` to list enviornmental variables :code:`!` to run shell commands and :code:`% lsmagic` to get a list with all of them. You can even do `profiling `_. Another nice one is :code:`% pastebin ` with which you can select linenumbers which you want to paste to pastebin * ipython widgets can be nice to make simple interactive plots (i.e. for education purposes). The `dominadatalab blog has a nice overview and and interactive ping plot `_ * `numpy-html `_ is quite need to render numpy arrays in notebooks. Jupyter notebooks on HPC environments ````````````````````````````````````` Using configuration file ************************* In some cases it can be useful to, instead of running the ipython kernel on the headnode, to just submit it to computing node of cluster and then access the kernel from the browser (either from the headnode or your local machine). You can achieve this by setting some things in your :code:`~/.jupyter/jupyter_notebook_config.py` file you can create this file using :code:`jupyter notebook --generate-config`. In this file you then can modify the following: :: c.NotebookApp.ip = '*' c.NotebookApp.open_browser = False c.NotebookApp.port = XXXXX Where :code:`XXXX` is a number greater than 8888. You might also want to set a password instead of a token (you can do this in the config file or by running :code:`jupyter-notebook password`). In your submission script you can then add :code:`hostname` to print the hostname which you can then use to access the notebook at :code:`hostname:XXXXX`. In some cases you might also want to hardcode :code:`c.NotebookApp.ip` to the ip of a particular compute node and then simply bookmark this address. Using tunneling *************** Add the following lines to your submission script (in this case setting a password is really useful) :: unset XDG_RUNTIME_DIR NODEIP=$(hostname -i) NODEPORT=$(shuf -i 8888-9999 -n 1) echo $NODEIP:$NODEPORT jupyter-notebook --ip=$NODEIP --port=$NODEPORT --no-browser :code:`shuf -i 8888-9999 -n 1` is just to get a random port (binding to 0 and letting the OS is chose is better practice, but this here is easier). You can then look into the output file which your scheduler produced as it should contain :code:`NODEIP` and :code:`NODEPORT` which you can then use to establish a tunnel connection from your local machine using :: ssh -N -L 8888:$NODEIP:$NODEPORT @ on your local machine you can now access the jupyter server at :code:`http://localhost:8888`. If you have to use windows on your local machine, you can set up tunneling using MobaXTerm. On some schedulers you may want to enable direct writing of output files, on the most recent version of PBS Pro, this is possible with :code:`qsub -koed`. Note that in both approaches the kernel will of course die after the walltime is exceeded. Speaking about jupyter notebooks, I like the `jupytertheme package `_. Online `````` * Valuable and fun are always the `talks by Raymond Hettinger `_ * Great infoĊ•mation is in `The Hitchhiker's Guide to Python `_ * `Bernd Klein `_ has also good information on advanced topics such as metaclasses or memoization with decorators Faster Python ````````````` * Consider trying `PyPY `_ instead of CPython (check the `benchmarks `_). * Nice introduction in vectorization, numpy and numba (and what are the bottlenecks in python) by `Donald Whyte `_ C -- Online `````` * `Build your own lisp `_ is a nice way to get started with C and learn about lisps Editors/IDEs ------------ VIM ``` Vim is a really powerful editor, but you need to spend some time learning and configuring it. Configuration ************* Some useful settings for the :code:`.vimrc` file are: * syntax highlighting :code:`syntax enable` is probably self-explanatory * search :: set incsearch " lookahead search set ignorecase " in most cases I want to be case-insenstivie set smartcase " unless i explicitely use uppercase set hlsearch " highlight matches * identations :: set tabstop=4 " number of spaces per set expandtab " convert key-presses to spaces in insert mode set shiftwidth=4 " set a key-press equal to 4 spaces set autoindent " copy indent from current line when starting a new line set smartindent " even better autoindent ('smart' insert after e.g. {) * Persistent undo :: if has('persistent_undo') " Save all undo files in a single location (less messy, more risky)... set undodir=$HOME/.VIM_UNDO_FILES " Save a lot of back-history... set undolevels=5000 " Actually switch on persistent undo set undofile endif * I am paranoid, I want to lose at max 10 keystrokes :: set updatecount=10 * If you do not want to type all the search replace syntax (vide infra) remap it :: nmap S :%s//g now you need to type only :: SX/Y for global search/replace on all lines. If you want to see a really crazy setup, check out `Damian Conway's vim setup `_. There you can also find how to create the `Star Wars intro in vim `_. Plugins ******* * `schelpp `_: makes it easier to move stuff in visual block * `fatfinger `_: corrects common misspellings * `python syntax highlighting `_ * `flake8 `_ for PEP8 style and error checking * if you are used to :code:`` completion, you might like `supertab `_ * `jedi-vim `_ for some nice python autocompletion Commands ********* * Use :code:`$` to get to the end of the lines * Use different navigation levels :code:`b`, :code:`w`, :code:`{` and :code:`(` * Search/Replace (:code:`g` means global) * all lines :code:`:%s/foo/bar/g` * this line :code:`:s/foo/bar/g` PyCharm ``````` PyCharm is the IDE I use for larger python projects, some useful features are: Sublime ``````` Sublime is a lot faster than PyCharm and supports basically all languages. For setting it up, the `realpython blog `_ has some useful package recommendation (especially the package manager is really good). In addition to that I would recommend `PyYapf `_ and the `Flake8 linter `_ Development process ------------------- Starting a project `````````````````` The easiest way to start a (python) project is to use a `cookiecutter `_ that creates the basic project structure and also some configuration files for you. A nice one in the field of molecular simulations is the `cookiecutter for computational molecular sciences python packages `_ CI/CD ````` Docker ****** On HPC environments, where you don't have root rights, `singularity `_ might be a way to go. There is also a `image to convert singularity images to docker images `_ Git(hub) ******** Pre-Commit `````````` Documentation ````````````` * `ReStructured Text Quickreference `_: useful when writing sphinx docs Schedulers `````````` SGE *** * Kill all jobs: :: squeue -u $USER | grep 5 | awk '{print $1}' | xargs -n 1 scancel replace :code:`5` with the number all your jobids start with.