The aim of this guide is to explain how to use a local
R session and submit jobs to the GWDG High Performance Cluster (HPC) and retrieve the results in the same
R session (requires GWDG account). The big advantage of this is that working locally in an IDE (e.g. RStudio) is way more convenient than working with Linux shell on the HPC. Also, it is not necessary to manually copy data to the HPC (and vice versa). The structure of this guide is heavily influenced by our own mistakes and we hope to make it somewhat easier for future HPC user.
In general, our setup should also work with any other HPC and the code snippets should be usable for other scheduling systems than SLURM (which is used by the GWDG) with slight modifications. To see the outdated version of this guide for LSF, there is a GitHub branch called LSF in this repo.
The user account must be activated to use the HPC. Therefore, an e-mail to firstname.lastname@example.org asking to activate the account must be sent.
The Linux shell must be set to
/bin/ksh (and not
/bin/sh). It’s possible to set this in the user settings at www.gwdg.de.
For a convenient login and to use
R via SSH connector, a SSH key is needed on the HPC (private key on local computer, public key on HPC).
Using Linux or macOS and Windows 10, it is straightforward to generate a SSH key and copy it to the HPC using the shell/terminal.
ssh-keygen -t rsa -b 2048 -f <yourkey>