Cluster

The sun HPC cluster consists of 24 computing nodes. Two different types of nodes are used:

This makes a total of 1280 processors at 1280GB RAM.

The Operating System used is Ubuntu based Qlustar. Qlustar has a very useful usage Manual here.

Both, cluster and workstation are booting os-images over the network. The base for both images are the same. For workstations additional packages (i.e. for Desktop environment) are added to the os-images.

Both, cluster and workstation OSes mount central file systems over the network, making them available by the exact same way to the user from every workstation as well as from the computing nodes.

Login Nodes

sun - Spectacular Userlogin Node

This is the main access to the cluster: Login using ssh at

username@sun.iek.fz-juelich.de

Use it to manage your jobs in the queue and for similar management tasks.

To transfer data do /data/ please use dam! To transfer data to /home/ please use fire!

dam - Data Access and analysis Machine

This is the access directly to the machine which holds the /data folder. Login via

username@dam.iek.fz-juelich.de

Use this node if you

  • want to run analysis directly on the data in /data/ or
  • want to upload/download a lot of data to/from /data/ or generally for data transfer
  • want to use SFTP to access your data!

Don't use this for building or running simulation software. Development libraries are not available.

fire - Also a machine

This is the access directly to the machine which holds the /home and /apps folder. Login via

username@fire.iek.fz-juelich.de

Don't use this for building or running simulation software. Development libraries are not available.

Folders

There are three special folders on all the login nodes and workstations in Nürnberg:

folder served byuse for user-quota
$WORK or /data/$USER dam simulation/analysis data 200G
$HOME or /home/$USER fire your regular home folder: documents, day-to-day, etc. 20G
/apps/local fire applications, binaries, compilers not in the OS-repo1)
(e.g. anaconda, TotalView, Mathematica, MATLAB, Maple)
-

Be aware that, by default all other users can read and list contents of your directories. If you don't want this, you have to change permissions for your files yourself using chmod.

Backup snapshots

Under /homesnaps/ you find 8 subfolders containing snapshots of the /home folder

  • 4 from the last 24 hours (hourly-1hourly-4, i.e. at 8:00, 12:00, 16:00 and 20:00) and
  • 4 from the last 4 days (daily-1daily-4, i.e. at 22:00).

The higher the number the older the snapshot.

Job Queue

slurm is used as a job queue manager. You get the installed version using srun –version. For users, slurm has 4 central commands:

  • srun – run a parallel job directly (i.e. replacement for mpirun)
  • sbatch – submit a jobscript to the queue
  • squeue – get information about the queue
  • sinfo – get information about slurm nodes and partitions)

Then there are some other commands:

  • salloc – open an interactive shell
  • smap – get a graphical representation of queue and cluster
  • scontrol – control utility (especially useful are `scontrol update`, `scontrol show` or `scontrol help`)
  • sprio – show priority of pending jobs. *The larger the number, the higher the job will be positioned in the queue, and the sooner the job will be scheduled.*

man-pages for slurm commands are installed on sun and the workstations.

Example Jobscript

This is an nonsensical example jobscript you would submit with sbatch. The lines starting with #SBATCH specify parameters for sbatch that are overwritten by commandline parameters. The most important ones are

  • -o – output file (%j,%N are replaced with jobid, masternode respectively)
  • -J – jobname
  • -n – number of processors (equivalent of -np for mpirun)

You must use srun inside the jobscript instead of mpirun!

jobscript.sh
#!/bin/bash
#SBATCH -o job.%j.%N.out
#SBATCH -J YourJobName
#SBATCH --get-user-env
#SBATCH -n 64
#SBATCH --time=08:00:00
 
srun executable 

Bridging wall-time

Every queue has a wall-time. After that run-time, your job is cancelled automatically. This is not a problem but a necessity! Have here an example for a script that launches your jobscript 10 times, so your job is taken up once it is stopped by the wall-time.

Note: Your jobscript must be able and correctly configured to start your simulation automatically from an existing checkpoint!

This script assumes, your real job-script is in run.batch, then submits 10 jobs that all run that script sucessively.

jobseries.sh
#!/bin/bash
NJOBS=10
ID=""
SCRIPT="run.batch"
for JOB in `seq 1 $NJOBS`; do
    CMD="sbatch $SCRIPT"
    if [ -n "$ID" ] ; then
       CMD="sbatch --dependency=afterany:$ID $SCRIPT"
    fi
    OUT=`$CMD`
    echo $OUT
    ID=`echo $OUT | awk '{print $NF}'`
done

Partitions

Select the partition you want with --partition=<name> or -p <name>. Default is long. In total, every user can at max occupy 928 cores at a time.

PartitionMax Nodes per userMax CPUs per userWall Time
long 12 24h
big 24 6h
small 3 160 3 days max 2 jobs per user
dev 3 1h elevated priority, only for group hi-ern
gold - 24h
gpu 24h Access to gpu machines
gpu-fau 24h Acces to FAU gpu node with elevated priority + preemption for group fau-puls

Accounting

Slurm keeps track of the used processing time for your jobs. Related commands are

  • sacct – displays accounting data for all jobs and job steps in the Slurm job accounting database
  • sreport – generates reports from slurm accounting data

For example

sreport user top start=1/1/16 end=12/12/16

gives you a nice ranking of all cluster users usage for the year 2016


Compilers

There is installed

  • intel compilers (icc, ifort, icpc)
  • gcc (gcc, gfortran, g++)
  • PGI Compilers (pgcc, pgfortran, pgc++)

via qlustar packages in the default path so you can use them without further ado from the command line.

MPI

The mpi wrappers are installed with qlustar packages as well. We use the OpenMPI implementation. The compilers are invoked by appending .openmpi-$CC 2) to the desired mpi* compiler executable.

$CC \ language C FORTRAN C++3) </sup>
intelmpicc.openmpi-icc mpifort.openmpi-icc mpicxx.openmpi-icc
gccmpicc.openmpi-gcc mpifort.openmpi-gcc mpicxx.openmpi-gcc
pgimpicc.openmpi-pgi mpifort.openmpi-pgi mpicxx.openmpi-pgi

You most certainly will need to adjust your Makefile(s), ./configure your software properly or maybe even only run make with adequate parameters (e.g. set CC=mpicc.openmpi-icc and such) in order to compile successfully.

If needed, you can find the openMPI root directory for your compiler collection $CC and version $VER at /usr/lib/openmpi/$VER/$CC. See also the -show switch of the mpi compiler wrappers.

To run your parallel executable locally, you need to use mpirun.openmpi. In your slurm batch scripts always use srun!

  • my job has S after it, all queues are down, what's happening?

It got too warm in the server room, as a first precaution, jobs are suspended (S) and queues disabled.

  • the queueing system is unfair!

Raise awareness or start a discussion at compflu_iek11@fz-juelich.de or create an issue here


1)
must be member of the group softadm to change things here
2)
$CC compiler collection: gcc icc or pgi
3)
mpiCC and mpicxx both compile C++
  • compflu/cluster.txt
  • Last modified: 19.07.2019 15:26
  • by m.zellhoefer