Gust

GUST USERS:

There is a bug in the version of slurm running on gust that will not propogate conda info correctly to the nodes when running an interactive or batch job. To fix this, please be sure to explicitly load your anaconda module within your batch script or interactive session. Please send questions to hpc-help@wm.edu .

The Gust subcluster of SciClone contains 2 nodes each containing 128 AMD EPYC Rome cores. The front-end is ust.sciclone.wm.edu and the same startup module file is named.cshrc.gust.

Hardware
Slurm
User Environment
Preferred File Systems
Compiler Flags
MPI

Hardware

		Front-end (gust / gt00)	Parallel nodes (gt01-gt02)
Model		HP DL385 Gen10
Processor(s)		2×16-core AMD EPYC 7703 32 total cores	2x 64 core AMD EPYC 7702 128 core/node 256 cores total
Clock speed		3.0 GHz	2.0 GHz
Memory		32 GB	512 GB
Network interfaces	Application	EDR IB (gt00-ib)	EDR IB (gt??-ib)
Network interfaces	System	1 GbE (gt00)	1 GbE (gt??)
OS		CentOS 7.9

Slurm

The Gust subcluster is our public offering to use the Slurm batch system. Over the next year, we will be converting all subclusters from Torque/Maui to Slurm.

Until more documentation can be written, here are some things that will get most users up and running. Please see the official slurm documentation for more information (for now):

Important differences between Torque and Slurm:

1) The Slurm startup environment is different that Torque. By default Slurm uses the environment/modules loaded when the batch script is submitted, Torque gives you a new startup environment with your default modules loaded. To simulate this in Slurm, add: #SBATCH --export=NONE to your batch script.

2) Slurm batch jobs start with the current directory as the directory from which the job was submitted. Torque would always place you in your home directory making you cd to the submission directory.

3) The mvp2run script will not be used on slurm subclusters. The main functionality is superseded by 'srun'. See 'srun -h' for help on options for mpi jobs. Most MPI jobs should be fine using 'srun ./a.out' in your batch script. One other function of mvp2run was to check the load on each node before running the job. This can be done with 'ckload' See 'ckload -h' and the example below.

Basic commands:

squeue -- show list of jobs

squeue -u <USER> -- show list of jobs owned by <USER>

sbatch <script> -- submit script as batch job

Requesting nodes/cores/threads:

-N <#>/ --nodes=<#> -- request nodes

-n <#> / --tasks=<#> -- request tasks (e.g. cores)

-c <#> / --cpus-per-task=<#> -- request # cpus per task

--ntasks-per-node=<#> -- request # tasks/cores per node

For instance:

You want to request 64 cores on 2 nodes:

--nodes=2 --ntasks-per-node=32

You want to request 64 cores on 2 nodes with 1 MPI process per node and 32 openmp threads on each node:

--nodes=2 --ntasks-per-node=2 --cpus-per-task=32

Interactive jobs:

An interactive job is run with srun:

>> salloc -N 1 -n 32

will give you a session 1 node with 32 cores available.

To get 2 nodes with 32 cores each specifiy

>> salloc -N 2 -n 64

>> salloc -N 2 -ntask-per-node=32

Example batch scripts

1. 64 core parallel job:

#!/bin/tcsh
#SBATCH --job-name=parallel 
#SBATCH --nodes=2 --ntasks-per-node=32 
#SBATCH --constraint=gust
#SBATCH -t 30:00 
 
srun ./a.out

2. 2 node hybrid parallel job

#!/bin/tcsh
#SBATCH --job-name=hybrid 
#SBATCH --nodes=2 --ntasks=2 --cpus-per-task=32
#SBATCH --constraint=gust 
#SBATCH -t 30:00 

ckload 0.05 # report if any node usage is greater than 0.05
srun ./a.out

User Environment

To login, use SSH from any host on the William & Mary or VIMS networks and connect to femto.sciclone.wm.edu with your HPC username (usually the same as your WMuserid) and W&M password.

Your home directory on Gust is the same as everywhere else on SciClone, and all of the usual filesystems (/sciclone/homeXX, /sciclone/dataXX, /sciclone/scrXX, /local/scr, etc.) are available throughout the cluster.

SciClone uses Environment Modules (a.k.a Modules) to automatically configure the user's shell environment across multiple computing platforms, as well as to organize the dozens of different software packages which are available on the system. We support tcsh as the primary shell environment for user accounts and applications.

The file which controls startup modules for Gust is .cshrc.gust. The most recent version of this file can be found in /usr/local/etc/templates on any of the front-end servers (including gust.sciclone.wm.edu).

Preferred filesystems

All of the nodes are equipped with a 700 GB HDD. Every user has a directory on this filesystem in /local/scr/$USER. This filesystem should be the preferred filesytem if your code can use it effectively.

The preferred global file system for all work on is the parallel scratch file system available at /sciclone/pscr/$USER on the front-end and compute nodes. /sciclone/scr10/$USER is a good alternative (NFS, but connected to the same InfiniBand switch).

Compiler flags

Gust has the Intel Parallel Studio XE compiler suite as well as version 11.2.0 versions of the of the GNU compiler suite. Here are suggested compiler flags which should result in fairly optimized code on the Skylake architecture:

Intel	C	icc -O3 -xCORE-AVX2 -fma
	C++	icpc -std=c11 -O3 -xCORE-AVX2 -fma
	Fortran	ifort -O3 -xCORE-AVX2 -fma
GNU	C	gcc -O3 -mavx2 -mfma
	C++	g++ -std=c11-mavx2 -mfma
	Fortran	gfortran -O3 -mavx2 -mfma

MPI

**Currently there are three versions of MPI available on the subcluster: openmpi (v3.1.4), intel-mpi (v 2018 and 2019) and mvapich2 (v2.3.1). The preferred way to run any mpi code under slurm is to use srun:

#!/bin/tcsh 
#SBATCH --job-name=test 
#SBATCH --nodes=2 --ntasks-per-node=32 
#SBATCH --constraint=gust 
#SBATCH -t 30:00
 
srun ./a.out >& LOG