Close menu Resources for... William & Mary
W&M menu close William & Mary

Femto

The Femto subcluster of SciClone contains 30 nodes each containing 32 Intel Skylake cores. The front-end is femto.sciclone.wm.edu and the same startup module file is named.cshrc.femto

Hardware

Front-end

(femto / fm0)

Parallel nodes

(fm01-fm30)

Model Dell PowerEdge R440
Processor(s)

2×16-core

Intel Xeon Gold 6130

Clock speed

2.1 GHz

Memory 96 GB 96 GB

Network

interfaces

Application

EDR IB (fm00-ib)

EDR IB (fm??-ib)

System

GbE (fm00)

1 GbE (fm??)

OS CentOS 7.6

 

Slurm

The Femto subcluster is our first offering to use the Slurm batch system.  Over the next year, we will be converting all subclusters from Torque/Maui to Slurm. 

Until more documentation can be written, here are some things that will get most users up and running.  Please see the official slurm documentation for more information (for now):

Important differences between Torque and Slurm:

1) The Slurm startup environment is different that Torque.   By default Slurm uses the environment/modules loaded when the batch script is submitted, Torque gives you a new startup environment with your default modules loaded.  To simulate this in Slurm, add:   #SBATCH --export=NONE to your batch script.

2) Slurm batch jobs start with the current directory as the directory from which the job was submitted.  Torque would always place you in your home directory making you cd to the submission directory.

3) The mvp2run script will not be used on slurm subclusters.  The main functionality is superseded by 'srun'.  See 'srun -h' for help on options for mpi jobs.   Most MPI jobs should be fine using 'srun ./a.out' in your batch script.  One other function of mvp2run was to check the load on each node before running the job.  This can be done with 'ckload'  See 'ckload -h' and the example below.

Basic commands:

squeue -- show list of jobs

squeue -u <USER>  -- show list of jobs owned by <USER>

sbatch <script> -- submit script as batch job

 

Requesting nodes/cores/threads:

-N <#>/ --nodes=<#>  -- request nodes

-n <#> / --tasks=<#>  -- request tasks (e.g. cores)

-c <#> / --cpus-per-task=<#> -- request # cpus per task

--ntasks-per-node=<#> -- request # tasks/cores per node

For instance:

You want to request 64 cores on 2 nodes:

--nodes=2 --ntasks-per-node=32

You want to request 64 cores on 2 nodes with 1 MPI process per node and 32 openmp threads on each node:

--nodes=2 --ntasks-per-node=2 --cpus-per-task=32

 

Interactive jobs:

An interactive job is run with srun:

>> srun -N 1 -n 32 --pty tcsh

will give you a tcsh on 1 node with 32 cores available.

To get 2 nodes with 32 cores each specifiy

>> srun -N 2 -n 64 --pty tcsh

or

>> srun -N 2 -ntask-per-node=32 --pty tcsh

 

Example batch scripts

1. 64 core parallel job:

#!/bin/tcsh
#SBATCH --job-name=parallel
#SBATCH --nodes=2 --ntasks-per-node=32
#SBATCH --constraint=femto
#SBATCH -t 30:00

srun ./a.out

2. 2 node hybrid parallel job

#!/bin/tcsh
#SBATCH --job-name=hybrid
#SBATCH --nodes=2 --ntasks=2 --cpus-per-task=32
#SBATCH --constraint=femto
#SBATCH -t 30:00

ckload 0.05 # report if any node usage is greater than 0.05
srun ./a.out

 


User Environment

To login, use SSH from any host on the William & Mary or VIMS networks and connect to femto.sciclone.wm.edu with your HPC username (usually the same as your WMuserid) and W&M password.

Your home directory on Femto is the same as everywhere else on SciClone, and all of the usual filesystems (/sciclone/homeXX, /sciclone/dataXX, /sciclone/scrXX, /local/scr, etc.) are available throughout the Femto.

SciClone uses Environment Modules (a.k.a Modules) to automatically configure the user's shell environment across multiple computing platforms, as well as to organize the dozens of different software packages which are available on the system. We support tcsh as the primary shell environment for user accounts and applications.  

The file which controls startup modules for Femto is .cshrc.femto. The most recent version of this file can be found in /usr/local/etc/templates on any of the front-end servers (including femto.sciclone.wm.edu).


Preferred filesystems

All of the femto nodes are equipped with a 2TB SSD.  Every user has a directory on this filesystem in /local/scr/$USER.  This filesystem should be the preferred filesytem if your code can use it effectively.

The preferred global file system for all work on Femto is the parallel scratch file system available at /sciclone/pscr/$USER on the front-end and compute nodes. /sciclone/scr10/$USER is a good alternative (NFS, but connected to the same InfiniBand switch).


Compiler flags

Bora and Hima have the Intel Parallel Studio XE 2018/2019 compiler suite as well as version 9.1.0 versions of the of the GNU compiler suite. Here are suggested compiler flags which should result in fairly optimized code on the Skylake architecture:

Intel C icc -O3 -xCORE-AVX2 -fma -align -finline-functions
C++ icpc -std=c11 -O3 -xCORE-AVX2 -fma -align -finline-functions
Fortran ifort -O3 -xCORE-AVX2 -fma -align array64byte -finline-functions
GNU C gcc -march=skylake -O3 -mfma -malign-data=cacheline -finline-functions
C++ g++ -std=c11 -march=skylake -O3 -mfma -malign-data=cacheline -finline-functions
Fortran gfortran -march=skylake -O3 -mfma -malign-data=cacheline -finline-functions

MPI

Currently there are three versions of MPI available on the Femto subcluster: openmpi (v3.1.4), intel-mpi (v 2018 and 2019) and mvapich2 (v2.3.1).    The preferred way to run any mpi code under slurm is to use srun:


#!/bin/tcsh 
#SBATCH --job-name=test 
#SBATCH --nodes=2 --ntasks-per-node=32
#SBATCH --constraint=femto
#SBATCH -t 30:00

srun ./a.out >& LOG