Compiling

SciClone offers multiple compiler suites and tools for code development and testing. In addition to GNU Compiler Collection, supported compilers include Intel and Portland Group compiler suites. Due to the differences in the sub-cluster architectures and the provisioned operating system, each sub-cluster has a standard default GCC compiler version that supports compilation flags and directives specific to the sub-cluster architecture.  An overview of the default and available compiler suite versions for each sub-cluster is tabulated below:

Sub-cluster GCC Versions Intel Compiler Versions PGI Compilers
Rain gcc/7.2.0
Hurricane/Whirlwind 4.7.0 (default),4.7.3, 4.8.4, 5.2.0 2017(default) , 2016 11.10 (default), 14.3, 17.7
Vortex/Vortex-alpha 4.7.3 (default), 4.8.4, 5.2.0 2017 14.3 (default), 16.3, 17.7
Bora/Hima 4.9.4 (default), 7.2.0 2017 (default), 2016 16.3 (default), 17.3, 17.7
Storm 4.7.3 (default), 5.2.0 2017 14.3 (default), 17.7
Meltemi 6.3.0 (default) 2016, 2017 (default)
Potomac & Pamunkey 4.7.3 (default), 5.3.0 2016 14.3 (default), 15.4
James 4.8.5, 6.3.0 (default) 2016, 2017, 2018 15.4, 18.7(default)

The versions listed above is not comprehensive and shows only the recommended versions for each sub-cluster. For a comprehensive list refer here.

All the compiler paths, include paths, library paths and compiler conflicts are managed using module files. (For more information on using module files refer here.)

Code Optimization

Code optimization using compilers can be done by passing the compiler optimization flags during compilation. In automated build systems, this can be done by setting the CFLAGS and CXXFLAGS environment variables. Most commonly used compiler optimization flags that are common across different platforms are tabulated below. These can be used in addition to the CPU architecture specific optimization flags.

GNU Compiler Chain Intel Parallel Studio PGI Compilers Remarks

--help=optimizers (-Q)

-help opt -help=opt Display optimization options.
-help advanced Display advanced optimization options that allow fine tuning of compilation.
-O0, -O1, -O2, -O3 -O0, -O1, -O2, -O3 -O0, -O1, -O2, -O3,-O4 Levels of optimizations. Default -O2. Aggressive optimization option -O3 may change numerical results.
-Ofast -fast -fast Choose generally optimal flags for the target platform
-ipa -ipo -Mipa=fast InterProcedural Analysis / Inter Procedural Optimization
-malign-data=cacheline -align -Mcache_align Align long objects on cache-line boundaries
-finline -inline-level=<0|1|2> -Minline Controlling inline expansions
-fprefetch-loop-arrays -qopt-prefetch[=<0|1|2|3|4|5>] (default is 2) -Mprefetch Generate Prefetch instructions 

 

Suitable flags for each sub-cluster processor architecture and compiler suites are tabulated below. References are also provided to optimization guides where relevant. These flags are recommended with compiler versions listed above, for each sub-cluster.

Sub-cluster GCC  Flags PGI  Flags Intel  Flags Reference Optimization Guide
Rain -march=k8  -tp amd64 -xHost
Hurricane

-march=westmere

-march=corei7

-tp nehalem -msse4.2
Vortex -march=bdver2 -tp piledriver -msse4.2
Bora/Hima -march=haswell -mfma -tp haswell -xCORE-AVX2 -fma -std=c11 Best Practice Guide - Haswell
Storm (except Ice) -march=barcelona -tp shanghai -fastsse -msse3
Ice -march=barcelona -tp istanbul -fastsse -msse3  Compiler Options Quick Reference - Magny-Cours
Meltemi

-march=knl -mavx512f -mavx512pf -mavx512er -mavx512cd -mfma

N/A -xCORE-AVX512 -fma -std=c11 KNL Best Practices Guide
Potomac -march=bdver1 -tp piledriver -xCORE-AVX
Pamunkey -march=bdver2 -tp piledriver -xCORE-AVX2 Compiler Options Quick Reference - Abu-Dhabi
James -march=skylake -tp=skylake -mtune=skylake

Users are encouraged to explicitly use the architecture flags (above) for their choice of sub-cluster when compiling, since the front-end may not be of the same architecture as the nodes (as is the case with Meltemi).

Choosing Compilers

Three different compiler suites are available on SciClone, often in multiple versions. These include the open source GNU Compiler Collection (GCC) and the commercial Portland Group (PGI) and Intel Parallel Studio XE (Cluster or Composer) suites. In addition, packages such as MPI and CUDA provide their own compilation commands which are implemented on top of one or more of these base compiler suites.

For any given application, the choice of compiler and compiler options can have a major impact on performance. Generally speaking, the commercial compiler suites (PGI and Intel) will produce better results than GCC, and we therefore recommend their use whenever the code base permits. There are exceptions, however, so some experimentation may be in order, particularly for applications with long runtimes. Compiler optimization guidelines are provided above, tailored to SciClone's various hardware platforms.

SciClone features a large collection of third-party application software, and the compiler requirements vary from one package to another. Some packages are highly portable and have been compiled with multiple compiler suites, while others require a very specific compiler version. Where applicable, compiler information is included in the local documentation pages for individual software packages. As a rule, application software should be linked to libraries which have been compiled with the same compiler suite, and should use compatible compiler options.