Files & Filesystems

W&M's HPC systems have several types of user filesystems, intended for different purposes. These can be grouped into two broad categories:

  • Global filesystems are identified with the prefix /sciclone/ or /ches/ and are accessible on every node within a particular cluster (SciClone or Chesapeake, respectively), by communicating over one of the cluster's networks.
  • Local filesystems begin with the prefix /local/ and are physically resident on one or more disks directly attached to a node. Since operations on local filesystems never have to travel over a network, they may provide better, more consistent performance than global filesystems (depending on the application and the load on the cluster), but their contents are only accessible on the local node.

Additional distinctions are made between home filesystems, scratch filesystems, and data filesystems.

  • Home filesystems (e.g. /sciclone/home2) contain users' home directories: the starting point when a user logs in, and where a user's environment configuration is stored. A user's home directory is appropriate for source code, executables, configuration files, scripts, and small (less than a gigabyte or so) data files. Unless you have been directed otherwise, you should not have a job read or write any substantial amount of data to home filesystems, as doing so is extremely likely to impact others' interactive work.
  • Scratch filesystems (e.g. /sciclone/pscr and /local/scr) provide high-performance storage of large amounts of working data which are needed only on a short-term basis, typically a few days or weeks, and is appropriate for files which are not needed on an ongoing basis, which can easily be re-created, or which will be copied to a remote system for long-term storage. Scratch filesystems are not backed up, and files which have not been accessed for 90 days are automatically deleted.
  • Data filesystems (e.g. /ches/data10) provide storage for data that are needed on an ongoing basis for active projects on the cluster and cannot be easily re-created.

When a user account is installed, a home directory for that user is created on each cluster in one of its home filesystems. Additionally, subdirectories with the user's login name are created in each data, global scratch, and local scratch filesystem. As a convenience, symlinks in each user's home directory point to the preconfigured user directories in the /{sciclone,ches}/{data,scr}* and /local/scr filesystems. After a user's account has expired, all of these directories become subject to deletion.

Filesystem numbering

Global filesystem names have traditionally had a two-digit suffix (e.g., /ches/scr00 or /sciclone/data10) which served not only to distinguish it from other filesystems of the same type, but also indicated the underlying storage architecture. Suffixes which begin with a "0" typically indicated a single internal disk drive within a server, while those beginning with "1", "2", etc. indicated a filesystem that spans one or more disk arrays, each consisting of multiple drives, usually in a RAID configuration. This allowed users to easily distinguish array-based filesystems, which were larger and faster, from their single-drive counterparts.

Since we no longer install single-disk global filesystems (and weren't making this distinction at all for local filesystems) this numbering scheme has been deprecated.

Miscellanea

From time to time, additional "project" filesystems may be provisioned for specific projects or research groups, e.g. /sciclone/baby10 and /sciclone/aiddata10.

Several system filesystems are also present throughout the clusters. /, /boot, /usr, /var, and /tmp are local to individual nodes; /usr/local and /import are hosted on the respective platform servers and exported to their client nodes via NFS. Note that on our systems, the /tmp filesystem is of very limited size, its public permissions leave files relatively unsecured, and its contents are often wiped clean on a reboot. Users should not explicitly store files in /tmp; use /local/scr/$USER instead. The default login scripts set your TMPDIR environment variable accordingly.