W&M's HPC systems have several types of user filesystems, intended for different purposes. These can be grouped into two broad categories:
- Global filesystems are identified with the prefix
/ches/and are accessible on every node within SciClone or Chesapeake, respectively.
- Local filesystems begin with the prefix
/local/and are physically resident on one or more disks directly attached to a node. Since operations on local filesystems never have to travel over one of the cluster's networks, and are less subject to competition from other users' jobs, they may provide better, more consistent performance than global filesystems, but their contents are only accessible on the local node.
An additional distinction is made between home, data, and scratch filesystems, by their intended use and accordant backup policy:
||Source code, executables, configuration files, scripts, and small (<1GB) data files. Unless you have been directed otherwise, you should not have a job read or write any substantial amount of data to your home directory, as doing so is extremely likely to impact others' interactive work.||
Weeknightly, on-site only
|After account expiration.||
||Data that are needed on an ongoing basis for active projects on the cluster and cannot be easily re-created or re-uploaded.||
||Scratch space: job outputs and working data that can be easily re-created or re-uploaded, or which will be copied elsewhere for longer-term storage.||Never||Any files not accessed for 90 days, and after account expiration.|
When a user account is installed, a home directory for that user is created on each cluster (SciClone and Chesapeake) in one of its home filesystems. Additionally, subdirectories with the user's login name are created in each data, global scratch, and local scratch filesystem. As a convenience, symlinks in each user's home directory point to the preconfigured user directories. After a user's account has expired, all of these directories become subject to deletion.
Global filesystem names have a two-digit suffix (e.g.,
/sciclone/data10) which serves not only to distinguish it from other filesystems of the same type, but also to indicate the underlying storage architecture. Suffixes which begin with a "0" typically indicate a single internal disk drive within a server, while those beginning with "1", "2", etc. indicate a filesystem that spans one or more disk arrays, each consisting of multiple drives, usually in a RAID configuration. This allows users to easily distinguish array-based filesystems, which are larger and faster, from their single-drive counterparts.
Local scratch filesystems labeled
/local/scrX used to provide the ability to distinguish between local scratch on different disk drives, enabling IO-intensive applications which place files in different filesystems to minimize head movement. Nowadays, however, for ease of use, if a compute node has more than one local scratch disk, we generally stripe one
/local/scr filesystem across all of them.
All backups remain on the same campus as the corresponding cluster. Chesapeake's backups are on a file server that is part of Chesapeake and is in the same room as the rest of Chesapeake. SciClone writes its backups to a tape library in another building, but on the same campus, less than half a mile away. Both schemes protect against accidental deletions, filesystem corruption, and hardware failures, and SciClone's additionally protects against loss (e.g. in a fire) of the room and building housing the cluster, but neither scheme is as secure as off-line, off-site backups would be. If your data require off-site backup, you must provide for it yourself.
Furthermore, capacity allows us to keep only about one month of backups, so it is important that you let us know as soon as possible if you need something restored. If your data need protection against loss that remains undiscovered for longer, you should make at least one additional backup of your own.
From time to time, additional "project" filesystems may be provisioned for specific projects or research groups, e.g.
Several system filesystems are also present throughout the clusters.
/tmp are local to individual nodes;
/import are hosted on the respective platform servers and exported to their client nodes via NFS. Note that on our systems, the
/tmp filesystem is of very limited size, its public permissions leave files relatively unsecured, and its contents are often wiped clean on a reboot. Users should not explicitly store files in
/local/scr/$USER instead. The default login scripts set your
TMPDIR environment variable accordingly.