Search open search form

Storage

User account Space

Each user account has a quota of 50GB in the home directory. If you are in need of more space, please use your group space.

Group Space

Each research group has a quota of 10TB in /central/groups/<groupname>. Additional space up to 30TB is available at no charge, while anything above 30TB will be charged at standard storage rates, please have your PI send an email to help-hpc@caltech.edu for information. 

Scratch Space

There are two scratch directories available. A 500TB of standard, high speed disk mounted on /central/scratch and a 30TB high IO disk mounted on /central/scratchio.  Best practice dictates creating directory for yourself (e.g. /central/scratch/<username>) and working with files from inside of said directory. The /central/scratch partition has a default quota of 20T and 15M files. The /central/scratchio partition has a default quota of 2TB and 3M files. Dependent on the IO properties of your code and the size of your jobs, /central/scratch can be faster in aggregate bandwidth and /central/scratchio faster in file operations per-second. If these's any question on the performance differences between the scratch space options it might make sense to profile your code against both. 

The quota can be extended to 50T for 30 days upon request. Please send an email to help-hpc@caltech.edu for information. 

These disks are truly meant as scratch space. Any files not accessed in 14 days will be automatically purgedAny method of artificially changing the date/time stamps of a file is strictly prohibited and subject to Caltech's Honor Code.

Checking Quotas for User and Group

To check see how much storage you are using you can use the mmlsquota

mmlsquota -u username  --block-size auto central:home

To check for your group storage

mmlsquota -j  groupname  --block-size auto central

To see how much space each group member is using in your group area, see ...

/central/groups/imss_admin/group_usage/XYZ_usage

... where XYZ is the name of your group.  The information is sorted by usage, highest at the top of the list and lowest at the bottom.  The usage file gets automatically updated once per day shortly after 4:00am.  Lines in the file which have a string of digits instead of a username, followed by a username in square brackets, are reporting usage by group members who are no longer at Caltech.  (The name in square brackets after a numeric string is the former user's access.caltech username.)

Snapshots

The GPFS based file system uses snapshot technology that will capture file changes in the following way:

  • Every 4 hours for 1 day
  • Every day for 1 week
  • Every week for 2 weeks
The snapshot directory is not listable, but can be found be changing directory to ".snapshots" .

Backup and Archive

There is no managed BCP/DR style back up nor archival system in place so on the central hpc cluster. Please be sure to migrate any critical data to systems or services outside of the cluster storage on a routine basis. For information on running backups using the Duplicity client see this page. (Duplicity supports saving backups to AWS, Google, Backblaze B2, ssh based hosts and others.)