Quick Start Guide
Get up and running on the cluster in about 15 minutes.
Tip
This is the express lane. For the long-form version see Getting Started. For the full reference, follow links in each section.
1. Get an account
Before you can log in, three things must happen — the first is the slowest, so start now:
Your PI emails help-hpc@caltech.edu asking to add you to their group. (New groups: see Account Information.)
Set up Multi-Factor Authentication at access.caltech.edu/my_duo. Duo Mobile on your phone is the easiest option.
Complete eligibility certification at access.caltech.edu/hpc_portal. This is the export-control acknowledgment — see Policies.
You’ll get a confirmation email when your account is ready.
2. Connect via SSH
Important
You must be on the Caltech network — either physically on-campus or connected to Caltech VPN. SSH attempts from anywhere else will hang or be refused. See Common Problems → Connection Refused if the login host is unreachable.
Open a terminal and connect:
ssh username@login.hpc.caltech.edu
You’ll be prompted for your access.caltech password and a Duo push.
Use the built-in OpenSSH client from PowerShell or Windows Terminal:
ssh username@login.hpc.caltech.edu
Or use MobaXterm for a graphical SSH/SFTP client.
Use Open OnDemand — a browser-based interface for shells, file management, and interactive apps (Jupyter, RStudio, MATLAB).
Important
You must SSH at least once before using Open OnDemand — the first SSH login creates your home directory.
Recommended SSH config
Save typing and avoid re-authenticating constantly. Add to ~/.ssh/config on your local machine:
Host hpc
HostName login.hpc.caltech.edu
User YOUR_USERNAME
ServerAliveInterval 60
ControlMaster auto
ControlPath ~/.ssh/sockets/%r@%h:%p
ControlPersist 10m
Then create the sockets directory once: mkdir -p ~/.ssh/sockets. After this, ssh hpc is enough.
See also
SSH Password FAQ for SSH-key setup. Common Problems → Connection Refused if you’re off-campus (VPN required).
3. Move some data over
# Upload a single file
scp myfile.txt hpc:~/
# Upload a directory
scp -r mydata/ hpc:/resnick/groups/yourgroup/$USER/
# Download results
scp hpc:/resnick/scratch/$USER/results.txt ./
For large transfers (> ~100 GB), use Globus instead — see Transferring Files.
4. Find and load software
module avail # List everything available
module spider pytorch # Search for a specific package
module load python3/3.10.12
module list # Show what's loaded
If something’s missing, request it via help-hpc@caltech.edu — or install it yourself with conda or pip.
5. Submit your first job
Save the following as hello.sh:
#!/bin/bash
#SBATCH --job-name=hello
#SBATCH --output=hello-%j.out
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --mem=4G
#SBATCH --time=00:05:00
echo "Hello from $(hostname) at $(date)"
echo "Allocated $SLURM_CPUS_PER_TASK CPUs and $SLURM_MEM_PER_NODE MB RAM"
sleep 30
echo "Done."
Submit, watch, and inspect:
sbatch hello.sh # → "Submitted batch job 12345"
squeue -u $USER # Watch it run
cat hello-12345.out # See the output
seff 12345 # Check how efficient the request was
Tip
seff tells you how much of your requested CPU/memory you actually used. Right-sizing your next job by that ratio is the single biggest thing you can do to improve queue times.
6. Cheat sheet
Command |
What it does |
|---|---|
|
Submit a batch job |
|
Show your jobs |
|
Cancel a job |
|
Cancel all your jobs |
|
Quick interactive shell on a compute node |
|
Reserve resources for an interactive session |
|
Efficiency report after a job finishes |
|
Detailed job accounting |
|
List installed software |
|
Check storage usage |
|
Show GPU partition state |
7. Where to put your files
Location |
Quota |
Backed up? |
Use for |
|---|---|---|---|
|
50 GB |
No |
Scripts, configs, dotfiles |
|
20 TB |
No |
Project data, results to keep |
|
Large, shared |
No |
Active computations, temp files |
Warning
Nothing on the cluster is backed up. Files on /resnick/scratch are purged after 14 days without access. Move important results to group storage and copy critical data offsite — see Backups.
Next steps
Copy-paste templates: serial, MPI, GPU, arrays, MATLAB, R, GROMACS, AlphaFold
H100/H200, CUDA, deep learning
PyTorch, TensorFlow, LLMs, distributed training
Right-sizing requests, checkpointing, I/O patterns
Stuck?
Troubleshooting — symptom-based index
Common Problems — fixes for the usual culprits
Submit a ticket: Caltech Help System
Email: help-hpc@caltech.edu