Common Problems

Authentication Issues

Password Not Working

  1. Verify your credentials at https://access.caltech.edu

  2. If successful there but cluster login fails, you lack cluster entitlement

  3. Contact your PI or group admin to request access via the HPC admin console

  4. Alternatively, email help-hpc@caltech.edu

Connection Refused

The cluster requires either on-campus access or VPN connectivity.

Tip

Connect to Caltech VPN first, or access an on-campus machine as an intermediary.

Network & Connectivity

Frequent SSH Disconnections During Idle Periods

Add to your local ~/.ssh/config:

Host hpc
    Hostname login.hpc.caltech.edu
    ServerAliveInterval 60

Or use during connection:

ssh -o "ServerAliveInterval 60" login.hpc.caltech.edu

Alternative: Mosh (Stateless SSH)

Mosh is resilient to network interruptions and IP changes.

On the cluster:

module load mosh/1.4.0-gcc-11.3.1-72uzmod

Connect (must target individual login nodes):

mosh username@login3.hpc.caltech.edu

Note

Mosh must target individual login nodes (login3 or login4) instead of the load balancer.

Computation Issues

Requested Cores Not Being Used

Non-MPI applications may lack multithreading support or require explicit thread specification.

For OpenMP/MKL applications, set:

export OMP_NUM_THREADS=32
export MKL_NUM_THREADS=32

Nested SRUNs Hanging on GPU Nodes

Add to your environment:

export SLURM_STEP_GRES=none

Out of Memory on Login Nodes

Cgroup limits restrict login/vislogin nodes to 8 GB memory.

Warning

Run memory-intensive processes on compute nodes only, not login nodes.

Access & Display Issues

Home Directory Not Found in Open OnDemand

SSH login creates the home directory. Connect via SSH first:

ssh username@login.hpc.caltech.edu

Subsequent OnDemand sessions will work.

Python/Conda Segfaults from WSL

Override the LANG variable before connecting:

export LANG=en_US.UTF-8
ssh login.hpc.caltech.edu

X11 Forwarding GLX Errors

Create/edit /usr/share/X11/xorg.conf.d/50-iglx.conf:

Section "ServerFlags"
    Option "AllowIndirectGLX" "on"
    Option "IndirectGLX" "on"
EndSection

Then reboot.

defaults write org.macosforge.xquartz.X11 enable_iglx -bool true

Restart XQuartz.

Still Having Issues?

Contact help-hpc@caltech.edu or submit a ticket at Caltech Help System.