My password doesn't work but I know it is the correct one.
First, try to log into https://access.caltech.edu with the same username and password. If you are able to log in to Access Caltech, then you have the correct username and password for the cluster. If you are still having trouble logging in to the cluster, you probably do not have the entitlement to log into the cluster. If your group has not yet been set up to access the cluster, then that should be done first
. The PI or admin of your group should be able to add you on the HPC admin console
. Your PI can also contact us and we can do it for them at email@example.com
I am getting a "connection refused" error messsage when trying to connect to the cluster
The cluster is only accessible from on campus or via VPN and you are likely not on either. If you already have vpn access, you can connect to that first and then to the cluster. If you have a machine on campus you can connect to remotely, you can connect there and then into the cluster.
I requested a lot of cores on a computer, but it is only using one.
If it is not an MPI job, then you executable may not be multithreaded, or you didn't specify the number of threads. If you know your application is multithreaded using openMP but it isn't using the additional cores, you may need to set the appropriate environment variable. Often you need to set something like the following:
Nested SRUNS fail on GPU nodes.
If you experience srun calls hanging on GPU nodes after starting an interactive session add this variable to your environment.
"Home directory not found" while connecting via Open OnDemand.
This usually occurs when a new user attempts initial login via the graphical Open OnDemand front end rather than terminal based SSH. Open OnDemand currently has no mechanism to create home directories but SSH does. Once you login via SSH for the first time your home directory will be created and subsequent OpenOnDemand sessions will function as intended.
Out of Memory while running a process on the login or vislogin nodes.
Cgroup limits are in effect on both the login and vislogin interactive nodes. To avoid this, run processes that use over 8GB of memory on compute nodes only. If the application requires that the job run on a login node please contact HPC support.