Nvidia NGC is a hub for GPU-optimized software for deep learning, machine learning, and high-performance computing (HPC). These instructions will make use of Singularity to pull and launch GPU enabled containers on the central cluster.
Setup a Nvidia NGC account for use with Singularity
- Create an account on the NGC signup page
- Login in and navigate to the "Setup" under your account name in the top right corner of the website.
- Under Generate API Key select 'Get API Key'.
- Click "Get API Key"
- Click on "Generate API key" and confirm it. This will then display your API key.
- Edit your .bash_profile to add the following, replacing the password with your API key:
export SINGULARITY_DOCKER_PASSWORD='Your Nvidia NGC API Key '
- Make your scratch dir: mkdir -p /central/scratch/$USER
- Log out and back in, or source your ~/.bash_profile
- Load the singularity module: module load singularity/3.5.2
Pull an image from NGC
- Check whether you can pull an image into singularity from NGC
- As a test, you will be able to run the container by first gaining interactive access to a GPU enabled node (or the Nvidia DGX) and then running the singularity execute command.
singularity exec --bind /central/scratch/$USER --nv tensorflow-20.02-tf1-py3.sif python -c 'import tensorflow as tf; print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices("GPU")))'
If all goes well, python will output 'Num GPUs Available: 1' as that was the amount of GPUs requested in the srun command above.
Using NGC tool to view the available images
- Load the NGC module with 'module load ngc/1.4.0'
- Set the NGC configuration with the command 'ngc config set'
- Enter the Nvidia NGC API key used previously
- Select the defaults by hitting Enter unless other settings are required
- To test, run the command to view all images 'ngc registry image list'