Example Job Scripts

These are working SLURM templates to copy as a starting point — each one notes what it’s for and the lines worth changing. Drop in your own command, tune the resource requests (--mem, --time, --cpus-per-task, and the partition or GPU type) to match your job, then submit with sbatch your_script.sh.

Tip

Start small and scale up based on actual usage.

Basic Examples

Serial Job

The simplest case — one task on a single core. Use it for scripts that aren’t parallelized, and raise --mem and --time to match what your code actually needs.

#!/bin/bash
#SBATCH --job-name=serial_job
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=4G
#SBATCH --time=01:00:00

module load python3/3.10.12
python my_script.py

Multi-threaded (OpenMP)

For code that spreads across the cores of a single node — OpenMP programs or threaded libraries like NumPy/MKL. --cpus-per-task sets how many cores you get, and OMP_NUM_THREADS is wired to match it automatically so you don’t have to hard-code the thread count.

#!/bin/bash
#SBATCH --job-name=openmp_job
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=16
#SBATCH --mem=32G
#SBATCH --time=04:00:00

export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
module load gcc/11.3.0
./my_openmp_program

MPI (Multi-node)

For programs that scale beyond a single node with MPI. This requests 2 nodes of 32 tasks each (64 ranks total); srun launches one rank per task and spreads them across the nodes for you.

#!/bin/bash
#SBATCH --job-name=mpi_job
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=32
#SBATCH --mem-per-cpu=4G
#SBATCH --time=08:00:00

module load openmpi/4.1.0
srun ./my_mpi_program

GPU Jobs

Reach for these whenever the work actually leans on a GPU — training or fine-tuning a model, a CUDA program, GPU-accelerated MD or genomics. The templates below cover the usual shapes; the one rule that trips almost everyone up is that you have to ask for a GPU type, not just a count (see the note).

Single GPU

The everyday GPU job — one card with a few CPU cores to keep it fed. This is the one you want for inference, prototyping, or any model that fits on a single GPU. Swap p100 for whatever type suits the work.

#!/bin/bash
#SBATCH --job-name=gpu_job
#SBATCH --partition=gpu
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=32G
#SBATCH --gres=gpu:p100:1
#SBATCH --time=04:00:00

module load cuda/11.8
nvidia-smi
./my_gpu_program

Multi-GPU

For when one GPU isn’t enough and your code can actually split the work — distributed or data-parallel training, typically. This grabs four GPUs on a single node; give each one a healthy share of CPU cores (about 8 here) so feeding data doesn’t leave them idle.

#!/bin/bash
#SBATCH --job-name=multi_gpu
#SBATCH --partition=gpu
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=32
#SBATCH --mem=128G
#SBATCH --gres=gpu:p100:4
#SBATCH --time=24:00:00

module load cuda/11.8
python train_distributed.py

Specific GPU Type

When the job really needs a particular card — a large model that only fits on an H100, or a benchmark you want pinned to one architecture. The type token in --gres is how you ask for exactly that and nothing else.

#!/bin/bash
#SBATCH --partition=gpu
#SBATCH --gres=gpu:h100:1
#SBATCH --cpus-per-task=16
#SBATCH --mem=64G
#SBATCH --time=12:00:00

module load cuda/12.0
python train_large_model.py

Valid gres type tokens: p100, v100, h100, nvidia_h200, nvidia_l40s (note the nvidia_ prefix on H200 and L40s — required by the cluster’s SLURM config).

Python & Conda

Two everyday ways to bring your Python setup into a job — activating a Conda environment you manage yourself, or running a notebook unattended.

Conda Environment

The go-to when your project lives in its own Conda environment with its own pinned packages. Activate it at the top of the job exactly as you would in a terminal — just point the source line at your own miniconda install.

#!/bin/bash
#SBATCH --job-name=conda_job
#SBATCH --nodes=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=16G
#SBATCH --time=02:00:00

source /resnick/groups/mygroup/$USER/miniconda3/etc/profile.d/conda.sh
conda activate myenv
python analysis.py

Jupyter Batch

For when you’re done poking at a notebook interactively and want to run the whole thing as a job — a long sweep, or something you’d rather start and walk away from. nbconvert --execute runs every cell top to bottom and saves the executed copy.

#!/bin/bash
#SBATCH --job-name=jupyter_batch
#SBATCH --cpus-per-task=4
#SBATCH --mem=16G
#SBATCH --time=01:00:00

module load python3/3.10.12
jupyter nbconvert --execute my_notebook.ipynb --output executed.ipynb

Job Arrays

Job arrays are the answer to “I need to run the same thing hundreds of times, just with a different input each time.” One script, one submission, and SLURM fans it out into numbered tasks — each one gets a $SLURM_ARRAY_TASK_ID it can use to grab its own slice of the work.

Parameter Sweep

The classic array job — one task per input file (100 of them here), with each task using its index to pick the file it processes.

#!/bin/bash
#SBATCH --job-name=array_job
#SBATCH --array=1-100
#SBATCH --cpus-per-task=1
#SBATCH --mem=4G
#SBATCH --time=01:00:00

python process.py --input data_${SLURM_ARRAY_TASK_ID}.csv

Limit Concurrent Jobs

Tack %N onto the array range to throttle how many tasks run at once — a courtesy when you’d otherwise swamp the queue, and sometimes a necessity when you’re sharing a limited pool of software licenses:

#SBATCH --array=1-500%10

All 500 tasks still queue up, but no more than 10 run at the same time.

Applications

Starting points for some of the software we get asked about most. The shape is always the same — load the module, then run the tool in batch (non-interactive) mode — but the exact invocation differs per app, so here are the ones worth copying.

MATLAB

Running MATLAB with no desktop. -nodisplay -nosplash keeps it headless, and the trailing exit makes sure the job ends instead of hanging at an open MATLAB prompt.

#!/bin/bash
#SBATCH --cpus-per-task=8
#SBATCH --mem=32G
#SBATCH --time=04:00:00

module load matlab/r2023a
matlab -nodisplay -nosplash -r "run('my_script.m'); exit"

R

A plain Rscript run. If your analysis leans on a parallel backend like future or doParallel, bump --cpus-per-task to give it the cores to work with.

#!/bin/bash
#SBATCH --cpus-per-task=4
#SBATCH --mem=16G
#SBATCH --time=02:00:00

module load R/4.2.0
Rscript analysis.R

GROMACS

A GPU-accelerated molecular-dynamics run. -nb gpu hands the heavy nonbonded work to the GPU, which is where most of the speedup comes from.

#!/bin/bash
#SBATCH --partition=gpu
#SBATCH --gres=gpu:p100:1
#SBATCH --cpus-per-task=8
#SBATCH --mem=32G
#SBATCH --time=24:00:00

module load gromacs/2023.1
gmx mdrun -deffnm production -nb gpu

AlphaFold

A full AlphaFold 3 structure prediction on an H100 — these are big, slow jobs, which is why the walltime runs to a full day. Note it reads inputs from /home and writes results out to /resnick/scratch.

#!/bin/bash
#SBATCH --partition=gpu
#SBATCH --gres=gpu:h100:1
#SBATCH --cpus-per-task=16
#SBATCH --mem=64G
#SBATCH --time=1-00:00:00

export MODEL_DIR=/home/$USER/alphafold3_models/
module load alphafold/3.0.3
alphafold --output_dir=/resnick/scratch/$USER/alphafold/out \
          --json_path=/home/$USER/alphafold/input.json

Job Dependencies

When jobs have to run in a certain order — this one has to finish before that one can start — dependencies wire them together so you’re not sitting there launching each stage by hand. --parsable hands you back just the job ID to hang the next step on, and afterok means “only if it finished successfully.”

Sequential Pipeline

A straight chain — each step waits for the one before it to finish cleanly before it starts.

JOB1=$(sbatch --parsable step1.sh)
JOB2=$(sbatch --parsable --dependency=afterok:$JOB1 step2.sh)
sbatch --dependency=afterok:$JOB2 step3.sh

Fan-out, Fan-in

One prep step feeds three analyses that run side by side, then a final job holds until all three are done before it pulls the results together.

PREP=$(sbatch --parsable preprocess.sh)
A1=$(sbatch --parsable --dependency=afterok:$PREP analyze1.sh)
A2=$(sbatch --parsable --dependency=afterok:$PREP analyze2.sh)
A3=$(sbatch --parsable --dependency=afterok:$PREP analyze3.sh)
sbatch --dependency=afterok:$A1:$A2:$A3 aggregate.sh

Email Notifications

Have SLURM email you when a job starts, finishes, or dies — well worth it for the long runs you’d rather not keep checking on by hand.

#SBATCH --mail-user=you@caltech.edu
#SBATCH --mail-type=BEGIN,END,FAIL

Generate a Custom Script

Prefer to fill in a form? Use the interactive job script generator to produce a ready-to-submit SBATCH file.