Example Job Scripts
These are working SLURM templates to copy as a starting point — each one notes what it’s for and the lines worth changing. Drop in your own command, tune the resource requests (--mem, --time, --cpus-per-task, and the partition or GPU type) to match your job, then submit with sbatch your_script.sh.
Tip
Start small and scale up based on actual usage.
Basic Examples
Serial Job
The simplest case — one task on a single core. Use it for scripts that aren’t parallelized, and raise --mem and --time to match what your code actually needs.
#!/bin/bash
#SBATCH --job-name=serial_job
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=4G
#SBATCH --time=01:00:00
module load python3/3.10.12
python my_script.py
Multi-threaded (OpenMP)
For code that spreads across the cores of a single node — OpenMP programs or threaded libraries like NumPy/MKL. --cpus-per-task sets how many cores you get, and OMP_NUM_THREADS is wired to match it automatically so you don’t have to hard-code the thread count.
#!/bin/bash
#SBATCH --job-name=openmp_job
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=16
#SBATCH --mem=32G
#SBATCH --time=04:00:00
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
module load gcc/11.3.0
./my_openmp_program
MPI (Multi-node)
For programs that scale beyond a single node with MPI. This requests 2 nodes of 32 tasks each (64 ranks total); srun launches one rank per task and spreads them across the nodes for you.
#!/bin/bash
#SBATCH --job-name=mpi_job
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=32
#SBATCH --mem-per-cpu=4G
#SBATCH --time=08:00:00
module load openmpi/4.1.0
srun ./my_mpi_program
GPU Jobs
Reach for these whenever the work actually leans on a GPU — training or fine-tuning a model, a CUDA program, GPU-accelerated MD or genomics. The templates below cover the usual shapes; the one rule that trips almost everyone up is that you have to ask for a GPU type, not just a count (see the note).
See also
For detailed GPU guidance, see GPU Computing and AI/ML Guide.
Important
GPU jobs must specify both --partition=gpu and a typed gres of the form --gres=gpu:<type>:<count>. Bare --gres=gpu:N (no type) is rejected by the scheduler. Valid type tokens: p100, v100, h100, nvidia_h200, nvidia_l40s.
Single GPU
The everyday GPU job — one card with a few CPU cores to keep it fed. This is the one you want for inference, prototyping, or any model that fits on a single GPU. Swap p100 for whatever type suits the work.
#!/bin/bash
#SBATCH --job-name=gpu_job
#SBATCH --partition=gpu
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=32G
#SBATCH --gres=gpu:p100:1
#SBATCH --time=04:00:00
module load cuda/11.8
nvidia-smi
./my_gpu_program
Multi-GPU
For when one GPU isn’t enough and your code can actually split the work — distributed or data-parallel training, typically. This grabs four GPUs on a single node; give each one a healthy share of CPU cores (about 8 here) so feeding data doesn’t leave them idle.
#!/bin/bash
#SBATCH --job-name=multi_gpu
#SBATCH --partition=gpu
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=32
#SBATCH --mem=128G
#SBATCH --gres=gpu:p100:4
#SBATCH --time=24:00:00
module load cuda/11.8
python train_distributed.py
Specific GPU Type
When the job really needs a particular card — a large model that only fits on an H100, or a benchmark you want pinned to one architecture. The type token in --gres is how you ask for exactly that and nothing else.
#!/bin/bash
#SBATCH --partition=gpu
#SBATCH --gres=gpu:h100:1
#SBATCH --cpus-per-task=16
#SBATCH --mem=64G
#SBATCH --time=12:00:00
module load cuda/12.0
python train_large_model.py
Valid gres type tokens: p100, v100, h100, nvidia_h200, nvidia_l40s (note the nvidia_ prefix on H200 and L40s — required by the cluster’s SLURM config).
Python & Conda
Two everyday ways to bring your Python setup into a job — activating a Conda environment you manage yourself, or running a notebook unattended.
Conda Environment
The go-to when your project lives in its own Conda environment with its own pinned packages. Activate it at the top of the job exactly as you would in a terminal — just point the source line at your own miniconda install.
#!/bin/bash
#SBATCH --job-name=conda_job
#SBATCH --nodes=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=16G
#SBATCH --time=02:00:00
source /resnick/groups/mygroup/$USER/miniconda3/etc/profile.d/conda.sh
conda activate myenv
python analysis.py
Jupyter Batch
For when you’re done poking at a notebook interactively and want to run the whole thing as a job — a long sweep, or something you’d rather start and walk away from. nbconvert --execute runs every cell top to bottom and saves the executed copy.
#!/bin/bash
#SBATCH --job-name=jupyter_batch
#SBATCH --cpus-per-task=4
#SBATCH --mem=16G
#SBATCH --time=01:00:00
module load python3/3.10.12
jupyter nbconvert --execute my_notebook.ipynb --output executed.ipynb
Job Arrays
Job arrays are the answer to “I need to run the same thing hundreds of times, just with a different input each time.” One script, one submission, and SLURM fans it out into numbered tasks — each one gets a $SLURM_ARRAY_TASK_ID it can use to grab its own slice of the work.
Parameter Sweep
The classic array job — one task per input file (100 of them here), with each task using its index to pick the file it processes.
#!/bin/bash
#SBATCH --job-name=array_job
#SBATCH --array=1-100
#SBATCH --cpus-per-task=1
#SBATCH --mem=4G
#SBATCH --time=01:00:00
python process.py --input data_${SLURM_ARRAY_TASK_ID}.csv
Limit Concurrent Jobs
Tack %N onto the array range to throttle how many tasks run at once — a courtesy when you’d otherwise swamp the queue, and sometimes a necessity when you’re sharing a limited pool of software licenses:
#SBATCH --array=1-500%10
All 500 tasks still queue up, but no more than 10 run at the same time.
Applications
Starting points for some of the software we get asked about most. The shape is always the same — load the module, then run the tool in batch (non-interactive) mode — but the exact invocation differs per app, so here are the ones worth copying.
MATLAB
Running MATLAB with no desktop. -nodisplay -nosplash keeps it headless, and the trailing exit makes sure the job ends instead of hanging at an open MATLAB prompt.
#!/bin/bash
#SBATCH --cpus-per-task=8
#SBATCH --mem=32G
#SBATCH --time=04:00:00
module load matlab/r2023a
matlab -nodisplay -nosplash -r "run('my_script.m'); exit"
R
A plain Rscript run. If your analysis leans on a parallel backend like future or doParallel, bump --cpus-per-task to give it the cores to work with.
#!/bin/bash
#SBATCH --cpus-per-task=4
#SBATCH --mem=16G
#SBATCH --time=02:00:00
module load R/4.2.0
Rscript analysis.R
GROMACS
A GPU-accelerated molecular-dynamics run. -nb gpu hands the heavy nonbonded work to the GPU, which is where most of the speedup comes from.
#!/bin/bash
#SBATCH --partition=gpu
#SBATCH --gres=gpu:p100:1
#SBATCH --cpus-per-task=8
#SBATCH --mem=32G
#SBATCH --time=24:00:00
module load gromacs/2023.1
gmx mdrun -deffnm production -nb gpu
AlphaFold
A full AlphaFold 3 structure prediction on an H100 — these are big, slow jobs, which is why the walltime runs to a full day. Note it reads inputs from /home and writes results out to /resnick/scratch.
#!/bin/bash
#SBATCH --partition=gpu
#SBATCH --gres=gpu:h100:1
#SBATCH --cpus-per-task=16
#SBATCH --mem=64G
#SBATCH --time=1-00:00:00
export MODEL_DIR=/home/$USER/alphafold3_models/
module load alphafold/3.0.0
alphafold --output_dir=/resnick/scratch/$USER/alphafold/out \
--json_path=/home/$USER/alphafold/input.json
Job Dependencies
When jobs have to run in a certain order — this one has to finish before that one can start — dependencies wire them together so you’re not sitting there launching each stage by hand. --parsable hands you back just the job ID to hang the next step on, and afterok means “only if it finished successfully.”
Sequential Pipeline
A straight chain — each step waits for the one before it to finish cleanly before it starts.
JOB1=$(sbatch --parsable step1.sh)
JOB2=$(sbatch --parsable --dependency=afterok:$JOB1 step2.sh)
sbatch --dependency=afterok:$JOB2 step3.sh
Fan-out, Fan-in
One prep step feeds three analyses that run side by side, then a final job holds until all three are done before it pulls the results together.
PREP=$(sbatch --parsable preprocess.sh)
A1=$(sbatch --parsable --dependency=afterok:$PREP analyze1.sh)
A2=$(sbatch --parsable --dependency=afterok:$PREP analyze2.sh)
A3=$(sbatch --parsable --dependency=afterok:$PREP analyze3.sh)
sbatch --dependency=afterok:$A1:$A2:$A3 aggregate.sh
Email Notifications
Have SLURM email you when a job starts, finishes, or dies — well worth it for the long runs you’d rather not keep checking on by hand.
#SBATCH --mail-user=you@caltech.edu
#SBATCH --mail-type=BEGIN,END,FAIL
Generate a Custom Script
Prefer to fill in a form? Use the interactive job script generator to produce a ready-to-submit SBATCH file.
See Also
SLURM Commands — full command reference
Best Practices — tips for efficiency
GPU Computing — detailed GPU guidance
AI/ML Guide — PyTorch, TensorFlow, LLMs
Common Problems — troubleshooting