Dependencies and pipelines

Using dependencies in Slurm

Slurm has a fairly robust set of dependencies you can use. These are set when you submit the job and can be used for setting up pipelines. A job can depend on more than one other job as well.

To use dependencies, submit the job with the following switch. If using multiple dependency types, they should be comma seperated:

-d, --dependency=<dependency_list>

Types of dependencies

Here are the most common types of dependencies you can specify:

after:job_id[:jobid...]

This job can begin execution after the specified jobs have begun execution.

afterany:job_id[:jobid...]

This job can begin execution after the specified jobs have terminated.

afternotok:job_id[:jobid...]

This job can begin execution after the specified jobs have terminated in some failed state (non-zero exit code, node failure, timed out, etc).

afterok:job_id[:jobid...]

This job can begin execution after the specified jobs have successfully executed (ran to completion with an exit code of zero).

singleton

This job can begin execution after any previously launched jobs sharing the same job name and user have terminated. In other words, only one job by that name and owned by that user can be running or suspended at any point in time.

Setting up a pipeline of dependencies

You can set up a script to submit the jobs, one after the other with the dependencies that you want to define. Here is an example of such a script:

#! /bin/bash

# first job - no dependencies
jid1=$(sbatch -t 00:10:00 job1.sh | sed 's/Submitted batch job //')

# multiple jobs can depend on a single job
jid2=$(sbatch -t 00:10:00 --dependency=afterany:$jid1 job2.sh | sed 's/Submitted batch job //')
jid3=$(sbatch -t 00:10:00 --dependency=afterany:$jid1 job3.sh | sed 's/Submitted batch job //')

In this case, job1.sh, job2.sh, and job3.sh all had the following contents:

#!/bin/bash
#SBATCH --job-name=dependency_test
#SBATCH --output=../out_err/depend-test-%J.out
#SBATCH --error=../out_err/dependtest-%J.err
#SBATCH --partition=any
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH -t 0:10:00

echo "Starting at `date`"
echo "Running on hosts: $SLURM_NODELIST"
echo "Running on hosts: $SLURM_JOB_NODELIST"
echo "Running on $SLURM_NNODES nodes."
echo "Running on $SLURM_NPROCS processors."
echo "Current working directory is `pwd`"
echo "Running a dependency test"

sleep 300