Job Submission Limits
Maximum Concurrent Jobs
The cluster allows a user to submit up to 10,000 jobs at one time.
Why This Limit Exists
This limitation exists to avoid hanging the cluster scheduling system. Very large numbers of jobs can overwhelm the SLURM scheduler.
Workarounds
If you need to exceed this threshold:
Manual Batching
Submit jobs in batches of 10,000, waiting for some to complete before submitting more.
Custom Job Management
Implement a wrapper script that monitors your job count and submits new jobs as others complete:
#!/bin/bash
MAX_JOBS=9000
TOTAL_JOBS=50000
for i in $(seq 1 $TOTAL_JOBS); do
# Wait if too many jobs queued
while [ $(squeue -u $USER -h | wc -l) -ge $MAX_JOBS ]; do
sleep 60
done
sbatch my_job.sh $i
done
Job Arrays
For similar jobs, use SLURM job arrays (more efficient):
#SBATCH --array=1-10000
Then submit another array after the first completes.