Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

For use when you need to run the same program over a number of files. In non-supercomputing environments, you might use a loop or gnu parallel. However, we can make slurm perform parallelisation for us with minimal effort. This has been tested on Zeus. The maximum number of jobs that can be created with a single array is 1000. However, there is a limit to the number of jobs that can run concurrently. If your the workq partition on Zeus is limited to a max of 512 jobs in the queue, and 16 jobs running concurrently. Therefore, do not create an array that will spawn more than 512 jobs. With the limit on number of concurrent jobs is limited to 10, you , a max of 16 will run ~10 jobs at a time, with a new one spawning as an older one completes.

...

Code Block
languagebash
titlearray_job.sh
collapsetrue
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --cpus-per-task=1
#SBATCH --time=1:00:00
#SBATCH --job-name=MYJOB
#SBATCH --partition=workq 
#SBATCH --account=MYACCOUNT
#SBATCH --output=MYJOB-%j.log
#SBATCH --error=MYJOB-%J.log

#Do not edit the echo sections
echo "All jobs in this array have:"
echo "- SLURM_ARRAY_JOB_ID=${SLURM_ARRAY_JOB_ID}"
echo "- SLURM_ARRAY_TASK_COUNT=${SLURM_ARRAY_TASK_COUNT}"
echo "- SLURM_ARRAY_TASK_MIN=${SLURM_ARRAY_TASK_MIN}"
echo "- SLURM_ARRAY_TASK_MAX=${SLURM_ARRAY_TASK_MAX}"
echo "This job in the array has:"
echo "- SLURM_JOB_ID=${SLURM_JOB_ID}"
echo "- SLURM_ARRAY_TASK_ID=${SLURM_ARRAY_TASK_ID}"


# alter the following line to suit your files. It will grab all files matching whatever regular expression you provide. 
FILES=($(ls -1 *.bam))

# grabs our filename from a directory listing
FILENAME=${FILES[$SLURM_ARRAY_TASK_ID]}
echo "My input file is ${FILENAME}"

#load modules
module load singularity

#set variables
basedir=/scratch/$MYGROUP/$USER/EHDN$(pwd)
container=/group/$MYGROUP/$USER/expansion-hunter-denovo_v0.8.7.sif
ref=Homo_sapiens_assembly38.fasta

#job script
singularity exec ${container} /ExpansionHunterDenovo profile \
        --reads ${basedir}/${FILENAME} \
        --reference ${basedir}/${ref} \
        --output-prefix ${basedir}/str-profiles/${FILENAME} \
        --min-anchor-mapq 50 \
        --max-irr-mapq 40

...