This tutorial illustrates how to set up a workflow that makes use of both CPU and GPU applications. It leverages a custom Nextflow build to provide support for Slurm multiple clusters. The feature is currently not supported by the Nextflow development team.
Context
Suppose you have a Nextflow pipeline where some processes make use of CPU applications, while other make use of GPU applications. Suppose also that you are using containers to deploy these applications by means of Singularity. There are a few aspects that need to be addressed to run this type of pipeline at Pawsey:
- CPU applications need to run on Setonix Phase 1, whereas GPU applications need to run on Topaz
- Nextflow needs to ask GPU resources to Slurm, in the case of GPU enabled processes
- Nextflow needs to enable GPU usage for Singularity
Details
- To submit tasks to both Setonix and Topaz, Nextflow needs a custom build to enable multi-cluster support. Detailed steps and a template script to build a customised Nextflow in the current context are provided in the Appendix on this page. In this example of customised Nextflow, two new executors are made available,
slurm_setonix
andslurm_topaz
, which allow to assign jobs to Setonix and Topaz, respectively. - Nextflow
processes
that make use of a CPU require these configuration parameters:executor = 'slurm_setonix'
(valid for this example of customised Nextflow)queue = 'work'
- Nextflow
processes
that make use of a GPU require these configuration parameters:executor = 'slurm_topaz'
(valid for this example of customised Nextflow)queue = 'gpuq'
clusterOptions += " --gpus-per-node=1"
- Special care is needed in the setup of Singularity environment variables
SINGULARITY_BINDPATH,
SINGULARITYENV_LD_LIBRARY_PATH
andSINGULARITYENV_LD_PRELOAD
, inherited from the Singularity module, because the master pipeline running on Setonix will execute tasks on distinct clusters with distinct installation paths. As a result, it is best NOT to pass the module variables at all, that is, not to include them in theenvWhitelist
parameter in the Nextflow configuration. - The Singularity configuration in Nextflow requires the following:
runOptions = "-B /scratch --nv"
- Here,
--nv
enables use of GPUs (this is compatible with CPU tasks, too, the only side effect is that a warning will be printed) - Here,
-B /scratch
is also added, becauseSINGULARITY_BINDPATH
is not whitelisted from the Singularity module (see point above)
Example
setonixtopaz { process.container = 'marcodelapierre/toy-gpu-nf:latest' singularity { enabled = true runOptions = "-B /scratch --nv" } params.slurm_account = 'pawsey0001' process { clusterOptions = "--account=${params.slurm_account}" executor = 'slurm_setonix' queue = 'work' withName: 'proc_gpu' { executor = 'slurm_topaz' queue = 'gpuq' clusterOptions += " --gpus-per-node=1" } } }
Appendix
Details on building Nexflow with multi-cluster support
The idea is to duplicate the existing slurm
executor, to create additional ones that submit to specific Pawsey supercomputers, Setonix and Topaz in this case. This new executor will have modified sbatch
, squeue
and scancel
commands, using e.g. the --clusters topaz
flag. In this way, two Slurm executables are available to the Nextflow user, for submitting processes either to the current cluster, or to Topaz. The source directory, modules/nextflow/src/main/groovy/nextflow/executor/,
contains relevant source files, in particular:
ExecutorFactory.groovy
- has the list of available executors. It's possible to add new onesSlurmExecutor.groovy
- codes the standardslurm
executor. E.g. it can be duplicated into a modifiedSlurmTopazExecutor.groovy
Here are the detailed steps:
- Clone the Nextflow Github repository, and check out the desired version
- Edit the file
modules/nextflow/src/main/groovy/nextflow/executor/ExecutorFactory.groovy
. Locate the classExecutorFactory
, then add additional Slurm executors in the list of executors, by naming a new class file, e.g.SlurmTopazExecutor
- In the same directory, duplicate the class file
SlurmExecutor.groovy
into a new file, e.g.SlurmTopazExecutor.groovy
. In the new file, locate the occurrences of sbatch, squeue and scancel (one each), and add the clusters flag with appropriate Groovy syntax. In this example,'--clusters', 'topaz'
- Compile your customised Nextflow with the sequence
$ make compile; "make pack; make install
- Retrieve the compiled executable from
nextflow/build/releases/nextflow-*-all
Notes on the Build
- The build with Gradle requires file locks, so you won't work on Lustre or NFS, but rather using a dedicated directory under
/tmp
instead - To comply with Pawsey best practices, you will edit the
Makefile
so that it stores dependencies under$MYSOFTWARE
rather than$HOME