Nextflow
Last updated
Was this helpful?
Last updated
Was this helpful?
ICA supports running pipelines defined using . See for an example.
In order to run Nextflow pipelines, the following process-level attributes within the Nextflow definition must be considered.
Nextflow version
20.10.0 (deprecated *), 22.04.3, 24.10.2 (Experimental)
Executor
Kubernetes
(*) Pipelines will still run when 20.10.0 will be deprecated, but you will no longer be able to choose it when creating new pipelines.
You can select the Nextflow version while building a pipeline as follows:
GUI
Select the Nextflow version at Projects > your_project > flow > pipelines > your_pipeline > Details tab.
API
Select the Nextflow version by setting it in the optional field "pipelineLanguageVersionId
".
When not set, a default Nextflow version will be used for the pipeline.
For each compute type, you can choose between the scheduler.illumina.com/lifecycle: standard
(default - AWS on-demand) or scheduler.illumina.com/lifecycle: economy
(AWS spot instance) tiers.
To specify a compute type for a Nextflow process, use the within each process. Set the annotation
to scheduler.illumina.com/presetSize
and the value
to the desired compute type. A list of available compute types can be found . The default compute type, when this directive is not specified, is standard-small
(2 CPUs and 8 GB of memory).
Syntax highlighting is determined by the file type, but you can select alternative syntax highlighting with the drop-down selection list.
The following configuration settings will be ignored if provided as they are overridden by the system:
Often, there is a need to select the compute size for a process dynamically based on user input and other factors. The Kubernetes executor used on ICA does not use the cpu
and memory
directives, so instead, you can dynamically set the pod
directive, as mentioned . e.g.
Additionally, it can also be specified in the . Example configuration file:
Inputs are specified via the or JSON-based input form. The specified code
in the XML will correspond to the field in the params
object that is available in the workflow. Refer to the for an example.
Outputs for Nextflow pipelines are uploaded from the out
directory in the attached shared filesystem. The can be used to symlink (recommended), copy or move data to the correct folder. Data will be uploaded to the ICA project after the pipeline execution completes.
Use "" instead of "copy" in the publishDir
directive. Symlinking creates a link to the original file rather than copying it, which doesn’t consume additional disk space. This can prevent the issue of silent file upload failures due to disk space limitations.
Use Nextflow 22.04.0 or later and enable the "" publishDir
option. This option ensures that the workflow will fail and provide an error message if there's an issue with publishing files, rather than completing silently without all expected outputs.
During execution, the Nextflow pipeline runner determines the environment settings based on values passed via the command-line or via a configuration file (see ). When creating a Nextflow pipeline, use the nextflow.config tab in the UI (or API) to specify a nextflow configuration file to be used when launching the pipeline.