# Nextflow DRAGEN Pipeline

In this tutorial, we will demonstrate how to create and launch a simple DRAGEN pipeline using the Nextflow language in ICA GUI. More information about Nextflow on ICA can be found [here](https://help.ica.illumina.com/project/p-flow/f-pipelines/pi-nextflow). For this example, we will implement the alignment and variant calling example from this [DRAGEN support page](https://support-docs.illumina.com/SW/DRAGEN_v40/Content/SW/DRAGEN/AligningVariantCallingExamples_fDG_dtREF.htm) for Paired-End FASTQ Inputs.

## Linking a DRAGEN bundle

You need a project in which the pipeline will reside. You can choose an existing project or create a new one. See the [Projects](https://help.ica.illumina.com/home/h-projects) page for information on how to create a project. For this tutorial, we will use a project called *Getting Started*.

Once you have selected or created your project, you need to link a DRAGEN bundle to it give the project access to the DRAGEN docker image. Open your project and navigate to **Projects > your\_project > Project settings > Details > Edit**. From here, select the + symbol next to linked bundles and select a *DRAGEN Demo Tool* bundle to add to the project. For this tutorial, link *DRAGEN Demo Bundle 4.0.3*.

Once the bundle has been linked to your project, you can access the docker image by navigating to the main level and opening **System Settings > Docker Repository.** There, click the docker image *dragen-ica-4.0.3*. At the bottom of the screen, you will see the regions where this bundle is available.

{% hint style="info" %}
The URL presented here will be used later in the `container` directive for your Nextflow DRAGEN process.
{% endhint %}

## Creating the pipeline

Select **Projects > your\_project > Flow > Pipelines**. From the **Pipelines** view, click **+Create > Nextflow > JSON based** to start creating a Nextflow pipeline.

<figure><img src="/files/K0QJSteKOz3hJpkS7HIY" alt="" width="375"><figcaption></figcaption></figure>

### Details

In the Nextflow pipeline creation view, use the **Details** tab to add information about the pipeline. Add values for the required *Code* (pipeline name) and *Description* fields. *Nextflow Version* and *Storage size* defaults to preassigned values.

<figure><img src="/files/MNxClnNKVvzpGnKQEmH1" alt=""><figcaption></figcaption></figure>

### Main.nf

Next, add the Nextflow pipeline definition by navigating to the **Nextflow files > main files > main.nf**. You will see a text editor. Copy and paste the following definition into the text editor. **Modify the `container` directive by replacing the current URL with the URL found in the docker image&#x20;*****dragen-ica-4.0.3*****. (System Settings > Docker Repository > your\_docker\_image > Regions)**.

This pipeline performs the following actions:

1. Accepts one paired FASTQ, one compressed reference file and a sample name
2. Schedules a **FPGA‑backed Kubernetes pod on ICA**
3. Unpacks the reference to local scratch
4. Runs DRAGEN with variant calling
5. Uploads **all outputs** to ICA cloud storage

```groovy
nextflow.enable.dsl = 2

process DRAGEN {

    // The container must be a DRAGEN image that is included in an accepted bundle and will determine the DRAGEN version
    container '079623148045.dkr.ecr.us-east-1.amazonaws.com/cp-prod/7ecddc68-f08b-4b43-99b6-aee3cbb34524:latest'
    pod annotation: 'scheduler.illumina.com/presetSize', value: 'fpga2-medium'
    pod annotation: 'volumes.illumina.com/scratchSize', value: '1TiB'

    // ICA will upload everything in the "out" folder to cloud storage 
    publishDir 'out', mode: 'symlink'

    input:
        tuple path(read1), path(read2)
        val sample_id
        path ref_tar

    output:
        stdout emit: result
        path '*', emit: output

    script:
        """
        set -ex
        mkdir -p /scratch/reference
        tar -C /scratch/reference -xf ${ref_tar}
        
        /opt/edico/bin/dragen --partial-reconfig HMM --ignore-version-check true
        /opt/edico/bin/dragen --lic-instance-id-location /opt/instance-identity \\
            --output-directory ./ \\
            -1 ${read1} \\
            -2 ${read2} \\
            --intermediate-results-dir /scratch \\
            --output-file-prefix ${sample_id} \\
            --RGID ${sample_id} \\
            --RGSM ${sample_id} \\
            --ref-dir /scratch/reference \\
            --enable-variant-caller true
        """
}

workflow {
    DRAGEN(
        Channel.of([file(params.read1), file(params.read2)]),
        Channel.of(params.sample_id),
        Channel.fromPath(params.ref_tar)
    )
}
```

Refer to the [ICA help page](https://help.ica.illumina.com/project/p-flow/f-pipelines/pi-nextflow) for details on ICA specific attributes within the Nextflow definition.

* To specify a compute type for a Nextflow process, use the [pod](https://www.nextflow.io/docs/latest/process.html#process-pod) directive within each process.
* Outputs for Nextflow pipelines are uploaded from the `out` folder in the attached shared filesystem. The [publishDir](https://www.nextflow.io/docs/latest/process.html#publishdir) directive specifies the output folder for a given process. Only data moved to the out folder using the `publishDir` directive will be uploaded to the ICA project after the pipeline finishes executing.

### Input Form

Next, we create the input form used for the pipeline. This is done on the **Inputform files** tab. More information on the specifications for the input form can be found in [Input Form](/connected-analytics/project/p-flow/f-pipelines/pi-inputform.md) page.

This pipeline takes two FASTQ files, one *reference file* and one *sample\_id* parameter as input.

Paste the following JSON input form into the **inputForm.json** text editor.

```json
{
  "fields": [
    {
      "id": "read1",
      "label": "FASTQ read 1",
      "type": "data",
      "dataFilter": {
        "dataType": "file",
        "dataFormat": ["FASTQ"]
      },
      "maxValues": 1,
      "minValues": 1
    },
    {
      "id": "read2",
      "label": "FASTQ read 2",
      "type": "data",
      "dataFilter": {
        "dataType": "file",
        "dataFormat": ["FASTQ"]
      },
      "maxValues": 1,
      "minValues": 1
    },
    {
      "id": "ref_tar",
      "label": "Reference TAR",
      "type": "data",
      "dataFilter": {
        "dataType": "file",
        "dataFormat": ["TAR"]
      },
      "maxValues": 1,
      "minValues": 1
    },
    {
        "id": "sample_id",
        "type": "textbox",
        "label": "Sample ID"
      }
  ]
}
```

Click the **Simulate** button (bottom left) to preview the launch form fields.

<figure><img src="/files/h5BC0hj1Z0Azvys9TY9w" alt=""><figcaption></figcaption></figure>

Click the `Save` button (top right) to save the changes.

## Running the pipeline

{% hint style="info" %}
If you have no test data available, you need to link the *Dragen Demo Bundle* to your project at **Projects > your\_project > Project Settings > Details > Linked Bundles**.
{% endhint %}

Go to the **projects > your\_project > flow > pipelines > your\_pipeline** and click **Start Analysis**.

Fill in the required fields and click on **Start Analysis** button.

<figure><img src="/files/CNRoqzODQoZwz9wcZzyJ" alt=""><figcaption></figcaption></figure>

#### Results

You can monitor the run from the **Projects > your\_project > Flow > analysis** page. Once the Status changes to Succeeded, you can click on the run to access the results.

## Useful Links

* [Illumina DRAGEN Documentation](https://support-docs.illumina.com/SW/DRAGEN_v40/Content/SW/DRAGEN/GettingStarted_fDG.htm)
* [Nextflow's official documentation](https://www.nextflow.io/)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://help.connected.illumina.com/connected-analytics/tutorials/nextflow/nextflow-dragen-pipeline.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
