# 5 Base DNA Germline WGS UMI

A DRAGEN recipe, like this one, is a predefined set of analysis parameters and workflow settings tailored to a specific type of genomic analysis. For clarity, some default parameters are explicitly included and annotated with comments.

```
  
/opt/dragen/$VERSION/bin/dragen         #DRAGEN install path 
--ref-dir $REF_DIR                      #path to DRAGEN pangenome hashtable 
--output-directory $OUTPUT 
--intermediate-results-dir $PATH        #e.g. SSD /staging 
--output-file-prefix $PREFIX 
# Inputs 
--fastq-list $PATH                      #see 'Input Options' for FQ, BAM or CRAM 
--fastq-list-sample-id $STRING 
# Mapper 
--enable-map-align true                 #optional with BAM/CRAM input 
--enable-map-align-output true          #optionally save the output BAM 
--enable-sort true                      #default=true 
# UMI 
--umi-enable true 
--umi-min-supporting-reads 1            #Default=2 
# 5-Base 
--methylation-conversion illumina 
--methylation-generate-cytosine-report true 
--methylation-compress-cx-report true 
# Small variant caller 
--enable-variant-caller true 
# Annotation 
--variant-annotation-data PATH 
--enable-variant-annotation true 
# SV 
--enable-sv true 
# CNV 
--enable-cnv true 
--cnv-enable-self-normalization true 
```

## Notes and additional options

### Hashtable

For DRAGEN germline runs, it is recommended to use the pangenome hashtable.

See: [Product Files](https://support.illumina.com/sequencing/sequencing_software/dragen-bio-it-platform/product_files.html)

### Input options

DRAGEN input sources include: fastq list, fastq, bam, or cram. For BCL input, first create FASTQs using [BCL conversion](https://help.connected.illumina.com/dragen/dragen-v4.4/product-guide/dragen-v4.4/bcl-conversion).

FQ list Input

```
--fastq-list $PATH 
--fastq-list-sample-id $STRING 
```

FQ Input

```
--fastq-file1 $PATH 
--fastq-file2 $PATH 
--RGSM $STRING 
--RGID $STRING 
```

BAM Input

```
--bam-input $PATH 
```

CRAM Input

```
--cram-input $PATH 
```

### Mapping and Aligning

| Option                           | Description                                                                                          |
| -------------------------------- | ---------------------------------------------------------------------------------------------------- |
| `--enable-map-align true`        | Optionally disable map & align (default=true).                                                       |
| `--enable-map-align-output true` | Optionally save the output BAM (default=false).                                                      |
| `--Aligner.clip-pe-overhang 2`   | Clean up any unwanted UMI indexes. Only use when reads contain UMIs, but UMI collapsing was not run. |

### UMI

| Option                             | Description                                                                                                                                                                                                                                                                                                                      |
| ---------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `--umi-nonrandom-whitelist $PATH`  | If UMI is nonrandom, either a whitelist or correction table is required. The whitelist includes a valid UMI sequence per line.                                                                                                                                                                                                   |
| `--umi-correction-table $PATH`     | If UMI is nonrandom, either a whitelist or correction table is required. The correction table defaults to the table used by TruSight Oncology: \<INSTALL\_PATH>/resources/umi/umi\_correction\_table.txt.gz.                                                                                                                     |
| `--umi-min-supporting-reads INT`   | Specify the number of matching UMI input reads required to generate a consensus read. Any family with insufficient supporting reads is discarded. The default is 2, but most pipelines perform better with this setting set to 1. A setting of 2 may potentially be relevant for samples with ultra deep coverage (e.g. ctDNA).  |
| `--umi-metrics-interval-file $BED` | Target region in BED format.                                                                                                                                                                                                                                                                                                     |
| `--umi-emit-multiplicity both`     | Set the consensus sequence type to output. DRAGEN UMI allows collapsing duplex sequences from the two strands of the original molecules. For more information, see [Merge Duplex UMIs](https://help.connected.illumina.com/dragen/dragen-v4.4/product-guide/dragen-dna-pipeline/unique-molecular-identifiers#merge-duplex-umis). |
| `--umi-start-mask-length INT`      | Number of additional bases to ignore from start of read. The default is 0. To reduce FP optionally set to 1.                                                                                                                                                                                                                     |
| `--umi-end-mask-length INT`        | Number of additional bases to ignore from end of read. The default is 0. To reduce FP optionally set to 3.                                                                                                                                                                                                                       |

For more information see: [UMI Options](https://help.connected.illumina.com/dragen/dragen-v4.4/product-guide/dragen-dna-pipeline/unique-molecular-identifiers#umi-options).

### 5-Base Methylation

| Option                                        | Description                                                                                                                                                                                                                       |
| --------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `--methylation-conversion STRING`             | Library conversion for methylation analysis. Options: `none`, `c_t`, `mc_t`, `illumina` (default=none).                                                                                                                           |
| `--methylation-protocol STRING`               | Library protocol for methylation analysis. Options: `none`, `directional`, `non-directional`, `directional-complement`, `pbat`. The default value for `methylation-conversion=illumina` is `directional`, otherwise it is `none`. |
| `--methylation-mapq-threshold INT`            | Only reads with MAPQ greater or equal than the threshold will be included in methyl-seq analysis (default=0).                                                                                                                     |
| `--methylation-generate-mbias-report true`    | Whether to generate a per-sequencer-cycle methylation bias report (default=true).                                                                                                                                                 |
| `--mbias-report-include-overlaps`             | Calculate methylation stats for overlapping bases between mates (default=false).                                                                                                                                                  |
| `--methylation-generate-cytosine-report true` | Whether to generate a genome-wide cytosine methylation CX\_report file (default=false).                                                                                                                                           |
| `--methylation-compress-cx-report true`       | Set to true to enable compression of the CX\_report (default=true).                                                                                                                                                               |
| `--methylation-keep-ref-cytosine true`        | Set to true to keep all reference cytosines in the CX\_report file, even if they don't appear in the input reads (default=false).                                                                                                 |
| `--enable-cpg-methylated-mapping true`        | Enable methylated mapping with base conversions restricted to CpG context (default=true). When false, runs DRAGEN Methylation 3-base map/align instead.                                                                           |
| `--methylation-report-to-vcf`                 | Specify methylation type (none, cg, or c) which is reported in VCF files (default=c).                                                                                                                                             |
| `--methylation-report-to-gvcf`                | Specify methylation type (none, cg, or c) which is reported in gVCF files (default=cg).                                                                                                                                           |

For more information see: [5-Base Pipeline](https://help.connected.illumina.com/dragen/dragen-v4.4/product-guide/dragen-v4.4/dragen-methylation-pipeline/dragen-5base-pipeline).

### SNV

| Option                                      | Description                                                                                                                                  |
| ------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- |
| `--vc-target-bed`                           | Limit variant calling to region of interest.                                                                                                 |
| `--vc-combine-phased-variants-distance INT` | Maximum distance in base pairs (BP) over which phased variants will be combined. Set to 0 to disable. Valid range is \[0; 15] BP (Default=2) |

For more detail on the small variant caller in somatic mode please refer to [Somatic Mode](https://help.connected.illumina.com/dragen/dragen-v4.4/product-guide/dragen-v4.4/dragen-dna-pipeline/small-variant-calling/somatic-mode)

### Annotation

For instructions on how to download the Nirvana annotation database, please refer to [Nirvana](https://help.connected.illumina.com/dragen/dragen-v4.4/product-guide/dragen-v4.4/nirvana)

### CNV

| Option                                | Description                                                                                                                                               |
| ------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `--cnv-enable-gcbias-correction true` | Enable or disable GC bias correction when generating target counts.                                                                                       |
| `--cnv-segmentation-mode $SEG_MODE`   | Option to override the default segmentation algorithm. Defaults include `slm` for germline WGS, `aslm` for somatic WGS, and `hslm` for targeted analysis. |

For more information, see [CNV Calling](https://help.connected.illumina.com/dragen/dragen-v4.4/product-guide/dragen-v4.4/dragen-dna-pipeline/cnv-calling).
