1 of 12

DRAGEN Recipes

Overview

The following sub-pages contain recommended command line options for specific DRAGEN pipelines. For an overview of DRAGEN command line parsing, also see

Germline WGS

DRAGEN Recipe - Germline WGS

Overview

This recipe is for processing whole genome sequencing data for germline workflows.

Example Command Line

For most scenarios, simply creating the union of the command line options from the single caller scenarios will work.

Configure the INPUT options
Configure the OUTPUT options
Configure MAP/ALIGN depending on if realignment is desired or not
Configure the VARIANT CALLERs based on the application
Configure any additional options
Build up the necessary options for each component separately, so that they can be re-used in the final command line.

We highly recommend using a pangenome reference for human samples (excluding RNA). For more details, refer to Dragen Reference Support.

The following are partial templates that can be used as starting points. Adjust them accordingly for your specific use case.

#!/bin/bash
set -euo pipefail

# Path to DRAGEN hashtable
DRAGEN_HASH_TABLE=<REF_DIR>  # pangenome reference for human samples

# Path to output directory for the DRAGEN run
OUTPUT=<OUT_DIR>

# File prefix for DRAGEN output files
PREFIX=<OUT_PREFIX>

# Define the input sources, select fastq list, fastq, bam, or cram.
INPUT_FASTQ_LIST="
  --fastq-list $FASTQ_LIST \
  --fastq-list-sample-id $FASTQ_LIST_SAMPLE_ID \
"

INPUT_FASTQ="
  --fastq-file1 $FASTQ1 \
  --fastq-file2 $FASTQ2 \
  --RGSM $RGSM \
  --RGID $RGID \
"

INPUT_BAM="
  --bam-input $BAM \
"

INPUT_CRAM="
  --cram-input $CRAM \
"

# Select input source, here in this example we use INPUT_FASTQ_LIST
INPUT_OPTIONS="
  --ref-dir $DRAGEN_HASH_TABLE \
  $INPUT_FASTQ_LIST \
"

OUTPUT_OPTIONS="
  --output-directory $OUTPUT \
  --output-file-prefix $PREFIX \
"

MA_OPTIONS="
  --enable-map-align true \
  --enable-sort true \
  --enable-duplicate-marking true \
"

CNV_OPTIONS="
  --enable-cnv true \
  --cnv-enable-self-normalization true \
"

SNV_OPTIONS="
  --enable-variant-caller true \
"

SV_OPTIONS="
  --enable-sv true \
"

TARGETED_OPTIONS="
  --enable-targeted true \
"

PGX_OPTIONS="
  --enable-pgx true \
"

STR_OPTIONS="
  --repeat-genotype-enable true \
"

# Automatic merging of VNTR calls into SV VCF disabled with the second option
# See the VNTR calling page for more details
VNTR_OPTIONS="
  --enable-vntr true \
  --sv-vntr-merge false \
"

HLA_OPTIONS="
--enable-hla=true \
--hla-enable-class-2=true \ 
"

# Construct final command line
CMD="
  dragen \
  $INPUT_OPTIONS \
  $OUTPUT_OPTIONS \
  $MA_OPTIONS \
  $CNV_OPTIONS \
  $SNV_OPTIONS \
  $SV_OPTIONS \
  $TARGETED_OPTIONS \
  $PGX_OPTIONS \
  $STR_OPTIONS \
  $VNTR_OPTIONS \
  $HLA_OPTIONS \
"

# Execute
echo $CMD
bash -c $CMD

Additional Notes and Options

Optional settings per component are listed below. Full option list at this page.

SNV

Note that we do not recommend changing the default QUAL thresholds of 3 for DRAGEN-ML and 10 for DRAGEN without ML. These values differ from each other because DRAGEN-ML improves the calibration of QUAL scores, leading to a change in the scoring range (see QUAL, QD, and GQ Formulation).

HLA

Germline WES

DRAGEN Recipe - Germline WES

Overview

This recipe is for processing whole exome sequencing data for germline workflows.

Example Command Line

For most scenarios, simply creating the union of the command line options from the single caller scenarios will work.

Configure the INPUT options
Configure the OUTPUT options
Configure MAP/ALIGN depending on if realignment is desired or not
Configure the VARIANT CALLERs based on the application
Configure any additional options
Build up the necessary options for each component separately, so that they can be re-used in the final command line.

We highly recommend using a pangenome reference for human samples (excluding RNA). For more details, refer to Dragen Reference Support.

The following are partial templates that can be used as starting points. Adjust them accordingly for your specific use case.

#!/bin/bash
set -euo pipefail

# Path to DRAGEN hashtable
DRAGEN_HASH_TABLE=<REF_DIR> # pangenome reference for human samples

# Path to output directory for the DRAGEN run
OUTPUT=<OUT_DIR>

# File prefix for DRAGEN output files
PREFIX=<OUT_PREFIX>

# Define the input sources, select fastq list, fastq, bam, or cram.
INPUT_FASTQ_LIST="
  --fastq-list $FASTQ_LIST \
  --fastq-list-sample-id $FASTQ_LIST_SAMPLE_ID \
"

INPUT_FASTQ="
  --fastq-file1 $FASTQ1 \
  --fastq-file2 $FASTQ2 \
  --RGSM $RGSM \
  --RGID $RGID \
"

INPUT_BAM="
  --bam-input $BAM \
"

INPUT_CRAM="
  --cram-input $CRAM \
"

# Select input source, here in this example we use INPUT_FASTQ_LIST
INPUT_OPTIONS="
  --ref-dir $DRAGEN_HASH_TABLE \
  $INPUT_FASTQ_LIST \
"

OUTPUT_OPTIONS="
  --output-directory $OUTPUT \
  --output-file-prefix $PREFIX \
"

MA_OPTIONS="
  --enable-map-align true \
  --enable-sort true \
  --enable-duplicate-marking true \
"

CNV_OPTIONS="
  --enable-cnv true \
  --cnv-target-bed $CNV_TARGET_BED \
  --cnv-combined-counts $CNV_PANEL_OF_NORMALS \
"

SNV_OPTIONS="
  --enable-variant-caller true \
  --vc-target-bed $VC_TARGET_BED \
"

SV_OPTIONS="
  --enable-sv true \
  --sv-exome true \
  --sv-call-regions-bed $SV_TARGET_BED \
"

HLA_OPTIONS="
--enable-hla=true \
--hla-enable-class-2=true \ # only if the panel has sufficient coverage for class II HLA typing 
"

# Construct final command line
CMD="
  dragen \
  $INPUT_OPTIONS \
  $OUTPUT_OPTIONS \
  $MA_OPTIONS \
  $CNV_OPTIONS \
  $SNV_OPTIONS \
  $SV_OPTIONS \
  $HLA_OPTIONS \
"

# Execute
echo $CMD
bash -c $CMD

Additional Notes and Options

Optional settings per component are listed below. Full option list at this page.

CNV

Please include the matched normal sample in the CNV panel of normals.

Generating Panel of Normals (PON)

WES CNV requires PON files. Follow the two steps below to generate CNV PON:

Target counts generation (per normal sample): Target counts of individual normal sample should be generated as baseline. Any options used for panel of normals generation (BED file, GC Bias Correction, etc) should be matched when processing the case sample.

CNV_PON_OPTIONS="
  --enable-cnv true \
  --cnv-target-bed $CNV_TARGET_BED \
"

CMD="
  dragen \
  $INPUT_OPTIONS \
  $OUTPUT_OPTIONS \
  $CNV_PON_OPTIONS \
"

Combined counts generation: Individual PON counts can be merged into a single file as a <prefix>.combined.counts.txt.gz file.

CNV_COMBINED_COUNTS_OPTIONS="
  --enable-cnv true \
  --cnv-generate-combined-counts true \
  --cnv-normals-list $CNV_NORMALS_LIST \
"

CMD="
  dragen \
  $INPUT_OPTIONS \
  $OUTPUT_OPTIONS \
  $CNV_COMBINED_COUNTS_OPTIONS \
"

$CNV_NORMALS_LIST is a single text file with paths to each target counts file generated by step1 (either .target.counts.gz or .target.counts.gc-corrected.gz). Output will have a PON file with suffix .combined.counts.txt.gz file. Use the PON file in case sample runs of DRAGEN CNV with --cnv-combined-counts option.

For more information, see Panel of Normals.

SNV

HLA

Somatic Tumor Normal with UMI

DRAGEN Recipe - Somatic UMI Tumor Normal

Overview

This recipe is for processing sequencing data with unique molecular identifier (UMI) for somatic tumor normal workflows.

Example Command Line

For Somatic UMI Tumor Normal inputs, tumor and normal sample need to be run separately for the Map/Align stage, and then Variant Calling is started from tumor and normal UMI collapsed BAM.

For Map/Align stage:

Configure the INPUT options
Configure the OUTPUT options
Configure MAP/ALIGN
Configure UMI options

For Variant Calling stage:

Configure the INPUT options
Configure the OUTPUT options
Configure the VARIANT CALLERs based on the application
Configure any additional options
Build up the necessary options for each component separately, so that they can be re-used in the final command line.

We recommend using a linear (non-pangenome) reference for somatic analysis. For more details, refer to Dragen Reference Support.

The following are partial templates that can be used as starting points. Adjust them accordingly for your specific use case.

Map/Align stage

#!/bin/bash
set -euo pipefail

# Path to DRAGEN hashtable
DRAGEN_HASH_TABLE=<REF_DIR>

# Path to output directory for the DRAGEN run
OUTPUT=<OUT_DIR>

# File prefix for DRAGEN output files
PREFIX=<OUT_PREFIX>

# Define the input sources, select fastq list, fastq, bam, or cram. Please select either tumor or normal input with UMI to generate collapsed BAM. In this example, we use tumor input option.
INPUT_FASTQ_LIST="
  --tumor-fastq-list $TUMOR_FASTQ_LIST \
  --tumor-fastq-list-sample-id $TUMOR_FASTQ_LIST_SAMPLE_ID \
"

INPUT_FASTQ="
  --tumor-fastq1 $TUMOR_FASTQ1 \
  --tumor-fastq2 $TUMOR_FASTQ2 \
  --RGSM-tumor $RGSM_TUMOR \
  --RGID-tumor $RGID_TUMOR \
"

INPUT_BAM="
  --tumor-bam-input $TUMOR_BAM \
"

INPUT_CRAM="
  --tumor-cram-input $TUMOR_CRAM \
"

# Select input source, here in this example we use INPUT_FASTQ_LIST
INPUT_OPTIONS="
  --ref-dir $DRAGEN_HASH_TABLE \
  $INPUT_FASTQ_LIST \
"

OUTPUT_OPTIONS="
  --output-directory $OUTPUT \
  --output-file-prefix $PREFIX \
"

MA_OPTIONS="
  --enable-map-align true \
  --enable-sort true \
"

UMI_OPTIONS="
  --enable-umi true \
  --umi-source $UMI_SOURCE \
  --umi-library-type $UMI_LIBRARY_TYPE \
"

# Construct final command line
CMD="
  dragen \
  $INPUT_OPTIONS \
  $OUTPUT_OPTIONS \
  $MA_OPTIONS \
  $UMI_OPTIONS \
"

# Execute
echo $CMD
bash -c $CMD

Variant Calling (and optional biomarkers) stage:

#!/bin/bash
set -euo pipefail

# Path to DRAGEN hashtable
DRAGEN_HASH_TABLE=<REF_DIR>

# Path to output directory for the DRAGEN run
OUTPUT=<OUT_DIR>

# File prefix for DRAGEN output files
PREFIX=<OUT_PREFIX>

# Path to VC systematic noise BED file. In tumor-normal variant calling, this filter
# is recommended for removing systematic noise observed in normal samples. Prebuilt
# systematic noise files are available for download on the DRAGEN Software 
# Support Site page. Alternatively, running the somatic TO pipeline on
# normal samples can generate a systematic noise file. We recommend using a
# systematic noise file based on normal samples that match the library prep of
# the tumor samples. A prebuilt systematic noise BED file can be downloaded from
# https://support.illumina.com/sequencing/sequencing_software/dragen-bio-it-platform/product_files.html
VC_SYSTEMATIC_NOISE_FILE=<VC_SYSTEMATIC_NOISE_BED_FILE_PATH>

INPUT_BAM="
  --tumor-bam-input $TUMOR_BAM \
  --bam-input $BAM \
"

INPUT_OPTIONS="
  --ref-dir $DRAGEN_HASH_TABLE \
  $INPUT_BAM \
"

OUTPUT_OPTIONS="
  --output-directory $OUTPUT \
  --output-file-prefix $PREFIX \
"

SNV_OPTIONS="
  --enable-variant-caller true \
  --vc-enable-umi-solid true or --vc-enable-umi-liquid true \
  --vc-target-bed $VC_TARGET_BED \
  --vc-systematic-noise $VC_SYSTEMATIC_NOISE_FILE \
  --vc-systematic-noise-method mean \
"

CNV_OPTIONS="
  --enable-cnv true \
  --cnv-target-bed $CNV_TARGET_BED \
  --cnv-combined-counts $CNV_PANEL_OF_NORMALS \
  --cnv-use-somatic-vc-baf true \
"

# HRD requires enabling CNV
HRD_OPTIONS="
--enable-hrd=true \
"

SV_OPTIONS="
  --enable-sv true \
  --sv-exome true \
  --sv-call-regions-bed $SV_TARGET_BED \
"

TMB_OPTIONS="
--enable-tmb=true
# Nirvana settings required for TMB
--enable-variant-annotation=true  
--variant-annotation-data=PATH
--variant-annotation-assembly=GRCh37/8
"

MSI_OPTIONS="
--msi-command=tumor-normal \
--msi-coverage-threshold=60 \
--msi-microsatellites-file=$MSI_MICROSATELLITES_FILE \
"

HLA_OPTIONS="
--enable-hla=true \
--hla-as-filter-min-threshold=29.0 \
--hla-as-filter-ratio-threshold=0.85 \
--hla-enable-class-2=true \ # only if the panel has sufficient coverage for class II HLA typing 
"

# Construct final command line
CMD="
  dragen \
  $INPUT_OPTIONS \
  $OUTPUT_OPTIONS \
  $SNV_OPTIONS \
  $CNV_OPTIONS \
  $HRD_OPTIONS \
  $SV_OPTIONS \
  $HRD_OPTIONS \
  $TMB_OPTIONS \
  $MSI_OPTIONS \
  $HLA_OPTIONS \
"

# Execute
echo $CMD
bash -c $CMD

Additional Notes and Options

Optional settings per component are listed below. Full option list at this page.

UMI

SNV

SNV library specific settings

SNV systematic noise file

Generic SNV noise files can be downloaded here: DRAGEN Software Support Site page

However for UMI and panels it is strongly recommended to build a custom systematic noise file as follow:

Step 1. Run DRAGEN somatic tumor-only on each of approximately 20-50 normal samples:

### choose input either from
### i) BAM
INPUT="--tumor-bam-input ${NORMAL_BAM}"
### ii) FASTQs
INPUT="--tumor-fastq-list ${NORMAL_FASTQ_LIST} \
  --tumor-fastq-list-sample-id ${NORMAL_FASTQ_LIST_SAMPLE_ID}"
###

dragen \
-r ${REFERENCE} \
${INPUT} \
--vc-detect-systematic-noise=true \
--vc-detect-systematic-noise-mode=UMI \ # detect ultra low noise levels relevant for UMI panels
--vc-enable-germline-tagging=true \
--enable-variant-annotation=true \
--variant-annotation-data ${NIRVANA_ANNOTATION_FOLDER} \
--variant-annotation-assembly ${REF_TYPE} \ # GRCh37 or GRCh38
--intermediate-results-dir ${TMP} \
--output-directory ${DIR} \
--output-file-prefix ${PREFIX}

Gather the full paths to the VCFs from step 1 in ${VCF_LIST} by specifying 1 file per line.

Step 2. Generate the final noise file with:

dragen \
-r ${REF_DIR} \
--build-sys-noise-vcfs-list ${VCF_LIST} \  
--build-sys-noise-method=mean \ # sets the default noise mode for this noise file by tagging the noise file header with '##NoiseMethod=mean' 
--output-directory ${DIR} \
--output-file-prefix ${PREFIX}

To download a SINE/ALU regions bed for SNV excluded regions

ALUs comprise approximately 11% of the genome and are common in introns. High rates of deamination FP calls have been observed in some FFPE libraries. If the ALU regions are not clinically significant for a specific analysis, then it is recommended to simply filter out the entire ALU region using the DRAGEN excluded regions filter: --vc-excluded-regions-bed $BED.

The ALU bed file can be downloaded as part of the Bed File Collection: DRAGEN Software Support Site page

CNV

Please include the matched normal sample in the CNV panel of normals.

Generating Panel of Normals (PON)

Somatic WES CNV requires PON files. Follow the two steps below to generate CNV PON:

Target counts generation (per normal sample): Target counts of individual normal sample should be generated as baseline. Any options used for panel of normals generation (BED file, GC Bias Correction, etc) should be matched when processing the case sample.

CNV_PON_OPTIONS="
  --enable-cnv true \
  --cnv-target-bed $CNV_TARGET_BED \
"

CMD="
  dragen \
  $INPUT_OPTIONS \
  $OUTPUT_OPTIONS \
  $CNV_PON_OPTIONS \
"

Combined counts generation: Individual PON counts can be merged into a single file as a <prefix>.combined.counts.txt.gz file.

CNV_COMBINED_COUNTS_OPTIONS="
  --enable-cnv true \
  --cnv-generate-combined-counts true \
  --cnv-normals-list $CNV_NORMALS_LIST \
"

CMD="
  dragen \
  $INPUT_OPTIONS \
  $OUTPUT_OPTIONS \
  $CNV_COMBINED_COUNTS_OPTIONS \
"

For more information, see Panel of Normals.

SV

TMB library specific settings

MSI

Microsatellite sites file

Microsatellite sites file can be downloaded here: DRAGEN Software Support Site page

For panels it is recommended to post-process the file by intersecting the WES or WGS sites with the manifest. This will avoid using any off-target reads in the MSI analysis. For small panels it may be required to generate custom site files to ensure the panel covers at least 2000 sites. To generate custom MSI site files please refer to the MSI Biomarker section in the user guide.

MSI library specific settings

HLA

Somatic Tumor Only with UMI

DRAGEN Recipe - Somatic UMI Tumor Only

Overview

This recipe is for processing sequencing data with unique molecular identifier (UMI) for somatic tumor only workflows.

Example Command Line

For most scenarios, simply creating the union of the command line options from the single caller scenarios will work.

Configure the INPUT options
Configure the OUTPUT options
Configure MAP/ALIGN depending on if realignment is desired or not
Configure the VARIANT CALLERs based on the application
Configure any additional options
Build up the necessary options for each component separately, so that they can be re-used in the final command line.

We recommend using a linear (non-pangenome) reference for somatic analysis. For more details, refer to .

The following are partial templates that can be used as starting points. Adjust them accordingly for your specific use case.

Additional Notes and Options

UMI

SNV

SNV library specific settings

SNV systematic noise file

However for UMI samples and panels it is strongly recommended to build a custom systematic noise file as follow:

Step 1. Run DRAGEN somatic tumor-only on each of approximately 20-50 normal samples:

Gather the full paths to the VCFs from step 1 in ${VCF_LIST} by specifying 1 file per line.

Step 2. Generate the final noise file with:

To download a SINE/ALU regions bed for SNV excluded regions

CNV

Generating Panel of Normals (PON)

Somatic WES CNV requires PON files. Follow the two steps below to generate CNV PON:

Target counts generation (per normal sample): Target counts of individual normal sample should be generated as baseline. Any options used for panel of normals generation (BED file, GC Bias Correction, etc) should be matched when processing the case sample.

Combined counts generation: Individual PON counts can be merged into a single file as a <prefix>.combined.counts.txt.gz file.

SV

TMB library specific settings

MSI

Microsatellite sites file

Build Normal references of miscrosatellite repeat distribution

Normal reference files can be generated by running collect-evidence mode on a panel of normal samples. This ONLY works with DRAGEN germline mode.

The --msi-microsatellites-file should be the same file used for running tumor-only mode. --msi-coverage-threshold should also be the same value used for running tumor-only mode.

A minimum of 20 normal samples is required for tumor-only mode.

MSI library specific settings

HLA

Somatic WES Tumor Normal

DRAGEN Recipe - Somatic WES Tumor Normal

Overview

This recipe is for processing whole exome sequencing data for somatic tumor normal workflows.

Example Command Line

For most scenarios, simply creating the union of the command line options from the single caller scenarios will work.

Configure the INPUT options
Configure the OUTPUT options
Configure MAP/ALIGN depending on if realignment is desired or not
Configure the VARIANT CALLERs based on the application
Configure any additional options
Build up the necessary options for each component separately, so that they can be re-used in the final command line.

We recommend using a linear (non-pangenome) reference for somatic analysis. For more details, refer to Dragen Reference Support.

The following are partial templates that can be used as starting points. Adjust them accordingly for your specific use case.

#!/bin/bash
set -euo pipefail

# Path to DRAGEN hashtable
DRAGEN_HASH_TABLE=<REF_DIR>

# Path to output directory for the DRAGEN run
OUTPUT=<OUT_DIR>

# File prefix for DRAGEN output files
PREFIX=<OUT_PREFIX>

# Path to VC systematic noise BED file. In tumor-normal variant calling, this filter
# is recommended for removing systematic noise observed in normal samples. Prebuilt
# systematic noise files are available for download on the DRAGEN Software 
# Support Site page. Alternatively, running the somatic TO pipeline on
# normal samples can generate a systematic noise file. We recommend using a
# systematic noise file based on normal samples that match the library prep of
# the tumor samples. A prebuilt systematic noise BED file can be downloaded from
# https://support.illumina.com/sequencing/sequencing_software/dragen-bio-it-platform/product_files.html
VC_SYSTEMATIC_NOISE_FILE=<VC_SYSTEMATIC_NOISE_BED_FILE_PATH>

# Define the input sources, select fastq list, fastq, bam, or cram.
INPUT_FASTQ_LIST="
  --tumor-fastq-list $TUMOR_FASTQ_LIST \
  --tumor-fastq-list-sample-id $TUMOR_FASTQ_LIST_SAMPLE_ID \
  --fastq-list $FASTQ_LIST \
  --fastq-list-sample-id $FASTQ_LIST_SAMPLE_ID \
"

INPUT_FASTQ="
  --tumor-fastq1 $TUMOR_FASTQ1 \
  --tumor-fastq2 $TUMOR_FASTQ2 \
  --RGSM-tumor $RGSM_TUMOR \
  --RGID-tumor $RGID_TUMOR \
  --fastq-file1 $FASTQ1 \
  --fastq-file2 $FASTQ2 \
  --RGSM $RGSM \
  --RGID $RGID \
"

INPUT_BAM="
  --tumor-bam-input $TUMOR_BAM \
  --bam-input $BAM \
"

INPUT_CRAM="
  --tumor-cram-input $TUMOR_CRAM \
  --cram-input $CRAM \
"

# Select input source, here in this example we use INPUT_FASTQ_LIST
INPUT_OPTIONS="
  --ref-dir $DRAGEN_HASH_TABLE \
  $INPUT_FASTQ_LIST \
"

OUTPUT_OPTIONS="
  --output-directory $OUTPUT \
  --output-file-prefix $PREFIX \
"

MA_OPTIONS="
  --enable-map-align true \
  --enable-sort true \
  --enable-duplicate-marking true \
"

CNV_OPTIONS="
  --enable-cnv true \
  --cnv-target-bed $CNV_TARGET_BED \
  --cnv-combined-counts $CNV_PANEL_OF_NORMALS \
  --cnv-use-somatic-vc-baf true \
"

# HRD requires enabling CNV
HRD_OPTIONS="
--enable-hrd=true \
"

SNV_OPTIONS="
  --enable-variant-caller true \
  --vc-target-bed $VC_TARGET_BED \
  --vc-systematic-noise $VC_SYSTEMATIC_NOISE_FILE \
  --vc-systematic-noise-method mean \
"

SV_OPTIONS="
  --enable-sv true \
  --sv-exome true \
  --sv-call-regions-bed $SV_TARGET_BED \
"

SNV_SV_DEDUPLICATION_OPTIONS="
  --enable-variant-deduplication true \
"

TMB_OPTIONS="
--enable-tmb=true
# Nirvana settings required for TMB
--enable-variant-annotation=true  
--variant-annotation-data=PATH
--variant-annotation-assembly=GRCh37/8
"

MSI_OPTIONS="
--msi-command=tumor-normal \
--msi-coverage-threshold=60 \
--msi-microsatellites-file=$MSI_MICROSATELLITES_FILE \
"

HLA_OPTIONS="
--enable-hla=true \
--hla-enable-class-2=true \ # only if the panel has sufficient coverage for class II HLA typing 
"

# Construct final command line
CMD="
  dragen \
  $INPUT_OPTIONS \
  $OUTPUT_OPTIONS \
  $MA_OPTIONS \
  $CNV_OPTIONS \
  $SNV_OPTIONS \
  $SV_OPTIONS \
  $SNV_SV_DEDUPLICATION_OPTIONS \
  $HRD_OPTIONS \
  $TMB_OPTIONS \
  $MSI_OPTIONS \
  $HLA_OPTIONS \
"

# Execute
echo $CMD
bash -c $CMD

Additional Notes and Options

Optional settings per component are listed below. Full option list at this page.

CNV

Please include the matched normal sample in the CNV panel of normals.

Generating Panel of Normals (PON)

Somatic WES CNV requires PON files. Follow the two steps below to generate CNV PON:

Target counts generation (per normal sample): Target counts of individual normal sample should be generated as baseline. Any options used for panel of normals generation (BED file, GC Bias Correction, etc) should be matched when processing the case sample.

CNV_PON_OPTIONS="
  --enable-cnv true \
  --cnv-target-bed $CNV_TARGET_BED \
"

CMD="
  dragen \
  $INPUT_OPTIONS \
  $OUTPUT_OPTIONS \
  $CNV_PON_OPTIONS \
"

Combined counts generation: Individual PON counts can be merged into a single file as a <prefix>.combined.counts.txt.gz file.

CNV_COMBINED_COUNTS_OPTIONS="
  --enable-cnv true \
  --cnv-generate-combined-counts true \
  --cnv-normals-list $CNV_NORMALS_LIST \
"

CMD="
  dragen \
  $INPUT_OPTIONS \
  $OUTPUT_OPTIONS \
  $CNV_COMBINED_COUNTS_OPTIONS \
"

For more information, see Panel of Normals.

SNV

SNV library specific settings

SNV systematic noise file

Generic SNV noise files can be downloaded here: DRAGEN Software Support Site page

When possible it is recommended to build a pipeline specific systematic noise file that matches the library prep and sequencer of interest:

Step 1. Run DRAGEN somatic tumor-only on each of approximately 20-50 normal samples:

### choose input either from
### i) BAM
INPUT="--tumor-bam-input ${NORMAL_BAM}"
### ii) FASTQs
INPUT="--tumor-fastq-list ${NORMAL_FASTQ_LIST} \
  --tumor-fastq-list-sample-id ${NORMAL_FASTQ_LIST_SAMPLE_ID}"
###

dragen \
-r ${REFERENCE} \
${INPUT} \
--vc-detect-systematic-noise=true \
--vc-enable-germline-tagging=true \
--enable-variant-annotation=true \
--variant-annotation-data ${NIRVANA_ANNOTATION_FOLDER} \
--variant-annotation-assembly ${REF_TYPE} \  # GRCh37 or GRCh38
--intermediate-results-dir ${TMP} \
--output-directory ${DIR} \
--output-file-prefix ${PREFIX}

Gather the full paths to the VCFs from step 1 in ${VCF_LIST} by specifying 1 file per line.

Step 2. Generate the final noise file with:

dragen \
-r ${REF_DIR} \
--build-sys-noise-vcfs-list ${VCF_LIST} \  
--build-sys-noise-method=mean \ # sets the default noise mode for this noise file by tagging the noise file header with '##NoiseMethod=mean' 
--output-directory ${DIR} \
--output-file-prefix ${PREFIX}

To download a SINE/ALU regions bed for SNV excluded regions

The ALU bed file can be downloaded as part of the Bed File Collection: DRAGEN Software Support Site page

SV

SNV-SV deduplication

We recommend using --enable-variant-deduplication true to filter all small indels in the structural variant VCF that appear and are passing in the small variant VCF (PASS in the FILTER column of the small variant VCF file). Using this feature, DRAGEN will create a new VCF that contains variants in SV VCF that are not matching a variant from SNV VCF file. The new deduplicated SV VCF file will have the same prefix passed by --output-file-prefix followed by sv.small_indel_dedup. DRAGEN normalizes variants by trimming and left shifting by up to 500 bases. An instance of utilizing this feature is when incorporating both SV and SNV callers in somatic workflows, which can increase sensitivity and prevent the occurrence of replicated variants within genes such as FLT3 and KMT2A.

MSI

Microsatellite sites file

Microsatellite sites file can be downloaded here: DRAGEN Software Support Site page

HLA

Somatic WES Tumor Only

DRAGEN Recipe - Somatic WES Tumor Only

Overview

This recipe is for processing whole exome sequencing data for somatic tumor only workflows.

Example Command Line

For most scenarios, simply creating the union of the command line options from the single caller scenarios will work.

Configure the INPUT options
Configure the OUTPUT options
Configure MAP/ALIGN depending on if realignment is desired or not
Configure the VARIANT CALLERs based on the application
Configure any additional options
Build up the necessary options for each component separately, so that they can be re-used in the final command line.

We recommend using a linear (non-pangenome) reference for somatic analysis. For more details, refer to .

The following are partial templates that can be used as starting points. Adjust them accordingly for your specific use case.

Additional Notes and Options

CNV

Generating Panel of Normals (PON)

Somatic WES CNV requires PON files. Follow the two steps below to generate CNV PON:

Target counts generation (per normal sample): Target counts of individual normal sample should be generated as baseline. Any options used for panel of normals generation (BED file, GC Bias Correction, etc) should be matched when processing the case sample.

Combined counts generation: Individual PON counts can be merged into a single file as a <prefix>.combined.counts.txt.gz file.

SNV

SNV library specific settings

SNV systematic noise file

When possible it is recommended to build a pipeline specific systematic noise file that matches the library prep and sequencer of interest:

Step 1. Run DRAGEN somatic tumor-only on each of approximately 20-50 normal samples:

Gather the full paths to the VCFs from step 1 in ${VCF_LIST} by specifying 1 file per line.

Step 2. Generate the final noise file with:

To download a SINE/ALU regions bed for SNV excluded regions

SNV-SV deduplication

MSI

Microsatellite sites file

Build Normal references of miscrosatellite repeat distribution

Normal reference files can be generated by running collect-evidence mode on a panel of normal samples. This ONLY works with DRAGEN germline mode.

The --msi-microsatellites-file should be the same file used for running tumor-only mode. --msi-coverage-threshold should also be the same value used for running tumor-only mode.

A minimum of 20 normal samples is required for tumor-only mode.

HLA

Somatic WGS Tumor Normal

DRAGEN Recipe - Somatic WGS Tumor Normal

Overview

This recipe is for processing whole genome sequencing data for somatic tumor normal workflows.

Example Command Line

For most scenarios, simply creating the union of the command line options from the single caller scenarios will work.

Configure the INPUT options
Configure the OUTPUT options
Configure MAP/ALIGN depending on if realignment is desired or not
Configure the VARIANT CALLERs based on the application
Configure any additional options
Build up the necessary options for each component separately, so that they can be re-used in the final command line.

We recommend using a linear (non-pangenome) reference for somatic analysis. For more details, refer to .

The following are partial templates that can be used as starting points. Adjust them accordingly for your specific use case.

Additional Notes and Options

CNV

SNV

SNV library specific settings

SNV systematic noise file

When possible it is recommended to build a pipeline specific systematic noise file that matches the library prep and sequencer of interest:

Step 1. Run DRAGEN somatic tumor-only on each of approximately 20-50 normal samples:

Gather the full paths to the VCFs from step 1 in ${VCF_LIST} by specifying 1 file per line.

Step 2. Generate the final noise file with:

To download a SINE/ALU regions bed for SNV excluded regions

SV

Generating SV systematic noise BEDPE file You can generate systematic noise BEDPE files from normal samples collected using library prep, sequencing system, and panels.

To build the SV systematic noise file

Run DRAGEN somatic tumor-only on normal samples with --sv-detect-systematic-noise set to true to generate VCF output per normal sample.
Build the BEDPE file using the VCFs and the --sv-build-systematic-noise-vcfs-list: List of input VCFs from previous step. Enter one VCF per line. Example command line is provided below

SNV-SV deduplication

MSI

Microsatellite sites file

HLA

Somatic WGS Tumor Only

DRAGEN Recipe - Somatic WGS Tumor Only

Overview

This recipe is for processing whole genome sequencing data for somatic tumor only workflows.

Example Command Line

For most scenarios, simply creating the union of the command line options from the single caller scenarios will work.

Configure the INPUT options
Configure the OUTPUT options
Configure MAP/ALIGN depending on if realignment is desired or not
Configure the VARIANT CALLERs based on the application
Configure any additional options
Build up the necessary options for each component separately, so that they can be re-used in the final command line.

We recommend using a linear (non-pangenome) reference for somatic analysis. For more details, refer to .

The following are partial templates that can be used as starting points. Adjust them accordingly for your specific use case.

Additional Notes and Options

In general (for most libraries and sample types) we recommend the default values, however for some specific libraries or sample types where it may be advisable to use different values those are explicitly listed below each variant caller section under "library specific settings".

CNV

SNV

SNV library specific settings

SNV systematic noise file

When possible it is recommended to build a pipeline specific systematic noise file that matches the library prep and sequencer of interest:

Step 1. Run DRAGEN somatic tumor-only on each of approximately 20-50 normal samples:

Gather the full paths to the VCFs from step 1 in ${VCF_LIST} by specifying 1 file per line.

Step 2. Generate the final noise file with:

To download a SINE/ALU regions bed for SNV excluded regions

SV

SV library-specific settings

To build the SV systematic noise file

You can generate systematic noise BEDPE files from normal samples collected using library prep, sequencing system, and panels.

To generate a BEDPE file, do as follows.

Run DRAGEN somatic tumor-only on normal samples with --sv-detect-systematic-noise set to true to generate VCF output per normal sample.
Build the BEDPE file using the VCFs and the --sv-build-systematic-noise-vcfs-list: List of input VCFs from previous step. Enter one VCF per line. Example command line is provided below

SNV-SV deduplication

MSI

Microsatellite sites file

Build Normal references of miscrosatellite repeat distribution

Normal reference files can be generated by running collect-evidence mode on a panel of normal samples. This ONLY works with DRAGEN germline mode.

The --msi-microsatellites-file should be the same file used for running tumor-only mode. --msi-coverage-threshold should also be the same value used for running tumor-only mode.

A minimum of 20 normal samples is required for tumor-only mode.

HLA

Somatic WGS Heme Tumor Only

DRAGEN Recipe - Somatic WGS Heme Tumor Only

Overview

This recipe is for processing whole genome sequencing data for somatic heme tumor only workflows.

Example Command Line

For most scenarios, simply creating the union of the command line options from the single caller scenarios will work.

Configure the INPUT options
Configure the OUTPUT options
Configure MAP/ALIGN depending on if realignment is desired or not
Configure the VARIANT CALLERs based on the application
Configure any additional options
Build up the necessary options for each component separately, so that they can be re-used in the final command line.

We recommend using a linear (non-pangenome) reference for somatic analysis. For more details, refer to Dragen Reference Support.

The following are partial templates that can be used as starting points. Adjust them accordingly for your specific use case.

#!/bin/bash
set -euo pipefail

# Path to DRAGEN hashtable
DRAGEN_HASH_TABLE=<REF_DIR>

# Path to output directory for the DRAGEN run
OUTPUT=<OUT_DIR>

# File prefix for DRAGEN output files
PREFIX=<OUT_PREFIX>

# Population SNP VCF. It can be retrieved from catalogs of population variation
# such as the 1000 genome project or other large cohort discovery efforts.
# Only high-frequency SNPs should be included. A suitable file can be retrieved
# from the GATK resource bundle: 1000G_phase1.snps.high_confidence.vcf.gz
CNV_POP_VCF=<POPULATION_VCF_PATH>

# Path to VC systematic noise BED file. In tumor-only variant calling, this filter
# is essential for removing systematic noise observed in normal samples. Prebuilt
# systematic noise files are available for download on the DRAGEN Software
# Support Site page. Alternatively, running the somatic TO pipeline on
# normal samples can generate a systematic noise file. We recommend using a
# systematic noise file based on normal samples that match the library prep of
# the tumor samples. A prebuilt systematic noise BED file can be downloaded from
# https://support.illumina.com/sequencing/sequencing_software/dragen-bio-it-platform/product_files.html
VC_SYSTEMATIC_NOISE_FILE=<VC_SYSTEMATIC_NOISE_BED_FILE_PATH>

# The Nirvana annotation database is downloadable at 
# https://support-docs.illumina.com/SW/DRAGEN_v310/Content/SW/DRAGEN/IAE_DownloadData.htm
NIRVANA_ANNOTATION_FOLDER=<NIRVANA_ANNOTATION_FOLDER_PATH>

# Define the input sources, select fastq list, fastq, bam, or cram.
INPUT_FASTQ_LIST="
  --tumor-fastq-list $TUMOR_FASTQ_LIST \
  --tumor-fastq-list-sample-id $TUMOR_FASTQ_LIST_SAMPLE_ID \
"

INPUT_FASTQ="
  --tumor-fastq1 $TUMOR_FASTQ1 \
  --tumor-fastq2 $TUMOR_FASTQ2 \
  --RGSM-tumor $RGSM_TUMOR \
  --RGID-tumor $RGID_TUMOR \
"

INPUT_BAM="
  --tumor-bam-input $TUMOR_BAM \
"

INPUT_CRAM="
  --tumor-cram-input $TUMOR_CRAM \
"

# Select input source, here in this example we use INPUT_FASTQ_LIST
INPUT_OPTIONS="
  --ref-dir $DRAGEN_HASH_TABLE \
  $INPUT_FASTQ_LIST \
"

OUTPUT_OPTIONS="
  --output-directory $OUTPUT \
  --output-file-prefix $PREFIX \
"

MA_OPTIONS="
  --enable-map-align true \
  --enable-sort true \
  --enable-duplicate-marking true \
"

CNV_OPTIONS="
  --heme-cnv true \
  --cnv-population-b-allele-vcf $CNV_POP_VCF \
"

QC_OPTIONS="
  --gc-metrics-enable=true \
"
SNV_OPTIONS="
  --enable-variant-caller true \
  --vc-systematic-noise $VC_SYSTEMATIC_NOISE_FILE \
  --vc-enable-germline-tagging true \
  --enable-variant-annotation true \
  --variant-annotation-data $NIRVANA_ANNOTATION_FOLDER \
  --variant-annotation-assembly $REF_TYPE \  # GRCh37 or GRCh38
"

SV_OPTIONS="
  --heme-sv true \
  --sv-systematic-noise $SV_SYSTEMATIC_NOISE_BEDPE \
"

DUX4_OPTIONS="
  --enable-dux4-caller true \
"
SNV_SV_DEDUPLICATION_OPTIONS="
  --enable-variant-deduplication true \
"

# Construct final command line
CMD="
  dragen \
  $INPUT_OPTIONS \
  $OUTPUT_OPTIONS \
  $MA_OPTIONS \
  $QC_OPTIONS \
  $CNV_OPTIONS \
  $SNV_OPTIONS \
  $SV_OPTIONS \
  $DUX4_OPTIONS \
  $SNV_SV_DEDUPLICATION_OPTIONS \
"

# Execute
echo $CMD
bash -c $CMD

Additional Notes and Options

Optional settings per component are listed below. Full option list at this page.

CNV

SNV

SNV systematic noise file

Generic SNV noise files (including a HEME specific WGS noise file) can be downloaded here: DRAGEN Software Support Site page

When possible it is recommended to build a pipeline specific systematic noise file that matches the library prep and sequencer of interest:

Step 1. Run DRAGEN somatic tumor-only on each of approximately 50 normal samples:

### choose input either from
### i) BAM
INPUT="--tumor-bam-input ${NORMAL_BAM}"
### ii) FASTQs
INPUT="--tumor-fastq-list ${NORMAL_FASTQ_LIST} \
  --tumor-fastq-list-sample-id ${NORMAL_FASTQ_LIST_SAMPLE_ID}"
###

dragen \
-r ${REFERENCE} \
${INPUT} \
--enable-variant-caller true \
--vc-detect-systematic-noise true \
--build-sys-noise-germline-vaf-threshold=1 \
--vc-enable-germline-tagging true \
--enable-variant-annotation true \
--variant-annotation-data ${NIRVANA_ANNOTATION_FOLDER} \
--variant-annotation-assembly ${REF_TYPE} \  # GRCh37 or GRCh38
--intermediate-results-dir ${TMP} \
--output-directory ${DIR} \
--output-file-prefix ${PREFIX}

Gather the full paths to the VCFs from step 1 in ${VCF_LIST} by specifying 1 file per line.

Step 2. Generate the final noise file with:

dragen \
-r ${REF_DIR} \
--build-sys-noise-vcfs-list ${VCF_LIST} \  
--build-sys-noise-method max \
--intermediate-results-dir ${TMP} \
--output-directory ${DIR} \
--output-file-prefix ${PREFIX}

SV

To build the SV systematic noise file

You can generate systematic noise BEDPE files from normal samples collected using library prep, sequencing system, and panels.

To generate a BEDPE file, do as follows.

Run DRAGEN somatic tumor-only on normal samples with --sv-detect-systematic-noise set to true to generate VCF output per normal sample.
Build the BEDPE file using the VCFs and the --sv-build-systematic-noise-vcfs-list: List of input VCFs from previous step. Enter one VCF per line. Example command line is provided below

dragen \
-r <HASHTABLE> \
--sv-build-systematic-noise-vcfs-list <LIST OF VCF FILES>
--output-directory <OUTPUT> \
--output-file-prefix <SAMPLE> \

You can also build systematic noise BEDPE files in the cloud using the DRAGEN Baseline Builder App on BaseSpace.

Pre-built SV systematic noise file

The following prebuilt systematic noise files for WGS are available for download on the DRAGEN Software Support Site page. To generate these noise files, we used 46 unrelated normal samples.

SNV-SV deduplication

We recommend using --enable-variant-deduplication true to filter all small indels in the structural variant VCF that appear and are passing in the small variant VCF (PASS in the FILTER column of the small variant VCF file). Using this feature, DRAGEN will create a new VCF that contains variants in SV VCF that are not matching a variant from SNV VCF file. The new deduplicated SV VCF file will have the same prefix passed by --output-file-prefix followed by sv.small_indel_dedup.vcf.gz. DRAGEN normalizes variants by trimming and left shifting by up to 500 bases. An instance of utilizing this feature is when incorporating both SV and SNV callers in somatic workflows, which can increase sensitivity and prevent the occurrence of replicated variants within genes such as FLT3 and KMT2A.

RNA WTS

DRAGEN Recipe - RNA Whole Transcriptome Sequencing (WTS)

Overview

This recipe is for processing Whole Transcriptome Sequencing data for RNA workflows.

Example Command Line

For most scenarios, simply creating the union of the command line options from the single caller scenarios will work.

Configure the INPUT options
Configure the OUTPUT options
Configure the RNA MAP/ALIGN options
Configure the QUANT options
Configure the SPLICE options
Configure the FUSION options
Configure the VARIANT options

We recommend using a linear (non-pangenome) reference for RNA analysis. For more details, refer to Dragen Reference Support.

The following are partial templates that can be used as starting points. Adjust them accordingly for your specific use case.

#!/bin/bash
set -euo pipefail

# Path to DRAGEN hashtable
DRAGEN_HASH_TABLE=<REF_DIR>

# Path to output directory for the DRAGEN run
OUTPUT=<OUT_DIR>

# File prefix for DRAGEN output files
PREFIX=<OUT_PREFIX>

# Define the input sources, select fastq list, fastq, bam, or cram.
INPUT_FASTQ_LIST="
  --fastq-list $FASTQ_LIST \
  --fastq-list-sample-id $FASTQ_LIST_SAMPLE_ID \
"

INPUT_FASTQ="
  --fastq-file1 $FASTQ1 \
  --fastq-file2 $FASTQ2 \
  --RGSM $RGSM \
  --RGID $RGID \
"

# You could use the tumor fastq options to provide the FASTQ files.
INPUT_TUMOR_FASTQ="
  --tumor-fastq1 $FASTQ1 \
  --tumor-fastq2 $FASTQ2 \
  --RGSM $RGSM \
  --RGID $RGID \
"

INPUT_BAM="
  --bam-input $BAM \
"

INPUT_CRAM="
  --cram-input $CRAM \
"

# Select input source, here in this example we use INPUT_FASTQ_LIST
INPUT_OPTIONS="
  --ref-dir $DRAGEN_HASH_TABLE \
  $INPUT_FASTQ_LIST \
"

OUTPUT_OPTIONS="
  --output-directory $OUTPUT \
  --output-file-prefix $PREFIX \
"

# RNA aligner requires an annotation file in GTF or GFF3 format.
GTF=<GTF_PATH>

# RNA pipeline requires map-align to be true.
RNA_MAP_OPTIONS="
  --enable-rna true \
  --enable-map-align true \
  --annotation-file $GTF \
"

# You should set the library according to the read orientations.
# The options are IU, ISR, ISF, U, SR, or SF. Or set it to A to automatically detect the correct read orientation.
QUANT_OPTIONS="
  --enable-rna-quantification true \
  --rna-library-type IU \
  --rna-quantification-gc-bias true \
"

SPLICE_OPTIONS="
  --enable-rna-splice-variant true \
"

FUSION_OPTIONS="
  --enable-rna-gene-fusion true \
"

# To call variants, you need to set a bed file with target regions to call. 
# This bed could contain all exones.
VARIANT_OPTIONS="
  --enable-variant-caller true \
  --vc-target-bed $TARGET_BED
"

# Construct final command line
CMD="
  dragen \
  $INPUT_OPTIONS \
  $OUTPUT_OPTIONS \
  $RNA_MAP_OPTIONS \
  $QUANT_OPTIONS \
  $SPLICE_OPTIONS \
  $FUSION_OPTIONS \
  $VARIANT_OPTIONS 
"

# Execute
echo $CMD
bash -c $CMD

Additional Notes and Options

For SPLICE options, you can provide a list of normal slice variants to reduce noisy calls. The file should be a tab separated file with the following first four columns:

contig name
first base of the splice junction (1-based)
last base of the splice junction (1-based)
strand (0: undefined, 1: +, 2: -) Use the optional option --rna-splice-variant-normals <SPLICE_NORMAL_FILE_PATH> to provide the normal splice variants.

RNA Panel

DRAGEN Recipe - RNA Panel Sequencing

Overview

This recipe is for processing panel data for RNA workflows.

Example Command Line

For most scenarios, simply creating the union of the command line options from the single caller scenarios will work.

Configure the INPUT options
Configure the OUTPUT options
Configure the RNA MAP/ALIGN options
Configure the QUANT options
Configure the SPLICE options
Configure the FUSION options
Configure the VARIANT options

We recommend using a linear (non-pangenome) reference for RNA analysis. For more details, refer to Dragen Reference Support.

The following are partial templates that can be used as starting points. Adjust them accordingly for your specific use case.

#!/bin/bash
set -euo pipefail

# Path to DRAGEN hashtable
DRAGEN_HASH_TABLE=<REF_DIR>

# Path to output directory for the DRAGEN run
OUTPUT=<OUT_DIR>

# File prefix for DRAGEN output files
PREFIX=<OUT_PREFIX>

# Define the input sources, select fastq list, fastq files, bam, or cram.
INPUT_FASTQ_LIST="
  --fastq-list $FASTQ_LIST \
  --fastq-list-sample-id $FASTQ_LIST_SAMPLE_ID \
"

INPUT_FASTQ="
  --fastq-file1 $FASTQ1 \
  --fastq-file2 $FASTQ2 \
  --RGSM $RGSM \
  --RGID $RGID \
"

# You could use the tumor fastq options to provide the FASTQ files.
INPUT_TUMOR_FASTQ="
  --tumor-fastq1 $FASTQ1 \
  --tumor-fastq2 $FASTQ2 \
  --RGSM $RGSM \
  --RGID $RGID \
"

INPUT_BAM="
  --bam-input $BAM \
"

INPUT_CRAM="
  --cram-input $CRAM \
"

# Select input source, here in this example we use INPUT_FASTQ_LIST.
INPUT_OPTIONS="
  --ref-dir $DRAGEN_HASH_TABLE \
  $INPUT_FASTQ_LIST \
"


OUTPUT_OPTIONS="
  --output-directory $OUTPUT \
  --output-file-prefix $PREFIX \
"

# When running a panel you need to set the target regions, accordingly. 
# This file should be in BED format with the region name in the fourth column.
INPUT_PANEL_BED=<TARGET_REGIONS_PATH>

# RNA aligner requires an annotation file in GTF or GFF3 format.
GTF=<GTF_PATH>

# RNA pipeline requires map-align to be true.
RNA_MAP_OPTIONS="
  --enable-rna true \
  --enable-map-align true \
  --annotation-file $GTF \
"

# You should set the library according to the read orientations.
# The options for orientation are IU, ISR, ISF, U, SR, or SF. To automatically detect the correct read orientation, set this option to A.
QUANT_OPTIONS="
  --enable-rna-quantification true \
  --rna-library-type IU \
  --rna-quantification-gc-bias true \
"

# For panels, you can set a target region bed file. 
# This BED file has the name of the region in the fourth column.
SPLICE_OPTIONS="
  --enable-rna-splice-variant \
  --rna-splice-variant-regions $INPUT_PANEL_BED
"

# For panels, the list of enriched genes should be set. 
# You can select a list of genes or a list of regions in BED format. 
FUSION_OPTIONS="
  --enable-rna-gene-fusion true \
  --rna-gf-enriched-regions $INPUT_PANEL_BED \
"

VARIANT_OPTIONS="
  --enable-variant-caller true \
  --vc-target-bed $INPUT_PANEL_BED
"

# Construct final command line
CMD="
  dragen \
  $INPUT_OPTIONS \
  $OUTPUT_OPTIONS \
  $RNA_MAP_OPTIONS \
  $QUANT_OPTIONS \
  $SPLICE_OPTIONS \
  $FUSION_OPTIONS \
  $VARIANT_OPTIONS 
"

# Execute
echo $CMD
bash -c $CMD

Additional Notes and Options

Amplicon data

If you are running amplicon, you need to set --enable-rna-amplicon true --amplicon-target-bed <AMPLICON_BED_PATH>.

If RNA amplicon mode is enabled and the amplicon bed file already includes the gene name, then you do not need to set the ENRICH options option; DRAGEN will read the enriched genes names from the amplicon BED file (fifth column).

SPLICE options

You can provide a list of normal slice variants to reduce noisy calls. The file should be a tab separated file with the following first four columns:

contig name
first base of the splice junction (1-based)
last base of the splice junction (1-based)
strand (0: undefined, 1: +, 2: -) Use the optional option --rna-splice-variant-normals <SPLICE_NORMAL_FILE_PATH> to provide the normal splice variants.

#!/bin/bash set -euo pipefail # Path to DRAGEN hashtable DRAGEN_HASH_TABLE=<REF_DIR> # Path to output directory for the DRAGEN run OUTPUT=<OUT_DIR> # File prefix for DRAGEN output files PREFIX=<OUT_PREFIX> # Path to VC systematic noise BED file. In tumor-only variant calling, this filter # is essential for removing systematic noise observed in normal samples. Prebuilt # systematic noise files are available for download on the DRAGEN Software # Support Site page. Alternatively, running the somatic TO pipeline on # normal samples can generate a systematic noise file. We recommend using a # systematic noise file based on normal samples that match the library prep of # the tumor samples. A prebuilt systematic noise BED file can be downloaded from # https://support.illumina.com/sequencing/sequencing_software/dragen-bio-it-platform/product_files.html VC_SYSTEMATIC_NOISE_FILE=<VC_SYSTEMATIC_NOISE_BED_FILE_PATH> # Define the input sources, select fastq list, fastq, bam, or cram. INPUT_FASTQ_LIST=" --tumor-fastq-list $TUMOR_FASTQ_LIST \ --tumor-fastq-list-sample-id $TUMOR_FASTQ_LIST_SAMPLE_ID \ " INPUT_FASTQ=" --tumor-fastq1 $TUMOR_FASTQ1 \ --tumor-fastq2 $TUMOR_FASTQ2 \ --RGSM-tumor $RGSM_TUMOR \ --RGID-tumor $RGID_TUMOR \ " INPUT_BAM=" --tumor-bam-input $TUMOR_BAM \ --bam-input $BAM \ " INPUT_CRAM=" --tumor-cram-input $TUMOR_CRAM \ --cram-input $CRAM \ " # Select input source, here in this example we use INPUT_FASTQ_LIST INPUT_OPTIONS=" --ref-dir $DRAGEN_HASH_TABLE \ $INPUT_FASTQ_LIST \ " OUTPUT_OPTIONS=" --output-directory $OUTPUT \ --output-file-prefix $PREFIX \ " MA_OPTIONS=" --enable-map-align true \ --enable-sort true \ " UMI_OPTIONS=" --enable-umi true \ --umi-source $UMI_SOURCE \ --umi-library-type $UMI_LIBRARY_TYPE \ " SNV_OPTIONS=" --enable-variant-caller true \ --vc-enable-umi-solid true or --vc-enable-umi-liquid true \ --vc-target-bed $VC_TARGET_BED \ --vc-systematic-noise $VC_SYSTEMATIC_NOISE_FILE \ --vc-systematic-noise-method mean \ --vc-enable-germline-tagging true \ --enable-variant-annotation true \ --variant-annotation-data $NIRVANA_ANNOTATION_FOLDER \ --variant-annotation-assembly $REF_TYPE \ # GRCh37 or GRCh38 " CNV_OPTIONS=" --enable-cnv true \ --cnv-target-bed $CNV_TARGET_BED \ --cnv-combined-counts $CNV_PANEL_OF_NORMALS \ --cnv-population-b-allele-vcf $CNV_POP_VCF \ " # HRD requires enabling CNV HRD_OPTIONS=" --enable-hrd=true \ " SV_OPTIONS=" --enable-sv true \ --sv-exome true \ --sv-call-regions-bed $SV_TARGET_BED \ " TMB_OPTIONS=" --enable-tmb=true # Nirvana settings required for TMB --enable-variant-annotation=true --variant-annotation-data=PATH --variant-annotation-assembly=GRCh37/8 " MSI_OPTIONS=" --msi-command=tumor-only \ --msi-coverage-threshold=60 \ --msi-microsatellites-file=$MSI_MICROSATELLITES_FILE \ --msi-ref-normal-dir=$MSI_REFERENCE_NORMAL_FOLDER \ " HLA_OPTIONS=" --enable-hla=true \ --hla-as-filter-min-threshold=29.0 \ --hla-as-filter-ratio-threshold=0.85 \ --hla-enable-class-2=true \ # only if the panel has sufficient coverage for class II HLA typing " # Construct final command line CMD=" dragen \ $INPUT_OPTIONS \ $OUTPUT_OPTIONS \ $MA_OPTIONS \ $UMI_OPTIONS \ $SNV_OPTIONS \ $CNV_OPTIONS \ $HRD_OPTIONS \ $SV_OPTIONS \ $TMB_OPTIONS \ $MSI_OPTIONS \ $HLA_OPTIONS \ " # Execute echo $CMD bash -c $CMD

#!/bin/bash set -euo pipefail # Path to DRAGEN hashtable DRAGEN_HASH_TABLE=<REF_DIR> # Path to output directory for the DRAGEN run OUTPUT=<OUT_DIR> # File prefix for DRAGEN output files PREFIX=<OUT_PREFIX> # Path to VC systematic noise BED file. In tumor-only variant calling, this filter # is essential for removing systematic noise observed in normal samples. Prebuilt # systematic noise files are available for download on the DRAGEN Software # Support Site page. Alternatively, running the somatic TO pipeline on # normal samples can generate a systematic noise file. We recommend using a # systematic noise file based on normal samples that match the library prep of # the tumor samples. A prebuilt systematic noise BED file can be downloaded from # https://support.illumina.com/sequencing/sequencing_software/dragen-bio-it-platform/product_files.html VC_SYSTEMATIC_NOISE_FILE=<VC_SYSTEMATIC_NOISE_BED_FILE_PATH> # Define the input sources, select fastq list, fastq, bam, or cram. INPUT_FASTQ_LIST=" --tumor-fastq-list $TUMOR_FASTQ_LIST \ --tumor-fastq-list-sample-id $TUMOR_FASTQ_LIST_SAMPLE_ID \ " INPUT_FASTQ=" --tumor-fastq1 $TUMOR_FASTQ1 \ --tumor-fastq2 $TUMOR_FASTQ2 \ --RGSM-tumor $RGSM_TUMOR \ --RGID-tumor $RGID_TUMOR \ " INPUT_BAM=" --tumor-bam-input $TUMOR_BAM \ --bam-input $BAM \ " INPUT_CRAM=" --tumor-cram-input $TUMOR_CRAM \ --cram-input $CRAM \ " # Select input source, here in this example we use INPUT_FASTQ_LIST INPUT_OPTIONS=" --ref-dir $DRAGEN_HASH_TABLE \ $INPUT_FASTQ_LIST \ " OUTPUT_OPTIONS=" --output-directory $OUTPUT \ --output-file-prefix $PREFIX \ " MA_OPTIONS=" --enable-map-align true \ --enable-sort true \ --enable-duplicate-marking true \ " CNV_OPTIONS=" --enable-cnv true \ --cnv-target-bed $CNV_TARGET_BED \ --cnv-combined-counts $CNV_PANEL_OF_NORMALS \ --cnv-population-b-allele-vcf $CNV_POP_VCF \ " # HRD requires enabling CNV HRD_OPTIONS=" --enable-hrd=true \ " SNV_OPTIONS=" --vc-target-bed $VC_TARGET_BED \ --vc-systematic-noise $VC_SYSTEMATIC_NOISE_FILE \ --vc-systematic-noise-method mean \ --vc-enable-germline-tagging true \ --enable-variant-annotation true \ --variant-annotation-data $NIRVANA_ANNOTATION_FOLDER \ --variant-annotation-assembly $REFERENCE \ " SV_OPTIONS=" --enable-sv true \ --sv-exome true \ --sv-call-regions-bed $SV_TARGET_BED \ " SNV_SV_DEDUPLICATION_OPTIONS=" --enable-variant-deduplication true \ " TMB_OPTIONS=" --enable-tmb=true # Nirvana settings required for TMB --enable-variant-annotation=true --variant-annotation-data=PATH --variant-annotation-assembly=GRCh37/8 " MSI_OPTIONS=" --msi-command=tumor-only \ --msi-coverage-threshold=60 \ --msi-microsatellites-file=$MSI_MICROSATELLITES_FILE \ --msi-ref-normal-dir=$MSI_REFERENCE_NORMAL_FOLDER \ " HLA_OPTIONS=" --enable-hla=true \ --hla-enable-class-2=true \ # only if the panel has sufficient coverage for class II HLA typing " # Construct final command line CMD=" dragen \ $INPUT_OPTIONS \ $OUTPUT_OPTIONS \ $MA_OPTIONS \ $CNV_OPTIONS \ $HRD_OPTIONS \ $SNV_OPTIONS \ $SV_OPTIONS \ $SNV_SV_DEDUPLICATION_OPTIONS \ $TMB_OPTIONS \ $MSI_OPTIONS \ $HLA_OPTIONS \ " # Execute echo $CMD bash -c $CMD

#!/bin/bash set -euo pipefail # Path to DRAGEN hashtable DRAGEN_HASH_TABLE=<REF_DIR> # Path to output directory for the DRAGEN run OUTPUT=<OUT_DIR> # File prefix for DRAGEN output files PREFIX=<OUT_PREFIX> # Population SNP VCF. It can be retrieved from catalogs of population variation # such as the 1000 genome project or other large cohort discovery efforts. # Only high-frequency SNPs should be included. A suitable file can be retrieved # from the GATK resource bundle: 1000G_phase1.snps.high_confidence.vcf.gz CNV_POP_VCF=<POPULATION_VCF_PATH> # Path to VC systematic noise BED file. In tumor-only variant calling, this filter # is essential for removing systematic noise observed in normal samples. Prebuilt # systematic noise files are available for download on the DRAGEN Software # Support Site page. Alternatively, running the somatic TO pipeline on # normal samples can generate a systematic noise file. We recommend using a # systematic noise file based on normal samples that match the library prep of # the tumor samples. A prebuilt systematic noise BED file can be downloaded from # https://support.illumina.com/sequencing/sequencing_software/dragen-bio-it-platform/product_files.html VC_SYSTEMATIC_NOISE_FILE=<VC_SYSTEMATIC_NOISE_BED_FILE_PATH> # The Nirvana annotation database is downloadable at # https://support-docs.illumina.com/SW/DRAGEN_v310/Content/SW/DRAGEN/IAE_DownloadData.htm NIRVANA_ANNOTATION_FOLDER=<NIRVANA_ANNOTATION_FOLDER_PATH> # Define the input sources, select fastq list, fastq, bam, or cram. INPUT_FASTQ_LIST=" --tumor-fastq-list $TUMOR_FASTQ_LIST \ --tumor-fastq-list-sample-id $TUMOR_FASTQ_LIST_SAMPLE_ID \ " INPUT_FASTQ=" --tumor-fastq1 $TUMOR_FASTQ1 \ --tumor-fastq2 $TUMOR_FASTQ2 \ --RGSM-tumor $RGSM_TUMOR \ --RGID-tumor $RGID_TUMOR \ " INPUT_BAM=" --tumor-bam-input $TUMOR_BAM \ " INPUT_CRAM=" --tumor-cram-input $TUMOR_CRAM \ " # Select input source, here in this example we use INPUT_FASTQ_LIST INPUT_OPTIONS=" --ref-dir $DRAGEN_HASH_TABLE \ $INPUT_FASTQ_LIST \ " OUTPUT_OPTIONS=" --output-directory $OUTPUT \ --output-file-prefix $PREFIX \ " MA_OPTIONS=" --enable-map-align true \ --enable-sort true \ --enable-duplicate-marking true \ " CNV_OPTIONS=" --enable-cnv true \ --cnv-population-b-allele-vcf $CNV_POP_VCF \ " # HRD requires enabling CNV HRD_OPTIONS=" --enable-hrd=true \ " SNV_OPTIONS=" --enable-variant-caller true \ --vc-systematic-noise $VC_SYSTEMATIC_NOISE_FILE \ --vc-enable-germline-tagging true \ --enable-variant-annotation true \ --variant-annotation-data $NIRVANA_ANNOTATION_FOLDER \ --variant-annotation-assembly $REF_TYPE \ # GRCh37 or GRCh38 " SV_OPTIONS=" --enable-sv true \ --sv-systematic-noise $SV_SYSTEMATIC_NOISE_BEDPE \ " SNV_SV_DEDUPLICATION_OPTIONS=" --enable-variant-deduplication true \ " TMB_OPTIONS=" --enable-tmb=true " MSI_OPTIONS=" --msi-command=tumor-only \ --msi-coverage-threshold=60 \ --msi-microsatellites-file=$MSI_MICROSATELLITES_FILE \ --msi-ref-normal-dir=$MSI_REFERENCE_NORMAL_FOLDER \ " HLA_OPTIONS=" --enable-hla=true \ --hla-enable-class-2=true \ # only if the panel has sufficient coverage for class II HLA typing " # Construct final command line CMD=" dragen \ $INPUT_OPTIONS \ $OUTPUT_OPTIONS \ $MA_OPTIONS \ $CNV_OPTIONS \ $SNV_OPTIONS \ $SV_OPTIONS \ $SNV_SV_DEDUPLICATION_OPTIONS \ $HRD_OPTIONS \ $TMB_OPTIONS \ $MSI_OPTIONS \ $HLA_OPTIONS \ " # Execute echo $CMD bash -c $CMD

#!/bin/bash set -euo pipefail # Path to DRAGEN hashtable DRAGEN_HASH_TABLE=<REF_DIR> # Path to output directory for the DRAGEN run OUTPUT=<OUT_DIR> # File prefix for DRAGEN output files PREFIX=<OUT_PREFIX> # Path to VC systematic noise BED file. In tumor-normal variant calling, this filter # is recommended for removing systematic noise observed in normal samples. Prebuilt # systematic noise files are available for download on the DRAGEN Software # Support Site page. Alternatively, running the somatic TO pipeline on # normal samples can generate a systematic noise file. We recommend using a # systematic noise file based on normal samples that match the library prep of # the tumor samples. A prebuilt systematic noise BED file can be downloaded from # https://support.illumina.com/sequencing/sequencing_software/dragen-bio-it-platform/product_files.html VC_SYSTEMATIC_NOISE_FILE=<VC_SYSTEMATIC_NOISE_BED_FILE_PATH> # Define the input sources, select fastq list, fastq, bam, or cram. INPUT_FASTQ_LIST=" --tumor-fastq-list $TUMOR_FASTQ_LIST \ --tumor-fastq-list-sample-id $TUMOR_FASTQ_LIST_SAMPLE_ID \ --fastq-list $FASTQ_LIST \ --fastq-list-sample-id $FASTQ_LIST_SAMPLE_ID \ " INPUT_FASTQ=" --tumor-fastq1 $TUMOR_FASTQ1 \ --tumor-fastq2 $TUMOR_FASTQ2 \ --RGSM-tumor $RGSM_TUMOR \ --RGID-tumor $RGID_TUMOR \ --fastq-file1 $FASTQ1 \ --fastq-file2 $FASTQ2 \ --RGSM $RGSM \ --RGID $RGID \ " INPUT_BAM=" --tumor-bam-input $TUMOR_BAM \ --bam-input $BAM \ " INPUT_CRAM=" --tumor-cram-input $TUMOR_CRAM \ --cram-input $CRAM \ " # Select input source, here in this example we use INPUT_FASTQ_LIST INPUT_OPTIONS=" --ref-dir $DRAGEN_HASH_TABLE \ $INPUT_FASTQ_LIST \ " OUTPUT_OPTIONS=" --output-directory $OUTPUT \ --output-file-prefix $PREFIX \ " MA_OPTIONS=" --enable-map-align true \ --enable-sort true \ --enable-duplicate-marking true \ " CNV_OPTIONS=" --enable-cnv true \ --cnv-use-somatic-vc-baf true \ " # HRD requires enabling CNV HRD_OPTIONS=" --enable-hrd=true \ " SNV_OPTIONS=" --enable-variant-caller true \ --vc-systematic-noise $VC_SYSTEMATIC_NOISE_FILE \ " SV_OPTIONS=" --enable-sv true \ " SNV_SV_DEDUPLICATION_OPTIONS=" --enable-variant-deduplication true \ " TMB_OPTIONS=" --enable-tmb=true # Nirvana settings required for TMB --enable-variant-annotation=true --variant-annotation-data=PATH --variant-annotation-assembly=GRCh37/8 " MSI_OPTIONS=" --msi-command=tumor-normal \ --msi-coverage-threshold=60 \ --msi-microsatellites-file=$MSI_MICROSATELLITES_FILE \ " HLA_OPTIONS=" --enable-hla=true \ --hla-enable-class-2=true \ # only if the panel has sufficient coverage for class II HLA typing " # Construct final command line CMD=" dragen \ $INPUT_OPTIONS \ $OUTPUT_OPTIONS \ $MA_OPTIONS \ $CNV_OPTIONS \ $SNV_OPTIONS \ $SV_OPTIONS \ $SNV_SV_DEDUPLICATION_OPTIONS \ $HRD_OPTIONS \ $TMB_OPTIONS \ $MSI_OPTIONS \ $HLA_OPTIONS \ " # Execute echo $CMD bash -c $CMD

Somatic Tumor Normal with UMI

DRAGEN Recipe - Somatic UMI Tumor Normal

Overview

This recipe is for processing sequencing data with unique molecular identifier (UMI) for somatic tumor normal workflows.

Example Command Line

For Somatic UMI Tumor Normal inputs, tumor and normal sample need to be run separately for the Map/Align stage, and then Variant Calling is started from tumor and normal UMI collapsed BAM.

For Map/Align stage:

Configure the INPUT options
Configure the OUTPUT options
Configure MAP/ALIGN
Configure UMI options

For Variant Calling stage:

Configure the INPUT options
Configure the OUTPUT options
Configure the VARIANT CALLERs based on the application
Configure any additional options
Build up the necessary options for each component separately, so that they can be re-used in the final command line.

We recommend using a linear (non-pangenome) reference for somatic analysis. For more details, refer to Dragen Reference Support.

The following are partial templates that can be used as starting points. Adjust them accordingly for your specific use case.

Map/Align stage

#!/bin/bash
set -euo pipefail

# Path to DRAGEN hashtable
DRAGEN_HASH_TABLE=<REF_DIR>

# Path to output directory for the DRAGEN run
OUTPUT=<OUT_DIR>

# File prefix for DRAGEN output files
PREFIX=<OUT_PREFIX>

# Define the input sources, select fastq list, fastq, bam, or cram. Please select either tumor or normal input with UMI to generate collapsed BAM. In this example, we use tumor input option.
INPUT_FASTQ_LIST="
  --tumor-fastq-list $TUMOR_FASTQ_LIST \
  --tumor-fastq-list-sample-id $TUMOR_FASTQ_LIST_SAMPLE_ID \
"

INPUT_FASTQ="
  --tumor-fastq1 $TUMOR_FASTQ1 \
  --tumor-fastq2 $TUMOR_FASTQ2 \
  --RGSM-tumor $RGSM_TUMOR \
  --RGID-tumor $RGID_TUMOR \
"

INPUT_BAM="
  --tumor-bam-input $TUMOR_BAM \
"

INPUT_CRAM="
  --tumor-cram-input $TUMOR_CRAM \
"

# Select input source, here in this example we use INPUT_FASTQ_LIST
INPUT_OPTIONS="
  --ref-dir $DRAGEN_HASH_TABLE \
  $INPUT_FASTQ_LIST \
"

OUTPUT_OPTIONS="
  --output-directory $OUTPUT \
  --output-file-prefix $PREFIX \
"

MA_OPTIONS="
  --enable-map-align true \
  --enable-sort true \
"

UMI_OPTIONS="
  --enable-umi true \
  --umi-source $UMI_SOURCE \
  --umi-library-type $UMI_LIBRARY_TYPE \
"

# Construct final command line
CMD="
  dragen \
  $INPUT_OPTIONS \
  $OUTPUT_OPTIONS \
  $MA_OPTIONS \
  $UMI_OPTIONS \
"

# Execute
echo $CMD
bash -c $CMD

Variant Calling (and optional biomarkers) stage:

#!/bin/bash
set -euo pipefail

# Path to DRAGEN hashtable
DRAGEN_HASH_TABLE=<REF_DIR>

# Path to output directory for the DRAGEN run
OUTPUT=<OUT_DIR>

# File prefix for DRAGEN output files
PREFIX=<OUT_PREFIX>

# Path to VC systematic noise BED file. In tumor-normal variant calling, this filter
# is recommended for removing systematic noise observed in normal samples. Prebuilt
# systematic noise files are available for download on the DRAGEN Software 
# Support Site page. Alternatively, running the somatic TO pipeline on
# normal samples can generate a systematic noise file. We recommend using a
# systematic noise file based on normal samples that match the library prep of
# the tumor samples. A prebuilt systematic noise BED file can be downloaded from
# https://support.illumina.com/sequencing/sequencing_software/dragen-bio-it-platform/product_files.html
VC_SYSTEMATIC_NOISE_FILE=<VC_SYSTEMATIC_NOISE_BED_FILE_PATH>

INPUT_BAM="
  --tumor-bam-input $TUMOR_BAM \
  --bam-input $BAM \
"

INPUT_OPTIONS="
  --ref-dir $DRAGEN_HASH_TABLE \
  $INPUT_BAM \
"

OUTPUT_OPTIONS="
  --output-directory $OUTPUT \
  --output-file-prefix $PREFIX \
"

SNV_OPTIONS="
  --enable-variant-caller true \
  --vc-enable-umi-solid true or --vc-enable-umi-liquid true \
  --vc-target-bed $VC_TARGET_BED \
  --vc-systematic-noise $VC_SYSTEMATIC_NOISE_FILE \
  --vc-systematic-noise-method mean \
"

CNV_OPTIONS="
  --enable-cnv true \
  --cnv-target-bed $CNV_TARGET_BED \
  --cnv-combined-counts $CNV_PANEL_OF_NORMALS \
  --cnv-use-somatic-vc-baf true \
"

# HRD requires enabling CNV
HRD_OPTIONS="
--enable-hrd=true \
"

SV_OPTIONS="
  --enable-sv true \
  --sv-exome true \
  --sv-call-regions-bed $SV_TARGET_BED \
"

TMB_OPTIONS="
--enable-tmb=true
# Nirvana settings required for TMB
--enable-variant-annotation=true  
--variant-annotation-data=PATH
--variant-annotation-assembly=GRCh37/8
"

MSI_OPTIONS="
--msi-command=tumor-normal \
--msi-coverage-threshold=60 \
--msi-microsatellites-file=$MSI_MICROSATELLITES_FILE \
"

HLA_OPTIONS="
--enable-hla=true \
--hla-as-filter-min-threshold=29.0 \
--hla-as-filter-ratio-threshold=0.85 \
--hla-enable-class-2=true \ # only if the panel has sufficient coverage for class II HLA typing 
"

# Construct final command line
CMD="
  dragen \
  $INPUT_OPTIONS \
  $OUTPUT_OPTIONS \
  $SNV_OPTIONS \
  $CNV_OPTIONS \
  $HRD_OPTIONS \
  $SV_OPTIONS \
  $HRD_OPTIONS \
  $TMB_OPTIONS \
  $MSI_OPTIONS \
  $HLA_OPTIONS \
"

# Execute
echo $CMD
bash -c $CMD

Additional Notes and Options

Optional settings per component are listed below. Full option list at this page.

UMI

Option

Description

SNV

Option

Description

SNV library specific settings

Option

FFPE

SNV systematic noise file

Generic SNV noise files can be downloaded here: DRAGEN Software Support Site page

However for UMI and panels it is strongly recommended to build a custom systematic noise file as follow:

Step 1. Run DRAGEN somatic tumor-only on each of approximately 20-50 normal samples:

### choose input either from
### i) BAM
INPUT="--tumor-bam-input ${NORMAL_BAM}"
### ii) FASTQs
INPUT="--tumor-fastq-list ${NORMAL_FASTQ_LIST} \
  --tumor-fastq-list-sample-id ${NORMAL_FASTQ_LIST_SAMPLE_ID}"
###

dragen \
-r ${REFERENCE} \
${INPUT} \
--vc-detect-systematic-noise=true \
--vc-detect-systematic-noise-mode=UMI \ # detect ultra low noise levels relevant for UMI panels
--vc-enable-germline-tagging=true \
--enable-variant-annotation=true \
--variant-annotation-data ${NIRVANA_ANNOTATION_FOLDER} \
--variant-annotation-assembly ${REF_TYPE} \ # GRCh37 or GRCh38
--intermediate-results-dir ${TMP} \
--output-directory ${DIR} \
--output-file-prefix ${PREFIX}

Gather the full paths to the VCFs from step 1 in ${VCF_LIST} by specifying 1 file per line.

Step 2. Generate the final noise file with:

dragen \
-r ${REF_DIR} \
--build-sys-noise-vcfs-list ${VCF_LIST} \  
--build-sys-noise-method=mean \ # sets the default noise mode for this noise file by tagging the noise file header with '##NoiseMethod=mean' 
--output-directory ${DIR} \
--output-file-prefix ${PREFIX}

To download a SINE/ALU regions bed for SNV excluded regions

The ALU bed file can be downloaded as part of the Bed File Collection: DRAGEN Software Support Site page

CNV

Please include the matched normal sample in the CNV panel of normals.

Option

Description

Generating Panel of Normals (PON)

Somatic WES CNV requires PON files. Follow the two steps below to generate CNV PON:

Target counts generation (per normal sample): Target counts of individual normal sample should be generated as baseline. Any options used for panel of normals generation (BED file, GC Bias Correction, etc) should be matched when processing the case sample.

CNV_PON_OPTIONS="
  --enable-cnv true \
  --cnv-target-bed $CNV_TARGET_BED \
"

CMD="
  dragen \
  $INPUT_OPTIONS \
  $OUTPUT_OPTIONS \
  $CNV_PON_OPTIONS \
"

Combined counts generation: Individual PON counts can be merged into a single file as a <prefix>.combined.counts.txt.gz file.

CNV_COMBINED_COUNTS_OPTIONS="
  --enable-cnv true \
  --cnv-generate-combined-counts true \
  --cnv-normals-list $CNV_NORMALS_LIST \
"

CMD="
  dragen \
  $INPUT_OPTIONS \
  $OUTPUT_OPTIONS \
  $CNV_COMBINED_COUNTS_OPTIONS \
"

For more information, see Panel of Normals.

SV

Option

Description

TMB library specific settings

Option

Description

Solid

Liquid

MSI

Microsatellite sites file

Microsatellite sites file can be downloaded here: DRAGEN Software Support Site page

MSI library specific settings

Option

Description

Solid

Liquid (cfDNA)

HLA

Option

Description