LogoLogo
Illumina KnowledgeIllumina SupportSign In
  • Home
  • Overview
    • Illumina® DRAGEN™ Secondary Analysis
    • DRAGEN Applications
    • Deployment Options
  • Product Guides
    • DRAGEN v4.4
      • Getting Started
      • DRAGEN Host Software
        • DRAGEN Secondary Analysis
      • Clinical Research Workflows
        • DRAGEN Heme WGS Tumor Only Pipeline
          • Quick Start
          • Sample Sheets
            • Introduction
            • Requirements
            • Templates
          • Run Planning
            • Sample Sheet Creation in BaseSpace
            • Custom Config Support
          • DRAGEN Server App
            • Getting Started
            • Launching Analysis
            • Command Line Options
            • Output
            • Advanced Topics
              • Custom Workflow
              • Custom Config Support
              • Illumina Connected Insights
          • ICA Cloud App
            • Getting Started
            • Launching Analysis
            • Advanced Topics
              • Custom Workflow
              • Custom Config Support
              • Post Processing
              • Illumina Connected Insights
          • Analysis Output
          • Analysis Methods
          • Troubleshooting
        • DRAGEN Solid WGS Tumor Normal Pipeline
          • Quick Start
          • Sample Sheets
            • Introduction
            • Requirements
            • Templates
          • Run Planning
            • Sample Sheet Creation in BaseSpace
            • Custom Config Support
          • DRAGEN Server App
            • Quick Start
            • Getting Started
            • Launching Analysis
            • Command Line Options
            • Output
            • Advanced Topics
            • Custom Workflow
              • Custom Config Support
            • Illumina Connected Insights
          • ICA Cloud App
            • Getting Started
            • Launching Analysis
            • Output
            • Advanced Topics
              • Custom Workflow
              • Custom Config Support
              • Post Processing
              • Illumina Connected Insights
          • Analysis Output
          • Analysis Methods
          • Troubleshooting
      • DRAGEN Recipes
        • DNA Germline Panel UMI
        • DNA Germline Panel
        • DNA Germline WES UMI
        • DNA Germline WES
        • DNA Germline WGS UMI
        • DNA Germline WGS
        • DNA Somatic Tumor-Normal Solid Panel UMI
        • DNA Somatic Tumor-Normal Solid Panel
        • DNA Somatic Tumor-Normal Solid WES UMI
        • DNA Somatic Tumor-Normal Solid WES
        • DNA Somatic Tumor-Normal Solid WGS UMI
        • DNA Somatic Tumor-Normal Solid WGS
        • DNA Somatic Tumor-Only Heme WGS
        • DNA Somatic Tumor-Only Solid Panel UMI
        • DNA Somatic Tumor-Only Solid Panel
        • DNA Somatic Tumor-Only Solid WES UMI
        • DNA Somatic Tumor-Only Solid WES
        • DNA Somatic Tumor-Only Solid WGS UMI
        • DNA Somatic Tumor-Only Solid WGS
        • DNA Somatic Tumor-Only ctDNA Panel UMI
        • Illumina scRNA
        • Other scRNA prep
        • RNA Panel
        • RNA WTS
      • DRAGEN Reference Support
        • Prepare a Reference Genome
      • DRAGEN DNA Pipeline
        • DNA Mapping
        • Read Trimming
        • DRAGEN FASTQC
        • Sorting and Duplicate Marking
        • Small Variant Calling
          • ROH Caller
          • B-Allele Frequency Output
          • Somatic Mode
          • Pedigree Analysis
          • De Novo Small Variant Filtering
          • Autogenerated MD5SUM for VCF Files
          • Force Genotyping
          • Machine Learning for Variant Calling
          • Evidence BAM
          • Mosaic Detection
          • VCF Imputation
          • Multi-Region Joint Detection
        • Copy Number Variant Calling
          • Available pipelines
            • Germline CNV Calling (WGS/WES)
            • Germline CNV Calling ASCN (WGS)
            • Multisample Germline CNV Calling
            • Somatic CNV Calling ASCN (WGS)
            • Somatic CNV Calling WES
            • Somatic CNV Calling ASCN (WES)
          • Additional documentation
            • CNV Input
            • CNV Preprocessing
            • CNV Segmentation
            • CNV Output
            • CNV ASCN module
            • CNV with SV Support
            • Cytogenetics Modality
        • Repeat Expansion Detection
          • De Novo Repeat Expansion Detection
        • Targeted Caller
          • CYPDB6 Caller
          • CYP2D6 Caller
          • CYP21A2 Caller
          • GBA Caller
          • HBA Caller
          • LPA Caller
          • Rh Caller
          • SMN Caller
        • Structural Variant Calling
          • Structural Variant De Novo Quality Scoring
          • Structural Variant IGV Tutorial
        • VNTR Calling
        • Population Genotyping
        • Filter Duplicate Variants
        • Ploidy Calling
          • Ploidy Estimator
          • Ploidy Caller
        • Multi Caller
        • QC Metrics Reporting
        • JSON Metrics Reporting
        • HLA Typing
        • Biomarkers
          • Tumor Mutational Burden
          • Microsatellite Instability
          • Homologous Recombination Deficiency
          • BRCA Large Genomic Rearrangment
          • DRAGEN Fragmentomics
        • Downsampling
          • Fractional (Raw Reads) Downsampling
        • Unique Molecular Identifiers
        • Indel Re-aligner (Beta)
        • Star Allele Caller
        • High Coverage Analysis
        • CheckFingerprint
        • Population Haplotyping (Beta)
        • DUX4 Rearrangement Caller
      • DRAGEN RNA Pipeline
        • RNA Alignment
        • Gene Fusion Detection
        • Gene Expression Quantification
        • RNA Variant Calling
        • Splice Variant Caller
      • DRAGEN Single Cell Pipeline
        • Illumina PIPseq scRNA
        • Other scRNA Prep
        • scATAC
        • Single-Cell Multiomics
      • DRAGEN Methylation Pipeline
      • DRAGEN MRD Pipeline
      • DRAGEN Amplicon Pipeline
      • Explify Analysis Pipeline
        • Kmer Classifier
        • Kmer Classifier Database Builder
      • BCL conversion
      • Illumina Connected Annotations
      • ORA Compression
      • Command Line Options
        • Docker Requirements
      • DRAGEN Reports
      • Tools and Utilities
    • DRAGEN v4.3
      • Getting Started
      • DRAGEN Host Software
        • DRAGEN Secondary Analysis
      • DRAGEN Reference Support
        • Prepare a Reference Genome
      • DRAGEN DNA Pipeline
        • DNA Mapping
        • Read Trimming
        • DRAGEN FASTQC
        • Sorting and Duplicate Marking
        • Small Variant Calling
          • ROH Caller
          • B-Allele Frequency Output
          • Somatic Mode
          • Joint Analysis
          • De Novo Small Variant Filtering
          • Autogenerated MD5SUM for VCF Files
          • Force Genotyping
          • Machine Learning for Variant Calling
          • Evidence BAM
          • Mosaic Detection
          • VCF Imputation
          • Multi-Region Joint Detection
        • Copy Number Variant Calling
          • CNV Output
          • CNV with SV Support
          • Multisample CNV Calling
          • Somatic CNV Calling WGS
          • Somatic CNV Calling WES
          • Allele Specific CNV for Somatic WES CNV
        • Repeat Expansion Detection
          • De Novo Repeat Expansion Detection
        • Targeted Caller
          • CYPDB6 Caller
          • CYP2D6 Caller
          • CYP21A2 Caller
          • GBA Caller
          • HBA Caller
          • LPA Caller
          • Rh Caller
          • SMN Caller
        • Structural Variant Calling
          • Structural Variant De Novo Quality Scoring
        • VNTR Calling
        • Filter Duplicate Variants
        • Ploidy Calling
          • Ploidy Estimator
          • Ploidy Caller
        • Multi Caller
        • QC Metrics Reporting
        • HLA Typing
        • Biomarkers
          • Tumor Mutational Burden
          • Microsatellite Instability
          • Homologous Recombination Deficiency
          • BRCA Large Genomic Rearrangment
          • DRAGEN Fragmentomics
        • Downsampling
          • Fractional (Raw Reads) Downsampling
          • Effective Coverage Downsampling
        • Unique Molecular Identifiers
        • Indel Re-aligner (Beta)
        • Star Allele Caller
        • High Coverage Analysis
        • CheckFingerprint
        • Population Haplotyping (Beta)
        • DUX4 Rearrangement Caller
      • DRAGEN RNA Pipeline
        • RNA Alignment
        • Gene Fusion Detection
        • Gene Expression Quantification
        • RNA Variant Calling
        • Splice Variant Caller
      • DRAGEN Single-Cell Pipeline
        • scRNA
        • scATAC
        • Single-Cell Multiomics
      • DRAGEN Methylation Pipeline
      • DRAGEN Amplicon Pipeline
      • Explify Analysis Pipeline
        • Kmer Classifier
        • Kmer Classifier Database Builder
      • DRAGEN Recipes
        • DNA Germline Panel UMI
        • DNA Germline Panel
        • DNA Germline WES UMI
        • DNA Germline WES
        • DNA Germline WGS UMI
        • DNA Germline WGS
        • DNA Somatic Tumor-Normal Solid Panel UMI
        • DNA Somatic Tumor-Normal Solid Panel
        • DNA Somatic Tumor-Normal Solid WES UMI
        • DNA Somatic Tumor-Normal Solid WES
        • DNA Somatic Tumor-Normal Solid WGS UMI
        • DNA Somatic Tumor-Normal Solid WGS
        • DNA Somatic Tumor-Only Heme WGS
        • DNA Somatic Tumor-Only Solid Panel UMI
        • DNA Somatic Tumor-Only Solid Panel
        • DNA Somatic Tumor-Only Solid WES UMI
        • DNA Somatic Tumor-Only Solid WES
        • DNA Somatic Tumor-Only Solid WGS UMI
        • DNA Somatic Tumor-Only Solid WGS
        • DNA Somatic Tumor-Only ctDNA Panel UMI
        • RNA Panel
        • RNA WTS
      • BCL conversion
      • Illumina Connected Annotations
      • ORA Compression
      • Command Line Options
      • DRAGEN Reports
      • Tools and Utilities
  • Reference
    • DRAGEN Server
    • DRAGEN Multi-Cloud
      • DRAGEN on AWS
      • DRAGEN on AWS Batch
      • DRAGEN on Microsoft Azure
        • Run DRAGEN VM on Azure
      • DRAGEN on Microsoft Azure Batch
        • Azure Batch Run Modes
    • DRAGEN Licensing
      • DRAGEN Server Licensing
      • DRAGEN Cloud Licensing
    • DRAGEN Application Manager
    • Support
    • Resource Files
      • Noise Baselines
    • Supplementary Information
    • Troubleshooting
    • Citing DRAGEN software
    • Release Notes
    • Revision History
Powered by GitBook
On this page
  • TMB Processing Flow
  • 1. Variant calling
  • 2. Eligible region detection
  • 3. Variant filters
  • 4. Support for germline variants
  • 5. Support Nonsynonymous variants.
  • 6. TMB calculation
  • 7. Maximum somatic allele frequency (MSAF)
  • Command-Line Options
  • Output

Was this helpful?

Export as PDF
  1. Product Guides
  2. DRAGEN v4.4
  3. DRAGEN DNA Pipeline
  4. Biomarkers

Tumor Mutational Burden

DRAGEN supports Tumor Mutational Burden (TMB) in Tumor-Only or Tumor-Normal Mode.

It is important to note that in T/O mode germline variants must be identified and filtered using database information and optionally also allele frequency information. These germline filtering techniques are generally not as accurate as tumor normal subtraction. When using databases only to subtract germline variants, the TMB may be slightly higher than the more accurate T/N estimate. When using database and allele frequency information to remove germline variants, the TMB may be slightly underestimated for high purity tumor samples.

TMB Processing Flow

DRAGEN TMB comprise the following steps:

1. Variant calling

Please refer to "Somatic mode" for detailed variant calling options.

2. Eligible region detection

TMB is computed over protein coding regions with sufficient coverage. If DRAGEN detects a reference hg19/38, GRCh37/38 or hs37d5 it will automatically select the appropriate coding region based on the bed files available in "<INSTALL_PATH>/resources/tmb/". By default the coverage threshold for eligible regions is 50.

The protein coding region bed file and the coverage settings can explicitly be specified using the qc-coverage options listed below in [QC coverage settings to override the default eligible region]. If DRAGEN does not automatically detect the reference it is required to specify these settings.

3. Variant filters

The following variants are excluded from the TMB calculation:

  • Non-PASS variants

  • Mitochondrial variants

  • MNVs

  • Variants that do not meet the minimum depth (DP) threshold. Use the --vc-callability-tumor-thresh command line option to specify the threshold value.

  • Variants that do not meet the minimum variant allele threshold. Use the --tmb-vaf-threshold command line option to specify the threshold value.

  • Variants that fall outside the eligible regions.

  • Tumor driver mutations. Variants with a population allele count ≥ 50 are treated as tumor driver mutations. You can specify the cosmic driver threshold using the tmb-cosmic-count-threshold command line option. The tumor driver mutations filter relies on Nirvana annotations and will additionally require settings for --enable-variant-annotation=true, --variant-annotation-assembly, and --variant-annotation-data.

4. Support for germline variants

By default, germline variants are not counted towards TMB. Variants are determined as germline based on a database or a proxi filter. The database germline filter can be disabled with tmb-skip-db-filter. Disabling the database germline filter will effectively also disable the germline proxi filter.

Database filter

  • Variants with a population allele count ≥ 50 that are observed in either the 1000 Genome or gnomAD database will be marked as germline. Use germline-tagging-db-threshold to change the population allele counts. The database germline filter relies on Nirvana annotations and requires settings for --enable-variant-annotation=true, --variant-annotation-assembly, and --variant-annotation-data.

Proxi filter

  • Proxi filter can be enabled with tmb-enable-proxi-filter. The proxi filter will flag any variants with VAF > 0.9 as germline. The proxi filter scans the variants surround a specific variant and identifies those variants with similar VAFs. The proxi window size that determines the number of surrounding variants can be specified with tmb-proxi-window-size. If 95% (default value for tmb-proxi-fraction-threshold) and no less than 5 (tmb-proxi-count-threshold) of the surrounding variants of similar VAF are germline, then mark the current variant also as germline.

  • Proxi filter can also be done via a probabilistic approach, which can be enabled with tmb-enable-prob-proxi. It estimates the expected germline allele frequency using the surrounding germline variants and then tests whether the allele frequency of the target variant is similar to the expected germline allele frequency or not. P value threshold can be set by tmb-prob-proxi-p-value (the default value 1e-15 is set for ultra-deep sequenced samples, e.g. cfDNA)

  • Note that proxi filters can be too aggressive for 100% pure cell lines. Probabilistic proxi filter can be problematic for mixing or contaminated samples, as these samples do not have clear germline variant allele frequency distributions.

CH filter

  • When processing ctDNA samples it may be beneficial to also remove CH (clonal hematopoiesis) variants. Circulating tumor DNA generally has shorter fragment size. CH variants can be identified based on the insert size of the reads supporting the call. To capture the insert-size distribution for each variant call, it is required to specify vc-log-insert-size during variant calling (step1). Once specified, potential CH variants based on insert size distribution will be labeled in the output. Additional, CH variants can be also labeled via a bed file supplied to tmb-ch-bed. Variants other than germlines overlapping the region will be labeled as CH.

5. Support Nonsynonymous variants.

Nonsynonymous consequences are detected based on the Nirvana annotations. Nirvana variants that are annotated with the following consequences are labaled as nonsynonymous:

  • feature_elongation, feature_truncation, frameshift_variant, incomplete_terminal_codon_variant, inframe_deletion, inframe_insertion, missense_variant, protein_altering_variant, splice_acceptor_variant, splice_donor_variant, start_lost, stop_gained, stop_lost, transcript_truncation

TMB outputs a tmb.trace.csv file with detailed information on each variant used the TMB score. The trace file contains a column "Nonsynonymous" that indicates the appropriate status for each variant.

The subset of filtered variants that are nonsynonymous are used as numerator in the "Filtered Nonsyn Variant Count" metric.

6. TMB calculation

  • TMB = Filtered Variants / Eligible Region (Mbp)

  • Nonsynonymous TMB = Filtered Nonsynonymous Variants / Eligible Region (Mbp)

7. Maximum somatic allele frequency (MSAF)

The maximum somatic allele frequency (MSAF) outputs the estimated maximum somatic allele frequency of the sample. This is done via finding the confident somatic variants with highest allele frequency. MSAF is a rough approximate to the tumor fraction of cfDNA in peripheral blood samples. The MSAF mode can be enabled with tmb-enable-msaf.

Command-Line Options

[Required]

  • --enable-tmb true Enables TMB. If set, the small variant caller, Illumina Annotation Engine, and the related callability report are enabled.

[Recommended]

Setting
Description
Tumor-Normal Panel/WES/WGS
Tumor-Only Panel/WES/WGS
High coverage bTMB

--vc-callability-tumor-thresh

The minimum coverage for usable coding regions

50 (default)

50 (default)

1000 (not default)

--tmb-vaf-threshold

Variant mininum allele frequency for usable variants

0.05 (default)

0.05 (default)

0.002 (not default)

--tmb-cosmic-count-threshold

Number of observations in cosmic for variant to be considered a driver mutation.

50 (default)

50 (default)

50 (default)

--tmb-skip-db-filter

Do not use Nirvana database to filter germline variants

TRUE (default:T/N)

FALSE (default:T/O)

FALSE (not default)

--tmb-enable-proxi-filter

Use allele frequency information to filter germline variants

OPTIONAL (default is FALSE)

FALSE (not default)

TRUE (not default)

[QC coverage settings to override the default eligible region]

The protein coding region and the coverage settings can explicitly be specified using the qc-coverage options listed below. All four settings must be specified to override the defaults. If DRAGEN does not automatically detect the reference it is required to specify these settings.

  • --qc-coverage-region-1 Specify the coding regions bed file to use.

  • --qc-coverage-tag-1=tmb Required to associate these coverage settings with TMB. If this setting is not specified then DRAGEN will revert to default coding regions.

  • --vc-callability-tumor-thresh Specify the somatic_callable bed minimum threshold, this will limit the regions over which TMB will be computed (default is 50).

  • --qc-coverage-reports-1=callability. The callability report is required whenever it is desired to override the default TMB coverage settings.

[Optional settings]

  • --tmb-vaf-threshold Specify the minimum VAF threshold for a variant. Variants that do not meet the threshold are filtered out (default=0.05)

  • --tmb-cosmic-count-threshold The minimum number of observations in cosmic for variant to be considered a driver mutation. Driver mutations are not counted in TMB. This setting has very little impact on WES/WGS, but can help avoid bias in small panels (default=50)

  • --tmb-skip-db-filter Skip database germline filtering. The database germline filter is required for tumor-only samples, but can be skipped for tumor-normal (default=false)

  • --germline-tagging-db-threshold Specify the minimum allele count (total number of observations) for an allele in gnomAD or 1000 Genome to be considered a germline variant. Variant calls that have the same positions and allele are ignored from the TMB calculation (default=50)

  • --tmb-germline-max-cosmic-count Restrict the db-filter. Variants with cosmic allele count higher than this threshold will never be marked as germline. Set to 0 to disable. (default=0, range=[0;1000]).

  • --tmb-germline-min-vaf Restrict the db-filter. Variants with a variant allele frequency lower than this threshold will never be marked as germline. Set to 0 to disable. (default=0, range=[0;1])

  • --tmb-enable-proxi-filter Enable proxi filter functionality in germline filtering. This is an optional feature that may be appropriate for T/O runs. In T/O mode the DB germline filter may not able to detect all germline variants, especially for ethnicity groups that are not well represented in germline databases. The proxi filter uses allele frequency information to help remove germline variants missed by the DB, and can help to obtain more accurate (lower) TMB values on samples with low tumor purity. In samples with high tumor purity this filter may be too aggressive and mark some somatic variants as germline resulting in too low TMB scores. (default=false)

  • --tmb-proxi-count-threshold Proxi filter surrounding variant count threshold in germline filtering (default=5)

  • --tmb-proxi-fraction-threshold Proxi filter surrounding variant db filter fraction threshold in germline filtering (default=0.95)

  • --tmb-proxi-window-size, Number of surrounding variants before and after the target variant for proxi filter (default=500)

  • --tmb-ch-bed Variants in the region will be labeled as clonal hematopoiesis (CH) variants.

  • --tmb-ch-insert-p-value Minimum P value to classify a variant as CH using insert size (default=0.1)

  • --tmb-ch-insert-min-len Minimum fragment size to test for CH using insert size (default=100)

  • --tmb-ch-insert-max-len Maximum fragment size to test for CH using insert size (default=200)

  • --tmb-ch-insert-min-num Minimum number of fragment size record to test for CH using insert size (default=50)

  • --tmb-enable-msaf Enable MSAF output (default=false)

  • --tmb-msaf-p-value Maximum P value (from insert size) to call confident somatic variant (default=1e-5)

  • --tmb-msaf-rank-num If no confident somatic variant found, it will use the specified ranked variant (default=4)

Output

The TMB values are output to <output prefix>.tmb.metrics.csv. The file format uses the following CSV column convention, similar to other metric CSV files.

Metric
Definition

Eligible Region (Mbp)

The specified custom regions in (Mbp) that meet the minimum coverage threshold.

Filtered Variant Count

Remaining variants after variant and germline filters.

Filtered Nonsyn Variant Count

Subset of filtered variants that are nonsynonymous.

TMB

Filtered variants normalized by the eligible regions (Mbp).

Nonsyn TMB

Filtered nonsynonymous variants normalized by the eligible regions (Mbp).

The TMB module also outputs a tmb.trace.csv file that provides detailed information on each variant that was included in the TMB calculation.

When enabling MSAF, the information is output to <output prefix>.tmb.msaf.csv.

PreviousBiomarkersNextMicrosatellite Instability

Last updated 2 days ago

Was this helpful?

--enable-variant-annotation=true, --variant-annotation-assembly, and --variant-annotation-data enables Nirvana, the Illumina Annotation Engine. For more information on selecting the correct assembly and downloading reference files, see .

Illumina Annotation Engine