1 of 4

Quality Control

The software calculates several quality control metrics for runs and samples.

These metrics and guidelines apply to DRAGEN TSO 500 v2.1 and above.

Run QC

The Run Metrics section of the metrics output report provides sequencing run quality metrics along with suggested values to determine if they are within an acceptable range. The overall percentage of reads passing filter is compared to a minimum threshold. For Read 1 and Read 2, the average percentage of bases ≥ Q30, which gives a prediction of the probability of an incorrect base call (Q‑score), are also compared to a minimum threshold. The following tables show run metric and quality threshold information for different systems.

The values in the Run Metrics section are listed as NA in the following situations:

If the analysis was started from FASTQ files.
If the analysis was started from BCL files and the InterOp files are missing or corrupt.

NextSeq 500/550 or NextSeq 550Dx (RUO)

Metric

Description

Recommended Guideline Quality Threshold

Variant Class

NovaSeq 6000 or NovaSeq 6000Dx (RUO)

Metric

Description

Recommended Guideline Quality Threshold

Variant Class

NextSeq 1000/2000

Metric

Description

Recommended Guideline Quality Threshold

Variant Class

NovaSeq X

Metric

Description

Recommended Guideline Quality Threshold

Variant Class

DNA Sample QC

DRAGEN TruSight Oncology 500 uses QC metrics to assess the validity of analysis for DNA libraries that pass contamination quality control. If the library fails one or more quality metrics, then the corresponding variant type or biomarker is not reported, and the associated QC category in the report header displays FAIL. Additionally, a companion diagnostic result may not be available if it relies on QC passing for one or more of the following QC categories.

DNA library QC results are available in the MetricsOutput.tsv file.

Metric

Description

Recommended Guideline Quality Threshold

Variant Class

RNA Sample QC

The input for RNA Library QC is RNA alignment. Metrics and guideline thresholds can be found in the MetricsOutput.tsv file.

Metric

Description

Recommended Guideline Quality Threshold

Variant Classes

*TOTAL_ON_TARGET_READS is the only QC metric with guidelines specific to chemistry (v1 vs. v2 assay); all other guidelines are applicable to both

** To avoid failing RNA samples unnecessarily, Illumina does not recommend a universal threshold for GENE_MEDIAN_COVERAGE to determine RNA sample quality. RNA expression varies significantly across tissue types and a small panel size (55 genes), which makes normalization challenging. Tissue-specific thresholds could be considered for normalization.

DNA Expanded Metrics

DNA expanded metrics are provided for information only. They can be informative for troubleshooting but are provided without explicit specification limits and are not directly used for sample quality control. For additional guidance, contact Illumina Technical Support.

Metric

Description

Troubleshooting

RNA Expanded Metrics

RNA expanded metrics are provided for information only. They can be informative for troubleshooting but are provided without explicit specification limits and are not directly used for sample quality control. For additional guidance, contact Illumina Technical Support.

Metric

Description

Units

PCT_CHIMERIC_READS

Percentage of reads that are aligned as two segments which map to nonconsecutive regions in the genome.

PCT_ON_TARGET_READS

Percentage of reads that cross any part of the target region versus total reads. A read that partially maps to a target region is counted as on target.

Contamination

The contamination score evaluates presence of sample-to-sample contamination. The algorithm uses common germline SNPs in the homozygous state expected to have variant allele frequencies (VAF) at 0% and 100%. In contaminated samples, the VAFs shift away from the expected values allowing the detection of sample-to-sample contamination.

The contamination score can detect sample-to-sample contamination greater than or equal to 2% (more than 2% of DNA input is coming from the non-source sample)

Contamination Score Calculation

The contamination score is calculated using the SNP error file and Pileup file that are generated during the small variant calling, as well as the TMB trace file. The algorithm includes the following steps:

All positions that overlap with a pre-defined set of common SNPs that have variant allele frequencies of < 25% or > 75% are collected (only SNP are considered, indels are excluded)
Variants in CNV events are removed using a clustering method
The likelihood that the positions are an error or a real mutation is calculated by:

CONTAMINATION_SCORE = sum(log10(P(v_i is False Positive)))

Contamination Score Interpretation

The contamination score is output in the metrics output file, MetricsOutput.tsv
If a contamination score is equal or below 1457 (the upper specification limit provided in the "USL Guideline" field in the metrics output file, see ), the sample has less than 2% sample-to-sample contamination.
If a contamination score is above 1457, the sample has more than 2% sample-to-sample contamination. In this case, an estimation of the contamination can be obtained from the PCT_CONTAMINATION_EST metric, see more details on the . As noted, PCT_CONTAMINATION_EST is not valid unless the contamination score exceeds 1457.

Samples with highly rearranged genomes (HRD samples) can have variants with VAFs that shift away from the expected frequencies due to genomic rearrangement, which can lead to false-positive contamination scores

Visual examination can help determine if a shift of VAFs is due to true contamination

How to build a VAF plot for visual examination

To build a VAF plot, use the {Sample_ID}.tmb.trace.csv file. Filter to only germline variants (for example, by using tags "Germline_DB" and "Germline_Proxi" in the column "Status") and use values in the VAF column.
Select Scatter from the Charts menu
Review plot as described above analyzing whether variants are scattered or clustered around 50% and 100% VAF

Contamination

The contamination score can detect sample-to-sample contamination greater than or equal to 2% (more than 2% of DNA input is coming from the non-source sample)

Contamination Score Calculation

All positions that overlap with a pre-defined set of common SNPs that have variant allele frequencies of < 25% or > 75% are collected (only SNP are considered, indels are excluded)
Variants in CNV events are removed using a clustering method
The likelihood that the positions are an error or a real mutation is calculated by:

CONTAMINATION_SCORE = sum(log10(P(v_i is False Positive)))

Contamination Score Interpretation

The contamination score is output in the metrics output file, MetricsOutput.tsv
If a contamination score is equal or below 1457 (the upper specification limit provided in the "USL Guideline" field in the metrics output file, see ), the sample has less than 2% sample-to-sample contamination.
If a contamination score is above 1457, the sample has more than 2% sample-to-sample contamination. In this case, an estimation of the contamination can be obtained from the PCT_CONTAMINATION_EST metric, see more details on the . As noted, PCT_CONTAMINATION_EST is not valid unless the contamination score exceeds 1457.

Visual examination can help determine if a shift of VAFs is due to true contamination

How to build a VAF plot for visual examination

To build a VAF plot, use the {Sample_ID}.tmb.trace.csv file. Filter to only germline variants (for example, by using tags "Germline_DB" and "Germline_Proxi" in the column "Status") and use values in the VAF column.
Select Scatter from the Charts menu
Review plot as described above analyzing whether variants are scattered or clustered around 50% and 100% VAF

Quality Control

The software calculates several quality control metrics for runs and samples.

These metrics and guidelines apply to DRAGEN TSO 500 v2.1 and above.

Run QC

The values in the Run Metrics section are listed as NA in the following situations:

If the analysis was started from FASTQ files.
If the analysis was started from BCL files and the InterOp files are missing or corrupt.

NextSeq 500/550 or NextSeq 550Dx (RUO)

Metric

Description

Recommended Guideline Quality Threshold

Variant Class

NovaSeq 6000 or NovaSeq 6000Dx (RUO)

Metric

Description

Recommended Guideline Quality Threshold

Variant Class

NextSeq 1000/2000

Metric

Description

Recommended Guideline Quality Threshold

Variant Class

NovaSeq X

Metric

Description

Recommended Guideline Quality Threshold

Variant Class

DNA Sample QC

DNA library QC results are available in the MetricsOutput.tsv file.

Metric

Description

Recommended Guideline Quality Threshold

Variant Class

RNA Sample QC

The input for RNA Library QC is RNA alignment. Metrics and guideline thresholds can be found in the MetricsOutput.tsv file.

Metric

Description

Recommended Guideline Quality Threshold

Variant Classes

*TOTAL_ON_TARGET_READS is the only QC metric with guidelines specific to chemistry (v1 vs. v2 assay); all other guidelines are applicable to both

Quality Control

hashtagRun QC

hashtagNextSeq 500/550 or NextSeq 550Dx (RUO)

hashtagNovaSeq 6000 or NovaSeq 6000Dx (RUO)

hashtagNextSeq 1000/2000

hashtagNovaSeq X

hashtagDNA Sample QC

hashtagRNA Sample QC

DNA Expanded Metrics

RNA Expanded Metrics

Contamination

hashtagContamination Score Calculation

hashtagContamination Score Interpretation

hashtagHow to build a VAF plot for visual examination

Contamination

hashtagContamination Score Calculation

hashtagContamination Score Interpretation

hashtagHow to build a VAF plot for visual examination

RNA Expanded Metrics

Quality Control

hashtagRun QC

hashtagNextSeq 500/550 or NextSeq 550Dx (RUO)

hashtagNovaSeq 6000 or NovaSeq 6000Dx (RUO)

hashtagNextSeq 1000/2000

hashtagNovaSeq X

hashtagDNA Sample QC

hashtagRNA Sample QC

DNA Expanded Metrics

Run QC

NextSeq 500/550 or NextSeq 550Dx (RUO)

NovaSeq 6000 or NovaSeq 6000Dx (RUO)

NextSeq 1000/2000

NovaSeq X

DNA Sample QC

RNA Sample QC

Contamination Score Calculation

Contamination Score Interpretation

How to build a VAF plot for visual examination

Contamination Score Calculation

Contamination Score Interpretation

How to build a VAF plot for visual examination

Run QC

NextSeq 500/550 or NextSeq 550Dx (RUO)

NovaSeq 6000 or NovaSeq 6000Dx (RUO)

NextSeq 1000/2000

NovaSeq X

DNA Sample QC

RNA Sample QC