1 of 4

Quality Control

Run QC

The Run Metrics section of the metrics output report provides sequencing run quality metrics along with suggested values to determine if they are within an acceptable range. The overall percentage of reads passing filter is compared to a minimum threshold. For Read 1 and Read 2, the average percentage of bases ≥ Q30, which gives a prediction of the probability of an incorrect base call (Q‑score), are also compared to a minimum threshold. The following tables show run metric and quality threshold information for different systems.

The values in the Run Metrics section are listed as NA in the following situations:

If the analysis was started from FASTQ files.
If the analysis was started from BCL files and the InterOp files are missing or corrupt.

NovaSeq 6000 or NovaSeq 6000Dx (RUO)

Metric

Description

Recommended Guideline Quality Threshold

Variant Class

NovaSeq X

Metric

Description

Recommended Guideline Quality Threshold

Variant Class

There is no PCT_PF_READS value in NovaSeqX Plus runs, so the PCT_PF_READS value will always be NA

Sample QC

DRAGEN TruSight Oncology 500 uses QC metrics to assess the validity of analysis for DNA libraries that pass contamination quality control. If the library fails one or more quality metrics, then the corresponding variant type or biomarker is not reported, and the associated QC category in the report header displays FAIL.

DNA library QC results are available in the MetricsOutput.tsv file. Refer to Metrics Output for details.

Metric

Description

Recommended Guideline Quality Threshold

Variant Class

CONTAMINATION_SCORE

Contamination

The contamination score evaluates presence of sample-to-sample contamination. The algorithm uses common germline SNPs in the homozygous state expected to have variant allele frequencies (VAF) at 0% and 100%. In contaminated samples, the VAFs shift away from the expected values allowing the detection of sample-to-sample contamination.

The contamination score can detect sample-to-sample contamination greater than or equal to 0.4% (more than 0.4% of DNA input is coming from the contaminant)

Contamination Score Calculation

The contamination score is calculated using the SNP error file and Pileup file that are generated during the small variant calling, as well as the TMB trace file. The algorithm includes the following steps:

All positions that overlap with a pre-defined set of common SNPs that have variant allele frequencies of < 25% or > 75% are collected (only SNP are considered, indels are excluded)
Variants in CNV events are removed using a clustering method
The likelihood that the positions are an error or a real mutation is calculated by:

CONTAMINATION_SCORE = sum(log10(P(v_i is False Positive)))

Contamination Score Interpretation

The contamination score is output in the metrics output file, MetricsOutput.tsv
If a contamination score is equal or below 1227 (the upper specification limit provided in the "USL Guideline" field in the metrics output file, see ), the sample has less than 0.4% sample-to-sample contamination.
If a contamination score is above 1227, the sample has more than 0.4% sample-to-sample contamination. In this case, an estimation of the contamination can be obtained from the PCT_CONTAMINATION_EST metric, see more details on the . As noted, PCT_CONTAMINATION_EST is not valid unless the contamination score exceeds 1227.

Samples with highly rearranged genomes (HRD samples) can have variants with VAFs that shift away from the expected frequencies due to genomic rearrangement, which can lead to false-positive contamination scores

Visual examination can help determine if a shift of VAFs is due to true contamination

How to build a VAF plot for visual examination

To build a VAF plot, use the {Sample_ID}.tmb.trace.csv file. Filter to only germline variants (for example, by using tags "Germline_DB" and "Germline_Proxi" in the column "Status") and use values in the VAF column.
Select Scatter from the Charts menu
Review plot as described above analyzing whether variants are scattered or clustered around 50% and 100% VAF

Contamination

The contamination score can detect sample-to-sample contamination greater than or equal to 0.4% (more than 0.4% of DNA input is coming from the contaminant)

Contamination Score Calculation

All positions that overlap with a pre-defined set of common SNPs that have variant allele frequencies of < 25% or > 75% are collected (only SNP are considered, indels are excluded)
Variants in CNV events are removed using a clustering method
The likelihood that the positions are an error or a real mutation is calculated by:

CONTAMINATION_SCORE = sum(log10(P(v_i is False Positive)))

Contamination Score Interpretation

The contamination score is output in the metrics output file, MetricsOutput.tsv
If a contamination score is equal or below 1227 (the upper specification limit provided in the "USL Guideline" field in the metrics output file, see ), the sample has less than 0.4% sample-to-sample contamination.
If a contamination score is above 1227, the sample has more than 0.4% sample-to-sample contamination. In this case, an estimation of the contamination can be obtained from the PCT_CONTAMINATION_EST metric, see more details on the . As noted, PCT_CONTAMINATION_EST is not valid unless the contamination score exceeds 1227.

Visual examination can help determine if a shift of VAFs is due to true contamination

How to build a VAF plot for visual examination

To build a VAF plot, use the {Sample_ID}.tmb.trace.csv file. Filter to only germline variants (for example, by using tags "Germline_DB" and "Germline_Proxi" in the column "Status") and use values in the VAF column.
Select Scatter from the Charts menu
Review plot as described above analyzing whether variants are scattered or clustered around 50% and 100% VAF

Run QC

The values in the Run Metrics section are listed as NA in the following situations:

If the analysis was started from FASTQ files.
If the analysis was started from BCL files and the InterOp files are missing or corrupt.

NovaSeq 6000 or NovaSeq 6000Dx (RUO)

Metric

Description

Recommended Guideline Quality Threshold

Variant Class

NovaSeq X

Metric

Description

Recommended Guideline Quality Threshold

Variant Class

There is no PCT_PF_READS value in NovaSeqX Plus runs, so the PCT_PF_READS value will always be NA

Quality Control

Run QC

hashtagNovaSeq 6000 or NovaSeq 6000Dx (RUO)

hashtagNovaSeq X

Sample QC

Contamination

hashtagContamination Score Calculation

hashtagContamination Score Interpretation

hashtagHow to build a VAF plot for visual examination

Quality Control

Contamination

hashtagContamination Score Calculation

hashtagContamination Score Interpretation

hashtagHow to build a VAF plot for visual examination

Run QC

hashtagNovaSeq 6000 or NovaSeq 6000Dx (RUO)

hashtagNovaSeq X

Sample QC

NovaSeq 6000 or NovaSeq 6000Dx (RUO)

NovaSeq X

Contamination Score Calculation

Contamination Score Interpretation

How to build a VAF plot for visual examination

Contamination Score Calculation

Contamination Score Interpretation

How to build a VAF plot for visual examination

NovaSeq 6000 or NovaSeq 6000Dx (RUO)

NovaSeq X