Refer to DNA Analysis Methods for more information.
File name: {SAMPLE_ID}_hard-filtered.gvcf.gz
The small variant genome variant call file contains information on all candidate small variants evaluated, including complex variants up to 15 bp from phased variant calling across the entire TSO 500 panel.
The variant status is determined by the FILTER column in the genome VCF as follows.
Filter | Note |
---|---|
File name: {SAMPLE_ID}_DNAVariants_Annotated.json.gz
The small variants annotated file provides variant annotation information for all nonreference positions from the genome VCF including pass and nonpass variants.
The TMB trace file provides comprehensive information on how the TMB value is calculated for a given sample. All passing small variants from the small variant filtering step are included in this file. To calculate the numerator of the TmbPerMb value in the TMB JSON, set the TSV file filter to use the IncludedInTMBNumerator with a value of True.
The TMB trace file is not intended to be used for variant inspections. The filtering statuses are exclusively set for TMB calculation purposes. Setting a filter does not translate into the classification of a variant as somatic or germline.
The copy number VCF file contains CNV calls for DNA libraries of the amplification genes targeted by DRAGEN TruSight Oncology 500 Analysis Software. The CNV call indicates fold change results for each gene classified as reference, deletion, or amplification.
The value in the QUAL column of the VCF is a Phred transformation of the p-value where Q=-10xlog10(p-value). The p-value is derived from the t-test between the fold change of the gene against the rest of the genome. Higher Q-scores indicate higher confidence in the CNV call.
In the VCF notation, <DUP> indicates the detected fold change (FC) is greater than a predefined amplification cutoff. <DEL> indicates the detected FC is less than a predefined deletion cutoff for that gene. This cutoff can vary from gene to gene.
In analysis versions prior to v2.5, <DEL> calls in the VCF are marked as LowValidation. The LowValidation filter indicates that the calls have been validated only with in silico data sets and are provided as information only.
Each copy number variant is reported as a fold change on normalized read depth in a testing sample relative to the normalized read depth in diploid genomes. Given tumor purity, you can infer the ploidy of a gene in the sample from the reported fold change.
Given tumor purity X%, for a reported fold change Y, you can calculate the copy number n using the following equation:
For example, a tumor purity at 30% and a MET with fold change of 2.2x indicates that 10 copies of MET DNA are observed.
Column | Description |
---|---|
PASS
PASS variants.
base_quality
Site filtered because median base quality of alt reads at this locus does not meet threshold.
filtered_reads
Site filtered because the fraction of reads is too large.
fragment_length
Site filtered because absolute difference between the median fragment length of alt reads and median fragment length of ref reads at this locus exceeds threshold.
low_depth
Site filtered because the read depth is too low.
low_frac_info_reads
Site filtered because the fraction of informative reads is below threshold.
long_indel
Site filtered because the indel length is too long.
mapping_quality
Site filtered because median mapping quality of alt reads at this locus does not meet threshold.
multiallelic
Site filtered because more than two alt alleles pass tumor LOD.
no_reliable_supporting_read
Site filtered because no reliable supporting somatic read exists.
read_position
Site filtered because median of distances between start/end of read and this locus is below threshold.
str_contraction
Site filtered due to suspected PCR error where the alt allele is one repeat unit less than the reference.
too_few_supporting_reads
Site filtered because there are too few supporting reads in the tumor sample.
weak_evidence
Somatic variant score (SQ) does not meet threshold.
systematic_noise
Site filtered based on evidence of systematic noise in normal sample.
excluded_regions
Site overlaps with VC excluded regions bed.
Chromosome
Chromosome
Position
Position of variant
RefCall
Reference base
AltCall
Alternate base
VAF
Variant allele frequency
Depth
Coverage of position
CytoBand
Cytoband of variant
GeneName
Name of gene if applicable. A semicolon delimited list is used for multiple genes.
VariantType
Type of the variant: SNV, insertion, deletion, MNV
CosmicIDs
Cosmic IDs, if multiple concatenated by “;”
MaxCosmicCount
Maximum Cosmic study count
AlleleCountsGnomadExome
Variant allele count in gnomAD exome database
AlleleCountsGnomadGenome
Variant allele count in gnomAD genome database
AlleleCounts1000Genomes
Variant allele count in 1000 genomes database
MaxDatabaseAlleleCounts
Maximum variant allele count over the three databases
GermlineFilterDatabase
TRUE if variant was filtered by the database filter
GermlineFilterProxi
TRUE if variant was filtered by the proxi filter
CodingVariant
TRUE if variant is in the coding region
Nonsynonymous
TRUE if variant has any transcript annotations with nonsynonymous consequences
IncludedinTMBNumberator
TRUE if variant is used in the TMB calculation