1 of 6

Analysis Output

When the analysis run completes, the DRAGEN TruSight Oncology 500 Analysis Software generates an analysis output folder in a specified location.

To view analysis output, navigate to the analysis output folder and select the files that you want to view.

Single Node Analysis Output Folder Structure

Single output folder structure is as follows.

Logs_Intermediates
- AdditionalSarjMetrics— Contains per pair ID calculations to support the PCT_TARGET_250X metric.
- Annotation—Contains outputs for small variant annotation.
  - Subfolders per sample ID—Contains the aligned small variants JSON.
Results
- Metrics Output TSV (all pair IDs)
- Pair ID—The following outputs are produced for each sample:

Multiple Node Analysis Output Folder Structure

Multiple output folder structure is as follows.

Demultiplex Output
- A Logs_Intermediates folder containing FASTQ files per sample.
Node(X) Output—The following outputs are produced for each node used:

ICA Output Folder Structure

This section describes each output folder generated during analysis and where to find metric and analytic files when the pipeline is executed. The same output folder structure and content exist in ICA and BaseSpace Sequence Hub.

High-Level Folder Structure

Run ID
- TSO500_Nextflow_logs
  - _manifest.json

TSO500_Nextflow_logs Folder Structure

The TSO_500_Nextflow_Logs provides information related to the execution of the pipeline on ICA as a whole and for specific nodes (when an analysis is split across multiple nodes). It contains files used to execute parts of the workflow on different nodes as well as records of the nextflow execution on those nodes.

TSO_500_Nextflow_Logs
- _manifest.json

Results Folder Structure

Contains the aggregated MetricsOutput.tsv file at the root level. Additionally, the Results folder contains a subfolder for each pair ID.

Results
- MetricsOutput.tsv
- Sample_1

The Results subfolder contains the following files:

Results
- MetricsOutput.tsv
- <Pair_id>

Logs_intermediates Folder Structure

Contains folders for each submodule in the DRAGEN TSO 500 on ICA pipeline. The folders contain a copy of all the relevant files required to create the metric output files and report files, as well as the combined log files at the root level and subfolders for each sample.

Logs_intermediates
- DnaDragenCaller
- AdditionalSarjMetrics

Errors Folder Structure

Contains Errors.tsv. This file contains the summary of all the errors encountered during pipeline execution.

Errors
- Errors.tsv

DNA Output

Refer to DNA Analysis Methods for more information.

Small Variant gVCF

File name: {SAMPLE_ID}_hard-filtered.gvcf.gz

The small variant genome variant call file contains information on all candidate small variants evaluated, including complex variants up to 15 bp from phased variant calling across the entire TSO 500 panel.

The variant status is determined by the FILTER column in the genome VCF as follows.

Filter

Note

Small Variant Annotated JSON

File name: {SAMPLE_ID}_DNAVariants_Annotated.json.gz

The small variants annotated file provides variant annotation information for all nonreference positions from the genome VCF including pass and nonpass variants.

TMB Trace

The TMB trace file provides comprehensive information on how the TMB value is calculated for a given sample. All passing small variants from the small variant filtering step are included in this file. To calculate the numerator of the TmbPerMb value in the TMB JSON, set the TSV file filter to use the IncludedInTMBNumerator with a value of True.

The TMB trace file is not intended to be used for variant inspections. The filtering statuses are exclusively set for TMB calculation purposes. Setting a filter does not translate into the classification of a variant as somatic or germline.

Column

Description

Copy Number VCF

The copy number VCF file contains CNV calls for DNA libraries of the amplification genes targeted by DRAGEN TruSight Oncology 500 Analysis Software. The CNV call indicates fold change results for each gene classified as reference, deletion, or amplification.

The value in the QUAL column of the VCF is a Phred transformation of the p-value where Q=-10xlog10(p-value). The p-value is derived from the t-test between the fold change of the gene against the rest of the genome. Higher Q-scores indicate higher confidence in the CNV call.

In the VCF notation, <DUP> indicates the detected fold change (FC) is greater than a predefined amplification cutoff. <DEL> indicates the detected FC is less than a predefined deletion cutoff for that gene. This cutoff can vary from gene to gene.

In analysis versions prior to v2.5, <DEL> calls in the VCF are marked as LowValidation. The LowValidation filter indicates that the calls have been validated only with in silico data sets and are provided as information only.

Each copy number variant is reported as a fold change on normalized read depth in a testing sample relative to the normalized read depth in diploid genomes. Given tumor purity, you can infer the ploidy of a gene in the sample from the reported fold change.

Given tumor purity X%, for a reported fold change Y, you can calculate the copy number n using the following equation:

For example, a tumor purity at 30% and a MET with fold change of 2.2x indicates that 10 copies of MET DNA are observed.

HRD and GIS Outputs

The Illumina DRAGEN TruSight Oncology 500 Analysis Software allows for analysis of sequencing data generated from the TruSight Oncology 500 HRD assay. When HRD samples are analyzed new results and metrics are included in the CombinedVariantOutput and MetricsOutput files respectively. The following tables detail how these scores and QC metrics are derived.

Metric

Description

Genomic Instability Score (GIS)

Proprietary Genomic Instability Score (GIS) indicating level of genomic instability in sample genome. Combination of Loss of Heterozygosity (LOH), Telomeric allelic imbalance and Large-scale State Transitions (LST) scores. The GIS scores provided by TruSight Oncology 500 HRD show good correlation (R2= 0.98) with Myriad Genetics GIS however they are not identical (Refer to TruSight Oncology 500 HRD Product Data Sheet Doc# M-GL-00748 for more details). GIS from alternative HRD assays should be not be considered equivalent to Illumina/Myriad GIS.

The GIS algorithm within the TSO500 pipeline (which does not have a cell line mode due to the TSO500 pipeline being non-configurable) is only intended for FFPE samples. Cell line samples will not accurately report GIS results as the tumor fraction (>90%) is too high to reliably distinguish tumor vs germline variants.

HRD Metrics Included in Metrics Output File

Metric

Description

Section in Metrics Output

RNA Output

Refer to RNA Analysis Methods for more information.

Splice Variant VCF

The splice variant VCF contains all candidate splice variants targeted by the analysis panel identified by the RNA analysis pipeline. You can apply the following filters for each variant call:

Filter Name

Description

Refer to the headers in the output for more information about each column.

Splice Variant Annotated JSON

If available, each splice variant is annotated using the Illumina Annotation Engine. The following information is captured in the JSON:

HGNC Gene
Transcript
Exons
Introns

All Fusions CSV

The all fusions CSV file contains all candidate fusions identified by the DRAGEN RNA pipeline. Two output columns in the file describe the candidate fusions: Filter and KeepFusion.

The following table describes the semicolon-separated output found in the Filter columns. The output is either a confidence filter or information only as indicated. If none of the confidence filters are triggered, the Filter column contains the output PASS, else it contains the output FAIL.

Filter Column Output

Filter

Filter Type

Description

The KeepFusion column of the output has a value of TRUE when none of the confidence filters are triggered.

Refer to the headers in the output for more information about each column.

Fusion Columns

Fusion Object Field

Source

When using Microsoft Excel to view this report, genes that are convertible to dates (such as MARCH1 automatically convert to dd-mm format (1 Mar) by Excel. The following are fusion allow list genes:

ABL1
AKT3
ALK
AR

Combined Variant Output

File name: {Pair_ID}_CombinedVariantOutput.tsv

The combined variant output file contains the variants and biomarkers in a single file that is based on a single sample. If using pair ID, the file is based on paired DNA and RNA samples from the same individual. The output contains the following variant types and biomarkers:

Small variants
Copy number variants (CNV) (with absolute copy number when HRD Assay is run)
TMB
MSI
Fusions
Splice variants
GIS (when HRD Assay is run)
Gene-level Loss of Heterozygosity (when HRD Assay is run)
Exon-level CNVs

The combined variant output file also contains Analysis Details and Sequencing Run Details sections. The details of each are listed in the following table:

Analysis Details

Sequencing Run Details

Combined variant output produces small variants with blank fields in the following situations:

The variant has been matched to a canonical RefSeq transcript on an overlapping gene not targeted by TruSight Oncology 500.
The variant is located in a region designated iSNP, indel, or Flanking in the TST500_Manifest.bed file located in the Resources folder.

Variant Filtering Rules

Small Variants - All variants with the FILTER field marked as PASS in the hard-filtered genome VCF are present in the combined variant output.
- Gene information is only present for variants belonging to canonical transcripts that are within the Gene Allow List–Small Variants.
- Transcript information is only present for variants belonging to canonical transcripts that are within the Gene Allow List–Small Variants.

Metrics Output

One metrics output file is generated for the entire run. An additional file is generated for each sample (or DNA-RNA pair).

The MetricsOutput.tsv file contains the following quality control metrics for all samples:

DNA Output

Refer to DNA Analysis Methods for more information.

Small Variant gVCF

File name: {SAMPLE_ID}_hard-filtered.gvcf.gz

The variant status is determined by the FILTER column in the genome VCF as follows.

Filter

Note

Small Variant Annotated JSON

File name: {SAMPLE_ID}_DNAVariants_Annotated.json.gz

The small variants annotated file provides variant annotation information for all nonreference positions from the genome VCF including pass and nonpass variants.

TMB Trace

Column

Description

Copy Number VCF

Given tumor purity X%, for a reported fold change Y, you can calculate the copy number n using the following equation:

For example, a tumor purity at 30% and a MET with fold change of 2.2x indicates that 10 copies of MET DNA are observed.

Analysis Output

When the analysis run completes, the DRAGEN TruSight Oncology 500 Analysis Software generates an analysis output folder in a specified location.

To view analysis output, navigate to the analysis output folder and select the files that you want to view.

Single Node Analysis Output Folder Structure

Single output folder structure is as follows.

Logs_Intermediates
- AdditionalSarjMetrics— Contains per pair ID calculations to support the PCT_TARGET_250X metric.
- Annotation—Contains outputs for small variant annotation.
  - Subfolders per sample ID—Contains the aligned small variants JSON.
Results
- Metrics Output TSV (all pair IDs)
- Pair ID—The following outputs are produced for each sample:

Multiple Node Analysis Output Folder Structure

Multiple output folder structure is as follows.

Demultiplex Output
- A Logs_Intermediates folder containing FASTQ files per sample.
Node(X) Output—The following outputs are produced for each node used:

ICA Output Folder Structure

High-Level Folder Structure

Run ID
- TSO500_Nextflow_logs
  - _manifest.json

TSO500_Nextflow_logs Folder Structure

TSO_500_Nextflow_Logs
- _manifest.json

Results Folder Structure

Contains the aggregated MetricsOutput.tsv file at the root level. Additionally, the Results folder contains a subfolder for each pair ID.

Results
- MetricsOutput.tsv
- Sample_1

The Results subfolder contains the following files:

Results
- MetricsOutput.tsv
- <Pair_id>

Logs_intermediates Folder Structure

Logs_intermediates
- DnaDragenCaller
- AdditionalSarjMetrics

Errors Folder Structure

Contains Errors.tsv. This file contains the summary of all the errors encountered during pipeline execution.

Errors
- Errors.tsv

Analysis Output

hashtagSingle Node Analysis Output Folder Structure

hashtagMultiple Node Analysis Output Folder Structure

hashtagICA Output Folder Structure

hashtagHigh-Level Folder Structure

hashtagTSO500_Nextflow_logs Folder Structure

hashtagResults Folder Structure

hashtagLogs_intermediates Folder Structure

hashtagErrors Folder Structure

DNA Output

hashtagSmall Variant gVCF

hashtagSmall Variant Annotated JSON

hashtagTMB Trace

hashtagCopy Number VCF

HRD and GIS Outputs

hashtagHRD Metrics Included in Metrics Output File

RNA Output

hashtagSplice Variant VCF

hashtagSplice Variant Annotated JSON

hashtagAll Fusions CSV

Combined Variant Output

hashtagVariant Filtering Rules

Metrics Output

hashtagMetrics Output

HRD and GIS Outputs

hashtagHRD Metrics Included in Metrics Output File

DNA Output

hashtagSmall Variant gVCF

hashtagSmall Variant Annotated JSON

hashtagTMB Trace

hashtagCopy Number VCF

Analysis Output

hashtagSingle Node Analysis Output Folder Structure

hashtagMultiple Node Analysis Output Folder Structure

hashtagICA Output Folder Structure

hashtagHigh-Level Folder Structure

hashtagTSO500_Nextflow_logs Folder Structure

hashtagResults Folder Structure

hashtagLogs_intermediates Folder Structure

hashtagErrors Folder Structure

Metrics Output

hashtagMetrics Output

Combined Variant Output

hashtagVariant Filtering Rules

RNA Output

hashtagSplice Variant VCF

hashtagSplice Variant Annotated JSON

hashtagAll Fusions CSV

Single Node Analysis Output Folder Structure

Multiple Node Analysis Output Folder Structure

ICA Output Folder Structure

High-Level Folder Structure

TSO500_Nextflow_logs Folder Structure

Results Folder Structure

Logs_intermediates Folder Structure

Errors Folder Structure

Small Variant gVCF

Small Variant Annotated JSON

TMB Trace

Copy Number VCF

HRD Metrics Included in Metrics Output File

Splice Variant VCF

Splice Variant Annotated JSON

All Fusions CSV

Variant Filtering Rules

Metrics Output

HRD Metrics Included in Metrics Output File

Small Variant gVCF

Small Variant Annotated JSON

TMB Trace

Copy Number VCF

Single Node Analysis Output Folder Structure

Multiple Node Analysis Output Folder Structure

ICA Output Folder Structure

High-Level Folder Structure

TSO500_Nextflow_logs Folder Structure

Results Folder Structure

Logs_intermediates Folder Structure

Errors Folder Structure

Metrics Output

Variant Filtering Rules

Splice Variant VCF

Splice Variant Annotated JSON

All Fusions CSV