When the analysis run completes, the DRAGEN TruSight Oncology 500 Analysis Software generates an analysis output folder in a specified location.
To view analysis output, navigate to the analysis output folder and select the files that you want to view.
Single output folder structure is as follows.
Logs_Intermediates
AdditionalSarjMetrics— Contains per pair ID calculations to support the PCT_TARGET_250X metric.
Annotation—Contains outputs for small variant annotation.
Subfolders per sample ID—Contains the aligned small variants JSON.
CombinedVariantOutput
Subfolders per pair ID—Contains the combined variant output TSV files.
A combined output log file.
Contamination
Subfolders per DNA sample ID—Contains the contamination metrics JSON file and output logs.
DnaDragenCaller
Subfolders per sample ID—Contains the aligned BAM and index files, small variant VCF and gVCF, copy number variant VCF, MSI JSON, exon coverage report bed, and QC outputs in CSV format.
DnaDragenExonCNVCaller
Subfolders per DNA sample ID—Contains the exon-level CNV JSON,the supporting calculation, and the QC files.
DnaFastqValidation—Contains the FASTQ validation output log for DNA samples.
FastqDownsample
Subfolders per RNA sample ID—Contains FASTQ files and output logs.
FastqDownsample output
FastqGeneration
Gis—Contains GIS-related files for HRD samples.
Subfolders per HRD sample ID—Contains the GIS JSON, the supporting calculation, and the QC files.
Also contains the annotated CNV VCF and gene level TSV file with absolute copy number and minor copy number information
LrAnnotation
Subfolders per DNA sample ID—Contains the annotated exon-level CNV JSON.
LrCalculator
Subfolders per DNA sample ID—Contains the exon-level CNV VCF.
MetricsOutput
Subfolders per pair ID—Contains the metrics output TSV files.
A combined output log file.
ResourceVerification—Contains the resource file checksum verification logs.
RnaAnnotation
Subfolders per RNA sample ID—Contains the annotated splice variant JSON.
RnaDragenCaller
Subfolders per sample ID—Contains the aligned BAM, fusion candidates CSV, exon coverage report bed and QC outputs in CSV format.
RnaFastqValidation—Contains the FASTQ validation output log for RNA samples.
RnaFusion
Subfolders per RNA sample ID—Contains the All Fusions CSV and Fusion Processor logs.
RnaQcMetrics
Subfolders per RNA sample ID—Contains the RNA QC metrics JSON.
RnaSpliceVariantCalling
Subfolders per RNA sample ID—Contains the splice variants VCF.
Run QC—Contains the Run QC metrics JSON, Intermediate Run QC metrics JSON, and log file.
SampleAnalysisResults
Subfolders per pair ID—Contains the Sample Analysis Results JSON and detailed log file.
SampleSheetValidation—Contains the Intermediate sample sheet and validation log.
Tmb
Subfolders per DNA sample ID—Contains the TMB metrics CSV, TMB trace TSV, and related files and logs. passing_sample_steps.json
—Contains the steps passed for each sample ID.
pipeline_trace.txt
—Contains a summary and troubleshooting file that lists each Nextflow task executed and the status (for example, COMPLETED or FAILED).
run.log
—Contains a complete trace-level log file describing the Nextflow pipeline execution.
run_report.html
—Contains high-level run statistics (performance, usage, etc.)
run_timeline.html
—Contains timeline-related information about the analysis run.
Results
Metrics Output TSV (all pair IDs)
Pair ID—The following outputs are produced for each sample:
Combined Variant Output TSV
Metrics Output TSV
TMB Trace TSV
Small Variant Genome VCF
Small Variant Genome Annotated JSON
Copy Number Variant VCF
GIS JSON
MSI JSON
Large Rearrangements CNV VCF
Large Rearrangements CNV Annotated JSON
All Fusion CSV
Splice Variant VCF
Splice Variant Annotated JSON
Exon Coverage Report TSV
Gene Coverage Report TSV
Multiple output folder structure is as follows.
Demultiplex Output
A Logs_Intermediates folder containing FASTQ files per sample.
Node(X) Output—The following outputs are produced for each node used:
A Logs_Intermediates folder containing step specific and component specific outputs and logs for every step/component run in the analysis pipeline for the sample run on the node.
A Results folder containing results only for the sample run on the node.
Gathered Output
A Logs_Intermediates folder containing step specific and component specific outputs and logs for every step/component run in each analysis pipeline on every node—this contains outputs for all samples and pairs ran across all nodes in the analysis.
A Results folder containing results for all samples and pairs ran across all nodes—results are organized by Pair_ID, then Sample_ID. This folder also contains summary files which contain information on all samples.
This section describes each output folder generated during analysis and where to find metric and analytic files when the pipeline is executed. The same output folder structure and content exist in ICA and BaseSpace Sequence Hub.
Run ID
TSO500_Nextflow_logs
_manifest.json
Results
_tags.json
Logs_intermediates
Errors—This folder is only present when analysis fails
The TSO_500_Nextflow_Logs provides information related to the execution of the pipeline on ICA as a whole and for specific nodes (when an analysis is split across multiple nodes). It contains files used to execute parts of the workflow on different nodes as well as records of the nextflow execution on those nodes.
TSO_500_Nextflow_Logs
_manifest.json
Contains the aggregated MetricsOutput.tsv file at the root level. Additionally, the Results folder contains a subfolder for each pair ID.
Results
MetricsOutput.tsv
Sample_1
Sample_2
Sample_<#>
_tags.json
The Results
subfolder contains the following files:
Results
MetricsOutput.tsv
<Pair_id>
CombinedVariantOutput.tsv
<SampleName>_MetricsOutput.tsv
<DNA_Sample_id>
CopyNumberVariants.vcf
DNAMergedSmallVariants_Annotated.json.gz
MergedSmallVariants.genome.vcf
MergedSmallVariants.vcf
microstat_output.json
TMB_Trace.tsv
<RNA_Sample_id>
AllFusions.csv
RNA_Annotated.json.gz
SpliceVariants.vcf
Contains folders for each submodule in the DRAGEN TSO 500 on ICA pipeline. The folders contain a copy of all the relevant files required to create the metric output files and report files, as well as the combined log files at the root level and subfolders for each sample.
Logs_intermediates
DnaDragenCaller
AdditionalSarjMetrics
CombinedVariantOutput
FastqGeneration
MetricsOutput
DnaDragenExonCnvCaller
DnaFastqValidation
DNACoverageReport
Gis
Tmb
SampleAnalysisResults
SampleSheetValidation
passing_sample_steps.json
RnaFusion
Contamination
Annotation
RnaAnnotation
RnaDragenCaller
RnaSpliceVariantCalling
RunQc
FastqDownsample
PassingSampleSteps
ResourceVerification
LrCalculator
LrAnnotation
RnaQcMetrics
RnaFastqValidation
RNACoverageReport
Contains Errors.tsv. This file contains the summary of all the errors encountered during pipeline execution.
Errors
Errors.tsv
The following files and folders are created during analysis by NovaSeq 6000Dx Analysis Application:
analysisResults.json
CopyComplete.txt
edgeos.nextflow.config
inputs/
sampleMapping.json
SampleSheet.csv
SampleSheet.json
Manifest.tsv
params.json
Results/
workflowLogs/
nf-main-***.log
When the analysis run completes, the analysis application generates an analysis output in a specified location. To view analysis output, follow the steps below:
On the “Completed” runs tab, select the run
Review the run details page, and this will give the information to access the output folder
External Location: is the input for the run
Analysis Output Folder: is where the output is stored. To navigate to this page, follow the “server location” and the gds analysis output folder
Navigate to the directory that contains the analysis output folder
Open the folder, and then select the files that you want to view