arrow-left

All pages
gitbookPowered by GitBook
1 of 42

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

DRAGEN TSO 500 ctDNA v2.6

Getting Started

MSI

The DRAGEN microsatellite instability status (MSI) module assesses microsatellite sites for evidence of microsatellite instability, relative to a set of baseline normal samples that are based on Jensen– Shannon (JS) distance (an information entropy metric). In total, there are 2343 selected MSI sites with 6 or 7 single nucleotide repeats in the panel. For MSI sites with 500 or more spanning duplex collapsed reads, JS distance is calculated using a test sample vs baseline normal samples, and then any two baseline normal samples. If the JS distance is significantly higher in the test sample vs baseline normal with p-value ≤ 0.01 and the JS distance difference is ≥ 0.02, the site is considered unstable. The final MSI score aggregates all JS distance across all unstable sites. The input is the BAM file from the DNA alignment and read collapsing step, and the output is an MSI metric file.

FastQ Generation

Sequencing data stored in BCL format are demultiplexed through a process that uses the index sequences unique to each sample to assign clusters to the library from which they originated. Each cluster contains two indexes (i7 and i5 sequences, one at each end of the library fragment). The combination of those index sequences are used to demultiplex the pooled libraries.

After demultiplexing, this process generates FASTQ files, which contain the sequencing reads for each individual sample library and the associated quality scores for each base call, excluding reads from any clusters that did not pass filter.

Run Planning

Quality Control

Analysis Launch on ICA

hashtag
Methods for Launching Analysis

Illumina Connected Analytics (ICA) supports the following methods for launching DRAGEN TruSight Oncology 500 ctDNA Analysis Software.

  • Auto-launch—Stream run data directly from the instrument to ICA via a specially configured sample sheet and automatically begin DRAGEN TSO 500 ctDNA analysis.

  • —Initiate DRAGEN TSO 500 ctDNA analysis on ICA using the run files and sample sheet files in the project.

For more information about using ICA or BaseSpace Sequence Hub, refer to the following support pages on the Illumina support site.

Run QC

The Run Metrics section of the metrics output report provides sequencing run quality metrics along with suggested values to determine if they are within an acceptable range. The overall percentage of reads passing filter is compared to a minimum threshold. For Read 1 and Read 2, the average percentage of bases ≥ Q30, which gives a prediction of the probability of an incorrect base call (Q‑score), are also compared to a minimum threshold. The following tables show run metric and quality threshold information for different systems.

The values in the Run Metrics section are listed as NA in the following situations:

  • If the analysis was started from FASTQ files.

  • If the analysis was started from BCL files and the InterOp files are missing or corrupt.

hashtag
NovaSeq 6000 or NovaSeq 6000Dx (RUO)

Metric
Description
Recommended Guideline Quality Threshold
Variant Class

hashtag
NovaSeq X

Metric
Description
Recommended Guideline Quality Threshold
Variant Class
circle-info

There is no PCT_PF_READS value in NovaSeqX Plus runs, so the PCT_PF_READS value will always be NA

NovaSeq 6000Dx Run Set Up

The following instructions describe steps to set up a run on NovaSeq 6000Dx Analysis Application.

Use the following steps to configure a TruSight™ Oncology 500 ctDNA run in Illumina Run Manager.

  1. Go to the "Runs" section of Illumina Run Manager by selecting "Runs" on the left-hand side.

  2. Enter sample data manually or by importing a sample sheet

Combined Variant Output

File name: {SampleID}_CombinedVariantOutput.tsv

The combined variant file contains the variants and biomarkers in a single file. The output contains the following variant types and biomarkers:

  • Small variants (including EGFR complex variants)

  • Copy number variants

Maximum Somatic VAF

The maximum somatic variant allele frequency (MSAF) is the highest VAF of a confirmed somatic mutation. The MSAF is often used as a surrogate for tumor fraction, especially when the ctDNA tumor fraction is high.

triangle-exclamation

Use caution when using MSAF as a surrogate for ctDNA tumor fraction. The MSAF implementation in the DRAGEN TSO 500 ctDNA analysis software incorporates fragment size based analysis and filtering of germline and clonal hematopoiesis variants. However, taking into account additional factors, e.g., aneuploidy, and broader testing is needed to provide a more accurate representation of tumor fraction.

Auto-Launch with FASTQs generated by Standalone BCL Convert Pipeline (Start from FASTQ)

When using BSSH Run Planning to generate a sample sheet to auto-Launch analysis on ICA, you must designate "Start from Fastq" to be True or False (default is False). If you choose "False", BSSH will kick off the TSO 500 pipeline normally using BCL input from the run folder.

If you choose "True" for this option, BSSH will run two ICA pipelines in sequence:

  • First, it will run the BCL Convert v3.10.9 pipeline to generate the FASTQ files

  • Second, it will kick off the TSO 500 ctDNA pipeline using the FASTQ files generated above as the input

Manual launch
Illumina Connected Analytics support site pagearrow-up-right
BaseSpace Sequence Hub support site pagearrow-up-right

Mapping and Alignment

hashtag
DNA Alignment and Read Collapsing

The alignment step uses DRAGEN Aligner with UMI collapsing to align DNA sequences in FASTQ files to the hg19_decoy genome. This step combines sets of reads (ie, families) that are grouped together based on genomic locations and UMI tags into representative sequences. This process accurately removes duplicate reads and sequencing errors without losing the signal of very low frequency (< 1%) sequence variations.

This alignment step generates BAM files (*.bam) and BAM index files (*.bam.bai) that are saved to the alignment folder. A BAM file is the compressed binary version of a SAM file that is used to represent aligned sequences.

Read collapsing adds the following BAM tags:

  • RX/XU: UMI combination. RX is duplicated from XU to satisfy the BAM/SAM format

  • XV: Number of reads in the family on one strand.

  • XW: Number of reads in the duplex-family or 0 if not a duplex family.

hashtag
Indel Realignment and Read Stitching

The Gemini software component performs local indel realignment, paired‑read stitching, and read filtering. A stitched read is a single read that has been combined from a pair of reads. Reads near detected indels are realigned to remove alignment artifacts. The input is a single BAM file and the reference genome FASTA used to align it; the output is a corresponding single BAM file with stitched, pair‑realigned reads. Read pairs with poor map quality or supplementary and secondary alignments from the input BAM are ignored.

The following BAM tags are added to the stitched reads:

  • XD—Directional support string indicating forward, reverse, and stitched positions.

  • XR—Read pair orientation, which can be either forward-reverse (FR) or reverse-forward (RF).

Launching Analysis

Coverage Reports

The gene and exon coverage report files are tab-separated value (TSV) files with coverage values matching respectively the exons and genes specified in the manifest file.

All

PCT_PF_READS (%)

Total percentage of reads passing filter.

≥55.0

All

PCT_Q30_R1 (%)

Percentage of Read 1 reads with quality score ≥ 30.

≥80.0

All

PCT_Q30_R2 (%)

Percentage of Read 2 reads with quality score ≥ 30.

PCT_Q30_R1 (%)

Percentage of Read 1 reads with quality score ≥ 30.

≥85.0

All

PCT_Q30_R2 (%)

Percentage of Read 2 reads with quality score ≥ 30.

≥85.0

All

≥80.0

To enter sample data run manually, select “Create Run”.
  • Choose "DRAGEN TruSight™ Oncology 500 ctDNA Analysis Application" from the "Create Run" screen to set-up and analyze runs for TruSight Oncology 500 ctDNA assay.

  • hashtag
    Run Settings

    1. On the "Run Settings" screen, enter a run name with the following criteria:

      1. 1 - 40 characters.

      2. Alphanumeric characters, underscores, or dashes only.

      3. Unique across all runs on the instrument.

      The run name identifies the run from sequencing through analysis.

    2. [Optional] Enter a run description. The run description must have the following criteria:

      1. 1 - 50 characters.

      2. Alphanumeric characters or spaces only.

    3. Select kit used during library preparation:

      1. TruSight Oncology 500 ctDNA

      2. TruSight Oncology 500 ctDNA v2

    4. Index adapter kit will be automatically selected based on the library prep kit selection

    5. [Optional] Enter a library tube ID.

    Depending on the library prep kit selected, additional fields will be populated for run settings and are not editable. Read and index lengths will differ between library prep kit type.

    hashtag
    Sample Data

    Use the table on the "Sample Data" screen to enter sample information manually.

    Alternately, select Import Samples to upload sample information. Refer to NovaSeq 6000Dx Analysis Application: Sample Sheet Requirements for sample sheet requirements.

    1. Select lane information. Options include one to four, or all lanes.

    2. Enter a unique sample ID in the sample ID field with the following criteria:

      1. Controls should be added first.

      2. 1 - 40 characters.

      3. Alphanumeric characters, underscores, or dashes only.

      4. Underscores and dashes must be preceded and followed by an alphanumeric character.

    3. Select an index set ID for the DNA library prepared from the sample.

    4. [Optional] Enter a library name.

    Depending on the options selected for index set ID, additional fields will be auto-populated for sample data and are not editable.

    hashtag
    Sample Settings

    Use the table on the "Sample Settings" screen to enter additional sample information.

    1. [Optional] Enter a sample name with the following criteria:

      1. 1 - 50 characters.

      2. Alphanumeric characters, dashes, underscores, or spaces.

      3. Spaces, underscores, and dashes must be preceded and followed by an alphanumeric character.

    2. [Optional] Enter a sample description with the following criteria:

      1. 1 - 50 characters.

      2. Alphanumeric characters, dashes, underscores, or spaces.

    Additional fields will be auto-populated based on selections made in the Sample Data screen, which are not editable.

    Before starting your run, review that the information entered is correct in the “Run Review” page before saving.

    Tumor Mutational Burden (TMB)

  • MSI

  • DNA Fusions

  • The combined variant output file also contains Analysis Details and Sequencing Run Details sections. The details of each are listed in the following table:

    Analysis Details
    Sequencing Run Details

    - Sample ID - Output date - Output time - Pipeline version (Docker image version number)

    - Run name - Run date - Sample index ID - Instrument ID - Instrument control software version - Instrument type - RTA version - SBS reagent cartridge lot number - Cluster reagent cartridge lot number

    hashtag
    Variant Filtering Rules

    Combined variant output produces small variants with blank fields in the following situations:

    • The variant has been matched to a canonical RefSeq transcript on an overlapping gene not targeted by TruSight Oncology 500 ctDNA.

    • The variant is located in a region designated iSNP, indel, or Flanking in the TST500_Manifest.bed file located in the Resources folder.

    • Small Variants - All variants with the FILTER field marked as PASS and which have a canonical RefSeq transcript are present in the combined variant output.

      • Gene and transcript information is only present for variants belonging to canonical transcripts that are within the Gene list–Small Variants.

    • Copy Number Variants - Copy number variants must meet the following conditions:

      • FILTER field marked as PASS.

      • ALT field is <DUP or <DEL> .

    • Fusion Variants - Fusion variants must meet the following conditions:

      • Passing fusion filtering criteria with "PASS" from DNAFF module

      • Contains at least one gene on the fusion allow list.

    hashtag
    MSAF algorithm

    The MSAF is determined using the following steps:

    1. Somatic variants determined by the TMB algorithm are used as an input (variants having Status Somatic in the {SampleID}_tmb.trace.tsv file). Variants that are not in coding regions, MNVs and variants with depth below 500 are filtered out.

    2. The remaining variants are ranked by the VAF in descending order

    3. The VAF for the highest ranked confident somatic variants is output as MSAF. "Confident somatic variants" are determined analyzing the fragment size of the reads supporting the variants. Circulating tumor DNA (ctDNA) molecules are expected to be shorter, when compared to the normal cell-free DNA (cfDNA) molecules. If the fragment sizes of the reads supporting a variant are significantly shorter than non-supporting reads (p-value < 1x10-5), the variant is considered to be a confident somatic variant.

    4. If no such variant exists, the VAF for the highest ranked COSMIC hotspot variant (with COSMIC count > 50) is output as MSAF.

    5. If no such variant exists, the VAF for the 4th highest ranked variant is output as MSAF.

    MSAF algorithm in DRAGEN TSO 500 ctDNA Analysis Software

    hashtag
    MSAF output files

    The MSAF algorithm outputs results in several files:

    1. Metrics Output File, {SampleID}_MetricsOutput.tsv as Max_Somatic_AF

    2. TMB Max Somatic VAF file, {Sample_ID}.tmb.msaf.csv as MaxSomaticAF, using the same file format as the TMB Trace File.

    circle-exclamation

    Please note. Run QC metrics are generated inside the TSO 500 pipeline from interop files in the run folder. When starting TSO 500 ctDNA from FASTQ (via auto-launch or manual launch), Run QC metrics will not be generated.

    When the auto-launched analysis completed, you will see the analysis result shows results from both the BCL Convert pipeline and the TSO 500 ctDNA pipeline, like below:

    Introduction to DRAGEN TSO 500 ctDNA Analysis Software v2.6

    The Illumina DRAGEN TruSight Oncology 500 (TSO 500) ctDNA Analysis Software supports analysis for DNA libraries that are isolated from plasma and prepared using TruSight Oncology 500 ctDNA v2 and v1 assays. The software, which can be run on the DRAGEN Server, on Illumina Connected Analytics, and as an analysis application on NovaSeq 6000Dx, produces a variant call file (VCF) for small variants. Other outputs include tumor mutational burden (TMB), a Jensen-Shannon divergence (sum JSD) score that can be used for evaluating microsatellite instability (MSI) status, files with copy number variants (CNV), fusions as well as coverage reports.

    The secondary analysis software starts from a sequencing run folder containing base call files (BCL) or from FASTQ files staged in a FASTQ folder.

    This document provides information on installation, configuration, running, troubleshooting as well as analysis algorithms of DRAGEN TruSight Oncology 500 ctDNA analysis software on Illumina Connected Analytics, standalone DRAGEN server, and the NovaSeq 6000Dx analysis application.

    The software is optimized for analyzing sequencing outputs generated by the TruSight Oncology 500 ctDNA v2 and v1 assays. Modification might lead to inaccurate data and is a violation of the Illumina Software Subscription Agreementarrow-up-right.

    circle-info

    Variant reporting by DRAGEN TruSight™ Oncology 500 ctDNA Analysis Software is limited by a manifest file and a . The manifest file excludes regions where the probe set does not effectively capture targets, and the block list file excludes specific positions from variant calling. TSO 500 ctDNA assay probes target at least 97% of the CDS of 474 genes. Please contact your local Illumina representative for more information if needed.

    hashtag
    Scope

    This resource provides information on installation, configuration, running, troubleshooting and analysis algorithms for the following software:

    Software
    Versions

    The content is applicable to all 3 software versions unless otherwise specified. The content related to setting up and running the analysis on ICA is only relevant to software versions v2.6.0 and v2.6.1. The content related to the analysis application on NovaSeq 6000Dx is only relevant to v2.6.1.

    hashtag
    Local and Cloud Deployments

    Local analysis is available using a standalone DRAGEN server or an application with a user interface on NovaSeq 6000Dx. The software on the standalone DRAGEN server allows for analysis on a single DRAGEN server or splitting across multiple servers.

    Cloud analysis is available on Illumina Connected Analytics with auto-launch or manual launch. Both methods are available from BCLs and FASTQs.

    hashtag
    Instrument Compatibility

    DRAGEN TruSight Oncology 500 ctDNA analysis software is compatible with data generated on the Illumina instruments as summarized in the table below.

    Instrument
    Illumina Connected Analytics
    Standalone DRAGEN Server
    Paired DRAGEN server
    On-board DRAGEN

    hashtag
    Navigation of Guide

    This resource provides information on installation, configuration, running, troubleshooting as well as analysis algorithms of DRAGEN TruSight Oncology 500 ctDNA analysis software on Illumina Connected Analytics, standalone DRAGEN server, and the NovaSeq 6000Dx analysis application.

    hashtag
    Workflow Diagram

    Metrics Output

    File Name: MetricsOutput.tsv

    The metrics output file is a final combined metrics report that provides sample status, key analysis metrics, and metadata in a tab-separated values (TSV) file. Sample metrics within the report indicate guideline‑suggested lower specification limits (LSL) and upper specification limits (USL) for each sample in the run.

    One metrics output file is generated for the entire run. An additional file is generated for each sample.

    circle-check

    All metrics and guidelines are applicable to all versions of DRAGEN TSO 500 ctDNA analysis software (v2.1 and above).

    hashtag
    Run Metrics

    Run metrics from the analysis module indicate the quality of the sequencing run.

    Review the following metrics to assess run data quality:

    Metric
    Description
    Recommended Threshold

    The values in the Run Metrics section are listed as NA in the following situations:

    • The analysis was started from FASTQ files.

    • The analysis was started from BCL files and the InterOp files are missing or corrupt.

    • [NovaSeqX Plus only] There is no PCT_PF_READS value in NovaSeqX Plus runs, so the PCT_PF_READS value will always be NA.

    hashtag
    Sample QC Metrics

    Review the following metrics to assess sample data quality:

    Metric
    Description
    Recommended Threshold
    Variant Class

    *The recommended threshold of 0.059 for GENE_SCALED_MAD only applies to real cell‑free DNA.

    For troubleshooting information, refer to

    Input Checks

    Items to check before submitting your analysis

    hashtag
    Input Validation Steps

    In addition to sample sheet validation, before starting an analysis, the software performs the following checks:

    • Resource bundle integrity

    • DRAGEN license validity

    • NEW in v2.6.3! Instrument Type determination

    Instrument type is required by the software because different instrument platforms produce output with distinct systematic noise profiles. DRAGEN TSO 500 ctDNA analysis software requires instrument-specific baseline noise files. Before kicking off analysis, the software determines the instrument type in order to select which baseline files to use. This means that even when starting from FASTQ, all samples in the run must be sequenced on the same type of instrument.

    hashtag
    Instrument Type Determination for Baselines

    When starting analysis from BCL files, the software references the RunInfo.xml file in the Run folder to identify the sequencing instrument type. For analyses beginning with FASTQ files, the software reads the FASTQ file headers for this information.

    Instrument Serial Number Prefix
    Instrument Type
    triangle-exclamation

    For analysis starting from FASTQ files, if a mixed instrument type is detected, the software will exit with an error.

    The Combined Variant Output File will include the instrument type

    Systematic Noise files for certain variant types depend on library prep kit, instrument type, or both. The software will select the correct set of resource files (systematic noise, panel of normals) according to its determination of instrument type and library prep kit. VCF headers contain details on which resource files were used in the analysis.

    • Library Preparation Kit: TSO500 ctDNA v2 (index length: 10)

      • Instrument Types:

        • NovaSeq 6000, NovaSeq 6000Dx: Use one set of baselines.

    circle-exclamation

    If instrument type is undetermined (ie not matching any of the below), the software will set instrument type to NA, and the baselines for NovaSeq 6000 and NovaSeq 6000Dx will be applied.

    circle-exclamation

    If the software cannot detect the Library Prep Kit due to a missing index in the sample sheet, it will default to TSO500 ctDNA v2.

    Contamination

    The contamination score evaluates presence of sample-to-sample contamination. The algorithm uses common germline SNPs in the homozygous state expected to have variant allele frequencies (VAF) at 0% and 100%. In contaminated samples, the VAFs shift away from the expected values allowing the detection of sample-to-sample contamination.

    circle-check

    The contamination score can detect sample-to-sample contamination greater than or equal to 0.4% (more than 0.4% of DNA input is coming from the contaminant)

    hashtag
    Contamination Score Calculation

    The contamination score is calculated using the SNP error file and Pileup file that are generated during the small variant calling, as well as the TMB trace file. The algorithm includes the following steps:

    • All positions that overlap with a pre-defined set of common SNPs that have variant allele frequencies of < 25% or > 75% are collected (only SNP are considered, indels are excluded)

    • Variants in CNV events are removed using a clustering method

    • The likelihood that the positions are an error or a real mutation is calculated by:

    CONTAMINATION_SCORE = sum(log10(P(vi is False Positive)))

    hashtag
    Contamination Score Interpretation

    • The contamination score is output in the metrics output file, MetricsOutput.tsv

    • If a contamination score is equal or below 1227 (the upper specification limit provided in the "USL Guideline" field in the metrics output file, see ), the sample has less than 0.4% sample-to-sample contamination.

    • If a contamination score is above 1227, the sample has more than 0.4% sample-to-sample contamination. In this case, an estimation of the contamination can be obtained from the PCT_CONTAMINATION_EST metric, see more details on the . As noted, PCT_CONTAMINATION_EST is not valid unless the contamination score exceeds 1227.

    circle-exclamation

    Samples with highly rearranged genomes (HRD samples) can have variants with VAFs that shift away from the expected frequencies due to genomic rearrangement, which can lead to false-positive contamination scores

    • Visual examination can help determine if a shift of VAFs is due to true contamination

    hashtag
    How to build a VAF plot for visual examination

    1. To build a VAF plot, use the {Sample_ID}.tmb.trace.csv file. Filter to only germline variants (for example, by using tags "Germline_DB" and "Germline_Proxi" in the column "Status") and use values in the VAF column.

    2. Select Scatter from the Charts menu

    3. Review plot as described above analyzing whether variants are scattered or clustered around 50% and 100% VAF

    Getting Started on Illumina Connected Analytics

    hashtag
    Prerequisites

    Illumina Connected Analytics (ICA) subscription includes access to DRAGEN TruSight Oncology 500 ctDNA Analysis Software. To get started, you need:

    • An ICA account with a valid subscription

    • A positive balance of iCredits for data storage

    Refer to the for information on how to register ICA subscription and iCredits.

    Analysis Launch on Standalone DRAGEN Server

    Start the DRAGEN TruSight Oncology 500 ctDNA Analysis Software with the DRAGEN_TSO500_CTDNA-2.6.0.sh Bash script. The script is installed in the /usr/local/bin directory. The Bash script is executed on the command line and runs the software with Docker (or Apptainer if specified).

    For arguments, refer to . You can start from BCL files or from the FASTQ folder produced by BCL Convert. The following requirements apply for both methods:

    • Path to the sequencing run or FASTQ folder. Copy the run or FASTQ folder to the DRAGEN server into the staging folder with the following recommended organization: /staging/runs/{RunID}. You can copy the run folder onto the DRAGEN server using Linux commands such as

    Sample Sheet Introduction

    hashtag
    Overview

    A sample sheet is required for each analysis with DRAGEN TruSight Oncology 500 ctDNA Analysis Software. A sample sheet is a comma-separated value (*.csv) file format used by Illumina instruments, platforms, and analysis pipelines to store settings and data for sequencing and analysis. The DRAGEN TruSight Oncology 500 ctDNA Analysis Software is compatible with the sample sheet v2. For general information on the sample sheet v2, refer to .

    The sample sheet includes a list of samples and their index sequences, along with additional information required to run DRAGEN TruSight Oncology 500 ctDNA Analysis Software. For example, the library prep kit used for analysis will need to be listed in the sample sheet. Appropriate index adapter sequences are determined by the assay used to perform analysis.

    Sample QC

    DRAGEN TruSight Oncology 500 uses QC metrics to assess the validity of analysis for DNA libraries that pass contamination quality control. If the library fails one or more quality metrics, then the corresponding variant type or biomarker is not reported, and the associated QC category in the report header displays FAIL.

    DNA library QC results are available in the MetricsOutput.tsv file. Refer to for details.

    Metric
    Description
    Recommended Guideline Quality Threshold
    Variant Class
    Spaces must be preceded and followed by an alphanumeric character.
    Spaces, underscores, and dashes must be preceded and followed by an alphanumeric character.
    Gene is part of the copy number variant gene list
    Genes separated by a dash (-) indicate that the fusion directionality could be determined. Genes separated by a slash (/) indicate that the fusion directionality could not be determined.

    NextSeq 1000/ 2000: Use a different set of baselines.

  • NovaSeqX, NovaSeqXPlus: Use another set of baselines.

  • Library Preparation Kit: TSO500 ctDNA (index length: 8)

    • A single set of baselines applies to all instrument types.

  • A

    NovaSeq6000

    ADX

    NovaSeq6000Dx

    VH

    NextSeq1k2k

    LH

    NovaSeqXPlus

    LL

    NovaSeqX

    Combined Variant Output
    Software Setup pagearrow-up-right
    When running analysis on a standalone DRAGEN server or on ICA, a valid sample sheet can be created by:
    • BaseSpace Run Planner (preferred), see Sample Sheet Creation in BaseSpace Run Planner page for details

    • Downloading and modifying a sample sheet template following the requirements, see Sample Sheet Requirements page for details

    When running analysis using a NovaSeq 6000Dx Analysis Application, a valid sample sheet can be created by:

    • Using the user interface of the DRAGEN TruSight Oncology 500 ctDNA Analysis Application, see Run Planning on Illumina Run Manager for details

    • Downloading and modifying a sample sheet template following the requirements (see Sample Sheet Requirements page for details), then importing it to Illumina Run Manager.

    The run set up section of this guide includes specific instructions to plan a run and set up a valid sample sheet for each deployment of DRAGEN TruSight Oncology 500 ctDNA Analysis Software.

    Illumina Connected Software - Sample Sheetarrow-up-right

    Estimating the error rate per sample

  • Counting mutation support

  • Counting total depth

  • The contamination score is calculated as the sum of all the log likelihood scores across the pre-defined SNP positions whose minor allele frequency is <25% in the sample and not likely due to CNV events:

  • Metrics Output page
    DNA Expanded Metrics page
    Visual investigation of VAFs across the genome can help determine if a shift of VAFs is due to true contamination

    Analysis Methods

    The software processes sequencing data to perform quality control, detect variants, determine tumor mutational burden (TMB), microsatellite instability (MSI) status, and report results. The following sections describe the analysis methods used in DRAGEN TruSight Oncology 500 ctDNA Analysis Software.

    DRAGEN TruSight Oncology 500 ctDNA Analysis Software uses the following workflows to analyze sequencing data.

    • FASTQ Generation

    • DNA Analysis

      • Contamination Detection

      • Copy Number Variant (CNV) Calling

      • DNA Alignment and Read Collapsing

      • Fusion Calling

      • Fusion Filtering

      • Indel realignment and read stitching

      • Max Somatic AF

      • Microsatellite Instability (MSI) Status

      • Phased Variant Calling

      • Small Variant Calling

      • Small Variant Filtering

      • Tumor Mutational Burden (TMB) Scoring

      • Variant Merging

    • Quality Control

      • Run QC

      • DNA Sample QC

    NovaSeq X

    Yes (v2.6.0.25, v2.6.1.8, v2.6.3)

    Yes

    N/A

    No

    NextSeq2K

    Yes (v2.6.3)

    Yes

    N/A

    No

    DRAGEN TruSight Oncology 500 ctDNA Analysis Software on Illumina Connected Analytics (ICA)

    • v2.6.0.25

    • v2.6.1.8

    • v2.6.3

    DRAGEN TruSight Oncology 500 ctDNA Analysis Software (for standalone DRAGEN server)

    • v2.6.0

    • v2.6.1

    • v2.6.2

    • v2.6.3

    DRAGEN TruSight Oncology 500 ctDNA Analysis Application on NovaSeq 6000Dx (uses a paired DRAGEN server)

    • v2.6.0

    • v2.6.1

    • v2.6.3

    NovaSeq 6000

    Yes (v2.6.0.25, v2.6.1.8, v2.6.3)

    Yes

    N/A

    N/A

    NovaSeq 6000Dx (RUO mode)

    Yes (v2.6.0.25, v2.6.1.8, v2.6.3)

    Yes

    Yes (v2.6.0, v2.6.1, v2.6.3)

    block list file
    Workflow Overview

    N/A

    GENE_SCALED_MAD

    The median of absolute deviations normalized by gene fold change.

    ≤ 0.059*

    CNV

    MEDIAN_BIN_COUNT_CNV_TARGET

    The median raw bin count per CNV target.

    ≥ 6.0

    CNV

    PCT_PF_READS

    Percentage of reads on the sequencing flow cell that pass the filter.

    ≥ 55.0

    (No lower specification limit for NovaSeq X Plus)

    PCT_Q30_R1

    Percentage of bases with a quality score ≥ 30 from Read 1.

    ≥ 80.0

    (≥ 85.0 for NovaSeq X Plus)

    PCT_Q30_R2

    Percentage of bases with a quality score ≥ 30 from Read 2.

    ≥ 80.0

    (≥ 85.0 for NovaSeq X Plus)

    CONTAMINATION_SCORE

    The contamination score based on VAF distribution of SNPs.

    ≤ 1227

    All

    MEDIAN_EXON_COVERAGE

    Median exon fragment coverage across all exon bases.

    ≥ 1300

    Small variant, TMB, fusion, MSI

    PCT_EXON_1000X

    Percent exon bases with 1000X fragment coverage.

    ≥ 80.0

    Troubleshooting

    Small variant, TMB

    ≤ 1227

    All

    MEDIAN_EXON_COVERAGE

    Median exon fragment coverage across all exon bases.

    ≥ 1300

    Small variant, TMB, Fusion, MSI

    PCT_EXON_1000X

    Percent exon bases with 1000x fragment coverage.

    ≥ 80.0

    Small variant, TMB

    GENE_SCALED_MAD

    The median of absolute deviations normalized by gene fold change.

    ≤ 0.059*

    CNV

    MEDIAN_BIN_COUNT_CNV_TARGET

    The median raw bin count per CNV target.

    ≥ 6.0

    CNV

    CONTAMINATION_SCORE

    Metrics Output

    The contamination score is based on VAF distribution of SNPs.

    rsync
    . The sample sheet within the run folder is used unless otherwise specified through the command line.
  • Run folder must be intact. Refer to Starting from BCL Files for input requirements.

  • If the analysis output folder path is different from the default, provide the analysis output folder path. Refer to Command-Line Options.

  • circle-info

    Before running the analysis, confirm that the output directory for the software to write to is empty and does not include results of previous analyses.

    hashtag
    Storage Requirements

    For optimal performance, run analysis on data stored locally on the DRAGEN server. Analysis of data stored on NAS can take longer and performance can be less reliable.

    The DRAGEN server provides an NVMe SSD in the /staging directory to use as the software output directory. Network-attached storage is required for long-term storage.

    When running the DRAGEN TruSight Oncology 500 ctDNA Analysis Software, use the default settings or set the -analysisFolder command line option to a directory in /staging to make sure the DRAGEN server processes read and write data on the NVMe SSD.

    Before beginning analysis, develop a strategy to copy data from the DRAGEN server to a network‑attached storage. Delete output data on the DRAGEN server as soon as possible.

    The following are the run and analysis output sizes for each sequencing system per 101 bp:

    Sequencing System
    Run Folder Output (Gb)
    Analysis Output (Gb)
    Minimum Disk Space (Gb)

    NovaSeq 6000/6000Dx (RUO) SP Flow Cell

    85-100

    250-374

    300

    NovaSeq 6000/6000Dx (RUO) S1 Flow Cell

    164-200

    360-665

    800

    NovaSeq 6000/6000Dx (RUO) S2 Flow Cell

    290-460

    When launching the analysis, the software checks that the minimum disk space required is available. If the minimum disk space is not available, the software shows an error message and prevents analysis from starting. If disk space is exhausted during a run, the run shows an error and stops analyzing.

    circle-exclamation

    Moving or modifying files during an analysis may cause the analysis to fail or provide incorrect results.

    Command-Line Options

    Multi-Assay Flow Cells

    How to plan a run and analyze data with multiple assays loaded individually per lane on one flow cell.

    hashtag
    Introduction

    The NovaSeq X platform supports loading samples from different assays into different lanes within a single sequencing run. To ensure compatibility, the DRAGEN TSO 500 ctDNA v2.6.3 analysis software supports sample sheets containing multiple data sections, with one section per assay (for example, TSO500S_Data and TSO500L_Data).

    This section describes how the software validates and processes multi-assay sample sheets and details the specific rules and logic applied during validation.

    hashtag
    Planning a Run with a Multi-Assay Flow Cell

    1. In the BaseSpace Sequence Hub home page, click "Runs" -> "New Run" -> "Run Planning".

    2. In the "Create a Run" page, select "NovaSeq X Series" as instrument platform.

    3. Provide information for the first assay until the "Run Review" page.

    hashtag
    Analyzing Data from a Multi-Assay Flow Cell

    hashtag
    Sample Sheet Validation

    The sample sheet validator determines which data section(s) to validate based on the active workflow type. The list of samples to process is generated only from the relevant section (for example, Solid or ctDNA). This ensures that only valid samples for the selected workflow are validated.

    The workflow’s data section may contain fewer samples than the BCLConvert_Data section. However, all samples listed in the workflow data section must exist in BCLConvert_Data section. The downstream pipeline will only process samples found in the workflow-specific data section.

    hashtag
    Library Prep Kits

    Multi-assay sample sheets may include multiple library prep kits values separated by semicolons (";").

    Validation Logic:

    Workflow Type
    Validation Logic
    Example of Accepted Values

    hashtag
    Adapter Reads

    When working with multiple assays, the "AdapterRead" field in the sample sheet may include multiple adapter sequences separated by "+". The adapter read of the selected workflow will be filled in the intermediate sample sheet.

    hashtag
    Override Cycles

    Override cycles are determined by the index and read length specified in the "Reads" section of the RunInfo.xml file and assay specifics. For multi-assay flow cells, the index and read length provided in RunInfo.xml should be the longer of the sequencing cycles required by the two assays. Override cycles for assays requiring shorter indexes and read lengths will be padded to match the cycle lengths used in the run.

    hashtag
    BCL Convert Settings and Data

    Starting from BCL:

    • Sample Sheet Validator recalculates and inserts the correct override cycles into the BCLConvert_Data section of the intermediate sample sheet.

      • If index lengths vary, the logic has been enhanced to automatically pad shorter indexes with "N" to standardize cycle lengths.

    • Sample Sheet Validator also writes in values for: Adapter Reads, MaskShortReads, AdapterBehavior, MinimumTrimmedReadLength. These values are based on the TSO 500 ctDNA library prep kit determined from the input sample sheet.

    Starting from FASTQ:

    • If users run BCL Convert separately to generate FASTQ files:

      • The input sample sheet must already contain valid override cycles, adapter reads, and other BCL Convert Settings.

      • The resulting TSO 500 ctDNA workflow run will start from FASTQ input files.

    hashtag
    BaseSpace Autolaunch

    Analyses of samples in multi-assay flow cells can be launched simultaneously from BaseSpace. As long as "Starts from FASTQ" is set to True in Analysis Settings, all samples from a multi-assay flow cell will be demultiplexed first to generate FASTQ files. Then, the ICA pipelines (e.g. DRAGEN TSO 500, DRAGEN TSO 500 ctDNA) will be launched simultaneously to process the samples for each of the assays.

    circle-check

    For more information about auto-launching BCL Convert before starting TSO 500 analysis from FASTQ, see

    hashtag
    Summary of Rule Changes

    Component
    Update Summary
    Effect

    Sample Sheet Requirements

    DRAGEN TSO 500 ctDNA Analysis Software has optional and required fields that are required in addition to general sample sheet requirements. Follow the steps below to create a valid samplesheet.

    circle-info

    TSO 500L Data Section header changes depending on the deployment:

    • Standalone DRAGEN Server and ICA with Manual Launch: TSO500L_Data

    • ICA with Auto-launch: Cloud_TSO500L_Data

    hashtag
    [TSO500L_Data] Section

    Parameter
    Required
    Details

    To ensure a successful analysis, follow these guidelines:

    1. Avoid any blank lines at the end of the sample sheet; these can cause the analysis to fail.

    2. When running local analysis using the command line save the sample sheet in the sequencing run folder with the default name SampleSheet.csv, or choose a different name and specify the path in the command-line options.

    hashtag
    ICA with Auto-launch: Sample Sheet Requirements

    Refer to the following requirements to create sample sheets for running the analysis on ICA with Auto-launch. For sample sheet requirements common between deployments see . Samples sheets can be created using BaseSpace Run Planning Tool or manually by downloading and editing a sample sheet template

    circle-info

    To auto-launch analysis from the sequencer run folder, ensure the StartsFromFastq and SampleSheetRequested fields are set to FALSE. To auto-launch analysis from FASTQs after BCL Convert auto-launch, StartsFromFastq and SampleSheet Requested fields must be set to TRUE

    hashtag
    [Cloud_TSO500L_Data] Section

    Refer to for this section's requirements.

    hashtag
    [Cloud_TSO500L_Settings] Section

    Parameters
    Required
    Details

    hashtag
    [Cloud_Data] Section

    Parameter
    Required
    Details

    hashtag
    [Cloud_Settings] Section

    Parameter
    Required
    Details

    hashtag
    NovaSeq 6000Dx Analysis Application: Sample Sheet Requirements

    This section describes fields specific for sample sheets for NovaSeq 6000Dx Analysis Application. For more information on DRAGEN TSO 500 ctDNA Analysis Software sample sheet requirements, refer to the sections above.

    circle-exclamation

    Mismatches between the samples and index primers can cause incorrect results due to loss of positive sample identification. Enter sample IDs and assign indexes in the sample sheet before beginning library preparation. Record sample IDs, indexes, and plate well orientation for reference during library preparation.

    hashtag
    [BCLConvert_Settings] Section

    Parameter Name
    Required

    Sample Sheet Templates

    Sample Sheet templates for TSO 500 ctDNA standalone DRAGEN server and ICA manual launch analysis can be found in the table below. For auto-launch compatible sample sheets, use BaseSpace Run Planner.

    DRAGEN TSO 500 ctDNA analysis software is compatible with several instruments and assay workflows, each of which have implications for the sample sheet.

    Sample sheet templates contain all required fields, including index sequences in the proper orientation for all indexes from a given library prep kit. The templates are provided as a starting point for creating a sample sheet manually when launching analysis on a standalone DRAGEN server or on ICA using manual launch.

    circle-info

    For interactive run planning or to create a sample sheet for ICA Autolaunch, use BaseSpace Run Plannerarrow-up-right to create valid sample sheets for either local or cloud analysis. To set up a run in BaseSpace run planner, refer to Sample Sheet Creation in BaseSpace Run Planner.

    Users can visit the section to learn additional details on required fields and values as they fill in sample information. Use the lookup tables below to select and download the sample sheet template that matches your instrument, assay, and workflow configuration.

    circle-check

    Note: Templates are not instrument-specific. Users can fill in InstrumentPlatform and InstrumentType according to the table provided.

    hashtag
    v2.6.1-2.6.2

    Assay
    Instrument
    Assay Workflow
    File

    hashtag
    v2.6.3

    Assay
    Instrument
    Assay Workflow
    File

    hashtag
    Instrument Lookup

    Use the following InstrumentPlatform values, based on Instrument value, to replace the placeholder value in the above templates:

    Instrument
    InstrumentPlatform

    Installation of NovaSeq 6000Dx TSO 500 ctDNA Analysis Application

    Instructions to install DRAGEN TSO 500 ctDNA Analysis Application on NovaSeq 6000Dx (RUO mode)

    hashtag
    Prerequisites

    The following requirements must be met to install and run DRAGEN TruSight Oncology 500 ctDNA Analysis Application on NovaSeq 6000Dx:

    • A NovaSeq 6000Dx sequencing instrument with Illumina Run Manager v1.6.2 installed and paired with DRAGEN Server v4.

    • Administrator privileges on Illumina Run Manager.

    hashtag
    Installation Instructions

    Perform the following steps to download and install the DRAGEN TruSight Oncology 500 ctDNA Analysis Application on NovaSeq 6000Dx installation package:

    1. Contact Illumina Customer Care for the link to the installation package. The link expires after 7 days.

    2. Download the installation package with the link provided in the email from Illumina Customer Care. The installation package contains the following files:

      1. DRAGEN IRES file: drageninstaller_<DRAGEN_VERSION>.el8.x86_64_prod.ires

    hashtag
    External Storage Configuration

    Perform the following steps to specify a location to store analysis results:

    1. Log into Illumina Run Manager as an administrator.

    2. From the Settings menu, select External Storage for Analysis Results.

    3. Configure the following parameters:

    circle-exclamation

    The v2.6.3 app uses DRAGEN v3.11.2, which is not designed to be co-installed with earlier DRAGEN versions (e.g. v3.10.18 and below, v4.3.5 and below). When launching analysis after installing another NovaSeq 6000Dx app and its dependent version of DRAGEN (that does not support multi-version installation), the software may sporadically be unable to run. For example, users may encounter an issue if the following steps occur in order:

    1. installation of the DRAGEN TSO 500 ctDNA v2.6.3 app for NovaSeq 6000Dx and its dependent DRAGEN, v3.11.2

    Command-Line Options

    Option
    Required
    Description

    Auto-Launch of DRAGEN TSO 500 ctDNA Analysis on ICA

    hashtag
    Auto-launch Prerequisites and Workflow

    *The BaseSpace Sequence Hub setting for run monitoring and storage must be selected on the instrument to use DRAGEN TSO 500 ctDNA analysis auto-launch. For information on preparing your instrument for DRAGEN TSO 500 ctDNA Auto-launch, refer to the documentation for your instrument.

    Illumina Run Manager: TSO500L_v<APP_VERSION>.iapp

  • Install the DRAGEN version using the IRES file.

    1. Log into Illumina Run Manager as an administrator.

    2. From the Settings menu, select DRAGEN.

    3. Select Add DRAGEN Installer.

    4. Select the DRAGEN IRES file to begin installation.

    5. Check the install versions list to verify that the appropriate DRAGEN version is listed

  • Install the Illumina Run Manager-compatible application.

    1. Log into Illumina Run Manager as an administrator.

    2. From the left-hand menu, select Applications.

    3. Select the RUO tab, select Install Application, and then select the IAPP file to install the application.

    4. Check the RUO Applications list to verify that the appropriate application version is listed. You can configure application settings such as required users from this list.

  • Server Location: The path of the preferred output file location. A change to the server location may prompt you to enter the domain, user name, and password to the new server location.
    1. In addition to storing the analysis results from any sequencing runs, this location contains the input data created when setting up runs. Make sure that the server location has sufficient storage space for your run.

  • Encryption—Select Require encryption during file transfer.

  • Select Save to preserve the changes to the external storage output parameters.

  • installation of the DRAGEN TSO 500 ctDNA v2.6.1 app* for NovaSeq 6000Dx and its dependent DRAGEN, v3.10.18

  • analysis is initiated using DRAGEN TSO 500 ctDNA v2.6.3 app

  • *or any other app requiring an incompatible DRAGEN version. See Installation of TSO 500 ctDNA v2.6.2, v2.6.3 on Standalone DRAGEN Server for DRAGEN co-installation compatibility.

    See Troubleshooting section for a workaround. NovaSeq 6000Dx App Troubleshooting

    890-1600

    1500

    NovaSeq 6000/6000Dx (RUO) S4 Flow Cell

    800-1200

    2700-4100

    3000

    NovaSeq X 1.5B

    213

    352

    800

    NovaSeq X 10B

    1100

    1800

    3000

    NovaSeq X 25B

    3042

    5177

    6000

    NextSeq 2000

    P4 Flow Cell

    223

    334

    500

    IndexAdapterKitName

    Not Required

    The Index Adapter Kit used.

    Sample_ID

    Required

    The unique ID to identify a sample. The sample ID is included in the output file names. Sample IDs are not case sensitive. Sample IDs must have the following characteristics: - Unique for the run. - 1–40 characters. - No spaces. - Alphanumeric characters with underscores and dashes. If you use an underscore or dash, enter an alphanumeric character before and after the underscore or dash. eg, Sample1-T5B1_022515. - Cannot be called all, default, none, unknown, undetermined, stats, or reports. - Must match a Sample_ID listed in the BCL Convert Data section. - Each sample must have a unique combination of Lane (if applicable), sample ID, and index ID or the analysis will fail.

    Sample_Type

    Required

    Enter DNA

    Sample_Description

    Not Required

    Sample description must meet the following requirements: - 1–50 characters. - Alphanumeric characters with underscores, dashes and spaces. If you enter a underscore, dash, or space, enter an alphanumeric character before and after. eg, Liquid-Sample_213.

    SoftwareVersion

    Not Required

    The TSO500S software version

    StartsFromFastq

    Required

    Set the value to TRUE or FALSE. To auto-launch from BCL files, set to FALSE. To auto-launch from FASTQ files after auto-launch of BCL Convert, set to TRUE.

    SampleSheetRequested

    Required

    Set the value to TRUE or FALSE.

    To auto-launch from BCL files, set to FALSE. To auto-launch from FASTQ files after auto-launch of BCL Convert, set to TRUE.

    Sample_ID

    Not Required

    The same sample ID used in the Cloud_TSO500L_Data section.

    ProjectName

    Not Required

    The BaseSpace project name.

    LibraryName

    Not Required

    Combination of sample ID and index values in the following format: sampleID_Index_Index2

    LibraryPrepKitName

    Required

    The Library Prep Kit used.

    GeneratedVersion

    Not Required

    The cloud GSS version used to create the sample sheet. Optional if manually updating a sample sheet.

    CloudWorkflow

    Not Required

    Ica_workflow_1

    Cloud_TSO500L_Pipeline

    Required

    This value is a universal record number (URN) . The valid value is: urn:ilmn:ica:pipeline:850b4fe6-2a2f-4a85-ae57-cf082dfbefd4#DRAGEN_TruSight_Oncology_500_ctDNA_v2_6_1_4

    BCLConvert_Pipeline

    Required

    The value is a URN in the following format: urn:ilmn:ica:pipeline: <pipeline-ID>#<pipeline-name>

    SoftwareVersion

    Required

    Enter the IRM iapp software version 2.6.1-4v2

    Standard Sample Sheet Requirements
    [TSO500L_Data] Section

    TSO 500 ctDNA v2

    NovaSeq 6000Dx (in RUO mode)

    Standard

    file-downloadSampleSheet-TSO500_ctDNA-v2_6_1-v2_assay-Novaseq6000Dx-Local.csv

    TSO 500 ctDNA v2

    NovaSeq X

    Standard

    file-downloadSampleSheet-TSO500_ctDNA-v2_6_1-v2_assay-NovaseqX-local.csv

    TSO 500 ctDNA v2

    NovaSeq 6000Dx (in RUO mode)

    Standard

    file-downloadSampleSheet-TSO500_ctDNA-v2_6_3-v2_assay-i5_fwd.csv

    TSO 500 ctDNA v2

    NovaSeqX

    Standard

    file-downloadSampleSheet-TSO500_ctDNA-v2_6_3-v2_assay-i5_fwd.csv

    TSO 500 ctDNA v2

    NextSeq 2000

    Standard

    file-downloadSampleSheet-TSO500_ctDNA-v2_6_3-v2_assay-i5_fwd.csv

    TSO 500 ctDNA v2 w/ v3 indexes

    NovaSeq 6000

    Standard

    file-downloadSampleSheet-TSO500_ctDNA-v2_6_3-v2_assay-UDv3-i5_rev.csv

    TSO 500 ctDNA v2 w/ v3 indexes

    NovaSeq 6000Dx

    Standard

    file-downloadSampleSheet-TSO500_ctDNA-v2_6_3-v2_assay-UDv3-i5_fwd.csv

    TSO 500 ctDNA v2 w/ v3 indexes

    NovaSeq X

    Standard

    file-downloadSampleSheet-TSO500_ctDNA-v2_6_3-v2_assay-UDv3-i5_fwd.csv

    TSO 500 ctDNA v2 w/ v3 indexes

    NextSeq 2000

    Standard

    file-downloadSampleSheet-TSO500_ctDNA-v2_6_3-v2_assay-UDv3-i5_fwd.csv

    TSO 500 ctDNA v1

    NovaSeq 6000

    Standard

    TSO 500 ctDNA v1

    NovaSeq 6000Dx (in RUO mode)

    Standard

    TSO 500 ctDNA v2

    NovaSeq 6000

    Standard

    TSO 500 ctDNA v1

    NovaSeq 6000

    Standard

    TSO 500 ctDNA v1

    NovaSeq 6000Dx (in RUO mode)

    Standard

    TSO 500 ctDNA v2

    NovaSeq 6000

    Standard

    NovaSeq 6000

    NovaSeq

    NovaSeq 6000Dx (in RUO Mode)

    NovaSeq

    NovaSeqX

    NovaSeqXSeries

    NextSeq 2000

    NextSeq1k2k

    Sample Sheet guidelines
    Instrument Lookup
    file-downloadSampleSheet-TSO500_ctDNA-v2_6_1-v1_assay-Novaseq6000-Reverse_Complement-Local.csv
    file-downloadSampleSheet-TSO500_ctDNA-v2_6_1-v1_assay-Novaseq6000Dx-Local.csv
    file-downloadSampleSheet-TSO500_ctDNA-v2_6_3-v1_assay-i5_rev.csv
    file-downloadSampleSheet-TSO500_ctDNA-v2_6_3-v1_assay-i5_fwd.csv
    file-downloadSampleSheet-TSO500_ctDNA-v2_6_1-v2_assay-Novaseq6000-Local.csv
    file-downloadSampleSheet-TSO500_ctDNA-v2_6_3-v2_assay-i5_rev.csv
    For TSO 500 or TSO 500 ctDNA assays, to auto-launch multiple pipelines at once, set "" to True
  • In the "Run Review" page, click "Add another configuration" to add the sample prep info for the second assay.

  • A sample sheet containing multiple data sections (one per assay) will be available for exporting at the end of run planning.

  • The intermediate sample sheet includes only samples relevant to the workflow type (Solid or ctDNA).

  • BCL Convert (inside the TSO 500 ctDNA workflow) then generates FASTQ files only for those samples.

  • BCLConvert_Settings and BCLConvert_Data sections will be excluded from the intermediate sample sheet created by the TSO 500 ctDNA workflow.

    Override Cycles

    Auto-calculated or user-provided depending on workflow

    Maintains correct cycle definitions per sample

    Solid

    One and only one value must match the predefined Solid kit set: TSO500, TSO500HT, or TSO500_v2. The validation fails if none or more than one valid Solid kit is provided. The validated value will be written into the intermediate sample sheet for baseline selection.

    TSO500HT

    ctDNA

    The library prep kit and the Sequencing_Settings section are not required. The library prep kit validation rule is skipped, and the Sequencing_Settings section is excluded from the intermediate sample sheet.

    N/A

    Sample Index Rule

    Filtered to process samples only from workflow-specific data section

    Prevents validation failures for multi-assay sample sheets

    Sample Parity Rule

    Enforces that workflow samples exist in BCLConvert_Data

    Ensures downstream consistency

    Library Prep Kit Rule

    Supports multiple values separated by ;

    Allows multi-assay sample sheets

    Adapter Read Rule

    Supports multiple adapters separated by +

    Expands flexibility across assays

    Yes

    Required when --fastqFolder is not specified. Provide the full path to the local run folder.

    --fastqFolder

    Yes

    Required when --runFolder is not specified. Provide the full path to the local FASTQ folder. Analysis starts at this location.

    --user

    No

    Optional for Docker. Specify the user ID to be used within the Docker container.

    --version

    No

    Displays the version of the software.

    --sampleSheet

    No

    Provide the full path, including file name, if not provided as SampleSheet.csv in the run folder

    --sampleOrPairIDs

    No

    Provide the comma-delimited sample IDs that should be processed on this node with no spaces. For example, Sample_1,Sample_2.

    --demultiplexOnly

    No

    Demultiplex to generate FASTQ only without additional analysis.

    Note:

    • Use full paths when specifying the file paths in the command line.

    • Avoid special characters such as &, *, #, and spaces.

    • When starting from BCL files, only the run folder needs to be specified. The immediate parent directory containing the BCL files does not need to be specified.

    When running the analysis software using SSH, Illumina recommends using additional software to prevent unexpected termination of analysis. Illumina recommends screen and tmux.

    1. Wait for any running DRAGEN TruSight Oncology 500 ctDNA Analysis Software containers to complete before launching a new analysis. Run the following command to generate a list of running containers:docker ps

    2. Select from one of the following options:

    • Start from BCL files in the run folder with the sample sheet included in the run folder. DRAGEN_TSO500_CTDNA-2.6.0.sh \ --runFolder /staging/{RunFolderName} \ --analysisFolder /staging/{AnalysisFolderName}

    • Start from BCL files in the run folder with the sample sheet located in a folder other than the run folder. DRAGEN_TSO500_CTDNA.sh \ --runFolder /staging/{RunFolderName} \ --analysisFolder /staging/{AnalysisFolderName} \ --sampleSheet /staging/{SampleSheetName}.csv

    • Start from BCL files in the run folder with a different sample sheet and demultiplexing only. DRAGEN_TSO500_CTDNA-2.6.0.sh \ --runFolder /staging/{RunFolderName} \ --analysisFolder /staging/{AnalysisFolderName} \ --sampleSheet /staging/{SampleSheetName}.csv \ --demultiplexOnly

    • Start from FASTQ folder with sample sheet included in the FASTQ folder and subset of samples. DRAGEN_TSO500_CTDNA-2.6.0.sh \ --fastqFolder /staging/{FastqFolderName} \ --analysisFolder /staging/{AnalysisFolderName} \ --sampleOrPairIDs "Sample_1,Sample_2"

    hashtag
    Starting from BCL Files

    If starting from BCL (*.bcl) files, DRAGEN TruSight Oncology 500 ctDNA Analysis Software requires the run folder to contain certain files and folders. These inputs are required for Docker.

    The run folder contains data from the sequencing run, make sure that the folder contains the following files:

    Folder/File
    Description

    Config folder

    Configuration files

    Data folder

    *.bcl files

    Images folder

    [Optional] Raw sequencing image files.

    Interop folder

    Interop metric files.

    Logs folder

    [Optional] Sequencing system log files.

    RTALogs folder

    Real-Time Analysis (RTA) log files.

    hashtag
    Starting from FASTQ Files

    The following inputs are required for running the DRAGEN TruSight Oncology 500 ctDNA Analysis Software using FASTQ (*.fastq) files. The requirements apply to Docker.

    • Full path to an existing FASTQ folder.

    • The FASTQ folder structure conforms to the folder structure in FASTQ File Organization.

    • The sample sheet is in the FASTQ folder path, or you can set the path to the sample sheet with the --sampleSheet override command line option.

    Make sure there is sufficient disk space for the analysis to complete. Refer to the --help command line argument details for disk space requirements.

    circle-info

    Use BCL Convert to produce FASTQ files for DRAGEN TruSight Oncology 500 ctDNA Analysis Software. Using bcl2fastq does not produce the same results and is discouraged.

    circle-info

    Make sure that BCL Convert is set to write UMI sequences to the read headers in the FASTQ files.

    hashtag
    FASTQ File Organization

    Store FASTQ files in individual subfolders that correspond to a specific Sample_ID. Keep file pairs together in the same folder. Alternatively, store the FASTQ files in one flat folder structure where the FASTQ files are stored in one folder.

    The DRAGEN TruSight Oncology 500 ctDNA Analysis Software requires separate FASTQ files per sample. Do not merge FASTQ files.

    The instrument generates two FASTQ files per flow cell lane, so that there are eight FASTQ files per sample.

    Sample1_S1_L001_R1_001.fastq.gz

    • Sample1 represents the Sample ID.

    • The S in S1 means sample, and the 1 in S1 is based on the order of samples in the sample sheet, so S1 is the first sample.

    • L001 represents the flow cell lane number.

    • The R in R1 means Read, so R1 refers to Read 1.

    --help

    No

    Displays a help screen with available command line options.

    --analysisFolder

    No

    Path to the local analysis folder. The default location is /staging/DRAGEN_TSO500_CTDNA_2.6.0_Analysis_{timestamp}. If not using the default location, provide the full path to the local analysis folder. Folder must have sufficient space and must be on an NVMe SSD drive. For example, the /staging directory on the DRAGEN server. Refer to table in Storage Requirements for minimum disk space requirements.

    --resourcesFolder

    No

    Path to the resource folder location. The default location is /staging/illumina/DRAGEN_TSO500_CTDNA_2.6.0/resources. If not using the default location, enter the full path to the resource folder.

    --runFolder

    Use BaseSpace Sequence Hub Run Planning tool or the sample sheet templates provided on the support page to create and export a sample sheet.
    1. If BaseSpace Run Planning tool is not available in your region, use the sample sheet template.

  • Import the sample sheet to the instrument and start the sequencing run. Refer to ICA Auto-launch Sample Sheet Requirements for sample sheet guidance.

    1. Data is uploaded to BaseSpace Sequence Hub and then pushed to ICA. You can monitor the run in BaseSpace Sequence Hub.

    2. Analysis auto launches in ICA when sequencing and the upload completes. You can monitor the status of the analysis in BaseSpace Sequence Hub or ICA

    3. If necessary, you can requeue the analysis via BaseSpace Sequence Hub.

  • View the analysis output results in either BaseSpace Sequence Hub or ICA.

  • circle-exclamation

    To avoid invalid sample sheet configurations, Illumina recommends using BaseSpace Run Planning tool to generate sample sheets. Using an invalid sample sheet can result in failed runs and analyses.

    hashtag
    BaseSpace Sequence Hub Requirements for ICA Auto-Launch

    BaseSpace Run Planning tool is a multi-step workflow that generates a manual launch or auto-launch capable sample sheet for export and requires the following additional settings:

    • Access to BaseSpace Sequence Hub.

    • ICA Run Storage is enabled under BaseSpace Sequence Hub settings.

    Refer to the arrow-up-rightBaseSpace Sequence Hub support site pagearrow-up-right for information on setting up a BaseSpace Sequence Hub project.

    hashtag
    Requeue Analysis

    You can requeue analysis of a run via the run's Summary page in BaseSpace Sequence Hub.

    Refer to the BaseSpace Sequence Hub support site pagearrow-up-right for more information on requeuing an analysis.

    hashtag
    Minimum Storage Requirements on ICA

    Sequencing System
    Minimum Disk Space (Gb)

    NovaSeq 6000/6000Dx (RUO) SP Flow Cell

    500

    NovaSeq 6000/6000Dx (RUO) S1 Flow Cell

    1100

    NovaSeq 6000/6000Dx (RUO) S2 Flow Cell

    2500

    NovaSeq 6000/6000Dx (RUO) S4 Flow Cell

    4300

    NovaSeq X 1.5B

    2000

    NovaSeq X 10B

    4300

    Refer to the Software Registration pagearrow-up-right for information on how to manage accounts and subscriptions.

    hashtag
    Guided Examples

    Please review these guided examples of using DRAGEN TSO 500 Analysis Software with auto-launch on ICA:

    • NovaSeq 6000Dx: TSO 500 Auto-launch Analysis in Cloudarrow-up-right

    Fusions

    hashtag
    Fusion Calling

    DRAGEN fusion caller performs the fusion calling. The outputs are VCF files. It first scans the genome to discover evidence of possible structural variants (SV) and large indels based on split- and spanning-reads. The evidence is enumerated into a graph with edges that connect all regions of the genome that have a possible breakend association. It then analyzes individual graph edges to discover and score SVs associated with the edges. The substeps of this process include the following items:

    • Inference of SV candidates associated with the edge.

    • Attempted assembly of the SVs breakends.

    • Scoring, genotyping, and filtering of the SV.

    • Output to VCF.

    hashtag
    Fusion Filtering

    In assays, there are hundreds to thousands of fusion candidates in a single sample, all of these candidates are reported in the $SAMPLE_ID.sv.vcf file although Most fusion candidates (~99%) are false positives. The fusion filtering tool, DNA Fusion Filter (DNAFF), distinguishes true fusion calls from the false positives. DNAFF performs the following functions:

    • Removes spurious fusions including fusions with only one supporting read pair and fusions that overlap with repeat regions, which are more likely to have sequencing errors.

    • Filters nonconfident supporting reads for all fusion candidates based on the following criteria:

      • Filter reads with low-sequence identity with the fusion contig.

    circle-info

    In software version 2.6.1 and later fusions below the limits of detection may not be reported due to an algorithm improvement that rescues false negative calls in samples with high chimeric reads

    hashtag
    Fusion Output Files

    Fusion calling results are output in several files:

    1. Combined Variant Output File, {SampleID}_CombinedVariantOutput.tsv

    2. Fusions CSV, {Sample_ID}_Fusions.csv

    3. All Candidate Fusions (with many false positives), {Sample_ID}.sv.vcf

    hashtag
    1. Combined Variant Output File

    File name: {SampleID}_CombinedVariantOutput.tsv

    Fusion calling results are output in the [DNA Fusions] section. Fusion variants must meet the following conditions to be included:

    • Passing fusion filtering criteria with "PASS" from DNAFF module.

    • Contains at least one gene on the fusion allow list.

    • Genes separated by a dash (-) indicate that the fusion directionality could be determined. Genes separated by a slash (/) indicate that the fusion directionality could not be determined.

    hashtag
    2. Fusions CSV

    File name: {Sample_ID}_Fusions.csv

    The fusions file contains all passing fusions identified by the analysis pipeline.

    The fusion columns are described in the following table. If you use Microsoft Excel to view this file, genes that are convertible to dates (for example, MARCH1) automatically convert to dd‑mm format (1-Mar).

    Fusion Object Field
    Description

    The following table lists the meaning of the values in the direction column. The values are in the format used by Samtools.

    Direction
    VCF Format
    Description

    hashtag
    3. Fusion Candidates

    All candidates identified through DRAGEN fusion caller performs the fusion calling. The outputs are VCF files.

    File name: Logs_Intermediates/DragenCaller/{Sample_ID}/{Sample_ID}.cv.vcf

    Installation of TSO 500 ctDNA v2.6.0, v2.6.1 on Standalone DRAGEN Server

    hashtag
    Overview

    The installation script for DRAGEN TruSight Oncology 500 ctDNA Analysis Software installs the following software and dependencies:

    1. DRAGEN TruSight Oncology 500 ctDNA Analysis Software itself

    2. DRAGEN Software if a compatible version is not present

    3. Docker software if a compatible version is not present

    4. A script required to generate DRAGEN genome hash table

    5. A script to check that DRAGEN TruSight Oncology 500 ctDNA Analysis Software is installed properly

    hashtag
    Installation Requirements

    hashtag
    Hardware

    • DRAGEN server v3 or v4

    hashtag
    Software

    • By default Linux CentOS 7.9 operating system (or later) or Oracle Linux 8 (or later), is provided. Oracle Linux 8 is recommended.

    • Docker Software, see table below.

    • DRAGEN Software, see table below.

    Software Dependency
    Compatible
    Installs
    circle-exclamation

    DRAGEN TruSight Oncology 500 ctDNA v2.6.0 Analysis Software is not compatible with DRAGEN Software v4.0 or above on the same standalone DRAGEN server.

    hashtag
    Licenses

    • TSO500Combined license

    TSO500Combined license has been pre-installed to DRAGEN servers in manufacturing since August 2022. To generate a list of installed DRAGEN server licenses, run the following command: /opt/edico/bin/dragen_lic. If a license is not installed, contact Illumina Customer Care at for the license.

    hashtag
    Permissions

    Illumina recommends logging in as root user for installation, but as a non-root user for running TSO 500 ctDNA analysis.

    • A non-root user must be a member of the Docker group to run Docker. For more information on Docker permission requirements and alternatives to running as root, refer to the Docker documentation available on the .

    • Installing and uninstalling DRAGEN TruSight Oncology 500 ctDNA Analysis Software and running the system check requires root privileges.

    • Run DRAGEN TruSight Oncology 500 ctDNA Analysis Software without being logged in as a root user. Running the DRAGEN TruSight Oncology 500 ctDNA Analysis Software as root is not required or recommended.

    hashtag
    Compatibility with other TruSight Oncology 500 ctDNA and TruSight Oncology 500 Analysis Software

    DRAGEN TruSight Oncology 500 Analysis Software ctDNA v2.6.0 can be installed on one DRAGEN server with:

    1. DRAGEN TruSight Oncology 500 Analysis Software v2.6.0 (v3.10.17*)

    2. One prior 2.x version of DRAGEN TruSight Oncology 500 ctDNA Analysis Software (v2.1.1 (v3.10.9*), v2.5.0 (v3.10.15*), 2.6.0 (v3.10.17*), 2.6.1 (v3.10.18*))

    3. One prior 2.x version of DRAGEN TruSight Oncology 500 Analysis Software (v2.1.1 (v3.10.9*), v2.5.3 (v3.10.16*)

    *DRAGEN Software version

    Contrary to the prior versions, the installation scripts for DRAGEN TruSight Oncology 500 ctDNA Analysis Software v2.6.0 and DRAGEN TruSight Oncology 500 v2.6.0 do not uninstall previous versions of DRAGEN TruSight Oncology 500 Analysis Software. To uninstall a previous version of DRAGEN TruSight Oncology 500 ctDNA Analysis Software, refer to the respective guide.

    circle-info

    When installing DRAGEN TruSight Oncology 500 and DRAGEN TruSight Oncology 500 ctDNA software on the same DRAGEN server, install the software with the highest corresponding DRAGEN Software version last, as versions below v2.6.0 will overwrite with its corresponding DRAGEN Software version.

    circle-exclamation

    If a prior version of DRAGEN TruSight Oncology 500 ctDNA Analysis Software (eg. v2.5.0) is installed after v2.6.0, re-execute the installation script for v2.6.0 to install the compatible version of DRAGEN Software without impacting other installations.

    hashtag
    Installation Instructions

    As a root user, perform the following steps to install DRAGEN TruSight Oncology 500 ctDNA v2.6.0 Analysis Software:

    1. Contact Illumina Customer Care at to obtain the DRAGEN TruSight Oncology 500 Analysis Software installer package.

    2. Download the installation package provided in the email from Illumina. The link expires after 7 days.

    circle-info

    It is recommended to use a command line tool like wget or curl to download the file rather than pasting the link into the web browser bar. For example:

    curl -o {filename} "{link}"

    wget -O {filename} "{link}"

    Where the file name is the installation script file name, and the link is provided by Illumina Customer Care.

    1. Make sure no other analysis is being performed. Installing the software while performing other analyses prevent the installer process from proceeding.

    2. Copy the install script to the /staging directory to store the script in the directory.

    circle-info

    Installation Script:

    install_DRAGEN_TSO500_ctDNA-2.6.0.run

    install_DRAGEN_TSO500_ctDNA-2.6.1.run

    MD5sum values v2.6.0: sha256:e98ab87152b02e2c7958f2f750fa37880a496d68e77858a09f4ad5fb07b2145b v2.6.1: sha256:026efd4402e91e8472effb30749e231817958532aae128334c5f9943564e4b8d

    1. Use the following command to update the run script permission: chmod +x /staging/install_DRAGEN_TSO500_CTDNA-2.6.0.run

    2. Use the following command to run the installation script, which runs for approximately 20 minutes:

      1. For Docker, use the following command: sudo TMPDIR=/staging /staging/install_DRAGEN_TSO500_CTDNA-2.6.0.run

    hashtag
    Running the System Check

    After installation is complete, make sure the system functions properly by running the following command: /usr/local/bin/check_DRAGEN_TSO500_CTDNA-2.6.0.sh

    The script checks that:

    • All required services are running

    • Proper Docker image is installed

    • DRAGEN TruSight Oncology 500 ctDNA Analysis Software can successfully process a test data set

    The system check script runs for approximately 25 minutes. If the script prints a failure message, contact Illumina Technical Support and provide the /staging/check_DRAGEN_TSO500_CTDNA_<timestamp>.tgz output file.

    If using MacOS to connect to a server, an error can occur if the local settings are not in English. To resolve the error, disable the ability to set environment variables automatically in Terminal settings.

    hashtag
    Uninstall Software

    The DRAGEN TruSight Oncology 500 Analysis Software installation includes an uninstall script called uninstall_DRAGEN_TSO500_CTDNA-2.6.0.sh, which is located in /usr/local/bin.

    Executing the uninstall script removes the following assets:

    • All DRAGEN TruSight Oncology 500 ctDNA Analysis Software related scripts located in /usr/local/bin

    • Resources found in /staging/illumina/DRAGEN_TSO500_CTDNA

    • The dragen_tso500_ctdna Docker image

    To uninstall the DRAGEN TruSight Oncology 500 ctDNA Analysis Software, run the following command as a root user:

    uninstall_DRAGEN_TSO500_CTDNA-2.6.0.sh

    You are not required to uninstall Docker or DRAGEN software. To remove Docker, review the install instructions for your operating system in the Docker documentation.

    DNA Expanded Metrics

    DNA expanded metrics are provided for information only. They can be informative for troubleshooting but are provided without explicit specification limits and are not directly used for sample quality control. For additional guidance, contact Illumina Technical Support.

    Metric
    Description
    Troubleshooting

    TOTAL_PF_READS (count)

    Total number of non-supplementary, non-secondary, and passing QC reads after alignment to the whole genome sequence.

    Primarily driven by data output of sequencer, quality of library and balancing of library in library pool. If TOTAL_PF_READS is in line with other samples, but coverage metrics are more may suggest non-specific enrichment.

    Low values for all samples indicate a poor quality run with possible low cluster numbers or low numbers of Q30 and PF%.

    A low value for an individual sample indicates poor pooling of this library into the final pool.

    MEAN_FAMILY_SIZE (count)

    A UMI Family is a group of reads that all have the same UMI barcode. The family size is the number of reads in family. MEAN_FAMILY_SIZE is the mean of the entire population of reads assembled into UMI families.

    The mean UMI family size decreases with increased unique read numbers, and more input DNA leads to more unique reads. Conversely over sequencing of a fixed population of unique DNA molecules leads to increased family size.

    As a guide, for a good run with optimal cluster density, passing specs, even sample pooling, and good quality DNA we usually observe values <10.

    UMI family size = 1 is not ideal as it is harder to correct for errors.

    UMI family size of 2 to 5 enables efficient error correction without wasting sequencing capacity on high percentages of duplicate reads.

    Troubleshooting

    hashtag
    General Troubleshooting on Standalone DRAGEN Server

    Failure Type
    Actions

    Software

    - Open the log file ./<AnalysisFolder>/Logs_Intermediates/pipeline_trace.txt. This log file displays each pipeline step run by the Nextflow workflow manager software. If a step fails, it is marked as FAILED. Each step generates log files that are stored in step-specific subfolders in the Logs_Intermediates folder. Review the log files in the relevant Logs_Intermediates folder for the step to identify potential sources of error. - Open the errors folder ./<AnalysisFolder>/errors. The workflow creates an error file, error_<NameOfFailedStep>.json, for each step that failed during analysis. For steps that fail per sample, there is a separately labeled file for each sample that failed each step error_<NameOfFailedStep>_<SampleIDIfRelevant>.json. These files contain the command and stdout and stderr from the step.

    Samples

    hashtag
    Sample Sheet Validation Failures

    In DRAGEN TruSight Oncology 500 ctDNA Analysis Software, the analysis fails if a sample sheet is invalid. If an invalid sample sheet in suspected, log files can help troubleshoot a failed analysis. Use the following steps to find the log file for the sample sheet:

    1. Navigate to the following location /<analysis_output>/Logs_Intermediates/SamplesheetValidation.

    2. Open the SamplesheetValidation-.log file

    3. Find a line with the following: SampleSheetValidationTask:NA:1 exited with return code 1 which has not been declared as a valid return code.

    hashtag
    General troubleshooting for a failed sample sheet:

    Failure Type
    Action

    hashtag
    Valid indexes for assay and index combinations:

    Assay
    Index Set ID

    hashtag
    Troubleshooting BCL issues:

    Failure Type
    Action

    hashtag
    Troubleshooting FASTQs issues:

    Failure Type
    Action

    hashtag
    Other Troubleshooting Tips

    Failure Type
    Action

    hashtag
    Troubleshooting on ICA

    In addition to TSO 500 managed sample sheet validations, ICA managed TSO 500 errors include the following:

    Error
    Description

    Manual Launch of DRAGEN TSO 500 ctDNA Analysis on ICA

    hashtag
    How to Launch Analysis

    1. Create a Project: Project can be specific for the DRAGEN TruSight Oncology 500 ctDNA pipeline or it can contain multiple Pipelines and/or Tools). For information on creating Projects, refer to the Projects section in Illumina Connected Analytics helparrow-up-right.

    circle-info

    ICA standard storage is used by default as soon as the Project is saved. To connect a different storage source, set it up before creating your Project. For details and options, refer to the Storage section in .

    1. Edit Project and Add Bundle: Edit the Project and add the bundle titled, "DRAGEN TSO 500 ctDNA v2.6.0 (XX)." XX is a 2-letter code designating the region from which you are launching the analysis. Adding the Bundle automatically adds the pipeline and associated resource files and datasets to the Project. For information on Bundles, refer to the Bundles section in .

    circle-info

    After adding the Bundle to the Project, an example dataset becomes available in the Demo_Data folder for the Project. 

    1.  Upload the sequencing data: For information on viewing and uploading data, refer to the Data section in .

    2. Start Analysis: In the Project, navigate to Pipelines, select the TSO 500 ctDNA v2.6.0 Pipeline, and then select  "Start New Analysis". Set up the new analysis by configuring the parameters listed in the . When the required files are completed, start analysis.

    3. Download Results: After analysis is complete, navigate to results in the configured output location.

    Please see the Illumina Support Shorts for guidance on how to set up and run DRAGEN TSO 500 RUO analysis on ICA. This is the same process as DRAGEN TSO 500 ctDNA, but with different inputs specific to DRAGEN TSO 500 solid:

    hashtag
    Analysis Parameters on ICA

    To launch an analysis via the ICA user interface, configure a DRAGEN TSO 500 ctDNA pipeline analysis with the following parameters.

    Parameter Name
    Description

    hashtag
    Known Limitations

    • FASTQ Folder Naming Requirements

      • When specifying input FASTQ folder names, avoid using folder names that consist entirely of numeric characters with a leading zero, as this will cause the software to error out.

      • Unsupported naming pattern:

    For information about using pipelines, refer to .

    Analysis Output

    When the analysis run completes, the DRAGEN TruSight Oncology 500 ctDNA Analysis Software generates an analysis output folder in a specified location.

    To view analysis output, navigate to the analysis output folder and select the files that you want to view.

    hashtag
    Analysis Output Folder Structure

    Single output folder structure is as follows.

    Logs_Intermediates

    • AdditionalSarjMetrics

    • Annotation—Contains outputs for small variant annotation.

      • Subfolders per sample ID—Contains the aligned small variants JSON.

    • CombinedVariantOutput

      • Subfolders per sample ID—Contains the combined variant output TSV files.

      • A combined output log file.

    • Contamination

      • Subfolders per sample ID—Contains the contamination metrics JSON file and output logs.

    • CoverageReports

    • DnaFusionFiltering

    • DragenCaller

      • Subfolders per sample ID—Contains the aligned BAM and index files, small variant VCF and gVCF, copy number variant VCF, MSI JSON, exon coverage report bed, and QC outputs in CSV format.

    • FastqValidation—Contains the FASTQ validation output log for the samples.

    • FastqGeneration

    • MetricsOutput

      • Subfolders per sample ID—Contains the metrics output TSV files.

      • A combined output log file.

    • ResourceVerification—Contains the resource file checksum verification logs.

    • Run QC—Contains the Run QC metrics JSON, Intermediate Run QC metrics JSON, and log file.

    • SampleAnalysisResults

      • Subfolders per sample ID—Contains the Sample Analysis Results JSON and detailed log file. The sample analysis results file (SARJ) is an aggregated results file created for each sample. The SARJ file is used for the generation of downstream outputs. The file contains passing variants and passing variant annotations.

    • SampleSheetValidation—Contains the Intermediate sample sheet and validation log.

    • Passing Sample Steps - JSON file that contains the steps passed for each Sample ID

    • Tmb

      • Subfolders per sample ID—Contains the TMB metrics CSV, TMB trace TSV, and related files and logs. pipeline_trace.txt—Contains a summary and troubleshooting file that lists each Nextflow task executed and the status (for example, COMPLETED or FAILED). run.log—Contains a complete trace-level log file describing the Nextflow pipeline execution. run_report.html—Contains high-level run statistics (performance, usage, etc.) run_timeline.html —Contains timeline-related information about the analysis run.

  • Results

    • Metrics Output TSV (all Sample IDs)

    • Sample ID—The following outputs are produced for each sample:

      • Combined Variant Output TSV

      • Metrics Output TSV

      • TMB Trace TSV

      • Small Variant Genome VCF

      • Small Variant VCF

      • Small Variant Annotated JSON

      • Copy Number Variant VCF

      • MSI JSON

      • Fusions CSV

      • Exon Coverage Report TSV

      • Gene Coverage Report TSV

  • hashtag
    ICA Output Folder Structure

    This section describes each output folder generated during analysis and where to find metric and analytic files when the pipeline is executed. The same output folder structure and content exist in ICA and BaseSpace Sequence Hub.

    hashtag
    High-Level Folder Structure

    • Run ID

      • TSO500_Nextflow_logs

      • _manifest.json

      • Results

        • _tags.json

      • Logs_intermediates

      • Errors—This folder is only present when analysis fails

    hashtag
    TSO500_Nextflow_logs Folder Structure

    The TSO_500_Nextflow_Logs provides information related to the execution of the pipeline on ICA as a whole and for specific nodes (when an analysis is split across multiple nodes). It contains files used to execute parts of the workflow on different nodes as well as records of the nextflow execution on those nodes.

    • TSO_500_Nextflow_Logs

      • _manifest.json

    hashtag
    Results Folder Structure

    Contains the aggregated MetricsOutput.tsv file at the root level. Additionally, the Results folder contains a subfolder for each sample ID.

    • Results

      • MetricsOutput.tsv

      • Sample_1

      • Sample_2

      • Sample_<#>

    The Results subfolder contains the following files:

    • Results

      • MetricsOutput.tsv

      • <Sample_id>

        • CombinedVariantOutput.tsv

        • Fusions.csv

        • tmb.trace.tsv

        • hard-filtered.gvcf

        • hard-filtered.vcf

        • SmallVariants_Annotated.json.gz

        • cnv.vcf

        • exon_cov_report.tsv

        • gene_cov_report.tsv

        • MetricsOutput.tsv

        • microsat_output.json

    hashtag
    Logs_intermediates Folder Structure

    Contains folders for each submodule in the DRAGEN TSO 500 ctDNA on ICA pipeline. The folders contain a copy of all the relevant files required to create the metric output files and report files, as well as the combined log files at the root level and subfolders for each sample.

    • Logs_intermediates

      • AdditionalSarjMetrics

      • Annotation

      • CombinedVariantOutput

      • Contamination

      • CoverageReports

      • DnaFusionFiltering

      • DragenCaller

      • FastqValidation

      • FastqGeneration

      • MetricsOutput

      • PassingSampleSteps

      • ResourceVerification

      • Run QC

      • SampleAnalysisResults

      • SampleSheetValidation

      • Tmb

    All logs in Logs_Intermediates are generated from the running analysis software. Inputs to the running Docker container (for example, the run folder, sample sheet, and FASTQ folder) are mapped from native locations on the server to the following locations in the container:

    Input
    Running Docker Container Location

    Run folder

    /opt/illumina/run-folder

    Sample sheet

    /opt/illumina/SampleSheet.csv

    FASTQ folder

    /opt/illumina/fastq-folder

    Resources

    /opt/illumina/resources

    Analysis output folder

    /opt/illumina/analysis-folder

    The paths in the log messages refer to paths within the running docker container, not paths on the server.

    hashtag
    Errors Folder Structure

    Contains Errors.tsv. This file contains the summary of all the errors encountered during pipeline execution.

    • Errors

      • Errors.tsv

    hashtag
    NovaSeq 6000Dx Analysis Application Output Folder Structure

    The following files and folders are created during analysis by NovaSeq 6000Dx Analysis Application:

    • analysisResults.json

    • CopyComplete.txt

    • edgeos.nextflow.config

    • inputs/

      • sampleMapping.json

      • SampleSheet.csv

      • SampleSheet.json

    • Logs_Intermediates

    • Manifest.tsv

    • params.json

    • Results/

    • workflowLogs/

      • nf-main-***.log

    When the analysis run completes, the analysis application generates an analysis output in a specified location. To view analysis output, follow the steps below:

    1. On the “Completed” runs tab, select the run

    2. Review the run details page, and this will give the information to access the output folder

    3. External Location: is the input for the run

    4. Analysis Output Folder: is where the output is stored. To navigate to this page, follow the “server location” and the gds analysis output folder

    5. Navigate to the directory that contains the analysis output folder

    6. Open the folder, and then select the files that you want to view

    RunInfo.xml file

    Run information.

    RunParameters.xml file

    Run parameters.

    SampleSheet.csv file

    Sample information. If you want to use a sample sheet that is not in the run folder or a sample sheet named something other than SampleSheet.csv, provide the full path.

    NovaSeq X 25B

    6000

    NextSeq 2000 P4 Flow Cell

    500

    MEDIAN_TARGET_COVERAGE (count)

    Median depth across all the unique loci occurring in all regions of the manifest file.

    Lower median target coverage may be due to poor sample input/quality, library preparation issues or low sequencing output.

    PCT_CHIMERIC_READS (%)

    Chimeric reads occur when one sequencing read aligns to two distinct portions of the genome with little or no overlap. Metric is proportion of total number of non-supplementary, non-secondary, and passing QC reads after alignment to the whole genome sequence.

    While this can be indicative of large-scale structural rearrangement of the genome, values that are elevated above the usual baseline may indicate enrichment probe contamination during library preparation. A suggested metric USL is 8% (those that are higher might see decrease performance in small variant and tmb scores).

    PCT_EXON_500X (%)

    Percentage of exon bases with 500X fragment coverage. Calculated against all regions in manifest containing _exon in name.

    Can be used in combination with other PCT_EXON metrics to understand under or over coverage of exons.

    PCT_EXON_1500X (%)

    Percentage of exon bases with 1500X fragment coverage. Calculated against all regions in manifest containing _exon in name.

    Can be used in combination with other PCT_EXON metrics to understand under or over coverage of exons

    PCT_READ_ENRICHMENT (%)

    Percentage of reads that have overlapping sequence with the target regions defined in the sample manifest.

    Indicative of general enrichment performance. Reduced proportions of enriched reads may indicate issues with the enrichment proportion of the library preparation.

    PCT_USABLE_UMI_READS (%)

    Percentage of reads that have valid UMI sequences associated with them.

    As UMI reads are sequenced at the start of each read, loss of valid UMI sequence may be cause by sequencing issues impacting the quality of base calling in this portion of the sequencing read.

    MEAN_TARGET_COVERAGE (count)

    Mean depth across all the unique loci defined in the manifest file.

    Lower mean target coverage may be due to poor sample input/quality, library preparation issues or low sequencing output. Large differences between the median and mean target coverage values may indicated a skewed distribution of target coverage.

    PCT_ALIGNED_READS (%)

    Proportion of aligned reads that are non-supplementary, non-secondary and pass QC versus aligned reads that are non-supplementary, non-secondary, mapped and pass QC.

    PCT_CONTAMINATION_EST (%)

    This metric should only be evaluated if the CONTAMINATION_SCORE metric exceed the USL. This metric estimates the amount of contamination in a sample. The contamination level is computed by taking 2.0* the average of the adjusted allele frequencies of all variants that were selected. The adjusted alllele frequency is either the actual allele frequency of the variant if it is less than 0.5, or 1 -allele frequency if it is greater than or equal to 0.5.

    If the sample does not fail the CONTAMINATION_SCORE this metric has no intended meaning as it will be driven by statistical noise (e.g. the few variants that naturally fall outside an expected interval around 0.5 due to random chance)

    High contamination estimates may be due to any of the following:

    Inter-sample contamination caused by mixing of samples during extraction or library preparation.

    Intra-sample contamination, due to mixing of clonally different cell populations during extraction. Large scale genomic rearrangements that cause unexpected VAFs for large numbers of variants.

    PCT_TARGET_0.4X_MEAN (%)

    Parentage of target (all locations in manifest) reads that have a coverage depth of greater the 0.4x the mean target coverage depth (see definition above).

    Provides an indication of uniformity of coverage of the target regions in the manifest file. When trended over time reductions in this metric may indicate an issue with the enrichment process resulting in coverage bias.

    PCT_TARGET_500X (%)

    Percentage of target bases with 500X fragment coverage. Calculated against all regions in manifest file.

    Can be used in combination with other PCT_TARGET metrics to understand under or over coverage of targets.

    PCT_TARGET_1000X (%)

    Percentage of target bases with 1000X fragment coverage. Calculated against all regions in manifest file.

    Can be used in combination with other PCT_TARGET metrics to understand under or over coverage of targets.

    PCT_TARGET_1500X (%)

    Percentage of target bases with 1500X fragment coverage. Calculated against all regions in manifest file.

    Can be used in combination with other PCT_TARGET metrics to understand under or over coverage of targets.

    PCT_DUPLEXFAMILIES (%)

    Percent of collapsed reads that are duplex (e.g. composed or original forward strand and original reverse strand reads). Number of families that are merged as duplex over total number of families.

    Higher is more desirable, lower family depth leads to lower precent duplex families. If low check for under clustering or chemistry concerns.

    MEDIAN_INSERT_SIZE (bp)

    Median fragment size for sample.

    A low median insert size could be a sign of low sample quality or degradation

    MAX_SOMATIC_AF

    Max somatic allele frequency of a variant; a proxy for tumor fraction. The TMB step flags the variants by potential somatic status using database, VAF and clonal hematopoiesis information. The remaining variants are ranked by variant allele frequency in descending order. The variant allele frequency of first COSMIC hotspot (count >50) or confident somatic variant (having significantly shorter fragment size) is reported as the MaxSomaticVaf for each sample. If no such variant exists, the 4th variant is reported.

    This metric is driven by sample tumor fraction

    PCT_SOFT_CLIPPED_BASES (%)

    Percentage of based that were not used for alignment but retained as part of the alignment file

    Soft clipped reads are used as a part of the downstream analysis for small variants calling. A higher-than-expected number could indicate a low-quality enrichment step.

    PCT_Q30_BASES (%)

    Average percentage of bases ≥ Q30. A prediction of the probability of an incorrect base call (Q‑score).

    An indicator of sequencing run quality, low Q30 across all samples on a run could be the result of run overclustering.

  • Search for errors in the sample sheet validation log and compare with the guidelines and warnings in Sample Sheet Requirements and the following tables.

  • Open the combined metrics output results file ./<AnalysisFolder>/Results/<PairId>/MetricsOutput.tsv. If a sample fails an analysis step, the Pair ID that contains the sample shows the failure under FAILED_STEPS in the Analysis Status section, and COMPLETED_ALL_STEPS shows as False. If available, review the individual log files for the failed steps under ./<AnalysisFolder>/Logs_Intermediates to identify potential sources of error.

    Sample Sheet not found

    Verify that SampleSheet.csv is present at the top level of the run folder with the name "SampleSheet.csv". If the sample sheet is in a different location, supply the sample sheet using the --sampleSheet option

    Indexes are not valid for the sequencer and/or assay

    See Valid indexes for assay and instrument combinations for correct indexes for the sequencer and assay.

    Sample Sheet is not in v2 format

    Verify that the format of the sample sheet is v2. v1 sample sheet is not compatible with DRAGEN TruSight Oncology 500 ctDNA Analysis Software.

    Analysis does not run

    Verify the analysis starts from the run folder, and BCLs or FASTQs are in the correct locations as outlined in Starting From BCL Files and Starting From FASTQ Files respectively.

    TSO 500 ctDNA

    • UP1-UP16

    TSO 500 ctDNA v2

    • UDP0001–UDP0192

    Lane Column without Values

    Ensure that the column is completed. If lane is not applicable to the run, delete the column.

    Format of v2 sample sheet is incorrect

    Verify that the sections and fields are present in the sample sheet and follow the individual rules in Sample Sheet Requirements

    Unique sample IDs

    Verify that the Sample_IDs are unique in the sample sheet.

    Incorrect folder structure

    Verify that the FASTQ files are in the correct structure. Refer to Starting From FASTQ Files for more information.

    Invalid FASTQ input files

    If the FASTQs are invalid, start TSO 500 ctDNA analysis from BCL files.

    The output file directory contains information from previous analyses

    If this issue is seen: specify a new target output folder and repeat analysis To prevent this issue: specify an empty directory before starting analysis

    Single exon (single probe) genes are still reported in the CNV VCF file, but not the CNV TSV file

    No action needed; software is working as expected.

    Currently all single probe genes are not emitted to the Copy Number Variants section of our CombinedVariantOutput.tsv. However, you can still find these events in the cnv.vcf.gz.

    Due to the single probe nature, accurate CNV calling has not been validated and as such they are emitted as REF

    Testing of cell lines, contrived samples and commercial controls does not return expected results

    Review recommendations for using these samples types here.

    Failure type: ValueError: Could not find pipeline ID for app BCLConvert in sample sheet SampleSheet.csv

    Action: Ensure StartsFromFastq field is in the [TSO500L_Settings] section, and it is not present in the [BCLConvert_Settings] Section. Refer to Sample Sheet Requirements for more information.

    Filter erroneous reads, which are reads that do not support the fusion. For example, reads that have suspicious supplementary alignment.

  • Deduplicate reads based on UMI information.

  • Applies the following rules for the final fusion output

    • If the fusion gene pair has been reported in COSMIC, it must have ≥ 2 supporting reads.

    • If the fusion gene pair has not been reported in COSMIC, it must have ≥ 3 supporting reads.

    • At least one fusion breakpoint must fall within the 23 target genes.

  • Direction

    The direction of how the breakends are joined.

    Alt_Depth

    The number of read-pairs supporting the fusion call.

    Total_Depth

    Max number of read-pairs aligned to a fusion breakend.

    BP1_Depth

    Number of read-pairs aligned to the first breakend.

    BP2_Depth

    Number of read-pairs aligned to the second breakend.

    VAF

    Variant allele frequency.

    Gene1

    Genes that overlap the first breakend.

    Gene2

    Genes that overlap the second breakend.

    Contig

    The fusion contig.

    Filter

    Indicates whether the fusion has passed all of the fusion filters from DNAFF.

    Is_Cosmic_GenePair

    Indicates whether the gene pair has been reported by COSMIC (True/False).

    Fusion Directionality Known

    Indicates whether the fusion direction is known, and indicated by the order of the genes (True/False).

    Sample

    Input sample ID.

    Name

    Fusion name as reported by the DRAGEN fusion caller.

    Chr1

    The chromosome of the first breakend.

    Pos1

    The position of the first breakend.

    Chr2

    The chromosome of the second breakend.

    Pos2

    The position of the second breakend.

    L1R2

    t[p[

    The left of breakend1 is joined with the right of breakend2.

    L1rL2

    t]p]

    The left of breakend1 is joined with the reverse complement of the left of breakend2.

    L2R1

    ]p]t

    The left of breakend2 is joined with the right of breakend1.

    rR2R1

    [p[t

    The reverse complement of the right of breakend2 is joined with the right of breakend1.

    . The script installs compatible DRAGEN software and removes any previously installed versions.
  • For Apptainer, use the following command: sudo TMPDIR=/staging /staging/install_DRAGEN_TSO500_CTDNA-2.6.0.run -- --noDockerInstall This will not install Apptainer, but will install the analysis software in the Singularity Image File (SIF) format and modify the software to launch analyses using Apptainer.

  • During the installation process, you might be instructed to reboot or power cycle the system to complete the installation of the DRAGEN software. A power cycle of the system requires the server be shut down and restarted.

  • Log out of the server and then log back in.

  • Install your DRAGEN server licenses if needed:

    1. To run DRAGEN TruSight Oncology 500 ctDNA v2.6.0 Analysis Software, you need TSOCombined license. This license is pre-installed on DRAGEN servers purchased after August 2022. To check if the license is already installed, run /opt/edico/bin/dragen_liccommand.

    2. For servers connected to the Internet, install your software licenses as follows:

      1. First, test and confirm that the server is connected to the Internet. Example: ping www.illumina.com

      2. To install the license, enter: /opt/edico/bin/dragen_lic -i auto

    3. For servers not connected to the internet, contact Illumina Customer Care at for license information.

  • After installing DRAGEN server licenses, generate a list of installed DRAGEN server licenses by running the following command: /opt/edico/bin/dragen_lic

    If license installation is successful, the list should include TSOCombined.

    If the expected licenses are not installed, contact Illumina Customer Care.

  • Docker

    20.10 or greater

    Docker 20.10.15

    DRAGEN Software

    v3.10.x where x is 17 or greater

    DRAGEN Software 3.10.17

    [email protected]envelope
    Docker websitearrow-up-right
    [email protected]envelope

    Input Folder

    The run folder or FASTQ folder that contains files to analyze.

    FASTQ List CSV

    Do not use, this only applies to auto-launch TSO 500 ctDNA analysis from FASTQs after BCL auto-launch.

    Starts from FASTQ

    True for analysis performed on files in the FASTQ folder. False for analysis performed on files in the run folder.

    Sample or Pair IDs

    Optional subset of Sample IDs or Pair IDs to analyze.

    Sample List

    Do not use, this only applies to auto-launch TSO 500 ctDNA analysis from FASTQs after BCL auto-launch.

    Storage Size

    The storage size to allocate for the analysis. The default and recommended value is Large.

    '01234' (numeric-only with leading zero)

  • Supported naming patterns:

    • '12340' (numeric without leading zero)

    • 'sample01' (alphanumeric)

    • 'A1234' (alphanumeric)

    • 'test_sample' (alphanumeric with underscore)

  • User Reference

    The analysis run name.

    User Tags

    Text labels to help index the analysis.

    Notify me when task is completed

    Option to receive an email notification when analysis is complete.

    Output Folder

    The path to the analysis output folder. The default path is the project output folder.

    Entitlement Bundle

    Automatically populated from the project details.

    Sample Sheet

    Select a sample sheet in CSV format for the analysis.

    To note: Sample Sheet selection is optional if starting from a run folder, and required when submitting a FASTQ folder.

    Illumina Connected Analytics helparrow-up-right
    Illumina Connected Analytics helparrow-up-right
    Illumina Connected Analytics helparrow-up-right
    table below
    Illumina Connected Analytics support site pagearrow-up-right

    Sample Sheet Creation in BaseSpace Run Planning tool

    hashtag
    How to Create TSO 500 ctDNA Sample Sheets in BaseSpace Run Planning tool

    The BaseSpace Sequence Hub Run Planning tool is available, and is used to generate a valid sample sheet in v2 format for use on a TSO 500 ctDNA supported sequencer for both ICA and Standalone DRAGEN Server analysis options. Filling out the form on the user interface will produce a exportable sample sheet with the required fields filled in. Refer to ICA Auto-launch Sample Sheet Requirements for descriptions of fields that appear in ICA sample sheets.

    The sections below represent each step in the BaseSpace Run Planning tool.

    circle-info

    Note that NovaSeq X Series has a different run set up screen than other instrument platforms, as it allows the user to include multiple assay configurations in one run. DRAGEN TSO 500 ctDNA supports multi-assay flow cell analysis starting in version 2.6.3.

    For ctDNA v2.6.0 and above, in order to run TSO 500 ctDNA on NovaSeq X Series, enter the appropriate Read 1, Read 2, Index 1 and Index 2 described in the instructions below.

    circle-exclamation

    BaseSpace Run Planning tool cannot generate a valid sample sheet for the NovaSeq 6000Dx TSO 500 ctDNA Analysis Application on Illumina Run Manager. Refer to to create a valid sample sheet.

    hashtag
    Step 1: Run Settings

    Parameter Name
    Required
    Description

    hashtag
    Step 2: Configuration

    circle-info

    Note: On NovaSeq X Series, this page is called "Configuration 1". The right hand corner of the UI displays the Read 1, Read 2, Index 1 and Index 2 entered on the previous run settings screen.

    Parameter Name
    Required
    Description

    hashtag
    Step 3: Sample Settings

    Users can manually enter sample information, or download a template file to bulk upload sample information. Users can import the completed template or a compatible sample sheet.

    Parameter Name
    Required
    Description

    hashtag
    Step 4: Run Review

    Once all details are captured and pass validation, the user can review the details on the Run Review screen. From here they can choose to edit details in previous screens or export the sample sheet. Once completed, press the Cancel button to finish run planning.

    Note: once leaving this screen, the run and sample sheet will not be accessible.

    For NovaSeqX Plus users, the run can be saved as a draft or as a planned run (via “Save as Draft” and “Save as Planned” buttons respectively). Either selection will save the run to the Planned Runs screen on BaseSpace. There is no option to export the sample sheet on this screen.

    hashtag
    Planned Runs Screen (NovaSeq X Series only)

    The Planned Runs screen lists all planned or drafted runs. Users can set drafted runs to planned, export the sample sheet, and edit or delete a run on this screen.

    Once the run is saved as Planned, it will appear on the NovaSeq X Series instrument where it can be selected for sequencing.

    For more information on run planning, refer to the .

    hashtag
    Guided Examples

    Please review these guided examples of analysis workflows that include a step of setting up a run in BaseSpace Run Planning tool:

    CNV

    The copy number variant caller in the TSO 500 ctDNA analysis software performs amplification, reference, and deletion calling for 59 genes. The list of genes can be accessed on the .

    hashtag
    Target Counts

    The first step in the CNV calling algorithm identifies target counts. It includes extracting signals such as read count and improper pairs and putting them into target intervals.

    [email protected]envelope
    Auto-Launch with FASTQs generated by Standalone BCL Convert Pipeline (Start from FASTQ)arrow-up-right
    Starts from FASTQarrow-up-right

    Read 1

    Required on Instrument Platform NovaSeq X Series

    • Fill with value 151 for TSO 500 ctDNA analysis

    Index 1

    Required on Instrument Platform NovaSeq X Series

    • Fill with value 10 for TSO 500 ctDNA analysis

    Index 2

    Required on Instrument Platform NovaSeq X Series

    • Fill with value 10 for TSO 500 ctDNA analysis

    Read 2

    Required on Instrument Platform NovaSeq X Series

    • Fill with value 151 for TSO 500 ctDNA analysis

    Sample Container ID

    Optional

    • Unique Identifier for the container that holds the sample

    Index ID

    Required

    Index set ID options are based on selected Index Adapter Kit

    Project

    Optional

    Optional field to describe the associated project

    Starts from Fastq

    Required

    True or False

    If auto-launching TSO 500 ctDNA from BCL files, set the value to False. If auto-launching TSO 500 ctDNA from FASTQ after auto-launching BCL Convert, set the value to True.

    DNA Barcode Mismatches Index 1

    DNA Barcode Mismatches Index 2

    Required on NovaSeq X

    Default value is set to 1.

    These fields are required by NovaSeq X and represent BCL Convert settings for index diversity checks when demultiplexing. These values are not used in TSO 500 ctDNA analysis.

    Run Name

    Required

    Run Name can contain 255 alphanumeric characters, dashes, underscores, periods, and spaces; and must start with an alphanumeric, a dash or an underscore.

    Run Description

    Optional

    Run Description can contain 255 characters except square brackets, asterisks, and commas.

    Instrument Platform

    Required

    Choose from TSO 500 ctDNA supported instruments:

    • NovaSeq 6000/6000Dx

    • NovaSeq X Series

    Secondary Analysis

    Required

    Application

    Required

    • DRAGEN TruSight Oncology 500 ctDNA Analysis Software - 2.6.1

    Description

    Optional

    Optional text field

    Library Prep Kit

    Required

    • TruSight Oncology 500 ctDNA (only for NovaSeq 6000/6000Dx instruments)

    • TruSight Oncology 500 ctDNA v2

    Index Adapter Kit

    Required

    Read Lengths: Read 1 and Read 2

    Required Not applicable on NovaSeq X Series

    Auto filled with the standard values, but can be optionally overwritten.

    Override Cycles

    Required on NovaSeq X Series

    Entered based on Run Settings read lengths & index 1 / index 2

    Lane Usage

    Not applicable on NovaSeq X Series

    Checkbox allows users to apply the same lane across samples.

    Lane

    Required if Lane Usage is unchecked

    Sample Sheet Requirements page
    BaseSpace Sequence Hub support site pagearrow-up-right
    NovaSeq 6000Dx: TSO 500 Auto-launch Analysis in Cloudarrow-up-right
    • BaseSpace/Illumina Connected Analytics (to generate sample sheet for cloud analysis. May also be used for analysis on a DRAGEN server.)

    • Local (only for analysis onboard the sequencer— not applicable to TSO 500 ctDNA)

    • None (to generate a sample sheet for analysis on a DRAGEN server. Note that cloud sample sheets may also be used for local analysis.)

    TSO 500 ctDNA:

    • TruSight Oncology 500 ctDNA (NovaSeq6000Dx)

    • TruSight Oncology 500 ctDNA (NovaSeq6000)

    TSO 500 ctDNA v2:

    Specify lanes for each sample. The unmarked checkbox at the top of the dropdown selects all lanes.

    hashtag
    GC Bias Correction

    GC bias refer to systematic differences in read coverage that are related to the GC content of the DNA (percentage of guanine (G) and cytosine (C) bases). Library preparation, capture kits, sequencing system differences, and mapping can contribute to GC bias (Note: many of these variables are also captured in the Panel of Normals covered in Normalization below). Regions with very high GC content might be underrepresented because they are harder to amplify or sequence. Regions with moderate GC content might be overrepresented.

    GC bias can impact CNV calling. The DRAGEN GC bias correction module is designed to correct GC bias and generate GC-bias-corrected target counts.

    hashtag
    Normalization

    The goal of normalization is to correct for technical biases and artifacts introduced during sample preparation, sequencing, and data processing. These biases can otherwise lead to reductions in sensitivity and specificity in CNV calling.

    In the CNV caller in the TSO 500 ctDNA analysis software, the normalization step uses a panel of normals (PON) to determine the baseline level from which to call CNV events. Illumina carefully assembles the PON to include a cohort of healthy cell-free DNA samples and contrived samples and updates the PON as needed to improve performance. The PON samples are processed using either TSO 500 ctDNA v1 or v2 library prep kits to accompany different library prep kits within the portfolio.

    The normalized read counts produced at this step are then used for segmentation and fold-change calculation.

    hashtag
    Targeted Segmentation

    In context of CNV calling, segments are regions of the genome with similar read-depth or copy number profiles. The TSO 500 ctDNA analysis software does not use a segmentation algorithm to assess CNV events, but rather they are derived from a pre-defined CNV resource BED file included with the software. The boundaries within this file determine the intervals used for copy number variant analysis.

    hashtag
    Fold-Change (FC) Calculation

    Fold change measures how much the read coverage in a given genomic region, for example, containing a target gene, differs from a baseline.

    The CNV caller in the TSO 500 ctDNA analysis software calculates Fold Change using a formula:

    • Observed Coverage: The actual read depth in a region.

    • Expected Coverage: The average read depth in regions assumed to be diploid (2 copies) as represented in the panel of normals (PON).

    hashtag
    Calling / Genotyping

    The CNV caller in the TSO 500 ctDNA analysis software uses gene-specific thresholds established by Illumina based on the PON to call amplifications and deletions. The thresholds indicate what fold-change values each gene needs to reach to be called an amplification or a deletion.

    In the VCF notation, <DUP> indicates that the detected fold change is greater than a predefined amplification cutoff for the that gene. <DEL> indicates that the fold change is less than a predefined deletion cutoff for that gene. The cutoffs vary from gene to gene and stored in cnv/dragen_tso500_manifest_59genes_predefined_cutoff_cfDNA.bed.

    hashtag
    Output

    CNV calls are output in several files:

    • Combined Variant Output File, {Sample_ID}_CombinedVariantOutput.tsv

    • Copy Number Variants VCF , {Sample_ID}.cnv.vcf

    • Copy Number Variants Metrics CSV, {Sample_ID}.cnv_metrics.csv

    hashtag
    Combined Variant Output File

    File name: {Sample_ID}_CombinedVariantOutput.tsv

    The Combined Variant Output File displays CNV calling results in the section [Copy Number Variants]. Each gene is accompanied by a fold change value.

    To be included in the Combined Variant Output File, copy number variants must meet the following conditions:

    • FILTER field in the Copy Number Variants VCF file marked as PASS

    • ALT field is <DUP or <DEL>

    • Gene is part of the copy number variant gene list

    hashtag
    Copy Number Variants VCF

    File name: {Sample_ID}.cnv.vcf

    The file includes CNV calling results in the VCF format. <DUP> indicates amplifications, <DEL> indicates deletions.

    Each CNV call is accompanied by a Q-score in the QUAL column. Q-score is a Phred transformation of the p-value represented by the following equation, where p-value is derived from the t-test between the fold change (FC) of the gene against the rest of the genome:

    Higher Q-scores indicate higher confidence in the CNV call.

    The Copy Number Variants VCF includes additional details on CNV genomic position, breakpoints, bins and segments.

    hashtag
    Copy Number Variants Metrics CSV

    File name: {Sample_ID}.cnv_metrics.csv

    The file includes copy number variant metrics reported on a per sample level.

    Name
    Description

    Sex Genotyper

    The predicted sex of the sample

    Bases in reference genome

    The number of bases in the reference genome

    Average alignment coverage over genome

    The average alignment coverage across the reference genome

    Number of alignment records

    The number of alignment records in the sample

    Number of filtered records (total)

    The number of total filtered records

    Number of filtered records (duplicates)

    The number of duplicated filtered records

    TSO 500 ctDNA product pagearrow-up-right
    CNV calling algorithm of the TSO 500 ctDNA analysis software
    FoldChange=ObservedCoverage/ExpectedCoverageFold Change = Observed Coverage/Expected CoverageFoldChange=ObservedCoverage/ExpectedCoverage

    Installation of TSO 500 ctDNA v2.6.2, v2.6.3 on Standalone DRAGEN Server

    hashtag
    Overview

    The installation script for DRAGEN TruSight Oncology 500 ctDNA Analysis Software installs the following software and dependencies:

    1. DRAGEN TruSight Oncology 500 ctDNA Analysis Software itself

    2. DRAGEN Software if a compatible version is not present

    3. Docker software if a compatible version is not present

    4. A script required to generate DRAGEN genome hash table

    5. A script to check that DRAGEN TruSight Oncology 500 ctDNA Analysis Software is installed properly

    hashtag
    Installation Requirements

    hashtag
    Hardware

    • DRAGEN server v3 or v4

    hashtag
    Software

    • Linux CentOS 7.9 operating system (or later) or Oracle Linux 8 (or later), one of which is provided on the server. Oracle Linux 8 is recommended.

    • Docker Software, see table below for minimum version needed. If sufficient Docker software is not present on the server, the TSO 500 installer will install compatible Docker software.

    • DRAGEN Server Software*, see table below for minimum version needed as the host version on the server. If sufficient DRAGEN software is not present on the server, the TSO 500 installer will install compatible DRAGEN software.

    Software Dependency
    Compatible
    Installs
    circle-info

    *The DRAGEN Server Software version may be higher than the DRAGEN version used by the DRAGEN TSO 500 ctDNA v2.6.2 pipeline (DRAGEN v3.10.18), which is provided inside the DRAGEN TSO 500 ctDNA docker image.

    hashtag
    Licenses

    • TSO500Combined license

    TSO500Combined license has been pre-installed to DRAGEN servers in manufacturing since August 2022. To generate a list of installed DRAGEN server licenses, run the following command: /usr/bin/dragen_lic. If a license is not installed, contact Illumina Customer Care at for the license.

    hashtag
    Permissions

    Illumina recommends logging in as root user for installation, but as a non-root user for running TSO 500 ctDNA analysis.

    • A non-root user must be a member of the Docker group to run Docker. For more information on Docker permission requirements and alternatives to running as root, refer to the Docker documentation available on the .

    • Installing and uninstalling DRAGEN TruSight Oncology 500 ctDNA Analysis Software and running the system check requires root privileges.

    • Run DRAGEN TruSight Oncology 500 ctDNA Analysis Software without being logged in as a root user. Running the DRAGEN TruSight Oncology 500 ctDNA Analysis Software as root is not required or recommended.

    hashtag
    Compatibility with other DRAGEN pipelines

    DRAGEN TSO 500 Analysis Software ctDNA v2.6.2 and v2.6.3 (v2.6.2+) are multi-version compatible. Multi-version compatibility refers to ability to be installed on a single DRAGEN server with software running a different version of DRAGEN software. For example, multi-version compatible pipelines running DRAGEN v4.3.6+ can be co-installed on a server alongside DRAGEN TSO 500 ctDNA pipelines with DRAGEN server software v3.10.19 and above. For more details on DRAGEN multi-version compatibility, please visit .

    Software versions without multi-version compatibility referred to as single-version compatible. DRAGEN TSO 500 ctDNA Analysis Software v2.6.2+ will disrupt installations of single-version compatible software from the DRAGEN server. To uninstall a previous version of DRAGEN TSO 500 ctDNA Analysis Software, refer to the respective guide.

    Compatibility of software for co-installation with DRAGEN TSO 500 ctDNA on a DRAGEN server is summarized in the table below:

    Software
    Version
    Type
    Compatible

    *DRAGEN TSO 500 Analysis Software ctDNA v2.6.2+ can run on a single server with DRAGEN TSO 500 v2.5.4, and should be installed after v2.5.4. DRAGEN TSO 500 Analysis Software ctDNA v2.6.2+ can be co-installed with multi-version compatible DRAGEN TSO 500 Analysis Software or other DRAGEN pipelines with any order of installation.

    **For example, DRAGEN Enrichment, DRAGEN Germline, and others. Order of installation does not matter.

    hashtag
    Installation Instructions

    hashtag
    Pre- and Post- Installation Steps

    1. Uninstall all existing single-version compatible software on the server (see table above)

    2. Install DRAGEN TSO 500 ctDNA Analysis Software v2.6.2+

      • This step will disrupt previously installed single-version compatible software but will not impact multi-version compatible ones (see table above)

    hashtag
    Steps to Install DRAGEN TSO 500 ctDNA Analysis Software v2.6.2+

    As a root user, perform the following steps to install DRAGEN TSO 500 ctDNA Analysis Software v2.6.2 and v2.6.3 (v2.6.2 for an example):

    1. Contact Illumina Customer Care at to obtain the DRAGEN TSO 500 ctDNA Analysis Software installer package.

    2. Download the installation package provided in the email from Illumina. The link expires after 7 days.

    circle-info

    It is recommended to use a command line tool like wget or curl to download the file rather than pasting the link into the web browser bar. For example:

    curl -o {filename} "{link}"

    wget -O {filename} "{link}"

    Where the file name is the installation script file name, and the link is provided by Illumina Customer Care.

    1. Make sure no other analysis is being performed. Installing the software while performing other analyses prevents the installer process from proceeding.

    2. Copy the install script to the /staging directory to store the script in the directory.

    circle-info

    Installation Script: install_DRAGEN_TSO500_CTDNA-2.6.2.run

    SHA256 value: 1324c86183526e12afb267f553a569cf78b01fd7a4ee85f8e07cc2f6a33d8f41

    Installation Script: install_DRAGEN_TSO500_CTDNA-2.6.3.run

    SHA256 value: 77a4f3b44af22c8c83e0bfe43d3b186d2b74dc58b8bbdb6f324ba3ce6220ab9e

    1. Use the following command to update the run script permission: chmod +x /staging/install_DRAGEN_TSO500_CTDNA-2.6.2.run

    2. Use the following command to run the installation script (run time ~ 20 minutes):

      1. For Docker, use the following command: sudo TMPDIR=/staging /staging/install_DRAGEN_TSO500_CTDNA-2.6.2.run . The script installs compatible DRAGEN software and removes any previously installed versions.

    hashtag
    Running the System Check

    After installation is complete, make sure the system functions properly by running the following command: /usr/local/bin/check_DRAGEN_TSO500_CTDNA-2.6.2.sh

    The script checks that:

    • All required services are running

    • Proper Docker image is installed

    • DRAGEN TSO 500 ctDNA Analysis Software can successfully process a test data set

    The system check script runs for approximately 25 minutes. If the script prints a failure message, contact Illumina Technical Support and provide the /staging/check_DRAGEN_TSO500_CTDNA_<timestamp>.tgz output file.

    If using MacOS to connect to a server, an error can occur if the local settings are not in English. To resolve the error, disable the ability to set environment variables automatically in Terminal settings.

    hashtag
    Uninstall Software

    The DRAGEN TSO 500 ctDNA Analysis Software installation includes an uninstall script called uninstall_DRAGEN_TSO500_CTDNA-2.6.2.sh, which is located in /usr/local/bin.

    Executing the uninstall script removes the following assets:

    • All DRAGEN TSO 500 Analysis Software related scripts located in /usr/local/bin

    • Resources found in /staging/illumina/DRAGEN_TSO500_CTDNA-2.6.2

    • The dragen_tso500_ctdna Docker image

    To uninstall the DRAGEN TSO 500 ctDNA Analysis Software, run the following command as a root user:

    uninstall_DRAGEN_TSO500_CTDNA-2.6.2.sh

    You are not required to uninstall Docker or DRAGEN software. To remove Docker, review the install instructions for your operating system in the Docker documentation.

    Small Variants

    hashtag
    Small Variant Calling and Filtering

    The DRAGEN TSO 500 ctDNA Analysis Software supports calling SNVs, indels, MNVs, and delins from cfDNA samples by using mapped and aligned DNA reads from a plasma sample as input.

    Variants are detected via both column wise pileup analysis and local de novo assembly of haplotypes. The de novo haplotypes allow the detection of much larger insertions and deletions than possible through column wise pileup analysis only. Insertions and deletions called by the TSO 500 ctDNA analysis software do not have a size limitation but has different level of performance testing depending on the length, see for more details.

    TruSight Oncology 500 ctDNA Index Set A and B (UDP 1-192) (NovaSeq6000Dx, NovaSeqX Series)
  • TruSight Oncology 500 ctDNA Index Set A and B (UDP 1-192) (NovaSeq6000)

  • Number of filtered records (MAPQ)

    The number of MAPQ filtered records

    Number of filtered records (unmapped)

    The number of unmapped filtered records

    Coverage MAD

    Gene Scaled MAD

    Median Bin Count

    Number of target intervals

    The number of target intervals in the sample

    Number of normal samples

    Number of segments

    The number of segments in the sample. Applicable only to CNV SLM mode

    Number of amplifications

    The number of amplifications in the sample. Applicable only to CNV SLM mode.

    Number of deletions

    The number of deletions in the sample. Applicable only to CNV SLM mode.

    Number of passing amplifications

    The number of passing amplifications in the sample. Applicable only to CNV SLM mode.

    Number of passing deletions

    The number of passing deletions in the sample. Applicable only to CNV SLM mode.

    DRAGEN TSO 500

    2.6.0

    Single-version

    No

    DRAGEN TSO 500

    2.5.4

    Multi-version

    Yes*

    DRAGEN TSO 500

    2.5.3 or below

    Single-version

    No

    DRAGEN pipelines**

    4.3.6+

    Multi-version

    Yes

    DRAGEN pipelines**

    4.2 or below

    Single-version

    No

    Install other multi-version DRAGEN software/pipelines if needed
    • Multi-version software can be installed in any order

    1. If new DRAGEN version is not installed from the above command, use sudo TMPDIR=/staging /staging/install_DRAGEN_TSO500_CTDNA-2.6.2.run -- --forceDragenInstall to force a reinstall of DRAGEN.

  • For Apptainer, use the following command: sudo TMPDIR=/staging /staging/install_DRAGEN_TSO500_CTDNA-2.6.2.run -- --noDockerInstall This will not install Apptainer, but will install the analysis software in the Singularity Image File (SIF) format and modify the software to launch analyses using Apptainer.

  • During the installation process, you may be instructed to reboot or power cycle the system to complete the installation of the DRAGEN software. A power cycle of the system requires the server be shut down and restarted.

  • Log out of the server and log back in.

  • Install your DRAGEN server licenses if needed (use /opt/dragen/3.11.2/bin/dragen_lic for v2.6.3):

    1. To run DRAGEN TSO 500 ctDNA Analysis Software v2.6.2 , you need TSOCombined license. This license is pre-installed on DRAGEN servers purchased after August 2022. To check if the license is already installed, run /opt/dragen/3.10.19/bin/dragen_lic command.

    2. For servers connected to the Internet, install your software licenses as follows:

      1. First, test and confirm that the server is connected to the Internet. Example: ping www.illumina.com

      2. To install the license, enter: /opt/dragen/3.10.19/bin/dragen_lic -i auto

    3. For servers not connected to the internet, contact Illumina Customer Care at for license information.

  • After installing DRAGEN server licenses, generate a list of installed licenses by running the following command: /opt/edico/bin/dragen_lic

    If license installation is successful, the list should include TSOCombined.

    If the expected licenses are not installed, contact Illumina Customer Care.

  • Docker

    20.10 or greater

    Docker 20.10.15

    DRAGEN Server Software*

    v3.10.x, where x >=19, v4.3+ for TSO500 ctDNA v2.6.2;

    v3.11.x, where x >=2, v4.3+ for TSO500 ctDNA v2.6.3

    DRAGEN Software 3.10.19 for TSO500 ctDNA v2.6.2;

    DRAGEN Software 3.11.2 for TSO500 ctDNA v2.6.3

    DRAGEN TSO 500 ctDNA

    2.6.2+

    Multi-version

    Yes

    DRAGEN TSO 500 ctDNA

    2.6.1 or below

    Single-version

    No

    DRAGEN TSO 500

    2.6.1+

    Multi-version

    [email protected]envelope
    Docker websitearrow-up-right
    page 7 of the DRAGEN v4.3.6 software release notesarrow-up-right
    [email protected]envelope

    Yes

    To call variants via local de novo assembly of haplotypes in active regions, haplotypes are first generated with de Bruijn graph. The likelihood of a read supporting a haplotype is calculated using a Paired Hidden Markov Model. Somatic Score (SQ) is calculated as the joint posterior probability that a variant is present in the sample. For each variant candidate, background noise at the same site is taken into account using a systematic noise file. A p-value is calculated using the observed variant depth, total depth, and the systematic noise using binomial distribution and then converted to a variant Quality Score (AQ).

    Variants are called if SQ >= 2 and AQ >= 20 for variants present in Catalogue of Somatic Mutations in Cancer (COSMIC) with count > 50 (hotspots) or if SQ >= 2 and AQ >= 60 for remaining sites (nonhotspots).

    In addition, DRAGEN uses the de novo assembly to detect SNVs, insertions, and deletions that are co-phased and part of the same haplotypes. Any such co-phased variants that are within a window of 15 bp are then reassembled into complex variants (MNVs and delins).

    The pipeline makes no ploidy assumptions, enabling detection of low-frequency alleles.

    DRAGEN small variant calling includes the following steps:

    1. Detects regions with sufficient read coverage (callable regions).

    2. Detects regions where the reads deviate from the reference and there is a possibility of a germline or somatic call (active regions).

    3. Assembles de novograph haplotypes are assembled from reads (haplotype assembly).

    4. Extracts possible somatic or germline calls (events) from column wise pileup analysis.

    5. Calibrates read base qualities to account for sample-specific noise.

    6. Computes read likelihoods for each read/ haplotype pair.

    7. Performs variant calling by summing the genotype probabilities across all reads/haplotype pairs.

    8. Performs additional filtering to improve variant calling accuracy (see ).

    hashtag
    Systematic Noise File

    The DRAGEN TSO 500 ctDNA Analysis Software uses a systematic noise file to improve variant calling accuracy. The file indicates the statistical probability of noise at specific positions in the genome. Illumina has constructed noise files using 40-60 normal cfDNA libraries that are sequencer specific. Regions where noise is common (eg, difficult to map regions) have higher noise values. The small variant caller penalizes those regions to reduce the probability of making false positive calls.

    Systematic noise file accounts for site specific noise by estimating average allele frequency over multiple normal samples

    hashtag
    Germline, Somatic and Clonal Hematopoiesis (CH) tagging

    The Tumor Mutational Burden (TMB) module of DRAGEN TSO 500 ctDNA Analysis Software, predicts whether a small variant is of germline or somatic origin as well as whether the variant is associated with Clonal Hematopoiesis (CH). The results are output in the TMB Trace TSV and Small Variant VCF files.

    Please review the TMB algorithm page for more details.

    triangle-exclamation

    Variant statuses (somatic, germline, clonal hematopoiesis (CH) variant) are predictions intended for TMB calculation. Use caution if using them separately as their performance has not been tested outside of the TMB algorithm.

    hashtag
    Outputs

    The DRAGEN TSO 500 ctDNA Analysis Software produces several files with small variant calling results, including:

    • Combined Variant Output File, {SampleID}_CombinedVariantOutput.tsv

    • Small Variant VCF {SampleID}_hard-filtered.vcf

    • Small Variant Genome VCF {SAMPLE_ID}_hard-filtered.gvcf.gz

    • Small Variant Annotated JSON {SAMPLE_ID}_SmallVariants_Annotated.json.gz

    hashtag
    Combined Variant Output File

    File name: {SampleID}_CombinedVariantOutput.tsv

    All variants with the FILTER field marked as PASS in the Small Variant Genome VCF are present in the Combined variant Output.

    • Gene information is only present for variants belonging to canonical transcripts that are within the Gene Allow List–Small Variants.

    • Transcript information is only present for variants belonging to canonical transcripts that are within the Gene Allow List–Small Variants.

    Combined variant output produces small variants with blank fields in the following situations:

    • The variant has been matched to a canonical RefSeq transcript on an overlapping gene not targeted by TruSight Oncology 500 ctDNA.

    • The variant is located in a region designated iSNP, indel, or Flanking in the TST500_Manifest.bed file located in the Resources folder.

    hashtag
    Small Variant VCF

    File name: {SampleID}_hard-filtered.vcf

    The Small Variant VCF file outputs all small variant calling results.

    hashtag
    MNVs and Phased Variants

    The small variant file contains both phased variants and all other small variants. The header sections from both the phased variant (complex) VCF and the small variant VCF are included in this merged VCF. Variants that are found for both phased variants and small variants are only displayed as phased variants.

    hashtag
    Germline Status

    The Small Variant VCF file contains predicted germline, somatic, and clonal hematopiesis (CH) variants that can be further filtered down using GermlineStatus in the INFO field. See this section for more details.

    hashtag
    Filter Status

    Variants can be filtered down using different tags assigned in the field FILTER as described in the table below.

    Filter
    Description

    base_quality

    Site filtered because median base quality of alt reads at this locus does not meet threshold

    filtered_reads

    Site filtered because too large a fraction of reads have been filtered out

    fragment_length

    Site filtered because absolute difference between the median fragment length of alt reads and median fragment length of ref reads at this locus exceeds threshold

    low_depth

    Site filtered because the read depth is too low (<1000)

    low_frac_info_reads

    Site filtered because the fraction of informative reads is below threshold (<0.5)

    low_normal_depth

    Site filtered because the normal sample read depth is too low

    2 This is a static list of regions compiled by Illumina. Email Illumina Technical Support for more information.

    hashtag
    Small Variant Genome VCF

    File name: {SAMPLE_ID}_hard-filtered.gvcf.gz

    The small variant genome VCF file includes the variant call status for all positions in all targeted intervals.

    hashtag
    Small Variant Annotated JSON

    File name: {SAMPLE_ID}_SmallVariants_Annotated.json.gz

    The small variants annotated file provides variant annotation information for all non-reference positions in the VCF, which includes non-pass variants. The variant consequence definition is available on the Sequence Ontology websitearrow-up-right.

    All pass variant calls are annotated using the Illumina Annotation Engine (IAE), also known as Nirvana, with the following information (using the RefSeq transcript):

    • HGNC Gene

      • Transcript

      • Exon

      • Consequence

      • c.HGVS

      • p.HGVS

    • COSMIC

    Performance Testing page

    TMB

    Tumor mutational burden (TMB) is a total number of somatic mutations present within the cancer genome.

    To calculate TMB, the algorithm follows the following steps.

    hashtag
    Small variant calling

    Refer to on how small variants are called.

    [email protected]envelope

    long_indel

    Site filtered because the indel length is too long

    mapping_quality

    Site filtered because median mapping quality of alt reads at this locus does not meet threshold (<30)

    multiallelic

    Site filtered because more than two alt alleles pass tumor LOD

    non_homref_normal

    Site filtered because the normal sample genotype is not homozygous reference

    no_reliable_supporting_read

    Site filtered because no reliable supporting somatic read exists

    panel_of_normals

    Seen in at least one sample in the panel of normals vcf

    read_position

    Site filtered because median of distances between start/end of read and this locus is below threshold

    RMxNRepeatRegion

    Site filtered because all or part of the variant allele is a repeat of the reference

    str_contraction

    Site filtered due to suspected PCR error where the alt allele is one repeat unit less than the reference

    too_few_supporting_reads

    Site filtered because there are too few supporting reads in the tumor sample

    weak_evidence

    Somatic variant score does not meet threshold (SQ < 2)

    systematic_noise

    Site filtered based on evidence of systematic noise in normals Candidate has low AQ Score: AQ < 20 for variants with COSMIC count ≥ 50 AQ < 60 for all other sites

    excluded_regions

    Site overlaps with vc excluded regions bed2

    Filter Status
    hashtag
    Eligible region detection

    TMB is computed over protein coding regions with sufficient coverage, excluding low confidence regions (our blocklist regions.) In case of the DRAGEN TSO 500 ctDNA analysis software, the total coding region with coverage ≥ 1000X is used.

    hashtag
    Germline variant identification

    To exclude germline variants from TMB calculation, the algorithm includes two methods for predicting germline variant origin.

    hashtag
    1. Database filter

    Variants with a population allele count ≥ 10 in either the 1000 Genome or gnomAD database are marked as germline and assigned a tag Germline_DB in the “tmb.trace.tsv” and “hard-filtered.vcf” files.

    hashtag
    2. Proxi filter

    In the TSO 500 ctDNA pipeline, the proxi filter uses a probabilistic approach. For a target variant, it estimates the expected germline allele frequency using the surrounding germline variants. It then tests whether the allele frequency of the target variant is similar to the expected germline allele frequency. If the allele frequency is similar to expected, a tag Germline_Proxi is assigned to the target variant in the “tmb.trace.tsv” and “hard-filtered.vcf” files.

    circle-info

    Note that proxi filter does not work well for 100% pure cell lines as well as for mixed or contaminated samples, as these samples do not have clear germline variant allele frequency distributions.

    hashtag
    Clonal hematopoiesis (CH) variant identification

    Clonal hematopoiesis (CH) is characterized by the overrepresentation of blood cells derived from a single clone. CH is common and increases in prevalence with age. For the accurate determination of TMB, the CH variants need to be excluded.

    The TSO 500 ctDNA pipeline uses two methods to tag variants as CH variants.

    hashtag
    1. CH genes whitelist

    Some of the most commonly mutated genes in clonal hematopoiesis, DNMT3A, TET2, PPM1D, and ASXL1, are included into the CH genes whitelist. If the variant is in one of these genes, a tag Somatic_Putative_CH is assigned to the variant in the “tmb.trace.tsv” and “hard-filtered.vcf” files.

    hashtag
    2. cfDNA fragment size analysis

    CH-derived cfDNA fragments are generally longer compared to tumor-derived cfDNA, which tends to be shorter. This difference is used to identify CH variants based on the fragment size of reads supporting variant calls. Non-germline variants from the longer fragments are tagged as Somatic_Putative_CH in the “tmb.trace.tsv” and “hard-filtered.vcf” file.

    Only variants with sufficient level of supporting reads or variant allele counts (VAC) > 50 are tested for fragment size difference between the reads supporting reference allele and reads supporting the variant allele. Non-germline variants with lower levels of VAC or without enough statistical power for the size difference test will remain tagged as Somatic in the “tmb.trace.tsv” and “hard-filtered.vcf” file.

    Germline variant and clonal hematopoiesis (CH) variant identification in the TMB algorithm.

    hashtag
    Tumor driver variant identification

    Excluding tumor driver variants helps reduce bias for the bTMB calculations that could be due to targeted enrichment of the panel of genes. Variants with count ≥ 50 in the COSMIC database are treated as tumor driver variants and excluded from the calculation.

    hashtag
    Nonsynonymous variant identification

    The nonsynonymous variant are defined as described in the DRAGEN user guidearrow-up-right. Only nonsynonymous variants are used to calculate Nonsynonymous TMB.

    hashtag
    TMB calculation

    The TMB is calculated using the following equations:

    TMB=Eligible VariantsEffective Panel SizeTMB = {Eligible\ Variants \over Effective\ Panel\ Size}TMB=Effective Panel SizeEligible Variants​

    NonsynonymousTMB=Filtered Nonsynonymous VariantsEligible Region Size(Mbp)Nonsynonymous TMB = {Filtered\ Nonsynonymous\ Variants \over Eligible\ Region\ Size (Mbp)}NonsynonymousTMB=Eligible Region Size(Mbp)Filtered Nonsynonymous Variants​

    The eligible variants and effective panel size of the TMB calculation are summarized in the following table:

    Calculation Value
    Description

    Eligible variants (numerator)

    • Variants in the coding region (RefSeq Cds)

    • Variant frequency ≥ 0.2%

    • Coverage ≥ 1000X

    Effective panel size (denominator)

    Total coding region with coverage ≥ 1000X.

    hashtag
    TMB Output Files

    The TMB algorithm outputs results in several files:

    1. Combined Variant Output File, {SampleID}_CombinedVariantOutput.tsv

    2. TMB Metrics CSV file, {Sample_ID}.tmb.metrics.csv

    3. TMB Trace TSV file, {Sample_ID}.tmb.trace.tsv

    4. TMB Max Somatic VAF file, {Sample_ID}.tmb.msaf.csv

    hashtag
    1. Combined Variant Output File

    File name: {SampleID}_CombinedVariantOutput.tsv

    The TMS results are output in the section [TMB] and include:

    • The TMB value

    • Coding Region Size in Megabases (a denominator for the TMB formula)

    • Number of Passing Eligible Variants (a numerator for the TMB formula)

    hashtag
    2. TMB Metrics CSV

    File name: {Sample_ID}.tmb.metrics.csv

    The TMB metrics file contains the TMB and Nonsynonimous TMB calculation results and values used to calculated them for each DNA sample.

    Column
    Description

    Total Input Variant Count

    Total number of variant considered by the algorithm

    Total Input Variant Count in TMB region

    Total number of variant considered by the algorithm in the TMB eligible region

    Filtered Variant Count

    Variants remaining after filtering, see for details

    Filtered Nonsyn Variant Count

    Nonsynonymous variants remaining after filtering, see for details

    Eligible Region (MB)

    The eligible region, in megabases, that meet the minimum coverage threshold.

    TMB

    TMB value for the sample

    hashtag
    3. TMB Trace File

    The TMB trace file provides comprehensive information on how the TMB value is calculated for a given sample. All passing small variants from the small variant filtering step are included in this file. To view eligible variants for TMB calculation, set the filter for the column IncludedInTMBNumerator to TRUE.

    triangle-exclamation

    Variant statuses (somatic, germline, clonal hematopoiesis (CH) variant) are predictions intended for TMB calculation. Use caution if using them separately as their performance has not been tested outside of the TMB algorithm.

    Column
    Description

    Chromosome

    Chromosome

    Position

    Position of variant

    RefCall

    Reference base

    AltCall

    Alternate base

    VAF

    Variant allele frequency

    Depth

    Coverage of position

    hashtag
    4. TMB Max Somatic VAF file

    The file outputs a variant with the Max Somatic VAF, using the same file format as the TMB Trace File.

    Small Variantsarrow-up-right
    SNVs and Indels (MNVs excluded)
  • Nonsynonymous and synonymous variants. Only nonsynonymous variants are used for Nonsynonymous TMB.

  • Variants with count ≥ 50 in the COSMIC database are excluded

  • Mutations in ASXL1, DNMT3A, PPM1D, and TET2 are excluded

  • Fragment-size based potential clonal hematopoiesis (CH) variants are excluded

  • Nonsyn TMB

    Nonsynonymous TMB value for the sample

    CytoBand

    Cytoband of variant

    GeneName

    Name of gene if applicable. A semicolon delimited list is used for multiple genes.

    VariantType

    Type of the variant: SNV, insertion, deletion, MNV

    CosmicIDs

    Cosmic IDs, if multiple concatenated by “;”

    MaxCosmicCount

    Maximum COSMIC study count

    ClinVarIDs

    Reference ClinVar Variation IDs (RCV IDs)

    ClinVarSignificance

    Variant Classification in ClinVar database

    AlleleCountsGnomadExome

    Variant allele count in gnomAD exome database

    AlleleCountsGnomadGenome

    Variant allele count in gnomAD genome database

    AlleleCounts1000Genomes

    Variant allele count in 1000 Genomes database

    MaxDatabaseAlleleCounts

    Maximum variant allele count over the three databases

    GermlineFilterDatabase

    TRUE if variant was filtered by the database filter

    GermlineFilterProxi

    TRUE if variant was filtered by the proxi filter

    CodingVariant

    TRUE if variant is in the coding region

    Nonsynonymous

    TRUE if variant has any transcript annotations with nonsynonymous consequences

    IncludedinTMBNumerator

    TRUE if variant is used in the TMB calculation

    Status

    Germline_DB or Germline_Proxi if the variant was filtered by , correspondingly. Somatic_Putative_CH if the variant was predicted to be associated with . Somatic - variants not determined to be germline or CH.

    ProteinChange

    p.HGVS

    CDSChange

    c.HGVS

    Exons

    Exon, where the variant is located

    Consequence

    Variant consequence

    TMB algorithm page
    TMB algorithm page
    the Database or the Proxi filter
    clonal hematopoiesis (CH)

    Block List

    The following table lists the genes that have associated block listed sites. For the exact location of the block listed site, contact Illumina Technical Support.

    Gene
    Block List Sites
    Gene
    Block List Sites
    Gene
    Block List Sites

    ABL1

    5

    FGFR2

    144

    \

    PAX7

    5

    AKT2

    5

    FGFR3

    1

    PAX8

    275

    AKT3

    20

    FGFR4

    36

    PBRM1

    3

    ALK

    90

    FLCN

    2

    PDCD1

    2

    ANKRD11

    6

    FLI1

    36

    PDGFRA

    5

    ANKRD26

    9

    FLT1

    91

    PDGFRB

    2

    AR

    81

    FLT4

    3

    PDK1

    1

    ARID1A

    40

    FOXA1

    48

    PDPK1

    6

    ARID1B

    87

    FOXL2

    4

    PGR

    5

    ARID2

    1

    FOXO1

    2

    PHF6

    2

    ASXL1

    3

    FOXP1

    3

    PHOX2B

    15

    ASXL2

    5

    FUBP1

    1

    PIK3C2G

    2

    ATM

    2

    GATA4

    6

    PIK3CA

    18

    ATR

    3

    GATA6

    12

    PIK3CB

    42

    ATRX

    17

    GEN1

    1

    PIK3R1

    6

    AURKA

    1

    GID4

    3

    PIK3R2

    2

    AXIN2

    4

    GNAQ

    4

    PLCG2

    3

    AXL

    74

    GNAS

    11

    PLK2

    2

    BBC3

    2

    GPR124

    3

    PMAIP1

    7

    BCL10

    2

    GRM3

    1

    PMS2

    1

    BCL2L11

    16

    H3F3A

    1

    POLE

    3

    BCOR

    2

    H3F3C

    2

    PPARG

    446

    BCORL1

    1

    HGF

    1

    PRDM1

    1

    BCR

    64

    HIST1H1C

    2

    PRKCI

    2

    BIRC3

    1

    HLA-A

    72

    PRKDC

    5

    BLM

    4

    HNF1A

    2

    PTCH1

    13

    BMPR1A

    4

    HNRNPK

    9

    PTEN

    41

    BRAF

    283

    HOXB13

    1

    PTPRS

    14

    BRCA1

    49

    HSP90AA1

    4

    PTPRT

    2

    BRCA2

    21

    ICOSLG

    6

    QKI

    2

    BRD4

    16

    IFNGR1

    2

    RAD21

    1

    CARD11

    4

    iIndel

    91

    RAD50

    5

    CASP8

    2

    INHBA

    4

    RAD51

    18

    CBL

    8

    INPP4A

    1

    RAD51B

    8

    CCND1

    25

    INPP4B

    1

    RAF1

    98

    CCND3

    49

    IRS1

    9

    RANBP2

    12

    CCNE1

    72

    IRS2

    19

    RARA

    2

    CD74

    50

    iSNP

    4

    RASA1

    1

    CDH1

    4

    JAK2

    4

    RB1

    5

    CDK12

    3

    JUN

    7

    RBM10

    13

    CDK4

    46

    KAT6A

    5

    RECQL4

    3

    CDK6

    13

    KDM5A

    7

    REL

    3

    CDK8

    4

    KDM5C

    2

    RET

    3

    CDKN2B

    2

    KDM6A

    2

    RFWD2

    22

    CEBPA

    12

    KDR

    1

    RICTOR

    1

    CHD2

    5

    KIF5B

    7

    ROS1

    287

    CHD4

    12

    KIT

    5

    RPS6KA4

    3

    CHEK1

    75

    KMT2B

    51

    RPS6KB1

    109

    CHEK2

    64

    KMT2C

    118

    RUNX1

    3

    chrY

    93

    KMT2D

    108

    SDHA

    18

    CIC

    2

    KRAS

    44

    SDHB

    3

    CREBBP

    4

    LAMP1

    64

    SDHD

    17

    CSNK1A1

    4

    LATS1

    1

    SETBP1

    7

    CTNNB1

    1

    LATS2

    4

    SETD2

    26

    CUL3

    1

    LoH

    85

    SF3B1

    1

    CUX1

    9

    LRP1B

    3

    SH2B3

    4

    DAXX

    5

    LZTR1

    1

    SH2D1A

    2

    DDR2

    1

    MAGI2

    2

    SLIT2

    1

    DDX41

    1

    MALT1

    4

    SLX4

    2

    DIS3

    2

    MAP2K2

    1

    SMARCA4

    4

    DNAJB1

    6

    MAP2K4

    5

    SMC1A

    1

    DNMT1

    1

    MAP3K1

    8

    SMC3

    8

    DNMT3A

    4

    MAP3K14

    2

    SMO

    2

    DOT1L

    2

    MAP3K4

    10

    SOX10

    7

    E2F3

    70

    MAPK1

    6

    SOX17

    1

    EGFR

    304

    MAPK3

    6

    SOX9

    14

    EIF4E

    12

    MCL1

    1

    SPEN

    4

    EML4

    9

    MDC1

    23

    STAG1

    5

    EP300

    1

    MDM2

    53

    STAG2

    2

    ERBB2

    14

    MDM4

    67

    STAT4

    1

    ERBB3

    62

    MED12

    28

    STAT5A

    1

    ERCC1

    53

    MGA

    6

    STAT5B

    4

    ERCC2

    57

    MLL

    9

    SUFU

    5

    ERCC3

    4

    MLLT3

    18

    SUZ12

    9

    ERCC5

    4

    MRE11A

    5

    TAF1

    9

    ERG

    2

    MSH3

    10

    TBX3

    1

    ESR1

    32

    MSH6

    2

    TCEB1

    1

    ETS1

    45

    MSI

    148

    TCF3

    2

    ETV1

    862

    MST1

    18

    TCF7L2

    6

    ETV4

    502

    MYB

    402

    TERT

    2

    ETV5

    11

    MYC

    78

    TET1

    1

    ETV6

    187

    MYCL1

    28

    TET2

    23

    EWSR1

    364

    MYCN

    69

    TFE3

    299

    EZH2

    2

    MYOD1

    3

    TFRC

    33

    FANCA

    1

    NAB2

    10

    TGFBR1

    6

    FANCD2

    11

    NCOA3

    28

    TGFBR2

    2

    FANCG

    10

    NCOR1

    9

    TMEM127

    5

    FANCI

    1

    NF1

    3

    TMPRSS2

    236

    FANCL

    1

    NKX2-1

    4

    TOP2A

    1

    FAT1

    2

    NOTCH1

    4

    TP53

    22

    FBXW7

    4

    NOTCH3

    7

    TRAF7

    4

    FGF1

    25

    NOTCH4

    9

    TSC1

    4

    FGF10

    17

    NPM1

    5

    TSC2

    1

    FGF14

    15

    NRAS

    29

    U2AF1

    1

    FGF19

    102

    NRG1

    47

    VEGFA

    7

    FGF2

    26

    NTRK1

    134

    WISP3

    2

    FGF23

    38

    NTRK2

    145

    WT1

    10

    FGF3

    60

    NTRK3

    13

    XIAP

    1

    FGF4

    25

    NUTM1

    134

    XPO1

    2

    FGF5

    14

    PAK1

    68

    XRCC2

    1

    FGF6

    9

    PAK3

    8

    YAP1

    1

    FGF7

    9

    PALB2

    1

    ZBTB7A

    11

    FGF8

    30

    PARK2

    23

    ZFHX3

    56

    FGF9

    21

    PARP1

    2

    ZNF703

    7

    FGFR1

    26

    PAX3

    156

    ZRSR2

    2