arrow-left

All pages
gitbookPowered by GitBook
1 of 7

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Analysis Launch on ICA

hashtag
Methods for Launching Analysis

Illumina Connected Analytics (ICA) supports the following methods for launching DRAGEN TruSight Oncology 500 Analysis Software.

  • Auto-launch—Stream run data directly from the instrument to ICA via a specially configured sample sheet and automatically begin DRAGEN TSO 500 analysis.

  • —Initiate DRAGEN TSO 500 analysis on ICA using the run files and sample sheet files in the project.

circle-exclamation

For the data generated by NextSeq 1000/2000 and NextSeq X, only manual option for launching analysis on ICA is available. The analysis can only start from FASTQs files.

For more information about using ICA or BaseSpace Sequence Hub, refer to the following support pages on the Illumina support site.

Manual launch
Illumina Connected Analytics support site pagearrow-up-right
BaseSpace Sequence Hub support site pagearrow-up-right

Launching Analysis

Command-Line Options

You can use the following command-line options with DRAGEN TruSight Oncology 500 Analysis Software.

To learn more about the input requirements, use the --help command-line option.

Option
Required
Description

--help

No

Displays a help screen with available command line options.

--analysisFolder

Note:

  • Use full paths when specifying the file paths in the command line.

  • Avoid special characters such as &, *, #, and spaces.

  • When starting from BCL files, only the run folder needs to be specified. The immediate parent directory containing the BCL files does not need to be specified.

When running the analysis software using SSH, Illumina recommends using additional software to prevent unexpected termination of analysis. Illumina recommends screen and tmux.

  1. Wait for any running DRAGEN TruSight Oncology 500 Analysis Software containers to complete before launching a new analysis. Run the following command to generate a list of running containers:docker ps

  2. Select from one of the following options:

  • Start from BCL files in the run folder with the sample sheet included in the run folder. DRAGEN_TSO500.sh \ --runFolder /staging/{RunFolderName} \ --analysisFolder /staging/{AnalysisFolderName}

  • Start from BCL files in the run folder with the sample sheet located in a folder other than the run folder. DRAGEN_TSO500.sh \ --runFolder /staging/{RunFolderName} \ --analysisFolder /staging/{AnalysisFolderName} \

hashtag
Starting from BCL Files

circle-info

For the data generated by NextSeq 1000/2000 and NextSeq X, the analysis can only be started from FASTQs and not from BCLs.

If starting from BCL (*.bcl) files, DRAGEN TruSight Oncology 500 Analysis Software requires the run folder to contain certain files and folders. These inputs are required for Docker.

The run folder contains data from the sequencing run, make sure that the folder contains the following files:

Folder/File
Description

hashtag
Starting from FASTQ Files

The following inputs are required for running the DRAGEN TruSight Oncology 500 Analysis Software using FASTQ (*.fastq) files. The requirements apply to Docker.

  • Full path to an existing FASTQ folder.

  • The FASTQ folder structure conforms to the folder structure in

  • The sample sheet is in the FASTQ folder path, or you can set the path to the sample sheet with the --sampleSheet override command line option.

Make sure there is sufficient disk space for the analysis to complete. Refer to the --help command line argument details for disk space requirements.

circle-info

Use BCL Convert to produce FASTQ files for DRAGEN TruSight Oncology 500 Analysis Software. Using bcl2fastq does not produce the same results and is discouraged.

circle-info

Make sure that BCL Convert is set to write UMI sequences to the read headers in the FASTQ files.

hashtag
FASTQ File Organization

Store FASTQ files in individual subfolders that correspond to a specific Sample_ID. Keep file pairs together in the same folder. Alternatively, store the FASTQ files in one flat folder structure where the FASTQ files are stored in one folder.

The DRAGEN TruSight Oncology 500 Analysis Software requires separate FASTQ files per sample. Do not merge FASTQ files.

The instrument generates two FASTQ files per flow cell lane, so that there are eight FASTQ files per sample.

Sample1_S1_L001_R1_001.fastq.gz

  • Sample1 represents the Sample ID.

  • The S in S1 means sample, and the 1 in S1 is based on the order of samples in the sample sheet, so S1 is the first sample.

  • L001 represents the flow cell lane number.

Run on Multiple DRAGEN Servers

DRAGEN TruSight Oncology 500 Analysis Software can be used to run a subset of samples on different DRAGEN servers to decrease overall processing time. This is possible using a three stage process called scatter/gather, which consists of demultiplexing, analysis, and result gathering.

The first stage is demultiplexing. Demultiplexing runs once on the entire run folder, generates FASTQ files for each sample in the run, and then separates sample files into respective folders. Once complete, note the output directory containing the sample directories holding the FASTQ files.

The process for scattering the analysis on multiple DRAGEN servers is as follows:

  1. Determine how many DRAGEN servers are available to run.

  2. Run demultiplexing on a single DRAGEN server.

circle-exclamation

Moving or modifying files during an analysis may cause the analysis to fail or provide incorrect results.

circle-info

To sequence runs on multiple DRAGEN servers using the NovaSeq 6000 XP workflow, modify the sample sheet to include a subset of the lanes. For example, on an S2 flowcell, create two modified sample sheets with one containing the samples from lane 1 and the other from lane 2. This allows only the sample sheet to be modified instead of copying files between servers. This strategy would use the start from Run Folder commands without the --demultiplexOnly option. The entire run folder would need to be copied to each analysis server as demultiplexing is performed once per server.

  1. Transfer the FASTQ folder output from the original DRAGEN server to additional servers.

    1. Logs_Intermediates/FastqGeneration.

  2. Run analysis software using the --fastqFolder option on both the original and additional DRAGEN servers.

hashtag
Commands for Multinode Analysis

Step
Command

Analysis Launch on Standalone DRAGEN Server

Start the DRAGEN TruSight Oncology 500 Analysis Software with the DRAGEN_TSO500.sh Bash script. The script is installed in the /usr/local/bin directory. The Bash script is executed on the command line and runs the software with Docker (or Apptainer if specified).

For arguments, refer to . You can start from BCL files or from the FASTQ folder produced by BCL Convert. The following requirements apply for both methods:

  • Path to the sequencing run or FASTQ folder. Copy the run or FASTQ folder to the DRAGEN server into the staging folder with the following recommended organization: /staging/runs/{RunID}. You can copy the run folder onto the DRAGEN server using Linux commands such as

  • Option 1 Copy the original SampleSheet.csv to each server. Then provide a subsetted list to the Bash script on each DRAGEN server with the intended samples/pairs to run.

  • Option 2 Copy and modify the SampleSheet.csv to each DRAGEN server to only contain the list of samples/pairs to run. The software verifies that all samples in the sample sheet are contained within the FASTQ folders unless the --sampleOrPairIDs command-line option is present in the analysis launch. Failure to account for these checks results in an error.

  • Copy the results from demultiplexing and each analysis run onto a single server, and then generate the final /Results directory, which contains the aggregated results. Enter the --gather command followed by the output directories of the demultiplexing step and each individual analysis run.

  • Demultiplexing

    DRAGEN_TSO500.sh --resourcesFolder /staging/illumina/DRAGEN_TSO500/resources --hashtableFolder /staging/illumina/DRAGEN_TSO500/ref_hashtable --runFolder /staging/{RunFolderName} --analysisFolder /staging/{DemultiplexAnalysisFolderName} --demultiplexOnly --sampleSheet /staging/illumina/{SampleSheetName}

    Analysis

    (one server)

    DRAGEN_TSO500.sh --resourcesFolder /staging/illumina/DRAGEN_TSO500/resources --hashtableFolder /staging/illumina/DRAGEN_TSO500/ref_hashtable --fastqFolder /staging/{DemultiplexAnalysisFolderName}/Logs_Intermediates/FastqGeneration/ --analysisFolder /staging/{Node1AnalysisFolderName} --sampleSheet /staging/illumina/{SampleSheetName} --sampleOrPairIDs Pair_1,Pair_2

    Analysis (additional servers)

    DRAGEN_TSO500.sh --resourcesFolder /staging/illumina/DRAGEN_TSO500/resources --hashtableFolder /staging/illumina/DRAGEN_TSO500/ref_hashtable --fastqFolder /staging/{DemultiplexAnalysisFolderName}/Logs_Intermediates/FastqGeneration/ --analysisFolder /staging/{Node1AnalysisFolderName} --sampleSheet /staging/illumina/{SampleSheetName} --sampleOrPairIDs Pair_3

    Gather

    DRAGEN_TSO500.sh --analysisFolder /Gathered_Results --resourcesFolder staging/illumina/DRAGEN_TSO500/resources --runFolder /staging/{RunFolderName}/--sampleSheet /staging/illumina/{SampleSheetName} --gather /Demultiplex_Output /Node1_Output /Node2_Output

    --sampleSheet /staging/{SampleSheetName}.csv
  • Start from BCL files in the run folder with a different sample sheet and demultiplexing only. DRAGEN_TSO500.sh \ --runFolder /staging/{RunFolderName} \ --analysisFolder /staging/{AnalysisFolderName} \ --sampleSheet /staging/{SampleSheetName}.csv \ --demultiplexOnly

  • Start from FASTQ with the sample sheet included in the FASTQ folder and with different resources and hash table folders. DRAGEN_TSO500.sh \ --resourcesFolder /staging/illumina/DRAGEN_TSO500/resources \ --hashtableFolder /staging/illumina/DRAGEN_TSO500/ref_hashtable \ --fastqFolder /staging/{FastqFolderName} \ --analysisFolder /staging/{AnalysisFolderName}

  • Start from FASTQ folder with sample sheet included in the FASTQ folder and subset of samples or pairs. DRAGEN_TSO500.sh \ --fastqFolder /staging/{FastqFolderName} \ --analysisFolder /staging/{AnalysisFolderName} \ --sampleOrPairIDs "Pair_1,Pair2"

  • RunInfo.xml file

    Run information.

    RunParameters.xml file

    Run parameters.

    SampleSheet.csv file

    Sample information. If you want to use a sample sheet that is not in the run folder or a sample sheet named something other than SampleSheet.csv, provide the full path.

    The R in R1 means Read, so R1 refers to Read 1.

    No

    Path to the local analysis folder. The default location is /staging/DRAGEN_TSO500_Analysis_{timestamp}. If not using the default location, provide the full path to the local analysis folder. Folder must have sufficient space and must be on an NVMe SSD drive. For example, the /staging directory on the DRAGEN server. Refer to table in Storage Requirements for minimum disk space requirements.

    --resourcesFolder

    No

    Path to the resource folder location. The default location is /staging/illumina/DRAGEN_TSO500/resources. If not using the default location, enter the full path to the resource folder.

    --runFolder

    Yes

    Required when --fastqFolder is not specified. Provide the full path to the local run folder.

    --fastqFolder

    Yes

    Required when --runFolder is not specified. Provide the full path to the local FASTQ folder. Analysis starts at this location.

    --user

    No

    Optional for Docker. Specify the user ID to be used within the Docker container.

    --version

    No

    Displays the version of the software.

    --sampleSheet

    No

    Provide the full path, including file name, if not provided as SampleSheet.csv in the run folder

    --sampleOrPairIDs

    No

    Provide the comma-delimited sample or pair IDs that should be processed on this node with no spaces. For example, Pair_1,Pair_2,Sample_1.

    --demultiplexOnly

    No

    Demultiplex to generate FASTQ only without additional analysis.

    --gather

    No

    Follow this option for any directories with results that should be gathered into a single Results folder.

    --hashtableFolder

    No

    Defaults to the DRAGEN hash table location created upon install. If not using the default location, enter the hash table location.

    Config folder

    Configuration files

    Data folder

    *.bcl files

    Images folder

    [Optional] Raw sequencing image files.

    Interop folder

    Interop metric files.

    Logs folder

    [Optional] Sequencing system log files.

    RTALogs folder

    Real-Time Analysis (RTA) log files.

    FASTQ File Organization.
    rsync
    . The sample sheet within the run folder is used unless otherwise specified through the command line.
  • Run folder must be intact. Refer to Starting from BCL Files for input requirements.

  • If the analysis output folder path is different from the default, provide the analysis output folder path. Refer to Command-Line Options.

  • circle-info

    Before running the analysis, confirm that the output directory for the software to write to is empty and does not include results of previous analyses.

    hashtag
    Storage Requirements

    For optimal performance, run analysis on data stored locally on the DRAGEN server. Analysis of data stored on NAS can take longer and performance can be less reliable.

    The DRAGEN server provides an NVMe SSD in the /staging directory to use as the software output directory. Network-attached storage is required for long-term storage.

    When running the DRAGEN TruSight Oncology 500 Analysis Software, use the default settings or set the -analysisFolder command line option to a directory in /staging to make sure the DRAGEN server processes read and write data on the NVMe SSD.

    Before beginning analysis, develop a strategy to copy data from the DRAGEN server to a network‑attached storage. Delete output data on the DRAGEN server as soon as possible.

    The following are the run and analysis output sizes for each sequencing system per 101 bp:

    Sequencing System
    Run Folder Output (Gb)
    Analysis Output (Gb)
    Minimum Disk Space (Gb)

    NextSeq 500/550/550Dx (RUO) HO flow cell

    32-55

    82-85

    150

    NovaSeq 6000/6000Dx (RUO) SP Flow Cell

    85-100

    250-374

    300

    NovaSeq 6000/6000Dx (RUO) S1 Flow Cell

    164-200

    When launching the analysis, the software checks that the minimum disk space required is available. If the minimum disk space is not available, the software shows an error message and prevents analysis from starting. If disk space is exhausted during a run, the run shows an error and stops analyzing.

    circle-exclamation

    Moving or modifying files during an analysis may cause the analysis to fail or provide incorrect results.

    Command-Line Options

    Auto-Launch of DRAGEN TSO 500 Analysis on ICA

    hashtag
    Auto-launch Prerequisites and Workflow

    *The BaseSpace Sequence Hub setting for run monitoring and storage must be selected on the instrument to use DRAGEN TSO 500 analysis auto-launch. For information on preparing your instrument for DRAGEN TSO 500 Auto-launch, refer to the documentation for your instrument.

    1. Use BaseSpace Sequence Hub Run Planning tool or the sample sheet templates provided on the support page to create and export a sample sheet.

      1. If BaseSpace Run Planning tool is not available in your region, use the sample sheet template.

    2. Import the sample sheet to the instrument and start the sequencing run. Refer to for sample sheet guidance.

      1. Data is uploaded to BaseSpace Sequence Hub and then pushed to ICA. You can monitor the run in BaseSpace Sequence Hub.

      2. Analysis auto launches in ICA when sequencing and the upload completes. You can monitor the status of the analysis in BaseSpace Sequence Hub or ICA

    3. View the analysis output results in either BaseSpace Sequence Hub or ICA.

    circle-exclamation

    To avoid invalid sample sheet configurations, Illumina recommends using BaseSpace Run Planning tool to generate sample sheets. Using an invalid sample sheet can result in failed runs and analyses.

    circle-exclamation

    For the data generated by NextSeq 1000/2000 and NextSeq X, only manual option for launching analysis on ICA is available. The analysis can only start from FASTQs files.

    hashtag
    BaseSpace Sequence Hub Requirements for ICA Auto-Launch

    BaseSpace Run Planning tool is a multi-step workflow that generates a manual launch or auto-launch capable sample sheet for export and requires the following additional settings:

    • Access to BaseSpace Sequence Hub.

    • ICA Run Storage is enabled under BaseSpace Sequence Hub settings.

    Refer to the for information on setting up a BaseSpace Sequence Hub project.

    hashtag
    Requeue Analysis

    You can requeue analysis of a run via the run's Summary page in BaseSpace Sequence Hub.

    Refer to the for more information on requeuing an analysis.

    hashtag
    Minimum Storage Requirements on ICA

    Sequencing System
    Minimum Disk Space (Gb)

    Refer to the for information on how to manage accounts and subscriptions.

    hashtag
    Guided Examples

    Please review these guided examples of using DRAGEN TSO 500 Analysis Software with auto-launch on ICA:

    360-665

    800

    NovaSeq 6000/6000Dx (RUO) S2 Flow Cell

    290-460

    890-1600

    1500

    NovaSeq 6000/6000Dx (RUO) S4 Flow Cell

    800-1200

    2700-4100

    3000

    NovaSeq X 1.5B

    213

    352

    800

    NovaSeq X 10B

    1100

    1800

    3000

    NovaSeq X 25B

    1800

    3300

    4000

    NextSeq 1000/2000

    41

    107

    150

    If necessary, you can requeue the analysis via BaseSpace Sequence Hub.

    NovaSeq X 10B

    4300

    NovaSeq X 25B

    8400

    NextSeq 1000/2000

    350

    NextSeq 500/550/550Dx (RUO) HO flow cell

    350

    NovaSeq 6000/6000Dx (RUO) SP Flow Cell

    500

    NovaSeq 6000/6000Dx (RUO) S1 Flow Cell

    1100

    NovaSeq 6000/6000Dx (RUO) S2 Flow Cell

    2500

    NovaSeq 6000/6000Dx (RUO) S4 Flow Cell

    4300

    NovaSeq X 1.5B

    2000

    ICA Auto-launch Sample Sheet Requirements
    arrow-up-right
    BaseSpace Sequence Hub support site pagearrow-up-right
    BaseSpace Sequence Hub support site pagearrow-up-right
    Software Registration pagearrow-up-right
    NovaSeq 6000Dx: TSO 500 Auto-launch Analysis in Cloudarrow-up-right
    NextSeq 500/550Dx: TSO 500 and Connected Insights Auto-launch Analysis in Cloudarrow-up-right

    Manual Launch of DRAGEN TSO 500 Analysis on ICA

    hashtag
    How to Launch Analysis

    1. Create a Project: Project can be specific for the DRAGEN TruSight Oncology 500 pipeline or it can contain multiple Pipelines and/or Tools). For information on creating Projects, refer to the Projects section in Illumina Connected Analytics helparrow-up-right.

    circle-info

    ICA standard storage is used by default as soon as the Project is saved. To connect a different storage source, set it up before creating your Project. For details and options, refer to the Storage section in .

    1. Edit Project and Add Bundle: Edit the Project and add the bundle titled, "DRAGEN TSO 500 v2.5.2 (XX)." XX is a 2-letter code designating the region from which you are launching the analysis. Adding the Bundle automatically adds the pipeline and associated resource files and datasets to the Project. For information on Bundles, refer to the Bundles section in .

    circle-info

    After adding the Bundle to the Project, an example dataset becomes available in the Demo_Data folder for the Project. 

    1.  Upload the sequencing data: For information on viewing and uploading data, refer to the Data section in .

    2. Start Analysis: In the Project, navigate to Pipelines, desired TSO 500  Pipeline, and then select  "Start New Analysis". Set up the new analysis by configuring the parameters listed in the . When the required files are completed, start analysis.

    3. Download Results: After analysis is complete, navigate to results in the configured output location.

    Please see the Illumina Support Shorts for guidance on how to set up and run DRAGEN TSO 500 RUO analysis on ICA.

    hashtag
    Analysis Parameters on ICA

    To launch an analysis via the ICA user interface, configure a DRAGEN TSO 500 pipeline analysis with the following parameters.

    Parameter Name
    Description

    hashtag
    Known Limitations

    • FASTQ Folder Naming Requirements

      • When specifying input FASTQ folder names, avoid using folder names that consist entirely of numeric characters with a leading zero, as this will cause the software to error out.

      • Unsupported naming pattern:

    For information about using pipelines, refer to .

    Input Folder

    The run folder or FASTQ folder that contains files to analyze.

    Starts from FASTQ

    True for analysis performed on files in the FASTQ folder. False for analysis performed on files in the run folder.

    Sample or Pair IDs

    Optional subset of Sample IDs or Pair IDs to analyze.

    Storage Size

    The storage size to allocate for the analysis. The default and recommended value is Large.

    '01234' (numeric-only with leading zero)

  • Supported naming patterns:

    • '12340' (numeric without leading zero)

    • 'sample01' (alphanumeric)

    • 'A1234' (alphanumeric)

    • 'test_sample' (alphanumeric with underscore)

  • User Reference

    The analysis run name.

    User Tags

    Text labels to help index the analysis.

    Notify me when task is completed

    Option to receive an email notification when analysis is complete.

    Output Folder

    The path to the analysis output folder. The default path is the project output folder.

    Entitlement Bundle

    Automatically populated from the project details.

    Sample Sheet

    Select a sample sheet in CSV format for the analysis.

    To note: Sample Sheet selection is optional if starting from a run folder, and required when submitting a FASTQ folder.

    Illumina Connected Analytics helparrow-up-right
    Illumina Connected Analytics helparrow-up-right
    Illumina Connected Analytics helparrow-up-right
    table below
    Illumina Connected Analytics support site pagearrow-up-right