Run on Multiple DRAGEN Servers
Last updated
Last updated
DRAGEN TruSight Oncology 500 Analysis Software can be used to run a subset of samples on different DRAGEN servers to decrease processing time. This is possible using a three stage process called scatter/gather, which consists of demultiplexing, analysis, and result gathering.
The first stage is demultiplexing. Demultiplexing runs once on the entire run folder, generates FASTQ files for each sample in the run, and then separates sample files into respective folders. Once complete, note the output directory containing the sample directories holding the FASTQ files.
The process for scattering the analysis on multiple DRAGEN servers is as follows.
Determine how many DRAGEN servers are available to run.
Run demultiplexing on a single DRAGEN server.
To sequence runs on multiple DRAGEN servers using the NovaSeq 6000 XP workflow, modify the sample sheet to include a subset of the lanes. For example, on an S2 flowcell, create two modified sample sheets with one containing the samples from lane 1 and the other from lane 2. This allows only the sample sheet to be modified instead of copying files between servers. This strategy would use the start from Run Folder commands without the
--demultiplexOnly
option. The entire run folder would need to be copied to each analysis server as demultiplexing is performed once per server.
Transfer the FASTQ folder output from the original DRAGEN server to additional servers. Logs_Intermediates/FastqGeneration.
Run analysis software using the --fastqFolder
option on both the original and additional DRAGEN servers.
Option 1 Copy the original SampleSheet.csv
to each server. Then provide a subsetted list to the Bash script on each DRAGEN server with the intended samples/pairs to run.
Option 2 Copy and modify the SampleSheet.csv
to each DRAGEN server to only contain the list of samples/pairs to run.
The software verifies that all samples in the sample sheet are contained within the FASTQ folders unless the --sampleOrPairIDs
command-line option is present in the analysis launch. Failure to account for these checks results in an error.
Copy the results from demultiplexing and each analysis run onto a single server, and then generate the final /Results
directory, which contains the aggregated results. Enter the --gather
command followed by the output directories of the demultiplexing step and each individual analysis run.
Commands for Multinode Analysis
Step | Command |
---|---|
Demultiplexing
DRAGEN_TSO500.sh --resourcesFolder /staging/illumina/DRAGEN_TSO500/resources --hashtableFolder /staging/illumina/DRAGEN_TSO500/ref_hashtable --runFolder /staging/{RunFolderName} --analysisFolder /staging/{DemultiplexAnalysisFolderName} --demultiplexOnly --sampleSheet /staging/illumina/{SampleSheetName}
Analysis (one server)
DRAGEN_TSO500.sh --resourcesFolder /staging/illumina/DRAGEN_TSO500/resources --hashtableFolder /staging/illumina/DRAGEN_TSO500/ref_hashtable --fastqFolder /staging/{DemultiplexAnalysisFolderName}/Logs_Intermediates/FastqGeneration/ --analysisFolder /staging/{Node1AnalysisFolderName} --sampleSheet /staging/illumina/{SampleSheetName} --sampleOrPairIDs Pair_1,Pair_2
Analysis (additional servers)
DRAGEN_TSO500.sh --resourcesFolder /staging/illumina/DRAGEN_TSO500/resources --hashtableFolder /staging/illumina/DRAGEN_TSO500/ref_hashtable --fastqFolder /staging/{DemultiplexAnalysisFolderName}/Logs_Intermediates/FastqGeneration/ --analysisFolder /staging/{Node1AnalysisFolderName} --sampleSheet /staging/illumina/{SampleSheetName} --sampleOrPairIDs Pair_3
Gather
DRAGEN_TSO500.sh --analysisFolder /Gathered_Results --resourcesFolder staging/illumina/DRAGEN_TSO500/resources --runFolder /staging/{RunFolderName}/--sampleSheet /staging/illumina/{SampleSheetName} --gather /Demultiplex_Output /Node1_Output /Node2_Output