This recipe is for processing Whole Transcriptome Sequencing data for RNA workflows.
Example Command Line
For most scenarios, simply creating the union of the command line options from the single caller scenarios will work.
Configure the INPUT options
Configure the OUTPUT options
Configure the RNA MAP/ALIGN options
Configure the QUANT options
Configure the SPLICE options
Configure the FUSION options
Configure the VARIANT options
The following are partial templates that can be used as starting points. Adjust them accordingly for your specific use case.
#!/bin/bashset-euopipefail# Path to DRAGEN hashtableDRAGEN_HASH_TABLE=<REF_DIR># Path to output directory for the DRAGEN runOUTPUT=<OUT_DIR># File prefix for DRAGEN output filesPREFIX=<OUT_PREFIX># Define the input sources, select fastq list, fastq, bam, or cram.INPUT_FASTQ_LIST=" --fastq-list $FASTQ_LIST \ --fastq-list-sample-id $FASTQ_LIST_SAMPLE_ID \"INPUT_FASTQ=" --fastq-file1 $FASTQ1 \ --fastq-file2 $FASTQ2 \ --RGSM $RGSM \ --RGID $RGID \"# You could use the tumor fastq options to provide the FASTQ files.INPUT_TUMOR_FASTQ=" --tumor-fastq1 $FASTQ1 \ --tumor-fastq2 $FASTQ2 \ --RGSM $RGSM \ --RGID $RGID \"INPUT_BAM=" --bam-input $BAM \"INPUT_CRAM=" --cram-input $CRAM \"# Select input source, here in this example we use INPUT_FASTQ_LISTINPUT_OPTIONS=" --ref-dir $DRAGEN_HASH_TABLE \ $INPUT_FASTQ_LIST \"OUTPUT_OPTIONS=" --output-directory $OUTPUT \ --output-file-prefix $PREFIX \"# RNA aligner requires an annotation file in GTF or GFF3 format.GTF=<GTF_PATH># RNA pipeline requires map-align to be true.RNA_MAP_OPTIONS=" --enable-rna true \ --enable-map-align true \ --annotation-file $GTF \"# You should set the library according to the read orientations.# The options are IU, ISR, ISF, U, SR, or SF. Or set it to A to automatically detect the correct read orientation.QUANT_OPTIONS=" --enable-rna-quantification true \ --rna-library-type IU \ --rna-quantification-gc-bias true \"SPLICE_OPTIONS=" --enable-rna-splice-variant true \"FUSION_OPTIONS=" --enable-rna-gene-fusion true \"# To call variants, you need to set a bed file with target regions to call. # This bed could contain all exones.VARIANT_OPTIONS=" --enable-variant-caller true \ --vc-target-bed $TARGET_BED"# Construct final command lineCMD=" dragen \ $INPUT_OPTIONS \ $OUTPUT_OPTIONS \ $RNA_MAP_OPTIONS \ $QUANT_OPTIONS \ $SPLICE_OPTIONS \ $FUSION_OPTIONS \ $VARIANT_OPTIONS "# Executeecho $CMDbash-c $CMD
Additional Notes and Options
For SPLICE options, you can provide a list of normal slice variants to reduce noisy calls. The file should be a tab separated file with the following first four columns:
contig name
first base of the splice junction (1-based)
last base of the splice junction (1-based)
strand (0: undefined, 1: +, 2: -) Use the optional option --rna-splice-variant-normals <SPLICE_NORMAL_FILE_PATH> to provide the normal splice variants.