Parameters

The following fields are available for use with the sample sheet v2 template. Different sequencing systems can support different parameters or expect certain values in the sample sheet. Refer to the settings for your specific system.

Standalone Sections

Header Section parameters

ParameterDescriptionRequirements

Custom_*

Custom field used to capture run metadata.

String with ASCII characters except for * and the control characters CR and LF.

FileFormatVersion

Used to identify the sample sheet as a v2 sample sheet. This field must always exist in the header section with a value of 2.

Must always be 2.

InstrumentPlatform

Identifies the instrument platform to be used for the run.

For example, NextSeq1000 or NextSeq2000.

String with ASCII characters except for * and the control characters CR and LF.

InstrumentType

Identifies the instrument to be used for the run.

For example: if using NextSeq 2000, populate the field with NextSeq2000.

String with ASCII characters except for * and the control characters CR and LF.

RunDescription

The run description can contain 255 alphanumeric characters, spaces, dashes, and underscores.

String with ASCII characters except for * and the control characters CR and LF.

RunName

The run name can contain 255 alphanumeric characters, spaces, dashes, and underscores.

String with ASCII characters except for * and the control characters CR and LF.

Reads Section Parameters

ParameterDescriptionRequirements

Index1Cycles

Number of cycles in Index Read 1. Required if more than one sample is present in sample sheet.

  • Must be an integer ≥ 0.

  • Depending on your sequencing system and reagent kit, there can be limitations on the number of cycles in the Index Read.

  • Warning if values in range [1–5] inclusive.

  • If there is more than 1 sample per lane, must be > 0.

  • If OverrideCycles is present in the BCL_Settings section, must be consistent with the sum of the Index1 section of OverrideCycles.

Index2Cycles

Number of cycles in Index Read 2. Required if using dual indexes for demultiplexing.

  • Must be an integer ≥ 0.

  • Depending on your sequencing system and reagent kit, there can be limitations on the number of cycles in the Index Read.

  • Warning if values in range [1–5] inclusive.

  • If OverrideCycles is present in the BCL_Settings section, must be consistent with the sum of the Index1 section of OverrideCycles.

Read1Cycles

Number of cycles for Read 1.

  • Must be an integer > 0.

  • Warning if less than 26.

  • If OverrideCycles is present in the BCL_Settings section, must be consistent with the sum of the Read1 section of OverrideCycles.

Read2Cycles

Number of cycles for Read 2. Required only when running a paired-end sequencing run.

  • Must be an integer ≥ 0.

  • Warning if values in range [1–25] inclusive.

  • If OverrideCycles is present in the BCL_Settings section, must be consistent with the sum of the Read2 section of OverrideCycles.

Sequencing Section Parameters

ParameterDescriptionRequirements

CustomIndex1Primer

Indicates if a Custom Index 1 primer is used for the run.

Values true and false allowed. Value true only allowed if Index1Cycles is specified.

CustomIndex2Primer

Indicates if a Custom Index 2 primer is used for the run.

Values true and false allowed. Value true only allowed if Index2Cycles is specified.

CustomRead1Primer

Indicates if a Custom Read 1 primer is used for the run.

Values true and false allowed. Value true only allowed if Read1Cycles is specified.

CustomRead2Primer

Indicates if a Custom Read 2 primer is used for the run.

Values true and false allowed. Value true only allowed if Read2Cycles is specified.

LibraryPrepKits

Identifies the library prep kit used for the run.

  • String with ASCII characters except for * and the control characters CR and LF.

  • If more than one library prep kit is being used, use semicolons to separate the names of the different library prep kits.

Application Sections

Application Section Parameters

Parameter

Description

Requirements

AdapterRead1

The sequence of the Read 1 adapter to be masked or trimmed.

Possible values are a concatenation from the set [A,C,T,G] and additional values based on your instrument.

To trim multiple adapters, separate the sequences with a plus sign (+) indicating independent adapters that must be independently assessed for masking or trimming for each read.

AdapterRead2

The sequence of the Read 2 adapter to be masked or trimmed.

Possible values are a concatenation from the set [A,C,T,G], and additional values based on your instrument.

To trim multiple adapters, separate the sequences with a plus sign (+) indicating independent adapters that must be independently assessed for masking or trimming for each read.

AppVersion

The version of the workflow-specific application (for example, DRAGEN Enrichment).

Use all three integers included in the version name. For example, 1.0.0.

AuxCnvPanelOfNormalsFile

File name in text (*.txt) or gzip (*.gz) format.

Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.

AuxCnvPopBAlleleVcfFile

File name in text (*.txt) or gzip (*.gz) format.

Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.

Optional if VariantCallingMode is AllVariantCallers.

The value must be na if AuxCnvPopBAlleleVcfFile is placed in the Data section and no AuxCnvPopBAlleleVcfFile is provided for the sample.

CNV output is only be generated if this file is provided.

AuxGermlineTaggingFile

File name in text (*.txt) or gzip (*.gz) format.

Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.

The value must be na if AuxGermlineTaggingFile is placed in the Data section and no AuxGermlineTaggingFile is provided for the sample.

Germline tagging output is only be generated if this file is provided.

AuxNoiseBaselineFile

File name in text (*.txt) or gzip (*.gz) format.

Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.

For the DRAGEN Enrichment and DRAGENSomatic, applicable only if GermlineOrSomatic is Somatic and VariantCallingMode is not None.

For more information on noise baseline files, refer to the instrument product documentation.

The value must be na if AuxNoiseBaselineFile is placed in the Data section and no AuxNoiseBaselineFile is provided for the sample.

AuxSvNoiseBaselineFile

File name in text (*.txt) or gzip (*.gz) format.

Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.

Optional if VariantCallingMode is AllVariantCallers.

The value must be na if AuxSvNoiseBaselineFile is placed in the Data section and no AuxSvNoiseBaselineFile is provided for the sample.

BarcodeMismatchesIndex1

Specifies barcode mismatch tolerance for Index 1.

Possible values are 0, 1, or 2. Additional values might be available based on your instrument.

The default value is 1. Only allowed if Index 1 is specified in RunInfo.xml file and in the Reads section of the sample sheet.

BarcodeMismatchesIndex2

Specifies barcode mismatch tolerance for Index 2.

Possible values are 0, 1, or 2. Additional values might be available based on your instrument.

The default value is 1. Only allowed if Index 2 is specified in RunInfo.xml file and in the Reads section of the sample sheet.

BarcodePosition

The location of the bases that corresponds to the barcode within the value entered for BarcodeRead. Base positions are indexed starting at the zero position.

For DRAGEN Single Cell Library Kits 1–5, enter the BarcodePosition value in the following format:

0_<barcode end position>

For example, if a barcode contains 16 bases, the value is 0_15.

For DRAGEN Single Cell Library Kits 6, enter the BarcodePosition value in the following format:

0_<first barcode end position>+<second barcode start position>_<second barcode end position>+<third barcode start position>_<third barcode end position>

For example, the following structure would result in the value 0_8+21_29+43_51:

  • 9 bases in the first barcode (0_8).

  • 12 bases between first and second barcodes.

  • 9 bases in the second barcode (21_29).

  • 13 bases between second and third barcodes.

  • 9 bases in the third barcode (43_51).

BarcodeRead

The locations within the sequencing run of the barcode read that contains both the barcode and the UMI.

Values can contain Read1 or Read2. The default value is Read1.

BarcodeSequenceList

The name of the file containing the barcode sequences to include.

The file name can only contain alphanumeric characters, dashes, underscores, and periods.

Bedfile

BED file to be used for analysis in text (*.txt) or gzip (*.gz) format.

Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.

Comparison1

Comparison sample 1.

  • Accepted values are control, comparison, or na.

  • Use only if DifferentialExpressionEnable is true.

  • If RNAPipelineMode is MapAlign, this value must be na.

  • Must have at least two control and two comparison samples. Can have a maximum of 15 control or comparison samples.

  • All control and comparison samples must have FullPipeline for RnaPipelineMode value and have the same ReferenceGenomeDir and RnaGeneAnnotationFile values.

Comparison2

Comparison sample 2.

  • Accepted values are control, comparison, or na.

  • Use only if DifferentialExpressionEnable is true.

  • If RNAPipelineMode is MapAlign, this value must be na.

  • Must have at least two control and two comparison samples. Can have a maximum of 15 control or comparison samples.

  • All control and comparison samples must have FullPipeline for RnaPipelineMode value and have the same ReferenceGenomeDir and RnaGeneAnnotationFile values.

Comparison3

Comparison sample 3.

  • Accepted values are control, comparison, or na.

  • Use only if DifferentialExpressionEnable is true.

  • If RNAPipelineMode is MapAlign, this value must be na.

  • Must have at least two control and two comparison samples. Can have a maximum of 15 control or comparison samples.

  • All control and comparison samples must have FullPipeline for RnaPipelineMode value and have the same ReferenceGenomeDir and RnaGeneAnnotationFile values.

Comparison4

Comparison sample 4.

  • Accepted values are control, comparison, or na.

  • Use only if DifferentialExpressionEnable is true.

  • If RNAPipelineMode is MapAlign, this value must be na.

  • Must have at least two control and two comparison samples. Can have a maximum of 15 control or comparison samples.

  • All control and comparison samples must have FullPipeline for RnaPipelineMode value and have the same ReferenceGenomeDir and RnaGeneAnnotationFile values.

Comparison5

Comparison sample 5.

  • Accepted values are control, comparison, or na.

  • Use only if DifferentialExpressionEnable is true.

  • If RNAPipelineMode is MapAlign, this value must be na.

  • Must have at least two control and two comparison samples. Can have a maximum of 15 control or comparison samples.

  • All control and comparison samples must have FullPipeline for RnaPipelineMode value and have the same ReferenceGenomeDir and RnaGeneAnnotationFile values.

CreateFastqForIndexReads

If set to true or 1, creates FASTQ files for index reads as specified by the sample sheet. The following settings do not affect the resulting FASTQ files:

  • MinimumTrimmedReadLength

  • MaskShortReads

  • Must be 0 or 1.

  • At least one index must be specified in the sample sheet to enable this setting.

DifferentialExpressionEnable

Enable or disable differential expression for DRAGEN RNA.

  • Accepted values are true or false.

  • If DifferentialExpressionEnable is true, then RnaPipelineMode must be set to FullPipeline.

DnaBedFile

The BED file containing the regions to target. The BED file can be input in text (*.txt) or gzip (*.gz) format.

Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.

DnaGermlineOrSomatic

To perform DNA Amplicon germline analysis, enter germline. To perform DNA Amplicon somatic analysis, enter somatic.

Accepted values are germline or somatic.

DnaOrRna

The type of Amplicon analysis to perform.

Only DNA analysis is supported for DRAGENv3.8. Enter dna.

DownSampleNumReads

Specifies the number of fragments to downsample to. For paired-end sequencing, the number of reads at the down-sampling output is twice the number of fragments specified.

  • Accepted values are integers.

  • If DownSampleNumReads is placed in the Data section and downsampling is not desired for that sample, a value of na must be used.

FastqCompressionFormat

The compression format for the FASTQ output files. Example values are gzip or dragen.

The value dragen is only valid if a dragen_compression_ref file exists in the genomes directory.

GermlineOrSomatic

Specifies either enrichment germline analysis or enrichment somatic analysis.

Accepted values are germline or somatic.

Index

The Index 1 (i7) index adapter sequence.

  • Required if index cycles are specified for the sample in OverrideCycles.

  • Can only contain A, C, G, or T.

  • Length of string must match number of first index cycles in RunInfo.xml or number specified in OverrideCycles

  • If RunInfo.xml has IsReverseComplement set to Y, reverse-complement of listed sequence is used.

Index2

The Index 2 (i5) Index adapter sequence.

  • Required if Index2Cycles is > 0.

  • Possible values are a concatenation from the set [A,C,T,G] or na.

  • Length of string must match number of first index cycles in RunInfo.xml or number specified in OverrideCycles

  • If RunInfo.xml has IsReverseComplement set to Y, reverse-complement of listed sequence is used.

  • A value of na must be used if no indexes are specified in OverrideCycles for a sample.

  • Although Index2 is processed on the instrument in reverse complement format, Index2 is entered in the sample sheet in forward, non-complemented format for user convenience.

  • When mixing dual- and single-index libraries, use na for single-index libraries.

KeepFastQ

Indicates whether FASTQ files are saved (true) or discarded (false).

Accepted values are true or false.

Lane

Specifies FASTQ files only for the samples with the specified lane number.

Must adhere to the following requirements:

  • Must be an integer.

  • Value must be in the range of lanes specified in RunInfo.xml.

  • Ranges are not supported with - or +.

  • If not supplied, it is assumed that all samples are present in all lanes specified in the RunInfo.xml.

  • f supplied, only lanes specified in the column are converted from BCL to FASTQ.

MapAlignOutFormat

Formatting of the alignment output files.

Accepted values are bam, cram, or none. Selecting none produces no map/align output. If no value is specified, the default is none.

MethylationProtocol

Select the library protocol for methylation analysis.

Accepted values are directional, non-directional, directional-complement, and pbat.

OverrideCycles

Specifies the sequencing and indexing cycles to be used when processing the sequencing data.

  • Y—Specifies a sequencing read

  • I—Specifies an indexing read

  • U—Specifies a UMI cycle

  • N—Specifies trimmed reads

For instruments with RunInfo.xmlhaving IsReverseComplement set to Y for Index2, then Index2 is processed in reverse complement orientation. Apply any trimming to the end of the sequence as it transitions to the adapter, which means the N cycles specified in OverrideCycles must be in the beginning of the Index Read.

Example of 8 bp i5 index with the last two (adapter) bases to be trimmed where XX is part of the adapter sequence.

  • Forward i5: XXATCGCGGT

  • ReverseComp i5: ACCGCGATXX(this is the direction of sequencing)

  • OverrideCycles for Index2: N2I8

Although Index2 is processed on the instrument in reverse complement format, Index2 and OverrideCycles are entered in the sample sheet in forward, non-complemented format for user convenience.

Must adhere to the following requirements:

  • Must be same number of fields (delimited by semicolon) as sequencing and indexing reads specified in RunInfo.xml and in the Reads section of the sample sheet.

  • The number of cycles specified for each read must equal the number of cycles specified for that read in the RunInfo.xml file and in the Reads section of the sample sheet.

  • Only one Y or I sequence can be specified per read.

  • I cycles can only be specified for index reads

  • Y cycles can only be specified for genomic reads

  • The total number of cycles used for demultiplexing by both indexes together cannot exceed 27. This includes all cycles between the first I cycle used by any sample and the last I cycle used by any sample within each index.

The following are examples of OverrideCycles

input:

U8Y143;I8;I8;U8Y143

N10Y66;I6;N10Y66

For a sample sheet containing two samples having the following OverrideCycles, the number of cycles used for demultiplexing sums to 18:

  • Y151;I8N2;N10;Y151

  • Y151;N2I8;I8N2;Y151

QcCoverage1BedFile

File name in text (*.txt) or gzip (*.gz) format.

Must include the prefix DragenGermline/. Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.

If QcCoverage1BedFile exists in the Data section, but a file is not provided, you must specify a value of na.

QcCoverage2BedFile

File name in text (*.txt) or gzip (*.gz) format.

Must include the prefix DragenGermline/. Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.

If QcCoverage2BedFile exists in the Data section, but a file is not provided, you must specify a value of na.

QcCoverage3BedFile

File name in text (*.txt) or gzip (*.gz) format.

Must include the prefix DragenGermline/. Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.

If QcCoverage3BedFile exists in the Data section, but a file is not provided, you must specify a value of na.

QcCrossContaminationVcfFile

File name in text (*.txt) or gzip (*.gz) format.

Must include the prefix DragenGermline/. Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.

If QcCrossContaminationVcfFile exists in the Data section, but a file is not provided, you must specify a value of na.

ReferenceGenomeDir

The reference genome name.

Alphanumeric string with underscores (_) or dashes (-).

To use a custom reference genome, refer to Reference Builder for Illumina Instruments v1.0.0 App Online Help.

RnaGeneAnnotationFile

Genotype reference file name in text (*.txt) or gzip (*.gz) format.

Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.

If DifferentialExpressionEnable is True, the GTF file must be provided by the user or included with the reference genome.

RnaLibraryType

Identifies if the RNA library type is stranded forward, stranded reverse, or unstranded.

Enter one of the following values:

  • SF—Stranded forward. SF is the default value.

  • SR—Stranded reverse.

  • U—Unstranded.

RnaPipelineMode

Identifies the pipeline mode as MapAlign or FullPipeline.

Accepted values are MapAlign or FullPipeline. The full pipeline option includes quantification and fusion detection.

Sample_ID

ID for the sample.

Must adhere to the following requirements:

  • Must be alphanumeric string with underscores (_) or dashes (-) and no spaces.

  • Case sensitive. For example, including MySample and mysample is not allowed.

  • For workflows other than BCL Convert, sample IDs must match the sample IDs specified in BCL Convert Data section.

  • 20 characters or fewer.

It is recommended to separate each identifier with a dash. For example, Sample1-DQB1-022515.

For BCL Convert, the same sample ID can exist on more than one row of the sample sheet to indicate one sample spanning more than one lane.

For non BCL Convert workflows, a unique sample ID can only appear in one non BCL Convert workflow.

No workflows allow Undetermined as a sample ID.

Sample_Name

Use Sample_Name sample sheet column for *.fastq file names in Sample_Project subdirectories (requires bcl-sampleproject-subdirectories true as well).

Can only contain alphanumeric characters, dashes, and underscores. Duplicate data strings with different cases (for example, sampleProject and SampleProject) are not allowed.

Sample_Project

If present, and both --sample-name-column-enabled true and --bcl-sampleproject-subdirectories true command lines are used, then output FASTQ files to subdirectories based on Sample_Project and Sample_ID, and name FASTQ files by Sample_Name.

Can only contain alphanumeric characters, dashes, and underscores. Duplicate data strings with different cases (for example, sampleProject and SampleProject) are not allowed.

SoftwareVersion

For the BCL Convert application, the version of software used to perform BCL conversion on sample IDs that only exist in the BCL Convert Data section.

For other application sections, identifies the version of the DRAGEN software to be used process the specific DRAGEN pipeline, including conversion to FASTQ.

Use all three integers included in the DRAGEN version name. For example, 3.5.7.

TrimUMI

If set to false or 0, UMI sequences are not trimmed from output FASTQ reads. The UMI is still placed in sequence header.

  • Must be 0 or 1.

  • UMIs must be specified for at least one read in the sample sheet to enable this setting.

UmiPosition

The location of the bases that corresponds to the UMI within the value entered for BarcodeRead.

Enter the UmiPosition value in the following format:

<UMI start position>_<UMI end position>

For example, if the UMI contains 10 bases and the barcode contains 16, the value is 16_25.

UsesTaps

Select whether the TAPS assay, which directly converts methylated C to T, is used (true) or not used (false).

Accepted values are true or false.

VariantCallingMode

Variant calling mode for the run.

Accepted values are None, SmallVariantCaller, AllVariantCallers. For DRAGENGermline, the option for all variant callers includes Small, Structural, CNV, Repeat Expansions, ROH, CYP2D6.

DRAGEN Version 4.1.7 includes CYP2B6, CYP21A2, SMN, and GBA.

For DRAGEN Enrichment and DRAGENSomatic, the option for all variant callers includes Small, Structural, and CNV callers (if panel of normals is provided).

Last updated