Parameters
Last updated
Last updated
The following fields are available for use with the sample sheet v2 template. Different sequencing systems can support different parameters or expect certain values in the sample sheet. Refer to the settings for your specific system.
Parameter | Description | Requirements |
---|---|---|
Parameter | Description | Requirements |
---|---|---|
Parameter | Description | Requirements |
---|---|---|
Custom_*
Custom field used to capture run metadata.
String with ASCII characters except for * and the control characters CR and LF.
FileFormatVersion
Used to identify the sample sheet as a v2 sample sheet. This field must always exist in the header section with a value of 2.
Must always be 2.
InstrumentPlatform
Identifies the instrument platform to be used for the run.
For example, NextSeq1000 or NextSeq2000.
String with ASCII characters except for * and the control characters CR and LF.
InstrumentType
Identifies the instrument to be used for the run.
For example: if using NextSeq 2000, populate the field with NextSeq2000.
String with ASCII characters except for * and the control characters CR and LF.
RunDescription
The run description can contain 255 alphanumeric characters, spaces, dashes, and underscores.
String with ASCII characters except for * and the control characters CR and LF.
RunName
The run name can contain 255 alphanumeric characters, spaces, dashes, and underscores.
String with ASCII characters except for * and the control characters CR and LF.
Index1Cycles
Number of cycles in Index Read 1. Required if more than one sample is present in sample sheet.
Must be an integer ≥ 0.
Depending on your sequencing system and reagent kit, there can be limitations on the number of cycles in the Index Read.
Warning if values in range [1–5] inclusive.
If there is more than 1 sample per lane, must be > 0.
If OverrideCycles is present in the BCL_Settings section, must be consistent with the sum of the Index1 section of OverrideCycles.
Index2Cycles
Number of cycles in Index Read 2. Required if using dual indexes for demultiplexing.
Must be an integer ≥ 0.
Depending on your sequencing system and reagent kit, there can be limitations on the number of cycles in the Index Read.
Warning if values in range [1–5] inclusive.
If OverrideCycles is present in the BCL_Settings section, must be consistent with the sum of the Index1 section of OverrideCycles.
Read1Cycles
Number of cycles for Read 1.
Must be an integer > 0.
Warning if less than 26.
If OverrideCycles is present in the BCL_Settings section, must be consistent with the sum of the Read1 section of OverrideCycles.
Read2Cycles
Number of cycles for Read 2. Required only when running a paired-end sequencing run.
Must be an integer ≥ 0.
Warning if values in range [1–25] inclusive.
If OverrideCycles is present in the BCL_Settings section, must be consistent with the sum of the Read2 section of OverrideCycles.
CustomIndex1Primer
Indicates if a Custom Index 1 primer is used for the run.
Values true
and false
allowed. Value true only allowed if Index1Cycles
is specified.
CustomIndex2Primer
Indicates if a Custom Index 2 primer is used for the run.
Values true
and false
allowed. Value true only allowed if Index2Cycles
is specified.
CustomRead1Primer
Indicates if a Custom Read 1 primer is used for the run.
Values true
and false
allowed. Value true only allowed if Read1Cycles
is specified.
CustomRead2Primer
Indicates if a Custom Read 2 primer is used for the run.
Values true
and false
allowed. Value true only allowed if Read2Cycles
is specified.
LibraryPrepKits
Identifies the library prep kit used for the run.
String with ASCII characters except for * and the control characters CR
and LF
.
If more than one library prep kit is being used, use semicolons to separate the names of the different library prep kits.
Parameter
Description
Requirements
AdapterRead1
The sequence of the Read 1 adapter to be masked or trimmed.
Possible values are a concatenation from the set [A,C,T,G] and additional values based on your instrument.
To trim multiple adapters, separate the sequences with a plus sign (+) indicating independent adapters that must be independently assessed for masking or trimming for each read.
AdapterRead2
The sequence of the Read 2 adapter to be masked or trimmed.
Possible values are a concatenation from the set [A,C,T,G], and additional values based on your instrument.
To trim multiple adapters, separate the sequences with a plus sign (+) indicating independent adapters that must be independently assessed for masking or trimming for each read.
AppVersion
The version of the workflow-specific application (for example, DRAGEN Enrichment).
Use all three integers included in the version name. For example, 1.0.0.
AuxCnvPanelOfNormalsFile
File name in text (*.txt) or gzip (*.gz) format.
Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.
AuxCnvPopBAlleleVcfFile
File name in text (*.txt) or gzip (*.gz) format.
Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.
Optional if VariantCallingMode is AllVariantCallers
.
The value must be na
if AuxCnvPopBAlleleVcfFile is placed in the Data section and no AuxCnvPopBAlleleVcfFile is provided for the sample.
CNV output is only be generated if this file is provided.
AuxGermlineTaggingFile
File name in text (*.txt) or gzip (*.gz) format.
Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.
The value must be na
if AuxGermlineTaggingFile is placed in the Data section and no AuxGermlineTaggingFile is provided for the sample.
Germline tagging output is only be generated if this file is provided.
AuxNoiseBaselineFile
File name in text (*.txt) or gzip (*.gz) format.
Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.
For the DRAGEN Enrichment and DRAGENSomatic, applicable only if GermlineOrSomatic
is Somatic
and VariantCallingMode
is not None
.
For more information on noise baseline files, refer to the instrument product documentation.
The value must be na
if AuxNoiseBaselineFile is placed in the Data section and no AuxNoiseBaselineFile is provided for the sample.
AuxSvNoiseBaselineFile
File name in text (*.txt) or gzip (*.gz) format.
Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.
Optional if VariantCallingMode is AllVariantCallers
.
The value must be na
if AuxSvNoiseBaselineFile is placed in the Data section and no AuxSvNoiseBaselineFile is provided for the sample.
BarcodeMismatchesIndex1
Specifies barcode mismatch tolerance for Index 1.
Possible values are 0, 1, or 2. Additional values might be available based on your instrument.
The default value is 1. Only allowed if Index 1 is specified in RunInfo.xml
file and in the Reads section of the sample sheet.
BarcodeMismatchesIndex2
Specifies barcode mismatch tolerance for Index 2.
Possible values are 0, 1, or 2. Additional values might be available based on your instrument.
The default value is 1. Only allowed if Index 2 is specified in RunInfo.xml
file and in the Reads section of the sample sheet.
BarcodePosition
The location of the bases that corresponds to the barcode within the value entered for BarcodeRead
. Base positions are indexed starting at the zero position.
For DRAGEN Single Cell Library Kits 1–5, enter the BarcodePosition value in the following format:
0_<barcode end position>
For example, if a barcode contains 16 bases, the value is 0_15
.
For DRAGEN Single Cell Library Kits 6, enter the BarcodePosition
value in the following format:
0_<first barcode end position>+<second barcode start position>_<second barcode end position>+<third barcode start position>_<third barcode end position>
For example, the following structure would result in the value 0_8+21_29+43_51:
9 bases in the first barcode (0_8
).
12 bases between first and second barcodes.
9 bases in the second barcode (21_29
).
13 bases between second and third barcodes.
9 bases in the third barcode (43_51
).
BarcodeRead
The locations within the sequencing run of the barcode read that contains both the barcode and the UMI.
Values can contain Read1
or Read2
. The default value is Read1
.
BarcodeSequenceList
The name of the file containing the barcode sequences to include.
The file name can only contain alphanumeric characters, dashes, underscores, and periods.
Bedfile
BED file to be used for analysis in text (*.txt) or gzip (*.gz) format.
Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.
Comparison1
Comparison sample 1.
Accepted values are control
, comparison
, or na
.
Use only if DifferentialExpressionEnable
is true.
If RNAPipelineMode
is MapAlign
, this value must be na
.
Must have at least two control and two comparison samples. Can have a maximum of 15 control or comparison samples.
All control and comparison samples must have FullPipeline
for RnaPipelineMode
value and have the same ReferenceGenomeDir
and RnaGeneAnnotationFile
values.
Comparison2
Comparison sample 2.
Accepted values are control
, comparison
, or na
.
Use only if DifferentialExpressionEnable
is true.
If RNAPipelineMode
is MapAlign
, this value must be na
.
Must have at least two control and two comparison samples. Can have a maximum of 15 control or comparison samples.
All control and comparison samples must have FullPipeline
for RnaPipelineMode
value and have the same ReferenceGenomeDir
and RnaGeneAnnotationFile
values.
Comparison3
Comparison sample 3.
Accepted values are control
, comparison
, or na
.
Use only if DifferentialExpressionEnable
is true.
If RNAPipelineMode
is MapAlign
, this value must be na
.
Must have at least two control and two comparison samples. Can have a maximum of 15 control or comparison samples.
All control and comparison samples must have FullPipeline
for RnaPipelineMode
value and have the same ReferenceGenomeDir
and RnaGeneAnnotationFile
values.
Comparison4
Comparison sample 4.
Accepted values are control
, comparison
, or na
.
Use only if DifferentialExpressionEnable
is true.
If RNAPipelineMode
is MapAlign
, this value must be na
.
Must have at least two control and two comparison samples. Can have a maximum of 15 control or comparison samples.
All control and comparison samples must have FullPipeline
for RnaPipelineMode
value and have the same ReferenceGenomeDir
and RnaGeneAnnotationFile
values.
Comparison5
Comparison sample 5.
Accepted values are control
, comparison
, or na
.
Use only if DifferentialExpressionEnable
is true.
If RNAPipelineMode
is MapAlign
, this value must be na
.
Must have at least two control and two comparison samples. Can have a maximum of 15 control or comparison samples.
All control and comparison samples must have FullPipeline
for RnaPipelineMode
value and have the same ReferenceGenomeDir
and RnaGeneAnnotationFile
values.
CreateFastqForIndexReads
If set to true
or 1
, creates FASTQ files for index reads as specified by the sample sheet. The following settings do not affect the resulting FASTQ files:
MinimumTrimmedReadLength
MaskShortReads
Must be 0
or 1
.
At least one index must be specified in the sample sheet to enable this setting.
DifferentialExpressionEnable
Enable or disable differential expression for DRAGEN RNA.
Accepted values are true
or false
.
If DifferentialExpressionEnable is true, then RnaPipelineMode must be set to FullPipeline
.
DnaBedFile
The BED file containing the regions to target. The BED file can be input in text (*.txt) or gzip (*.gz) format.
Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.
DnaGermlineOrSomatic
To perform DNA Amplicon germline analysis, enter germline
. To perform DNA Amplicon somatic analysis, enter somatic
.
Accepted values are germline
or somatic
.
DnaOrRna
The type of Amplicon analysis to perform.
Only DNA analysis is supported for DRAGENv3.8. Enter dna
.
DownSampleNumReads
Specifies the number of fragments to downsample to. For paired-end sequencing, the number of reads at the down-sampling output is twice the number of fragments specified.
Accepted values are integers.
If DownSampleNumReads is placed in the Data section and downsampling is not desired for that sample, a value of na
must be used.
FastqCompressionFormat
The compression format for the FASTQ output files. Example values are gzip
or dragen
.
The value dragen
is only valid if a dragen_compression_ref
file exists in the genomes directory.
GermlineOrSomatic
Specifies either enrichment germline analysis or enrichment somatic analysis.
Accepted values are germline
or somatic
.
Index
The Index 1 (i7) index adapter sequence.
Required if index cycles are specified for the sample in OverrideCycles.
Can only contain A, C, G, or T.
Length of string must match number of first index cycles in RunInfo.xml
or number specified in OverrideCycles
If RunInfo.xml
has IsReverseComplement set to Y
, reverse-complement of listed sequence is used.
Index2
The Index 2 (i5) Index adapter sequence.
Required if Index2Cycles is > 0.
Possible values are a concatenation from the set [A,C,T,G] or na
.
Length of string must match number of first index cycles in RunInfo.xml
or number specified in OverrideCycles
If RunInfo.xml
has IsReverseComplement set to Y, reverse-complement of listed sequence is used.
A value of na
must be used if no indexes are specified in OverrideCycles for a sample.
Although Index2 is processed on the instrument in reverse complement format, Index2 is entered in the sample sheet in forward, non-complemented format for user convenience.
When mixing dual- and single-index libraries, use na
for single-index libraries.
KeepFastQ
Indicates whether FASTQ files are saved (true
) or discarded (false
).
Accepted values are true
or false
.
Lane
Specifies FASTQ files only for the samples with the specified lane number.
Must adhere to the following requirements:
Must be an integer.
Value must be in the range of lanes specified in RunInfo.xml
.
Ranges are not supported with - or +.
If not supplied, it is assumed that all samples are present in all lanes specified in the RunInfo.xml
.
f supplied, only lanes specified in the column are converted from BCL to FASTQ.
MapAlignOutFormat
Formatting of the alignment output files.
Accepted values are bam
, cram
, or none
. Selecting none
produces no map/align output. If no value is specified, the default is none
.
MethylationProtocol
Select the library protocol for methylation analysis.
Accepted values are directional
, non-directional
, directional-complement
, and pbat
.
OverrideCycles
Specifies the sequencing and indexing cycles to be used when processing the sequencing data.
Y
—Specifies a sequencing read
I
—Specifies an indexing read
U
—Specifies a UMI cycle
N
—Specifies trimmed reads
For instruments with RunInfo.xml
having IsReverseComplement set to Y
for Index2, then Index2 is processed in reverse complement orientation. Apply any trimming to the end of the sequence as it transitions to the adapter, which means the N cycles specified in OverrideCycles must be in the beginning of the Index Read.
Example of 8 bp i5 index with the last two (adapter) bases to be trimmed where XX is part of the adapter sequence.
Forward i5: XXATCGCGGT
ReverseComp i5: ACCGCGATXX
(this is the direction of sequencing)
OverrideCycles for Index2: N2I8
Although Index2 is processed on the instrument in reverse complement format, Index2 and OverrideCycles are entered in the sample sheet in forward, non-complemented format for user convenience.
Must adhere to the following requirements:
Must be same number of fields (delimited by semicolon) as sequencing and indexing reads specified in RunInfo.xml
and in the Reads section of the sample sheet.
The number of cycles specified for each read must equal the number of cycles specified for that read in the RunInfo.xml
file and in the Reads section of the sample sheet.
Only one Y or I sequence can be specified per read.
I cycles can only be specified for index reads
Y cycles can only be specified for genomic reads
The total number of cycles used for demultiplexing by both indexes together cannot exceed 27. This includes all cycles between the first I cycle used by any sample and the last I cycle used by any sample within each index.
The following are examples of OverrideCycles
input:
U8Y143;I8;I8;U8Y143
N10Y66;I6;N10Y66
For a sample sheet containing two samples having the following OverrideCycles, the number of cycles used for demultiplexing sums to 18:
Y151;I8N2;N10;Y151
Y151;N2I8;I8N2;Y151
QcCoverage1BedFile
File name in text (*.txt) or gzip (*.gz) format.
Must include the prefix DragenGermline/
. Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.
If QcCoverage1BedFile exists in the Data section, but a file is not provided, you must specify a value of na
.
QcCoverage2BedFile
File name in text (*.txt) or gzip (*.gz) format.
Must include the prefix DragenGermline/
. Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.
If QcCoverage2BedFile exists in the Data section, but a file is not provided, you must specify a value of na
.
QcCoverage3BedFile
File name in text (*.txt) or gzip (*.gz) format.
Must include the prefix DragenGermline/
. Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.
If QcCoverage3BedFile exists in the Data section, but a file is not provided, you must specify a value of na
.
QcCrossContaminationVcfFile
File name in text (*.txt) or gzip (*.gz) format.
Must include the prefix DragenGermline/
. Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.
If QcCrossContaminationVcfFile exists in the Data section, but a file is not provided, you must specify a value of na
.
ReferenceGenomeDir
The reference genome name.
Alphanumeric string with underscores (_) or dashes (-).
To use a custom reference genome, refer to Reference Builder for Illumina Instruments v1.0.0 App Online Help.
RnaGeneAnnotationFile
Genotype reference file name in text (*.txt) or gzip (*.gz) format.
Alphanumeric string with underscores (_) or dashes (-) or periods (.) with no spaces allowed.
If DifferentialExpressionEnable is True
, the GTF file must be provided by the user or included with the reference genome.
RnaLibraryType
Identifies if the RNA library type is stranded forward, stranded reverse, or unstranded.
Enter one of the following values:
SF
—Stranded forward. SF is the default value.
SR
—Stranded reverse.
U
—Unstranded.
RnaPipelineMode
Identifies the pipeline mode as MapAlign
or FullPipeline
.
Accepted values are MapAlign
or FullPipeline
. The full pipeline option includes quantification and fusion detection.
Sample_ID
ID for the sample.
Must adhere to the following requirements:
Must be alphanumeric string with underscores (_) or dashes (-) and no spaces.
Case sensitive. For example, including MySample and mysample is not allowed.
For workflows other than BCL Convert, sample IDs must match the sample IDs specified in BCL Convert Data section.
20 characters or fewer.
It is recommended to separate each identifier with a dash. For example, Sample1-DQB1-022515
.
For BCL Convert, the same sample ID can exist on more than one row of the sample sheet to indicate one sample spanning more than one lane.
For non BCL Convert workflows, a unique sample ID can only appear in one non BCL Convert workflow.
No workflows allow Undetermined
as a sample ID.
Sample_Name
Use Sample_Name
sample sheet column for *.fastq file names in Sample_Project
subdirectories (requires bcl-sampleproject-subdirectories true
as well).
Can only contain alphanumeric characters, dashes, and underscores. Duplicate data strings with different cases (for example, sampleProject
and SampleProject
) are not allowed.
Sample_Project
If present, and both --sample-name-column-enabled true
and --bcl-sampleproject-subdirectories true
command lines are used, then output FASTQ files to subdirectories based on Sample_Project
and Sample_ID
, and name FASTQ files by Sample_Name
.
Can only contain alphanumeric characters, dashes, and underscores. Duplicate data strings with different cases (for example, sampleProject
and SampleProject
) are not allowed.
SoftwareVersion
For the BCL Convert application, the version of software used to perform BCL conversion on sample IDs that only exist in the BCL Convert Data section.
For other application sections, identifies the version of the DRAGEN software to be used process the specific DRAGEN pipeline, including conversion to FASTQ.
Use all three integers included in the DRAGEN version name. For example, 3.5.7.
TrimUMI
If set to false or 0, UMI sequences are not trimmed from output FASTQ reads. The UMI is still placed in sequence header.
Must be 0
or 1
.
UMIs must be specified for at least one read in the sample sheet to enable this setting.
UmiPosition
The location of the bases that corresponds to the UMI within the value entered for BarcodeRead.
Enter the UmiPosition
value in the following format:
<UMI start position>_<UMI end position>
For example, if the UMI contains 10 bases and the barcode contains 16, the value is 16_25
.
UsesTaps
Select whether the TAPS assay, which directly converts methylated C to T, is used (true
) or not used (false
).
Accepted values are true or false.
VariantCallingMode
Variant calling mode for the run.
Accepted values are None
, SmallVariantCaller
, AllVariantCallers
. For DRAGENGermline, the option for all variant callers includes Small, Structural, CNV, Repeat Expansions, ROH, CYP2D6.
DRAGEN Version 4.1.7 includes CYP2B6, CYP21A2, SMN, and GBA.
For DRAGEN Enrichment and DRAGENSomatic, the option for all variant callers includes Small, Structural, and CNV callers (if panel of normals is provided).