MiSeqDx Sample Sheet Generation (v1.11.0 and later)
Last updated
Last updated
Package Version: BaseSpace Clarity LIMS MiSeqDx (v1.11.0 and later)
The Illumina MiSeqDx Integration Package allows for automatic generation of a sample sheet to be used with the MiSeqDx instrument software. The format of this sample sheet is designed for the instrument when it is running in Diagnostic mode.
Note
If you are running the MiSeqDx instrument in Research Use Only (RUO) mode, see Sample Sheet Generation section in the Configuration guide for MiSeq version-of-interest.
MiSeqDx does not support bcl2fastqv2 sample sheet generation.
How sample sheet generation works
Sample sheet generation is configured on the step prior to the sequencing run – Denature, Dilute and Load Sample, which is the step where samples are placed on the flow cells or reagent cartridges that will be placed in the instrument.
The sample sheet is generated by means of a script, which the lab user initiates by clicking a button on the Record Details screen of the step. This generates a sample sheet file for the container loaded during the step, where the name of the sample sheet will be
The user then downloads the sample sheet from the LIMS and uploads it to the instrument software.
The following assays are supported by the MiSeqDx sample sheet generation script:
CF 139-Variant Assay
CF Clinical Sequencing Assay
Sample sheet format is controlled via master step field / step user defined field (UDF) configuration, where key step fields are pre-populated with values specific to each protocol step. These values are not editable, and their configuration should not be modified.
For details, see Sample Sheet Data and File Format and Contents sections below.
The fields listed in the following table are available on the Denature, Dilute and Load Sample step and will be placed into the sample sheet.
Usage
Below is the default command line that ships with the Denature, Dilute and Load Sample (CF 139-Variant Assay) step.
Single well container types, and all one-dimensional container types with both numeric rows and numeric columns, are supported.
The following table lists and describes the fields included in the MiSeqDx sample sheet.
Note the following:
If no upstream pooling is detected, LIMS will populate the sample sheet with the SampleID and SampleName of the submitted sample. Other fields are populated with data from the samples that were input to the step (i.e. derived samples).
If upstream pooling is detected, LIMS will populate the sample sheet with the first upstream pooled inputs found – not with the submitted sample or step input fields.
Control samples may be one of the following:
Built-in BaseSpace Clarity LIMS control samples
Submitted samples with field / UDF Control? set to true
This section outlines the format and contents of the generated sample sheet and associated log file.
When validating the installation of your integration, refer to this information to ensure that the sample sheet and log files are correctly generated.
MiSeqDx sample sheet
The file is a comma-separated file.
The file contains the following sections:
Header
Manifests
Reads
Settings
Data
The file is populated with data from the samples in the step. If pooled, each sample in the pool is represented as a separate, demultiplexed entry.
The entries are sorted by SampleWell and by SampleID.
The data section of the file contains 11 columns.
MiSeqDx sample sheet log file
The file is in HTML format.
The file contains logging information and a success message if sample sheet generated successfully.
Enabling unique FASTQ file names
To enable unique FASTQ file names per sequencing run, the EPP command on the process type must be configured to use the following parameter options:
-useSampleLimsID – ensures unique entries in the SampleName column by using the sample LIMS ID instead of its name
-appendLimsID – ensures unique names per run by appending the LIMS ID of the current step
For more information, see Script Parameters and Usage.
The step on which this script runs must be the step in which samples are placed on the flow cell(s) or reagent cartridge(s).
The contents of the sample sheet are ordered by SampleWell and then ordered by SampleID.
Project and sample names in the sample sheet cannot contain illegal characters. Characters not allowed are the space character and the following: ? ( ) [ ] / \ = + < > : ; " ' , * ^ | &
Illegal characters will be replaced with an underscore "_"
The destination container type (flow cell or reagent cartridge) must be must be either single well or a one-dimensional container type with both numeric rows and numeric columns. Back to top
If a value is provided for only a single CAT manifest file, then all samples in the sample sheet will be given the designation (A) associated with that CAT type.
For single read, one entry is listed beneath the [Reads] heading, in the first column of the spreadsheet. For paired end, two entries are listed.
Field Name
Field Type
Required?
Notes
Experiment Name
Text
No
Entered by the user
Workflow
Text
Yes
Value set to Amplicon for CF 139-Variant Assay and CF Clinical Sequencing Assay
Description
Text
No
Entered by the user
Assay
Text
No
Configured with the following preset values
CF 139-Variant
CF Clinical Sequencing
/ul>
Amplicon Workflow Type
Text
No
Configured with the following preset values
CF139VARIANTASSAY
CFCLINICALSEQUENCINGASSAY
Application
Text
Yes
Configured with the following preset values
CF 139-Variant Assay
CF Clinical Sequencing Assay
PhiX Control added?
Check box
No
Default set to false
VariantCaller
Text
No
Value set to Starling, a legacy variant caller
Variant Min Quality Cutoff
Numeric
No
Value set to 100
GenomeFolder
Text
Yes
Required for secondary analysis
Control CAT Manifest
Text
No
Preset value CFTRManifest.txt
Custom CAT Manifest
Text
No
Manifest file for custom assay
Read 1 Cycles
Numeric
Yes
Configured with range 0-1000. Value set to 151
Read 2 Cycles
Numeric
Yes
Configured with range 0-1000. Value set to 151
Field Name
Field Type
Required?
Notes
Reference Genome
Text
No
Optionally used to populate the GenomeFolder value for individual samples in the sample sheet.
Control?
Text
No
Used to indicate a control sample that is represented as a submitted sample in the LIMS.
Parameter
Description
Required?
Notes
u, username
LIMS username
Yes
p, password
LIMS password
Yes
i, processURI
LIMS process URI
Yes
c, csvFileLimsIds
Sample sheet CSV file LIMS ID
Yes
May be provided multiple times
e, errorLogFileName
Log file name
Yes
l, useProjectLimsID
Project LIMS ID will be used instead of project name in the Project column of the sample sheet
No
Accepted values: true or false. Provide with quotes e.g. -l 'true'
s, useSampleLimsID
Should be set to true Sample LIMS ID will be used instead of sample name in the SampleName column of the sample sheet
No
Accepted values: true or false. Provide with quotes e.g. -s 'true' See Enabling unique FASTQ file names in Configuration Options
a, appendLimsID
Should be set to false LIMS ID of the protocol step will be appended to sample names in the SampleName column of the sample sheet.
No
Accepted values: true or false. Provide with quotes e.g. -a 'true'
Field Name
Description
Required?
Notes
WMFileVersion
Illumina Worklist Manager Version number.
No
Date
The date the sample sheet was generated.
No
Workflow
Master step field/Step UDF of the same name.
Yes
Application
Populated with CF 139-Variant Assay or CF Clinical Sequencing Assay.
Yes
Assay
Master step field/Step UDF of the same name.
No
Description
Master step field/Step UDF of the same name.
No
Chemistry
The recipe fragments used to build the run-specific recipe.
Yes
Populated with Amplicon
Field Name
Description
Required?
Notes
A
Master step field/Step UDF Control CAT Manifest. Typically a path to a file.
No
Must be a real path. Convention indicates this is the Control CAT. Control samples will be given the designation A in the Data section.
B
Master step field/Step UDF Control CAT Manifest. Typically a path to a file.
No
Must be a real path. Convention indicates this is the Control CAT. Control samples will be given the designation B in the Data section.
Field Name
Description
Required?
Notes
AmpliconWorkflowType
Master step field/Step UDF Amplicon Workflow Type.
No
Populated with values CF139VARIANTASSAY or CFCLINICALSEQUENCINGASSAY
VariantCaller
Master step field/Step UDF of the same name.
No
VariantMinimumQualCutoff
Master step field/Step UDF Variant Min Quality Cutoff.
Yes
Field Name
Description
Required?
Notes
Sample_ID
Populated with the LIMS ID of the sample if pooled, or the LIMS ID of the submitted sample if not pooled.
Yes
Sample_Name
Populated with the sample name if pooled, or the submitted sample name if not pooled.
Always present
If script parameter useSampleLimsID is provided on the command line, the LIMS ID of the sample will be used instead of the name. The additional -a command line option appends the LIMS ID to the end of this value, e.g. Sample1-1234 See Script Parameters and Usage.
Sample_Plate
Name of the Container that the Sample resides in as recorded in the LIMS.
Always present
Sample_Well
Container well location of the sample. If a sample is part of a pool, this will list the well location of the sample that was added to the pool.
Always present
Sample_Project
The name of the project in the LIMS, that the sample belongs to.
Always present
Control
Blank for a normal sample. Populated with value positive when Positive Control for MiSeqDx has been added to the pool. Populated with value negative when Negative Control for MiSeqDx has been added to the pool.
No
Controls must have an index. Controls in these cases look just like the other samples.
index
Determined from the reagent label. Uses the Sequence attribute value from Index Reagents. Dual index reagents will contain a hyphen-separated DNA sequence; this field will use the first half of that value.
Yes, if more than 1 input
I7_Index_ID
Determined from the name of the index reagent type. Dual index names will be hyphen-separated; this field will use the first half of that value.
Yes, if more than one input.
index2
Determined from the reagent label. Uses the Sequence attribute value from Index Reagents. Dual index reagents will contain a hyphen-separated DNA sequence; this field will use the second half of that value.
No
I5_Index_ID
Determined from the name of the index reagent type. Dual index names will be hyphen-separated; this field will use the second half of that value.
No
Manifest
A
Yes
Value determined by the entries in the Manifests Section.
GenomeFolder
If master step field/step UDF Use submitted sample details for Genome Folder location is true, this is populated with the value of submitted sample global field/UDF Reference Genome. Otherwise, populated with the value of master step/step UDF GenomeFolder.
Yes, if Use submitted sample details for Genome Folder location is true
Folder path for ReferenceGenomes used for secondary analysis.
Description
No
Field Name
Description
Required?
Notes
Master step field/Step UDF Read 1 Cycles
Yes
Index reads are determined by the MOS, based on the indexes on the inputs.
Master step field/Step UDF Read 2 Cycles
No
Index reads are determined by the MOS, based on the indexes on the inputs.
Read cycle entry is listed beneath the [Reads] heading, in the first column of the spreadsheet.
Read cycle entry is listed beneath the [Reads] heading, in the first column of the spreadsheet.