Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Lab Instrument Toolkit provides a number of scripts that can be used for custom configuration.
The driver_file_generator script (Template File Generator) is a file-generation solution that produces custom template files without requiring scripting or development knowledge or resources.
The addBlankLines script allows for the creation of files that require a line entry for every well in a container, including those wells that are empty.
The convertToExcel script converts separated-value files (eg. CSV) to Microsoft Excel XLS or XLSX spreadsheet format.
The parseCSV script allows for the data for each well to be parsed into fields on either derived samples or measurement records that map directly to the derived samples being measured.
The parseXmlBySampleName script matches data in a result file to samples in the LIMS, using the measurement record LIMSID.
The PlacementHelper script automates sample placement according to a transfer file produced by a robot. The script covers a one-to-one, many-to-one (pooling), or one-to-many (replicates) mapping of samples for placement.
Available from: BaseSpace Clarity LIMS v5.1.x
When the Template File Generator generates the file, it creates a log file and attaches it to the step in the LIMS.
If the file generation process encounters errors, these error conditions appear in the log file.
If the file generation process completes successfully, the log file contents resemble the following example:
When troubleshooting Template File Generator issues, you will find detailed information in the Automation Worker log files.
This article describes the metadata, tokens, and special characters that you can include in your custom template files for use with the .
Available from: BaseSpace Clarity LIMS v5.1.x
The following table lists and describes the metadata elements that you can include in your template files.
Unless otherwise specified, metadata elements are optional. In some cases, a metadata element must be used in conjunction with another element. For example, ILLEGAL.CHARACTERS must be used with ILLEGAL.CHARACTER.REPLACEMENTS.
Unless otherwise specified, metadata elements can appear multiple times in the template. However, if they are paired with values, only the first occurrence is used. The other lines are silently ignored.
Unless otherwise specified, if a metadata element requires a single value, any additional values are ignored when the file is generated. For example, suppose you include the OUTPUT.TARGET.DIR <path> metadata in your template file and provide more than one value for <path>. The script will process only the first (valid) path value and will ignore all other values.
Unless "metadata syntax must match exactly" is specified, metadata elements are detected and used even if there is text appended before or after them. For example the following expressions are equivalent:
For more information on metadata and how to use metadata elements in your template files, see in article.
A token is a placeholder variable that is replaced with unique data at run time. You can include tokens in automation command lines, in scripts, and in template files.
For example, suppose you include the INPUT.CONTAINER.NAME token in a template file generated by a step. At run time, this token is replaced with the name of the container that was input to the step.
All tokens included in a template file must appear in the following form: ${TOKEN}, for example - ${INPUT.CONTAINER.NAME}.
INCLUDE.INPUT.RESULTFILES
INCLUDE.OUTPUT.RESULTFILES
CSV and template file generation special characters have substitution symbols within templates.
For steps with ResultFile inputs or outputs, refer to the following entries in the table:
Token | Description |
INPUT.LIMSID OUTPUT.LIMSID | The LIMS ID of a given input / output |
INPUT.NAME OUTPUT.NAME | The name of a given input / output |
INPUT.CONTAINER.COLUMN OUTPUT.CONTAINER.COLUMN | The column part of the placement of a given input / output in its container |
INPUT.CONTAINER.LIMSID OUTPUT.CONTAINER.LIMSID | The LIMS ID of the container of a given input / output Also supported in:
|
INPUT.CONTAINER.NAME OUTPUT.CONTAINER.NAME | The name of the container of a given input / output Also supported in:
|
INPUT.CONTAINER.PLACEMENT OUTPUT.CONTAINER.PLACEMENT | The placement of a given input / output in its container. Format defined in the <PLACEMENT> segment |
INPUT.CONTAINER.ROW OUTPUT.CONTAINER.ROW | The row part of the placement of a given input / output in its container |
INPUT.CONTAINER.TYPE OUTPUT.CONTAINER.TYPE | The type of container holding a given input / output Also supported in:
|
INPUT.CONTAINER.UDF.<udf name> OUTPUT.CONTAINER.UDF.<udf name> | Get the value of a UDF on the container of a given input / output |
INPUT.REAGENT.CATEGORY OUTPUT.REAGENT.CATEGORY | List of categories of reagent on a given input / output |
INPUT.REAGENT.NAME OUTPUT.REAGENT.NAME | List of reagents on an input / output |
INPUT.REAGENT.SEQUENCE OUTPUT.REAGENT.SEQUENCE | List the sequence of each category of reagent on a given input / output |
INPUT.UDF.<udf name> OUTPUT.UDF.<udf name> | Get the value of a UDF on a given input / output |
INPUT.POOL.NAME | If the current input is a pool, provides its name. Empty if the input is not a pool |
INPUT.POOL.PLACEMENT | If the current input is a pool, provides its placement (not affected by the <PLACEMENT> section) Empty if the input is not a pool. |
INPUT.POOL.UDF.<udf name> | If the current input is a pool, provides one of its UDFs. Empty if the input is not a pool. |
Token | Description |
DATE | Current date (i.e., when the script is run). The default format uses the host's locale setting. |
INDEX | Row number of the data row (in <DATA> segment), starting from 1 |
Substitution Symbol | Character Represented |
ASTERISK | * |
BACKSLASH | \ |
CARET | ^ |
CLOSING_BRACE | } |
CLOSING_BRACKET | ] |
CLOSING_PARENTHESIS | ) |
COMMA | , |
DOLLAR_SIGN | $ |
DOUBLE_QUOTE | " |
OPENING_BRACE | { |
OPENING_BRACKET | [ |
OPENING_PARENTHESIS | ( |
PERIOD | . |
PIPE | | |
PLUS_SIGN | + |
QUESTION_MARK | ? |
SINGLE_QUOTE | ' |
TAB | tab |
Metadata Element | Description | Examples |
CONTROL.SAMPLE.DEFAULT.PROJECT.NAME, <project name> | Defines a project name for control samples. The value specified is used to determine the SAMPLE.PROJECT.NAME and SAMPLE.PROJECT.NAME.ALL token values.
|
EXCLUDE.CONTROL.TYPES, <control-type name>, <control-type name>, ... | Excludes control inputs that have a control-type uri matching an entry from the exclusion list.
|
EXCLUDE.CONTROL.TYPES.ALL | Excludes all control types. Takes precedence over EXCLUDE.CONTROL.TYPES
|
EXCLUDE.INPUT.ANALYTES | Excludes inputs of type Analyte (derived sample) from the generated file. If used without the INCLUDE.INPUT.RESULTFILES element, the generated files will be empty. File generation finishes with a warning message and a warning is logged in the log file. |
EXCLUDE.OUTPUT.ANALYTES | Excludes outputs of type Analyte (derived sample) from the generated file. The generated file(s) will be empty if:
In both scenarios, file generation finishes with a warning message and a warning is logged in the log file. |
GROUP.FILES.BY.<grouping>, <zip file name> The following groupings are supported:
|
|
HIDE, <token>, <token>, ... IF <case> The following case is supported:
| Removes lines from the HEADER_BLOCK section and columns from the HEADER and DATA sections when a <token> matches the <case>.
|
ILLEGAL.CHARACTERS, <character>, <character>, ... ILLEGAL.CHARACTER.REPLACEMENTS, <replacement>, <replacement>, ... <character> supports Special Character Mapping | Specifies characters that must not appear in the generated file, and replaces them.
|
INCLUDE.INPUT.RESULTFILES | Includes inputs of type ResultFile in the generated file. (By default these are excluded.) |
INCLUDE.OUTPUT.RESULTFILES | Includes outputs of type ResultFile in the generated file. (By default these are excluded.) |
LIST.SEPARATOR, <separator> <separator> supports Special Character Mapping | Specifies character(s) used to separate elements for tokens that return lists (e.g., SAMPLE.PROJECT.NAME.ALL).
|
OUTPUT.FILE.NAME, <file name> | Specifies the name for the generated file(s).
|
OUTPUT.FILE.NAME.ILLEGAL.CHARACTER.REPLACEMENT, <replacement> | Specifies the character(s) to use when replacing illegal characters in file names.
|
OUTPUT.SEPARATOR, <separator> <separator> supports Special Character Mapping | Specifies the character(s) used to separate columns in the output.
|
OUTPUT.TARGET.DIR, <path> | Specifies the name for the generated file(s).
|
PROCESS.POOLED.ARTIFACTS | Includes pools in the generated file as if they were regular input artifacts.
|
SCRIPT.VERSION, <major>.<minor>.<patch> | Provides the version of the compatible DriverFileGenerator.jar file.
|
SORT.BY.${token}, ${token}, ... | Sorts the <DATA> rows based on the ${token} specified.
|
SORT.VERTICAL | Sorts the <DATA> rows based on container column placement.
|
Token | Description |
PROCESS.LIMSID | The LIMS ID of the current process Also supported in:
|
PROCESS.NAME | The name of the current process |
PROCESS.UDF.<udf name> | Get the value of a process UDF (on the current step) Also supported in:
|
The first and last name of the technician running the current process. Also supported in:
|
Token | Description |
SAMPLE.LIMSID | List of all submitted sample LIMS IDs of a given artifact |
SAMPLE.NAME | List of all submitted sample names of a given artifact |
SAMPLE.UDF.<udf name> | Get the value of a UDF for the project containing the submitted samples of a given artifact |
SAMPLE.UDT.<udt name>.<udf name> | Get the value of a UDT UDF on the submitted samples of a given artifact |
SAMPLE.PROJECT.CONTACT | List of the project contacts for all submitted samples of a given artifact |
SAMPLE.PROJECT.CONTACT.ALL | List of the project contacts for the submitted samples of all artifacts. Prints all unique project contact names in a line (first name followed by last name) separated by LIST.SEPARATOR. Also supported in:
|
SAMPLE.PROJECT.LIMSID | List of the project LIMS IDs for all submitted samples of a given artifact |
SAMPLE.PROJECT.NAME | List of projects for all submitted samples of a given artifact (uses CONTROL.SAMPLE.DEFAULT.PROJECT.NAME) |
SAMPLE.PROJECT.NAME.ALL | List of projects for the submitted samples of all artifacts (uses CONTROL.SAMPLE.DEFAULT.PROJECT.NAME). Prints all unique project names in a line, separated by LIST.SEPARATOR. Also supported in:
|
SAMPLE.PROJECT.UDF.<udf name> | Get the value of a UDF for the project containing the submitted samples of a given artifact. Example: |
Available from: Clarity LIMS v2.0.5
-haltOnMissingSample option and support for header section values (e.g., containerName) are introduced in NGS v5.4.0.
Data might sometimes need to be parsed from an instrument result file (CSV, TSV, or other character-separated format) into Clarity LIMS, for the purposes of QC.
For example, suppose that a 96 well plate is run on a Caliper GX. The instrument produces a result file, which the user imports into Clarity LIMS. The per-sample data are parsed and stored for a range of capabilities, such as QC threshold checking, searching, and visibility in the Clarity LIMS interface.
The parseCSV script allows for the data for each well to be parsed into fields on either derived samples or result files (measurement records) that map directly to the derived samples being measured.
If the instrument result file contains data that applies to the batch of derived samples being measured, this data are stored in fields on the step.
The parseCSV script automates parsing a separated-value file, configurable but typically comma- or tab-separated, into the LIMS.
Data lines in the file are matched to the corresponding sample in the LIMS using well placement information.
A line that references well A1 of container Plate123 will have its parsed data mapped to the sample placed in well position A:1 of container Plate123 in the LIMS.
Values from the file are mapped to fields (known as UDFs in the API) in Clarity LIMS based on the automation configuration for the script.
Configure the step to invoke the script manually via a button in Record Details screen.
Before pressing the button that invokes the script, upload a shared result file to be parsed.
Configure the automation command line to match the destination fields configured in Clarity LIMS.
Create a field for each column that will be brought into the LIMS. Field names must not contain the separator used for the automation parameter string, "::".
When using NGS v5.0 or later, fields can be configured for the step, input samples, output samples, or output result files. Versions before this release support only output result files.
Input result files are not supported.
Example 1
This example uses matching Strategy 1 for a comma-separated file and maps two columns, "Region[100–1000] Conc. (ng/ul)" and "Region[100–1000] Size at Maximum [BP]", to output resultfile fields "Concentration" and "Size (bp)" in the LIMS, respectively:
Example 2
This example uses matching Strategy 2 for a tab-separated file, running in relaxed mode. It maps a column to an input sample field, using that input sample placement information, and maps a header section row to a protocol step field:
Other scripts you may find useful are as follows.
Available from: BaseSpace Clarity LIMS v3.0.0
Some instruments, such as Bioanalyzer and the ABI 7900HT, require input files that include a line for every well in a container.
Currently, files created with the sample input sheet generator can only contain data lines for each well that is in use and cannot add lines for empty wells. This extension script allows you to process a file in a certain format to add customizable lines for the empty container wells.
The addBlankLines script allows for the creation of files that require a line entry for every well in a container, including those wells that are empty.
To accomplish this, the script takes in an existing file, created with sample input sheet generator, and processes it to add the new lines.
The script reads the file and separates it into header, data, and footer sections.
Using the API, the script obtains the container size, its unavailable wells, and the placement of each sample.
Using this information, the script runs through each possible placement and builds a full set of data. This includes a line for each empty well, in addition to all original lines of data. If desired, a line for each unavailable well may also be included using script parameters described.
The file is then rewritten with the header, full data, and footer.
The addBlankLines script overwrites the input file provided, replacing it with the new file that includes the additional container line entries.
Placement information is written using a numeric index that is converted from the LIMS well placement. The line information that is inserted for an empty well is provided to the script. This is controlled by the blankLineString parameter, described.
The logic that determines whether the script uses input samples and containers or output samples and containers is as follows:
By default the script will use outputs.
If there are no outputs or if the optional parameter -forceInputs true is configured, the script will use inputs.
The script operates with the following constraints:
The input file and log file to update must exist locally.
Only supports TAB and COMMA as separators in the file.
Only supports processing one container at a time.
Can only use the first column of the data and requires either well placement or LIMS ID as values in this column.
Indexing always is done from left to right, top to bottom.
The well placement format in the original file must be alphanumeric (eg A1), or separated by a colon (eg 1:1).
As designed, the script supports processing one container only. As such, any step on which it is run must use either a single input container or a single output container, depending on which type of container you expect the script to use (see forceInputs).
The script should be configured to run after the creation of a file occurs - either within the same automation call or as a separate automation triggered after the call that generated the file.
It can also be configured standalone for files that are locally accessible.
Available from: BaseSpace Clarity LIMS v4.2.x
The Template File Generator is a file-generation solution that allows Clarity LIMS admins, such as lab managers, to produce custom template files without requiring scripting or development knowledge or resources.
At run time, the Template File Generator uses a script (driver_file_generator) and the supplied template file to generate a file. This may be a simple file that includes a subset of LIMS data, or a more complex sample sheet file for upload to the sequencing instrument to start a run.
The format of a template file is typically a comma-delimited CSV file. However, the following file formats are also supported: .bak, .groovy, .md5, .tsv, .txt, .xml.
In Clarity LIMS:
An automation is configured and enabled on a step.
The driver_file_generator script is triggered from the automation command line.
The script uses the template file to generate a file, the contents of which are based on the specifications provided in the template.
The script extracts data from the LIMS via the API, based on tokens defined within the template file.
The script parses the template file and processes the sections, metadata, and tokens it contains. Sections may include header block, header, data, and footer, each of which is enclosed inside tags.
For example:
See also .
Creates a file for each instance of the specified grouping, i.e., one file per input or per output container. The script gathers all files together into one zip file so only one file placeholder is needed. The metadata may be followed by the name of the zip file that will contain the grouped files. Otherwise, the value set by the script parameter is used for the file name. The following scenarios will abort file generation:
See also .
All tokens must be part of the list. Unsupported tokens will abort the file generation. Otherwise, file generation is aborted.
Refer to section below to determine if the illegal character must be specified using a keyword.
In LIMS v5 and later, ResultFile inputs are only supported in the API.
Refer to the section below to determine if the character must be provided using a keyword.
If this metadata element is not provided or is incomplete, the value of the script parameter is used instead.
You can include 'grouping' tokens in the file name, which allows you to create unique file names when generating multiple files. For details and a list of supported tokens, see .
Refer to the section below to determine if the separator must be provided using a keyword
See also .
Ensure your NGS version is up to date.
See .
See .
PROCESS.TECHNICIAN See Upgrade Note in article.
At least one mapping parameter must be provided to map data from the file to the LIMS. The details of how each of these parameters affects the behavior of the script is described in the section.
To view an out-of-the-box example included in the NGS package, review the configuration of the NanoDrop QC protocol steps included in the Initial DNA and RNA QC protocols.
Supported placement formats are alphanumeric (e.g., A1), or colon-separated (e.g., 1:1). Placements that are numeric only (e.g., 1 1) are not supported.
For details on template sections and the tokens you can use in your template files, see article.
Several special options are available for inclusion in the template. These options do not directly pull data from the LIMS API. Instead, they modify what has been gathered in the template file. For details, refer to TOKEN FORMAT section of the article.
Parameter
Description
-i {URI}
LIMS step URI (Required)
-u {username}
The LIMS username (Required)
-p {password}
The LIMS password (Required)
-f {inputFileName}
The input file name (Required)
-l {logFileName}
The log file name (Required)
-h {headerRows}
The number of lines in the header (Required)
-sep {separator}
The separator to use for the file - COMMA and TAB supported (Required). See Parameters Details.
-b {blankLineString}
The string to be used for blank lines (Required). See Parameters Details.
-c {columnType}
The data type specified in the first column of the data - LIMSID and PLACEMENT supported (Required). See Parameters Details.
-pre {indexPrefix}
The prefix to be included before the index of each well in the file output (Optional). See Parameters Details.
-forceInputs {forceInputs}
Provide as "true" to force the script to use the input samples and container when processing, even if output samples and containers exist. (Optional). See Parameters Details.
-countUnavailable
Provide as "true" to include unavailable wells in the placement index count (Optional). See Parameters Details.
-addUnavailable
Provide as "true" to include unavailable wells as entries in the updated file (Optional). See Parameters Details.
Parameter | Description |
-u {user} | LIMS username (Required) |
-p {password} | LIMS password (Required) |
-i {URI} | LIMS process URI (Required) |
-inputFile {result file} | Instrument result file to be parsed (Required) |
-log {log file name} | Log file name (Required) |
-containerName {container name} | Name of column header for container name |
-wellPosition {well position} | Name of column header for well position |
-sampleLocation {sample location} | Name of column header for <container_name>_<well> |
-measurementUDFMap {measurement UDF map} |
-partialMatchUDFMap {partial match UDF map} |
-processUDFMap {process UDF map} |
-headerRow {header row} | Numeric index of CSV header row, starting from one (default 1) |
-separator {separator} | File separator; comma used by default if not otherwise specified (default comma) |
-matchOutput {boolean} |
-setOutput {boolean} |
-relaxed {boolean} |
-haltOnMissingSample {boolean} |
Compatibility: Clarity LIMS v2.5 or later, and v3.0 or later
The sample placement helper tool allows for automated sample placement and validation in Clarity LIMS. It consists of the following scripts, each of which handles different use cases.
Clarity LIMS supports renaming of destination containers in a step in two ways:
Renaming the containers on the Placement screen.
Renaming the containers on the Record Details screen.
Often, the container name will be the barcode of the physical container (plate, chip, and so on) in the lab, which is entered into the system using a barcode scanner. However, it's possible for the barcodes to be small or close together and for the scanner to have a wide scanning range.
The purpose of this script is for validation that the same barcode has not been scanned twice during container renaming.
The script validates the names of the destination containers in the step, and reports if any duplicates are found.
The script may be run on any step that includes sample placement, and may be run:
Manually from the Record Details screen OR
Automatically on transition at any point after placement has been done (for example, on exit from the Placement screen).
The script may be configured to do the following:
Report either a warning or failure if duplicate container names are detected.
Reset the destination container names in the step to the default value of the container LIMSID.
Trigger Node
Trigger mode must be provided when configuring the script to be triggered manually.
This mode determines how the script reports the outcome of its validation to the user, which is a different underlying mechanism when run on-screen transition.
Error Mode
The errMode parameter accepts values of "warn" and "fail."
When in "warn" mode, you are still able to proceed in the step if duplicate destination container names are detected.
If set to "fail," you will not be able to continue as long as duplicates are present.
Reset Mode
Typically, a duplicate name is caused by an accidental repeated scan of a barcode.
Reset mode may be used to automatically reset the destination container names in the step if duplicates are detected.
This option resets all destination container names in the step to each LIMSID, which is the default value used for container names in the LIMS.
Example 1: Validate container names on Record Details
In this example, the script is triggered manually on the Record Details screen. It will warn the user if duplicates are detected, and the destination container names will not be changed.
Example 2: Validate container Names on Transition
In this example, the script is configured to run automatically on-screen transition. It will fail if duplicates are detected, preventing the user from advancing in the step, and the destination container names will be reset to the container LIMSID.
Some instruments or workflows require specific placement patterns based on the container type. Clarity LIMS may need to place samples in an unusual pattern, such as every second well in a plate. To handle this, the ability to specify new patterns based on specific types of containers is required.
The purpose of the Placement by Pattern File script is to automate sample placement according to a specific pattern based on container type. The script reads the pattern in from a file, allowing for easy customization.
There are two script modes:
Source sample index placement: This is the default mode. This mode assigns an index to each input sample and their replicates, and then transfers them to a particular well in the output container. This mode allows for cherry picking of samples.
Source well placement: This mode uses the well position of an input sample to transfer it to a certain well position in the output container. In this mode, samples from one well will always be transferred to the same well in the output container.
Pattern files are stored in a directory that is passed as a parameter to the script.
Files are automatically selected based on the mode of the script and the container names:
Source sample index placement is toggled with the useIndexMode parameter. When using this mode, the file name must contain "IndexedTransfer" and the selected container type.
When using source well placement, the file name must contain the input container and selected container types of the protocol step.
For example, suppose that there are samples in a 96 well plate that are to be transferred into several 12x1 BeadChips.
If source well placement is used, the pattern file would be named 96 well plate_12x1 HD BeadChip.tsv.
If source sample index placement is used, the pattern file would be named IndexedTransfer_12x1 HD BeadChip.tsv (for information on the format of the pattern files, see the Pattern File Format section).
Both script modes support multiple input containers of the same type and multiple output containers of the same type. Only source sample index placement supports multiple types of input containers.
All input samples are sorted by container LIMS ID to ensure accurate placement. If useIndexMode is set to "true", samples will be further sorted to assign index. After being sorted by container LIMS ID, they are sorted based input sample well position. The sortOrder parameter specifies whether indexed samples are sorted by row or by column with respect to well position. The sortOrder parameter is only used with source index placement. Input samples are indexed from 1, and sample replicates of a particular sample are all assigned the same index.
Output containers are created and given temporary names of "Plate 1," "Plate 2," "Plate 3"... These can then be updated as desired, for example, by scanning in the barcode of the container. This temporary naming is done for ease-of-use, to make sure that the visual order in the interface is correct, making it easier to confirm that the placement pattern was followed.
It is possible to configure the script to fail or produce a warning message if the number of samples does not match the contents of the pattern file. There are three parameters to enforce the number of samples and replicates in the step: replicatesMode, minSamplesMode, and maxSamplesMode.
For example, if replicatesMode is set to "warn" and maxSamplesMode is set to "fail", then the script will produce a warning after placement occurs if the number of replicates is inaccurate and it will fail if there are more samples in the step than specified in the pattern file.
The script may be configured on any process that includes sample placement:
Configure this script to run automatically on entry to the Sample Placement screen.
Before using the script, confirm that the desired pattern files are correctly configured and are located in the correct location with the appropriate parameter provided. (See Script Parameters and Usage.)
Example 1
This example uses source well placement and a pattern file placed in: /opt/gls/clarity/extensions/conf/infinium/placementpatterns
Example 2
This example uses source sample index placement. Replicates mode fails the script if there is an inaccurate number of replicates, minSamplesMode warns the user that there are fewer samples than specified in the pattern file, and maxSamplesMode fails the script if there are more samples than specified in the pattern file. The pattern file is located in: /opt/gls/clarity/extensions/quantstudio/conf/placementpatterns
The location of the pattern file is customized using the appropriate parameter, as described in Script Parameters and Usage.
When source well placement is used, the exact file to use is selected based on the names of the input container type and destination container type:
The name must contain an exact match for each of the container types (eg "384 well plate_96 well plate.tsv")
Container type names must be separated from other text by underscores; any other text in the name is ignored. For example, "{type}_{type}.tsv" and "{type}_{type}_transfer file.tsv" are both valid pattern file names.
For a transfer from one container to another of the same type, "{type}.tsv" is a valid pattern file name. However, the script currently uses the first file it finds with a name containing the source and destination container types. If both "{type1}.tsv" and "{type1}_{type2}.tsv" are found in the same location, the script may not accurately detect which should be used for a protocol step that has input and output containers of type "type1." This can be avoided by using the long form for naming, ie {type1}_{type1}.tsv.
When source sample index placement is used, the pattern file name is slightly different:
The name must contain "IndexedTransfer" along with an exact match for the destination container type (eg "IndexedTransfer_96 well plate.tsv")
As with source well placement, container type names must be separated from other text by underscores; any other text in the name is ignored.
The pattern file format is a tab-separated file (.tsv) that has three columns with headers. These headers depend on the script mode. For source well placement, SRC_WELL, DEST_CONTAINER_INDEX, and DEST_WELL are the required headers. When useIndexMode is enabled, the pattern file must contain SRC_SAMPLE_INDEX, DEST_CONTAINER_INDEX, and DEST_WELL.
The table describes the four columns:
Part of an example pattern file for source well placement is shown below:
An example of a pattern file for placement via source sample index is shown below. It expects samples to have eight replicates each.
Some labs require support for a robot-driven sample placement scenario, in which sample placement is performed on the robot and is then recorded in the LIMS, without requiring manual entry of the sample placement.
The purpose of the Placement by Transfer File script is to automate sample placement according to a transfer file produced by a robot. The script covers a one-to-one, many-to-one (pooling), or one-to-many (replicates) mapping of samples for placement. Support for reagents is planned for a future release.
The script is available as of Clarity LIMS v3.0. Replicate support added in NGS Extensions Package v5.3.1 (Placement Helper v2.1.0). Pooling support added in NGS Extensions Package v5.4.0 (Placement Helper v2.2.0) and requires LIMS v3.1.
The robot produces a transfer file (worksheet, work list file, and so on) that contains information about which samples were used and their source and destination locations. Upload this file to the protocol Step Setup screen in the LIMS.
The Placement by Transfer File script automatically performs the same placement as the robot, recording the work that has been done in the lab, not manually enter this information.
The script looks at the protocol step inputs and matches these to the source information in the transfer file, then uses the destination information to place the protocol step outputs. On pooling steps, the script will have the intermediary step of creating the pools. After the pools are created, you are given the opportunity to make sure that their contents before the script does placements for the pools.
(For information on the format of the pattern files, see the Transfer File Format section.)
The script will first search for containers that exist in the LIMS with names matching the destination container names provided in the transfer file, and if only a single container matches a given destination container name it will be used. Otherwise the containers will be created using the destination container type, or selected container type if type information is not provided in the transfer file. The only case where the script will proceed if it finds multiple containers in the LIMS matching a destination container name in the transfer file is if one of those containers is the selected container for the protocol step.
The sample name column is used for validation. If the sample found in the LIMS that matches the input placement information does not have the same name, an exception will be thrown by the script.
The script may be configured on any protocol step that includes sample placement.
Configure this script to run automatically on entry to the Sample Placement screen. (See Script Parameters and Usage.)
If you are pooling samples, the script must also be configured to run on entry to the Pool Samples screen. The configuration strings would be the same for both automatic EPP triggers.
The Step Setup screen should also be enabled in the protocol step configuration, to allow for attachment of the transfer file.
Example 1
This example uses the minimum configuration for the script:
Example 2
This example provides some of the optional parameters. The script will parse a tab-separated file that has its header on the third line, with additional validation on sample name and destination container type:
Currently the script supports comma- and tab-separated formats with a single-line header followed by data rows for the transfer information.
The minimum information required in the transfer file is a column for each of the following:
source container
source well location
destination container
destination well location
For additional validation, it is possible to specify columns to use for the following:
sample name
destination container type
The contents of the transfer file must correspond to the number of sample outputs per input configured for the step. In the transfer file, replicates are represented as multiple lines with the same source container and well, but different destination containers or wells. A line must appear for each expected output, including replicates, or else an exception will be thrown by the script and placement will fail.
To pool multiple inputs together, make sure that the corresponding inputs have the same destination container and well. If so, and the step is configured for pooling, when the script is triggered on entry to the pooling screen your pools will be created. If the step is not configured for pooling, then such a transfer file will cause an error to be thrown by the script.
An example from a Hamilton robot (.tsv, tab-separated file) is shown below:
Placements separated by a colon (e.g., 1:1) or that are alphanumeric (e.g., A1) are valid. Placements that are numeric only (e.g., 11) are not supported.
If the destination container type information is not supplied, the selected container type for the protocol step will be used to place the samples.
Some other scripts you may find useful:
Available from: BaseSpace Clarity LIMS v2.1.0
Often, data can be parsed from an instrument result file in XML format into Clarity LIMS, for the purposes of QC.
For example, perform a TapeStation instrument run. This produces an XML result file, which the user imports into the LIMS. The file includes information of interest for each sample, which should be parsed and stored for a range of capabilities, such as QC threshold checking, searching, and visibility in the LIMS interface.
The XmlSampleNameParser tool allows for sample data to be parsed into UDFs on result files (measurement records) that map directly to the derived samples being measured.
The XmlSampleNameParser tool is installed as a standalone jar file as part of the NGS Extensions Package. Currently it contains one script, parseXmlBySampleName.
Provided the result file is in XML format, this script can be used to match data in the file to samples in the LIMS using the measurement record LIMSID.
Values are mapped to UDFs in the LIMS using a configuration file that contains XPath mappings for the result file. (External resources, such as w3schools, can be used to learn more about Xpath, and many XML viewing tools will generate it automatically for elements of interest.)
The format for the data needed to make the association between the file contents and the sample in the LIMS is: LIMSID_NAME.
The name is optional and is supported for readability. This means it may come from the input sample on which the step is being run.
The LIMSID must come from the output result file, which is also where the parsed information will be stored in UDFs.
Typically, it is ideal to set up the instrument run with the sample and result file information, so that it will appear in the same format in the XML result file. To automate setup, you can use a tool such as the template driver file generator.
The LIMSID_NAME can be provided to the instrument as the sample name, or as a comment or other field on the sample. The only conditions are that:
The sample field that you want to use for the LIMSID_NAME must be passed into the file result file (eg via a driver file).
The configuration file must be set up such that it can access this field from the correct location. (See Configuration File Format.)
The parseXmlBySampleName script uses the following parameters, all of which are required:
Example
This example shows the script run on a manually imported TapeStation XML file that has been attached to the TapeStation DNA QC process.
The process type for the steps on which information will be tracked must be configured with the following output generation:
1x fixed ResultFile output per input
2x fixed ResultFile outputs applied to all inputs
Shared output naming pattern example: {LIST:Instrument Result XML (required),XML Parsing Log}
This represents the minimum configuration. Additional shared output files may be added as required.
For each piece of information that will be parsed from the XML file and stored on the step outputs, configure desired UDFs on ResultFile and associate them with the per-input output result files for the process type.
The configuration file should be produced as a .groovy file and stored in the /opt/gls/clarity/customextensions directory. Its format allows for four types of entries:
baseSampleXPath
sampleNameXPath
process.run.UDF."UDF name"
process.output.UDF."UDF name"
The examples provided here use XPath for a TapeStation XML result file.
baseSampleXPath
Provide this one time.
This XPath indicates the list of samples and the associated sample information, relative to the root of the XML file. Specific sample information will be retrieved relative to this path.
sampleNameXPath
Provide this one time.
This XPath indicates where the LIMS sample association information (LIMSID_NAME) can be found, relative to the sample list indicated by baseSampleXPath. Often this will be stored as the sample name or in a comment field for the sample.
process.run.UDF."UDF name"
May be provided multiple times.
Indicates information that is tracked for the entire run, and not on individual samples.
Typically, this will be XPath relative to the root of the XML file, as shown.
The destination result file UDF name is specified as part of the entry name and must match the UDF name in the LIMS exactly.
In the example above, three values will be parsed into the LIMS from the XML file, represented on each individual measurement record (result file) output:
Conc. Units
Molarity Units
MW Units
process.output.UDF."UDF name"
May be provided multiple times.
Indicates information that is tracked on individual samples.
Typically, this will be XPath relative to the sample XPath (baseSampleXPath), as shown.
The destination result file UDF name is specified as part of the entry name and must match the UDF name in the LIMS exactly.
In the example above, five values will be parsed into the LIMS from the XML file, represented on each individual measurement record (result file) output:
Concentration
Region 1 Average Size - bp
Region 1 Conc.
Peak 1 MW
Peak 1 Conc.
Some other scripts you may find useful:
Available from: BaseSpace Clarity LIMS v4.2.x
You can create template files that the Template File Generator script (driver_file_generator) will use to generate custom files for use in your lab.
This article provides details on the following:
The parameters used by the script.
The sections of the template file—these define what is output in the generated file.
Sorting logic—options for sorting the data in the generated file.
Rules and constraints to keep in mind when creating templates and generating files.
Examples of how you can use specific tokens and metadata in your template files.
For a complete list of the metadata elements and tokens that you can include in a template file, see Template File Contents.
Upgrade Note: process vs step URIs The driver_file_generator script now uses steps instead of processes for fetching information. When a process URI is supplied, the script detects it and automatically switches it to a step URI. (The PROCESS.TECHNICIAN token, which is only available on 'process' in the API, is still supported.) The behavior of the script has not changed, except that the long form -processURI parameter must be replaced by -stepURI in configuration. The -i version of this parameter remains supported and now accepts both process and step URI values. If your configuration is using -processURI or --processURI, replace each instance with -i (or -stepURI/--stepURI).
The following table defines the parameters used by the driver_file_generator script.
Command-line example:
Command-line example using -quickAttach and -destLIMS:
See also the Examples section.
The input-output-maps of the step (defined by the -stepURI parameter) are used as the data source for the content of the generated file.
If they are present, input-output-maps with the attribute output-generation-type=PerInput are used. Otherwise, all input-output-map items are used.
By default, the data source entries are sorted alphanumerically by LIMS ID. You can modify the sort order by using the SORT.BY and SORT.VERTICAL metadata elements (see Metadata section of the Template File Contents article).
The content of the generated file is determined by the sections defined in the template. Content for each section is contained within xml-like opening and closing tags that are structured as follows:
Most template files follow the same basic structure and include some or all the following sections (by convention, section names are written in capital letters, but this is not required):
The order of the section blocks in the template does not affect the output. In the output file, blocks will always be in the order shown.
The area outside of the sections can contain metadata elements (see Metadata section of the Template File Contents article). Anything else outside of the section tags is ignored.
The <PLACEMENT> and <TOKEN FORMAT> sections are not part of the list and do not create distinct sections in the generated file. Instead, they alter the formatting of the generated output.
The header block section may include both plain text and data from the LIMS. It consists of information that does not appear multiple times in the generated file—ie, the information is not included in the data rows (see DATA section)
Tokens in the header block always resolve in the context of the first input and first output available. For example, suppose the INPUT.CONTAINER.TYPE token is used in the header block:
If there is only one type of input container present in the data source, that container type will be present in the output file.
If multiple input container types are present in the data source, only the first one encountered while processing the data will be present in the output file.
For this reason, we recommend against using tokens that will resolve to different values for different samples - such as SAMPLE.NAME. If one of these tokens is encountered, a warning is logged and the first value retrieved from the API is used. (Note that you may use.ALL tokens, where available.)
To include a header block section in a template, enclose it within the <HEADER_BLOCK> and </HEADER_BLOCK> tags.
HIDE feature: If one of the tokens of a line is empty and is part of a HIDE statement, that line will be removed entirely. See Using HIDE to Exclude Empty Columns and Using HIDE to Exclude Empty HEADER rows examples.
The header section describes the header line of the data section (see DATA section). A simple example might be "Sample ID, Placement".
The content of this section can only include plain text and is output as is. Tokens are not supported.
To include a header section in a template, enclose it within the <HEADER> and </HEADER> tags.
HIDE feature: See 'Hide feature' in DATA section. Also note:
If multiple <HEADER> lines are present, at least one must have the same number of columns as the <DATA> template line.
<HEADER> lines that do not match the number of columns are unaffected by the HIDE feature.
Each data source entry creates a data row for each template line in the section. All entries are output for the first template line, then the next template line runs, and so on.
The data section allows tokens and text entries. All tokens are supported.
Note the following:
Duplicated rows are eliminated, if present. A row is considered duplicated if its content (after all variables and placeholders have been replaced with their corresponding values) is identical to a previous row. Tokens must therefore provide distinctive enough data (ie, something more than just CONTAINER.NAME) if all of the input-output entry pairs are desired in the generated file.
By default, the script processes only sample entries. However, there are metadata options that allow inclusion of result files/measurements and exclusion of samples.
Metadata sorting options are applied to this section of the template file only.
By default, pooled artifacts are treated as a single input artifact. They can be demultiplexed using the PROCESS.POOLED.ARTIFACTS metadata element.
If there is at least one token relevant to the step inputs or outputs, this section will produce a row for each PerInput entry in the step input-output-map. If no PerInput entries are present in the step input-output-map, the script will attempt to add data rows for PerAllInputs entries.
Input and output artifacts are always loaded if a <DATA> section is present in the template file, due to the need to determine what type of artifacts the script is dealing with.
To include a data section in a template, enclose it within the <DATA> and </DATA> tags.
HIDE feature: If the token in a given column is empty for all lines and that token is part of a HIDE statement, that column (including the matching <HEADER> columns) will be removed entirely. There can only be one <DATA> template line present when using the HIDE feature. See Using HIDE to Exclude Empty Columns and Using HIDE to Exclude Empty HEADER rows examples.
The content of this section can only include plain text and is output as is. Tokens are not supported.
To include a footer section in a template, enclose it within the <FOOTER> and </FOOTER> tags.
This section contains groovy code that controls the formatting of PLACEMENT tokens (see the PLACEMENT tokens in Template File Contents article Tokens table).
Within the groovy code, the following variables are available:
Note the following:
The script must return a string, which replaces the corresponding <PLACEMENT> tag in the template.
Logic within the placement tags can be as complex as needed, provided it can be compiled by a groovy compiler.
If an error occurs while running formatting code, the original location value is used.
To include a placement section in a template, enclose it within the <PLACEMENT> and </PLACEMENT> tags.
Placement Example: Container Type
In the following example:
If the container type is a 96 well plate, sample placement A1 will return as "A_1"
If the container type is not a 96 well plate, sample placement A1 will return as "A:1"
Placement Example: Zero Padding
This section defines logic to be applied to specific tokens to change the format in which they appear in the generated file.
Special formatting rules can be defined per token using the following groovy syntax:
Within the groovy code, the variable 'token' refers to the original value being transformed by the formatting code. The logic replaces all instances of that token with the result.
${token.identifier} marks the beginning of the token formatting code and the end of the previous token formatting code (if any).
You can define multiple formatting logic rules for a given token, by assigning a name to the formatting section (named formatters are called 'variations'). This is done by appending “##” after the token name (eg “${token.identifier##formatterName}”).
Using the named formatter syntax without giving a name (“${token.identifier##}”) will abort the file generation.
If an error occurs while running formatting code, the resulting value will be blank.
If a named formatter is used but not defined, the value is used as is.
To include a placement section in a template, enclose it within the <TOKEN_FORMAT> and </TOKEN_FORMAT> tags.
TOKEN FORMAT Example: Technician Name
In this example, a custom format is defined for displaying the name of the technician who ran a process (step).
The name of the token appears at the beginning of the groovy code that will then be applied. In this code, the variable 'token' refers to the token being affected. The return value is what will replace all instances of this token in the file.
TOKEN FORMAT Example: Appending a String to Container Name or Sample Name
In this second example, when special formatting is required for two tokens, the logic for both appear inside the same set of tags.
The example appends a string to the end of the input container name or a prefix to the beginning of the submitted sample name.
Metadata provides information about the template file that is not retrieved from the API — such as the file output directory to use, and how the data contents should be grouped and sorted.
Metadata is not strictly confined to a section, and is not designated by opening and closing tags. However, each metadata entry must be on a separate line.
Metadata entries can be anywhere in the template, but the recommended best practice is to group them either at the top or the bottom of the file.
For a list of supported metadata elements, rules for using them, and examples, see Template File Contents, Metadata section.
Sorting in the generated file is done either alphanumerically or by vertical placement information, using the SORT.BY. and SORT.VERTICAL metadata elements.
Sorting must be done using a combination of sort keys - provided to SORT.BY. as one or more ${token} values, each of which always produces a unique value in the file. For example, sorting by just OUTPUT.CONTAINER.NAME would work for samples placed in tubes, but would not work for samples in 96 well plates. Sorting behavior on nonunique combinations is not guaranteed to be predictable.
To sort vertically:
Include the SORT.VERTICAL metadata element in the template file. In addition, the SORT.BY.${token}, ${token} metadata must also be included, as follows:
Any SORT.BY. tokens will be sorted using the vertical sorter instead of the alphanumeric sort.
To apply sorting to samples in 96 well plates:
You could narrow the sort key to a unique combination such as:
See also SORT.VERTICAL and SORT.BY. in the Template File Contents article.
The template must adhere to the following rules:
Metadata entries must each appear on a new line and be the only entry on that line.
Metadata entries must not appear inside tags.
Opening and closing section tags must appear on a new line and as the only entry on that line.
Each opened tag must be closed, otherwise it is skipped by the script.
Any sections (opening tag + closing tag combination) can be omitted from the template file.
Entries that are separated by commas in the template will be delimited by the metadata-specified separator (default: COMMA) in the template file.
White space is allowed in the template. However, if there is a blank line inside a tag, it will also be present in the template file produced.
If an entry in the template is enclosed in double quotes it will be imported as a single entry and written to the template file as such, even if it has commas inside.
To include double-quotes or single-quotes in the template file, use the escape character: Example: \" or \'
To include an escape character in the template file, use two escape characters inside double-quotes. For example, if you want to see \\Share\Folder\Filename.txt use "\\\\Share\\Folder\\Filename.txt" as the token.
If any of the following conditions is not met - the tag, and everything inside it, is ignored by the script and a warning displays in the log file:
Except for the metadata, all template sections must be enclosed inside tags.
Each tag must have its own line, and must be the only tag present on that line.
No other entries, even empty ones, are allowed.
All opened tags must be closed.
Custom field names must not contain periods.
The LIMS provides configuration to support generation of sample sheets that are compatible with some Illumina instruments. For details, see the Illumina Instrument Sample Sheets documentation.
The LIMS provides configured automations that generate sample sheets compatible with a number of QC instruments. The default automation command lines are provided below.
In the template file, the following OUTPUT.FILE.NAME metadata element renames the generated template file 'NewTemplateFileName':
In the automation command line, the following will attach the generated file to the {compoundOutputFileLuid0} placeholder, with the name defined by the OUTPUT.FILE.NAME metadata element.
If the quickAttach parameter is provided without destLIMSID parameter, the script logs an error and stops execution.
If destLIMSID is provided without using quickAttach, it is ignored.
The OUTPUT.FILE.NAME and OUTPUT.TARGET.DIR metadata elements support token values. This allows you to name files based on input / output values of the step - the input or output container name, for example.
The following tokens are supported for this feature:
PROCESS.LIMSID
PROCESS.UDF.<UDF NAME>
PROCESS.TECHNICIAN
DATE
INPUT.CONTAINER.NAME
INPUT.CONTAINER.TYPE
INPUT.CONTAINER.LIMSID
OUTPUT.CONTAINER.NAME
OUTPUT.CONTAINER.TYPE
OUTPUT.CONTAINER.LIMSID
Rules and Constraints
When using token values in file names, the following rules and constraints apply:
Container-related functions will return the value from a single container, even if there are multiple containers.
Other tokens will function, but will only return the value for the first row of the file (first input or output).
If the OUTPUT.FILE.NAME specified does not match the LIMS ID of the file, the output file will not be attached in the LIMS user interface. To ensure that the file is attached, include the quickAttach and destLIMSID parameters in the command-line string.
It is highly recommended that you do not use SAMPLE.PROJECT.NAME.ALL or SAMPLE.PROJECT.CONTACT.ALL, because the result is prone to surpassing the maximum length of a file name. There are similar issues with other SAMPLE tokens when dealing with pools.
Only the following characters are supported in the file name. Any other characters will be replaced by an _ (underscore) by default. This replacement character can be configured with the OUTPUT.FILE.NAME.ILLEGAL.CHARACTER.REPLACEMENT metadata element.
a-z
A-Z
0–9
_ (underscore)
- (dash)
. (period)
You can use the CONTROL.SAMPLE.DEFAULT.PROJECT.NAME metadata element to define a project name for control samples. The value specified by this token will be used when determining one or more values for the SAMPLE.PROJECT.NAME and SAMPLE.PROJECT.NAME.ALL tokens.
Example:
Rules and Constraints
If the token is found in the template, but with no value then no project name will be given for control samples.
If the token is not found in the template, then no project name will be given for control samples.
If multiple values are provided, the first one will be used.
The SAMPLE.PROJECT.NAME.ALL list will include the control project name.
You can use tthe HIDE metadata element to optionally hide a column if it contains no data. The following lines in the metadata will hide a data column when empty:
Assuming ${OUTPUT.UDF.SAMPLEUDF} is one of the data columns specified in the template, then that column will be hidden whenever there is no data to show in the output file. If a list of fields is provided, then any empty ones will be hidden:
You may also hide only one representation of a specific column or field:
You can also use the HIDE metadata element with tokens in the header section. If one or more tokens are used for a header key value pair, and there are no values for any of the tokens, the entire row will be hidden.
Assuming ${OUTPUT.UDF.SAMPLEUDF} is one of the rows specified in the template header section, that header row will be hidden whenever there is no data to display in the output file.
If a list of tokens is provided for the value, the row will only be shown if one or more of the tokens resolves to a value:
If you would like to generate multiple files, you can use the following GROUP.FILES.BY metadata elements:
GROUP.FILES.BY.INPUT.CONTAINERS
GROUP.FILES.BY.OUTPUT.CONTAINERS
These elements allow a file to be created per instance of the specified element in the step, for example, one file per input or per output container. Step level information appears in all files, but sample information is specific to the samples in the given container.
For example, suppose that a step has two samples - each in their own container - with a template file calling for information about process UDFs and sample names. Using this metadata will produce two files, each of which will contain:
One sample entry
The same process UDF information
As a best practice, we recommend storing a copy of generated files in the LIMS. To do this, you must use the quickAttach script parameter. This parameter must be used with the destLIMSID parameter, which tells the Template File Generator script which file placeholder to use. (For details, see Script Parameters.)
Naming The Files
When generating multiple files, the script gathers them all into one zip file so only one file placeholder is needed regardless of how many containers are in the step.
The zip file name may be provided in the metadata as follows:
Inside the zip file, include any paths specified for where files should be written. An example final structure inside the zip, where the subfolders are specified using the container name token, could be as follows:
The file naming, writing, and uploading process works as follows:
The outputPath parameter element is required for the script. You can use this parameter to specify the path to which the generated files will be written and/or the name to use for the file. Use this in the following scenarios:
When the target path/name is constant OR
When the target path/name includes something that can only be passed to the script via the command line - for example, if you want to include the value of a {compoundOutputFileLuidN} in the path.
The OUTPUT.TARGET.DIR metadata element overrides any path provided by outputPath, but does not change the file name. Use this:
When the target path includes something that can only be accessed with token templates - for example, the name of the user who ran the step.
The OUTPUT.FILE.NAME metadata element overrides any value provided by outputPath entirely. This token determines the name of the files that are produced for each container - for example, SampleSheet.csv. It may also contain tokens to access information, such as the container name, and it may also contain a path.
If you provide all three of outputPath, OUTPUT.TARGT.DIR, and OUTPUT.FILE.NAME, the result is that outputPath is ignored and the path specified by OUTPUT.TARGET.DIR is used as the parent under which OUTPUT.FILE.NAME is created, even if OUTPUT.FILE.NAME includes a path in addition to the file name.
If you wish to only attach files to placeholders in the LIMS and do not wish to also write anything to disk, then omit OUTPUT.TARGET.DIR and provide the outputPath parameter value as ".". This will cause files to only be written to the temporary directory that is cleaned up after the automation completes.
To produce the example of MyZip.zip, you could use the following:
Script parameters:
Template:
Rules and Constraints
You can only use one GROUP.FILES.BY metadata element in each template file.
To attach the files in the LIMS as a zip file, you must provide the quickAttach parameter along with the destLIMSID.
The zip file name may optionally be specified with the GROUP.FILES.BY metadata.
If quickAttach is used and no zip name is specified in the template, the zip will be named using the destLIMSID parameter value.
The zip file name, file paths, and file names should not contain characters that are illegal for directories and files on the target operating system. Illegal characters will be replaced with underscores.
If a file name is not unique to the target directory, e.g., if multiple SampleSheet.csv files are being written to /my/target/path, an error will be thrown and no files written.
When specifying the OUTPUT.TARGET.DIR metadata element, if a token is used that may resolve to multiple values for a single path (for example, using INPUT.NAME in the path when it will resolve to multiple sample names), one value will be chosen arbitrarily for the path. For example, you may end up with /Container1/Sample1/myfile.csv when there are two samples in the container.
Available from: BaseSpace Clarity LIMS v3.1.0
Included as part of the NGS Extensions package, the convertToExcel script is designed to convert separated-value files (eg CSV) to Microsoft Excel spreadsheets of type XLS and XLSX.
The script can be run on comma- and tab-separated files with any file extension. The original file is not edited, unless its name matches the name given for the output file.
The script can update an existing Excel spreadsheet or produce an entirely new one.
When updating an existing Excel spreadsheet, if this spreadsheet does not have a file extension XLSX will be used by default.
A single worksheet is updated with the input file contents. When producing a new Excel spreadsheet, this worksheet name may optionally be specified. Otherwise, The default name will be used.
The worksheet name must be provided when updating an existing Excel spreadsheet. If the worksheet exists, its contents will be overwritten with the contents from the input file. Otherwise, a new worksheet will be added.
Each line in the input file becomes a row in the output file, and its values are placed into the cells of that row. The first value in the input file becomes the value of cell A:1, and so forth.
When updating an existing worksheet, cells that are not overwritten by values from the input file are left untouched. For example, there may be a footer section that is not updated.
The Excel file produced may be written to a location accessible from the LIMS server (a location on the server, a mounted drive, or a shared file store for example) and also attached in the LIMS via quick Attach. If both options are specified, the script will warn if the file cannot be written and report an error if the file cannot be uploaded.
The cell types currently supported are Numeric, Boolean, Blank, and String.
Supported number formats include period (.) as the decimal point and numbers that include an exponential (eg, 1e-8 or 4E2).
Boolean values are case-insensitive.
Macros and equations are not supported when updating an existing Excel file. Other cells in the file that depend on the new values will not be updated when the worksheet is.
The convertToExcel script can be run on any step, provided there is a way to supply it with an input file to convert.
The recommended configuration is to use a minimum of two shared result files on the step: One result file used to attach the final converted file; the other the log file. The input file placeholder may be the same as the final file destination, if the input file is to be overwritten with the script results, and likewise for an existing Excel file to be updated.
Configure an automation trigger, usually on the Record Details view, to use the script. The input file may be attached manually or produced automatically by another script such as the sample sheet generator.
To configure the script to both attach its output file to a placeholder in the LIMS and to write it to a location on the server (or in a directory with shared access), provide both outputFileName and destLIMSID with quickAttach. Include the destination path in outputFileName, for example
The following examples include various options for file handling. These options exist to reduce the FTP/Automated Informatics (AI) overhead so that the script executes faster.
For example, if quickAttach is set to true, the script will attach the file directly to the LIMS through FTP. It will only write the file locally if upload/attachment via the API fails.
Example 1: Typical Use
In this example:
The file currently attached in the LIMS with LIMS ID {compoundOutputFileLuid0} is downloaded.
The file is converted to an XLSX file with the name {compoundOutputFileLuid0}-converted.xlsx.
This file is left in the current local directory for AI to attach to the LIMS automatically.
When attached, the file overwrites the file with LIMS ID {compoundOutputFileLuid0} that was originally downloaded.
Finally, the log file is uploaded to the LIMS with the name {compoundOutputFileLuid1}-LogFile.html.
Example 2: Updating an attached Excel file and both writing it to a specific location and uploading the result
In this example:
The input file currently attached in the LIMS with LIMS ID {compoundOutputFileLuid0} is downloaded.
The Excel file to update, currently attached in the LIMS with LIMS ID {compoundOutputFileLuid1}, is downloaded.
The file to update has a worksheet with the name Samples updated (or overwritten, if already present in the file) using the contents of the input file.
The resulting file is written to /opt/gls/clarity/customextensions/example as {compoundOutputFileLuid1}.xls.
Because quickAttach is passed as true:
The file is added to the LIMS directly with FTP with the LIMS ID {compoundOutputFileLuid1}.
This overwrites the Excel file to update, which was previously attached here.
Finally, the log file is uploaded to the LIMS with the name {compoundOutputFileLuid2}-LogFile.html.
Example 3: Use with sample sheet generator
In this example:
A driver file is generated with the name {compoundOutputFileLuid0}.csv, and the {compoundOutputFileLuid1}-LogFile.html log file is created by the sample sheet generator.
The conversion script is executed on {compoundOutputFileLuid0}.csv.
Because quickAttach is passed as true and no outputFileName was provided, after the file has been converted:
It is added to the LIMS directly with FTP with the LIMS ID {compoundOutputFileLuid8}.
No file is created locally.
As the input file is converted, log messages are appended to the {compoundOutputFileLuid1}-LogFile.html file.
Example 4: Use with sample sheet generator and add blank lines
In this example:
Sample sheet generator creates the base driver file with name {compoundOutputFileLuid0}.csv.
The add blank lines script takes that file and adds extra lines for empty wells in the container, editing the file in place.
Finally, the convertToExcel script is run on that result.
In this case, the final output is an XLS file named {compoundOutputFileLuid8}.xls.
Because quickAttach is false, this file is written in the current local directory and it is assumed that AI will upload it to the LIMS.
The sample log file is appended to by all three programs that are run, and is attached to the LIMS.
The input file separators supported are comma and tab.
Spaces between entries in the input file are not supported (eg "Sample Name, A:1" must instead be "Sample Name,A:1")
A short message is logged after each successful action by the script.
Any errors that occur will be logged in the log file before the script terminates.
Warnings will also be captured in the log file, and if any occur, a notification will be sent on script completion.
If a local log file exists that matches the log file name configured for the script, or if a file exists in the LIMS with the associated log file LIMSID, the log messages will be appended to these files. Otherwise, a new file will be created.
Mapping of CSV columns to fields in the LIMS. See .
Mapping of CSV columns to fields in the LIMS (partial match). See .
Mapping of CSV headers or columns to protocol step fields in the LIMS. See .
Match by input or output placement (default false). See .
Control which artifacts to parse information into (default true). See .
Control whether headers are optional or mandatory (default false). See .
Control whether the script halts execution or warns when the Container Name and Well Position cannot be determined on any line (default true) (NGS v5.4 and later). See .
The script is intended to be used with fresh destination plates, rather than plates that exist in the system. If a preexisting container is selected, it too will have the temporary naming applied and will become "Plate 1."
The output generation type specifies how the step outputs were generated in relation to the inputs. PerInput entries are available for the following step types: Standard, Standard QC, Add Labels, and Analysis.
Only a subset of the tokens is available for use in the header block section. For details, see the Template File Contents article Tokens table. If an unsupported token is included, file generation will complete with a warning message and a warning will appear in the log file.
When the LIMS attaches a file to a placeholder in the LIMS, it assumes that the file is named with the step LIMSID, and uses this LIMSID to identify the placeholder to which the file should be attached. However, when using OUTPUT.FILE.NAME, you can give the file a name that does not begin with the LIMSID of the placeholder to which it will be attached. To do this, you must use the quickAttach and destLIMSID parameters in the automation command line.
Providing a full file path for OUTPUT.FILE.NAME is still supported, but deprecated. If the full path is provided, the file/directory separator will be automatically detected and will not be replaced in the static parts of the file name. Any of these separators derived from the result of a token value will be replaced.
Automated Informatics (AI) will only pick up and attach files in BaseSpace Clarity LIMS if the outputFileName begins with the LIMSID of a file placeholder that exists in the LIMS. For more information on AI, refer to the topics in the Automated Informatics forum.
The quickAttach option makes use of FTP to upload the file to the result file placeholder in the LIMS. This means that this script may not be run using this option on remote AI nodes, as it requires access to the LIMS database to retrieve the FTP credentials.
Because the destLIMSID for the convertToExcel script differs from the LIMS ID in the name of the input CSV file, the CSV file is also uploaded by AI to the LIMS and both files will be available separately.
Because the destLIMSID for the convertToExcel script is the same as the LIMS ID in the name of the input CSV file, the CSV file is not uploaded by AI to the LIMS and only the final XLS file will be available in the LIMS.
Parameter
Description
-i {URI}
Step URI (Required)
-u {username}
LIMS username (Required)
-p {password}
LIMS password (Required)
-t, -triggerMode
Provide as "true" if script will be triggered manually (Optional)
-errMode
Provide as "warn" or "fail" to determine behavior when duplicate container names are detected (Required)
-reset
Provide as "true" to reset all container names to their default LIMSID when duplicates are found (Optional)
Parameter
Description
-i {step URI}
Protocol step URI (Required)
-u {username}
LIMS username (Required)
-p {password}
LIMS password (Required)
-d {pattern file directory}
Directory for placement pattern files (Required)
-useIndexMode {use index mode}
Determines whether source sample index placement is used instead of source well placement (Optional)
-replicatesMode {replicates mode}
Provide as "warn"/"fail" to determine script behavior if the number of replicates does not match the pattern file (Optional)
-minSamplesMode {min samples mode}
Provide as "warn"/"fail" to determine script behavior if the number of samples in the step is lower than specified by the pattern file (Optional)
-maxSamplesMode {max samples mode}
Provide as "warn"/"fail" to determine script behavior if the number of samples in the step is higher than specified by the pattern file (Optional)
-sortOrder {sort order}
Provide as "horizontal"/"vertical" to sort samples by row/column for source sample index placement - default is "horizontal" (Optional)
Column header
Description
Required for source well placement
Required for source sample index placement
SRC_WELL
Well position of the input sample, in the format A:1
Yes
No
SRC_SAMPLE_INDEX
Index of the input sample, indexed from 1
No
Yes
DEST_CONTAINER_INDEX
Output container index, indexed from 1
Yes
Yes
DEST_WELL
Destination well position of the sample, in the format A:1
Yes
Yes
Parameter
Description
-i {URI}
LIMS process URI (Required)
-u {username}
LIMS username (Required)
-p {password}
LIMS password (Required)
-f {placement file LIMSID}
LIMS ID of the file used to map samples to new placements (Required)
-headerIndex {header index}
Numeric index of file header row, starting from 1(default is 1) (Optional)
-srcContainer {source container}
Name of column containing source container ID (Required)
-srcWell {source well}
Name of column containing source container well (Required)
-sampleName {sample name}
Name of column containing sample ID (name) (Optional)
-destContainer {destination container}
Name of column containing destination container ID (Required)
-destWell {destination well}
Name of column containing destination container well (Required)
-destType {destination type}
Name of column containing destination container type (Optional)
-separator {separator}
File separator, provide tab as "tab" and comma as "comma" (default is comma) (Optional)
Option
Name
Description
-i {stepURI:v2} -stepURI {stepURI:v2}
Step URI
(Required) LIMS step URI Provides context to resolve all token values. See Upgrade Note above.
-u {username} -username {username}
Username
(Required) LIMS login username
-p {password} -password {password}
Password
(Required) LIMS login password
-t <templateFile> -templatePath <templateFile>
Template file
(Required) Template file path
-o <outputFile> -outputPath <outputFile>
Output file
(Required) Output file path If the folder structure specified in the path does not exist, it is created. Details on the following metadata elements are provided in the Metadata section of the Template File Contents article:
This output file parameter value is overwritten by OUTPUT.FILE.NAME
To output multiple files, use GROUP.FILES.BY.INPUT.CONTAINERS and GROUP.FILES.BY.OUTPUT.CONTAINERS
Files generated are in CSV format by default. Other value-separated formats are available—see OUTPUT.SEPARATOR.
-l <logFile> -logFileName <logFile>
Log file
(Required) Log file name
-q [true
false] -quickAttach [true
false]
-destLIMSID <LIMSID>
Destination LIMS ID
LIMSID of the output to attach the template file to. Use with quickAttach. See Renaming Generated Files and Generating Multiple Files examples.
Variable Name
Description
containerTypeNode
The container type holding the derived sample
row
The row part of the derived sample's location
column
The column part of the derived sample's location
Parameter
Description
Remark
-u
LIMS username (Required)
-p
LIMS password (Required)
-i
LIMS step URI (Required)
-logFileLIMSID
LIMSID of the log file that will be generated (Required)
-logFileName
Custom name to be given to the generated log file (Optional) The name must begin with 'logFileLIMSID', or else the file will not be attached to the LIMS.
If the name does not end in '.html', then '-LogFile.html' will be appended to the end of the file name.
Default is ${logFileLIMSID}-LogFile.html.
-srcLIMSID
The LIMSID of the file to be converted (Required if not providing inputFileName)
If no local file is found in the current directory whose name starts with srcLIMSID, an attempt will be made to download the file from the LIMS.
You may only provide one of either inputFileName or srcLIMSID.
-inputFileName
Name of the input file to be converted (Required if not providing srcLIMSID)
Assumes that the file is accessible from the LIMS server (either local to the server or available in a mounted drive or shared file store).
You may only provide one of either inputFileName or srcLIMSID.
-destLIMSID
The LIMSID of the result file to attach the output XLS/XLSX file to. (Required if not providing outputFileName, or if using quickAttach.)
At least one of outputFileName and destLIMSID are required.
-outputFileName
Name to be given to the output XLS/XLSX file. (Required if not providing destLIMSID.)
Default name is {destLIMSID}.xls(x), based off the file type being created.
At least one of outputFileName and destLIMSID are required.
-q quickAttach
Typically used when the EPP invocation calls a subsequent script that must access the file via the API. (Optional)
If true, destLIMSID is required and the following occurs:
When the script has completed running, it will attach the file to the LIMS via FTP, rather than writing it to disc to be attached by AI (see note for information about file names and AI).
Default is false.
-s separator
The separator used in the input file. (Optional)
Options are COMMA or TAB.
Default is COMMA.
-xls generateXLS
Determines the output file type and extension. XLS and XLSX are supported. (Optional)
If true, then the input file will be converted to an XLS file.
If false, then the input file will be converted to an XLSX file.
Default is false.
This parameter is ignored if updateFileName or updateFileLIMSID are provided. The file provided to update will be used to determine the output file type, and if not available XLSX will be used by default.
-worksheet worksheetName
Name of the worksheet to create or update. Must be provided when updating an existing file. (Optional)
The worksheet name must be provided when either updateFileName or updateFileLIMSID are provided.
The worksheet name must not exceed 32 characters and must not contain any of the following: ? [ ] / \ : *
-updateFileName
Name of an existing Excel file to update. (Optional)
Assumes that the file is accessible from the LIMS server (either local to the server or available in a mounted drive or shared file store).
You may only provide one of either updateFileName or updateFileLIMSID.
-updateFileLIMSID
LIMSID of an existing Excel file to update. (Optional)
If no local file is found in the current directory whose name starts with updateFileLIMSID, an attempt will be made to download the file from the LIMS.
You may only provide one of either updateFileName or updateFileLIMSID.
Parameter
Description
-u {user}
LIMS username
-p {password}
LIMS password
-i {URI}
LIMS process URI
-inputFile {result file}
LIMSID of the XML file to be parsed.
-log {log file name}
Log file name
-configFile {configuration file name}
Parsing configuration file