The large capacity of current Next Generation Sequencing (NGS) instruments means that labs are able to perform multiplexed experiments with multiple samples pooled into a single lane or region of the container. Before being pooled, samples are assigned a unique tag or index. After sequencing and initial analysis are complete, the sequencing results must be demultiplexed to separate data and relate the results back to each individual sample.
Clarity LIMS allows you to track a multiplexing workflow by adding reagents and reagent labels to artifacts, and then using the reagent labels to demultiplex the resulting files.
There are several ways to apply reagent labels. However, all methods involve creating placeholders that link the final sequences back to the original submitted samples. Either the lab scientist or an automated process must determine which file actually belongs with which placeholder. For more information on applying reagent labels, refer to Work with Multiplexing.
This example walks through assigning user-defined field (UDF)/custom field values to the demultiplexed output files based on upstream derived sample (analyte) UDF/custom field values. This includes upwards traversal of a sample history / genealogy, based on assigned reagent labels. This differs from upstream traversal based strictly upon process input-output mappings.
As of Clarity LIMS v5, the term user-defined field (UDF) has been replaced with custom field in the user interface. However, the API resource is still called UDF.
There are two types of custom fields:
Master step fields—Configured on master steps. Master step fields only apply to the following:
The master step on which the fields are configured.
The steps derived from those master steps.
Global fields—Configured on entities (eg, submitted sample, derived sample, measurement, etc.). Global fields apply to the entire Clarity LIMS system.
If you are using Clarity LIMS v5 or later, make sure you have completed the following actions:
Created a project and have added multiple samples to it.
Run the samples through a sequence of steps that perform the following:
Reagent addition / reagent label assignment
Pooling
Demultiplexing (to produce a set of per-reagent-label result file outputs).
Set a Numeric custom field value on each derived sample input to the reagent addition process.
A Numeric custom field with no assigned value exists on each of the per-reagent-label result file outputs. The value of this field will be computed from the set of upstream derived sample custom field values corresponding to the reagent label of the result file.
You also must make sure that API v2 r21 or later is installed.
Due to the complexity of NGS workflows, beginning at the top level submitted sample resource and working down to the result file is not the most efficient way to traverse the sample history/genealogy. It is easier to start with the result file artifact, and then trace upward to find the process with the UDFs/custom fields that you are looking for.
Starting from the per-reagent-label result file, you can traverse upward in the sample history using the parent process URI in the XML returned for each artifact. At each level of the sample history, the number of artifacts returned may increase due to processes that pooled individual artifacts.
In this example:
The upstreamArtifactLUIDs list represents the current set of relevant artifacts.
The foundUpstreamArtifactNodes list stores the target upstream artifact nodes found.
The sample history traversal stops at the inputs to the process that performed the reagent addition/reagent label assignment.
The traversal is executed using a while loop over the contents of the upstreamArtifactLUIDs list.
The list serves as a stack of artifacts. With each iteration of the loop, an artifact is removed from the end of the list and the relevant input artifacts to its parent process are pushed back onto the end of the list.
After the loop has executed, the foundUpstreamArtifactNodes list will contain all of the artifacts that are assigned the reagent label of interest upon execution of the next process in the sample history.
The final step in the script assigns a value to a Numeric UDF / custom field on the per-reagent-label output result file, Mean DNA Prep 260:280 Ratio, by computing the mean value of a Numeric UDF / custom field on each of the foundUpstreamArtifactNodes, DNA prep 260:280 ratio.
First, compute the mean using the following example:
Then, set the UDF/custom field on the per-reagent-label output result file using the following example:
TraversingPooledDemuxGenealogy.groovy:
When working with process and step outputs, you can do the following:
As processing occurs in the lab, associated processes and steps are run in Clarity LIMS. Often, key data must be recorded for the derived samples (referred to as analytes in the API) generated by these steps.
The following example explains how to change the value of an analyte UDF/global custom field.
If you would like to update a batch of output derived samples (analytes), you can increase the script execution speed by using batch operations. For more information, see Working with Batch Resources.
In Clarity LIMS v5 or later, the key data fields are configured as global custom fields on derived samples. If you are using Clarity LIMS v5 or later, make sure you have the following items:
A defined global custom field named Library Size on the Derived Sample object.
A configured Library Prep step to apply Library Size to generated derived samples.
A Library Prep process that has been run and has generated derived samples.
As of Clarity LIMS v5, the term user-defined field (UDF) has been replaced with custom field in the user interface. However, the API resource is still called UDF.
There are two types of custom fields:
Master step fields—Configured on master steps. Master step fields only apply to the following:
The master step on which the fields are configured.
The steps derived from those master steps.
Global fields—Configured on entities (eg, submitted sample, derived sample, measurement, etc.). Global fields apply to the entire Clarity LIMS system.
In Clarity LIMS v5 and later, the Record Details screen displays the information about the derived samples generated by a step. You can view the global fields associated with the derived samples in the Sample Table.
The following screenshot shows the Library Size values for the derived samples.
Derived sample information is stored in the API in the analyte resource. Step information is stored in the process resource. Each global field value is stored as an udf.
An analyte resource contains specific derived sample details that are recorded in lab steps. Those details are typically stored in global custom fields (configured in Clarity LIMS on the Derived Sample object) and then associated with the step.
When you update the information for a derived sample by updating the analyte API resource, only the global fields that are associated with the step can be updated.
To update the derived samples generated by a step, you must first request the process resource through a GET method.
The following GET method provides the full XML structure for the step:
The process variable now holds the complete XML structure returned from the GET request.
The XML returned from a GET on the process resource contains the URIs of the process output artifacts (the derived samples generated by the step). You can use these URIs to query for each individual artifact resource.
The process resource contains many input-output-map elements, where each element represents an artifact. The following snippet of the XML shows the process:
Because processes with multiple inputs and outputs tend to be large, many of the input-output-map nodes have been omitted from this example.
After you have retrieved each individual artifact resource, you can use this information to update the UDFs/custom fields for each output analyte after you request its resource.
Request the analyte output resource and update the UDF/custom field as follows.
If the output-type is analyte, then run through each input-output-map and request the output artifact resource.
Use a GET to return the XML for each artifact and store it in a variable.
When you have the analytes stored, change the analyte UDF/custom field through the following methods:
The UDF/custom field change in the XML.
The http PUT call to update the artifact resource.
The UDF/custom field change can be achieved with the Library Size UDF/custom field XML element defined in the following code. In this example, the Library Size value is updated to 25.
The PUT method updates the artifact resource at the specified URI using the complete XML representation, including the UDF/custom field. The setUdfValue method of the util library is used to perform this in a safe manner.
The output-type attribute is the user-defined name for each of the output types generated by a process/step. This is not equivalent to the type element of an artifact whose value is one of several hard-coded artifact types.
If you must filter inputs or outputs from the input-output-map based on the artifact type, you will must GET each artifact in question to discover its type.
It is important that you remove the state from each of the analyteURIs before you GET them, to make sure that you are working with the most recent state.
Otherwise, when you PUT the analyteURI back with your UDF/custom field changes, you can inadvertently revert information, such as QC, volume, and concentration, to previous values.
The results can be reviewed in a web browser through the following URI:
In Clarity LIMS v5 or later, in the Record Details screen, the Sample table now shows the updated Library Size.
UpdateProcessUDFInfo.groovy:
UpdateUDFAnalyteOutput.groovy:
When samples are processed in the lab, they generally produce child samples that are altered in some way. Eventually, the samples are analyzed on an instrument, with the result being a data file. Often these data are analyzed further, which produces additional data files.
The sample processing that occurs in the lab is modeled as steps in the Clarity LIMS web interface. In the REST API (v2 r21 or later), this processing is modeled as processes, and the samples and files that are processed are represented as artifacts. Understanding the representation of inputs and outputs within the XML for an individual process is critical to being able to use the REST API effectively.
If you are using Clarity LIMS v5 or later, make sure that you have done the following actions:
Added samples to the LIMS.
Configured a step that generates derived samples in the Lab Work tab.
Configured a file placeholder for a sample measurement file to be generated and attached by an automation script at run time. This configuration is done in the Master Step Settings of the step on the Record Details milestone.
Configured an automation that generates the sample measurement file and have enabled it on the step. This configuration is done in the Automation tab.
Configured the automation triggers. This configuration is done in the Step Settings screen, under the Record Details milestone.
Run the step on some samples.
As of Clarity LIMS v5, the Operations Interface Java client has been deprecated. In LIMS v5 and later, there is no equivalent screen to the Input/Output Explorer where you can select step inputs/outputs and generated files and view their corresponding inputs/outputs and files.
However, the following API code example is still relevant and will produce the same results.
The first step in this example is to request the individual process resource through a GET method. The full XML representation returned includes the input-output-map.
To illustrate the relationships between the inputs and outputs, you can save them using a Groovy Map data structure. This maps the output LIMS IDs to a list of input LIMS IDs associated with each output, as shown in the following example:
The process variable now holds the complete XMLstructure returned from the processURI.
In the following example XML snippet, elements of the input-output-map are labeled with <input-output-map>:
All of the input and output URIs include a ?state= some number. State allows Clarity LIMS to track historical values for QC, volume, and concentration, so you can compare the state of an analyte before and after a process was run. However, when you make changes to an artifact you should always work with the most current state.
To make sure that you are getting the current state when you do a GET request, simply remove the state from the artifact URI.
You can examine each input-output-map to find details about the relationship represented between inputs and outputs. The following code puts the output and input LIMS IDs into an array named outputToInputMap.
As the output type is also important for further processing, outputToInputMap is formatted as follows:
If the output is shared for all inputs (eg, the sample measurement file with LIMS ID 92-13007), the inputs to the process are listed. If the output relates to an individual input, only the LIMS ID for that particular input will be listed.
Outputs are listed in multiple input-output-map elements when they have multiple input files generating them. The first time any particular output LIMS ID is seen, the output type and input LIMS ID in the input-output-map are added to the list, stored in outputToInputMap.
If the output LIMS ID already has a list in outputToInputMap, then the code adds input LIMS ID to the list.
One way to access the information is to print it out. You can run through each key-value pair and print the information it contains, as shown in the following example:
After running the script on the command line, an output similar to the following will be generated, whereby the inputs used to generate each output are listed.
GetProcessInputOutput.groovy:
Lab scientists must understand the priority of the samples they are working with. To help them prioritize their work, you can rename the derived samples generated by a step so that they include the priority assigned to the original submitted sample.
If you would like to rename a batch of derived samples, you can increase the script execution speed by using batch operations. You can also use a script to rename a derived sample after a step completes.
If you are using Clarity LIMS v5 and later, make sure that you have done the following actions:
Added samples to the system.
Defined a global custom field named Priority on the Submitted Sample object. The field should have default values sp1, sp2, and sp3, and it should be enabled on a step.
Run samples through the step with the Priority of each sample set to sp1, sp2, or sp3.
In this example, six samples have been added to a project in Clarity LIMS. The submitted samples names are Heart-1 through Heart-6. The samples are run through a step that generates derived samples, and the priority of each sample is set.
By default, the name of the derived samples generated by the step would follow the name of the original submitted samples as shown in the Assign Next Steps screen of the step.
This example appends the priority of the submitted sample to the name of the derived sample output. The priority is defined by the Priority sample UDF (in Clarity LIMS v4.2 or earlier) or the Priority submitted sample custom field (in Clarity LIMS v5 or later).
Renaming the derived sample consists of the following steps:
Request the step information (process resource) for the step that generated the derived sample (analyte resource).
Request the individual analyte resource for the derived sample to be renamed.
Request the sample resource linked from the analyte resource to get the submitted sample UDF/custom field value to use for the update.
Update the individual analyte output resource with the new name.
When using the REST API, you will often start with the LIMS ID for the step that generated a derived sample. The key API concepts are as follows.
Information about a step is stored in the process resource.
In general, automation scripts access information about a step using the processURI, which links to the individual process resource. The input-output-map in the XML returned by the individual process resource gives the script access to the artifacts that were inputs and outputs to the process.
Information about a derived sample is stored in the analyte resource. This is used as the input and output of a step.
Analytes are also used to record specific details from lab processing.
The XML representation for an individual analyte contains a link to the URI of its submitted sample, and to the URI of the process that generated it (parent process).
The following GET method returns the full XML structure for the step.
The process variable now holds the complete XML structure returned from the process GET request, as shown in the following example. The URI for each analyte generated is given in the output node in each input-output-map element. For more information on the input-output-map, see View the Inputs and Outputs of a Process/Step.
Each output node has an output-type attribute that is the user-defined type name of the output. You can iterate through each input-output-map and request the output artifact resource for each output of a particular output-type.
In the code example shown below, we filter on output-type = Analyte
The output-type attribute is the user-defined name for each of the output types generated by a process. This is not equivalent to the type element of an artifact whose value is one of several hard-coded artifact types.
If you must filter inputs or outputs from the input-output-map based on the artifact type, you need to GET each artifact in question to discover its type.
It is important that you remove the state from each of the analyteURIs before you GET them to make sure that you are working with the most recent state. Otherwise, when you PUT the analyteURI back with your UDF changes, you can inadvertently revert information (eg, QC, volume, and concentration) to their previous values.
From the analyte XML, you can use the submitted sample URI to return the sample that maps to that analyte.
Updating Sample Information shows how to set a sample UDF/global field. To get the value of a sample UDF/global field, use the same method to find the field, and then use the .text() method to get the field value.
The value of the UDF is stored in the variable samplePriority so that it is then available for the renaming step described below.
The variable analyte holds the complete XML structure returned from a GET on the URI in the output node. The variable nameNode references the XML element in that structure that contains the artifact's name. The XML for the analyte named Heart-1.
Renaming the derived sample consists of two steps:
The name change in the XML.
The PUT call to update the analyte resource.
The name change can be performed with the nameNode XML element node defined. The following example shows this element defined.
The http PUT command updates the artifact resource using the complete XML representation, including the new name.
After a successful PUT, the results can be reviewed in a web browser at http://yourIPaddress/api/v2/artifacts/TST110A291AP45.
The following XML resource is returned from the PUT command and is stored in returnNode.
In Clarity LIMS, the Assign Next Steps screen shows the new names for the generated derived samples.
This example shows simple renaming of derived samples based on a submitted sample UDF/global field. However, you can use step names, step UDFs (known as master step fields in Clarity LIMS v5 or later), project information, and so on, to rename derived samples and provide critical information to scientists working in the lab.
UpdateAnalyteName.groovy:
As samples are processed in the lab, substances are moved from one container to another. Because container locations are sometimes used to reference the sample in data files, tracking the location of these substances within containers is one of the key values that Clarity LIMS provides to the lab.
Within the REST API (v2 r21 or later), analytes represent the substances on which processes/steps are run. These analytes are the substances that are chemically altered and transferred between containers as samples are processed in the lab.
Each individual sample resource has an analyte artifact that describes its container location and is used to run processes.
In Clarity LIMS, steps are not run on the original submitted samples, but are instead run on (and can also generate) derived samples. In the API, derived samples are known as analytes. Each sample resource, which is the original submitted sample in Clarity LIMS, has a corresponding analyte that is used for running processes/steps and describing placement in a container.
For more information on analyte artifacts and other REST resources, see .
For all Clarity LIMS users, make sure you have done the following actions:
Added a sample to Clarity LIMS.
Run a process/step on the sample, with the same process/step generating a derived sample output.
Added the generated derived sample to a multi-well container (eg, a 96-well plate).
The container location information for an individual derived sample/analyte is located within the XML for the individual artifact resource. Because artifacts are generated by running steps in the LIMS, this is a logical place to keep track of the location.
Within a script, you can use a GET method to request the artifact. The resulting XML structure contains all the information related to the artifact, including its container and well location.
In this example, a derived sample named Brain-600 is placed in well A:1 of a container with LIMS ID 27-1259. This information is found in the location element.
The location elements has two child data elements:
One linking to the container URI, which specifies which container the analyte is in.
One for the well location, which has the name 'value' in the XML structure.
Valid values for a well location can be either numeric or alphabetic, and are determined by the configuration of the container in Clarity LIMS.
Well locations are always represented in the row:column format. For example, a 96-well plate can have locations A:1 and C:12, and a tube can have a single well called 1:1.
Use the following XML example to retrieve the artifact:
Because the container position is structured in the row:column format, you can store the row and column in separate variables by splitting the container position on the colon character. You can access the string value of the location value node using the text() method, as shown in the following code:
Running the script in a console produces the following output:
GetContainerAnalyteLocation.groovy: