1 of 100

Clarity LIMS Software

Announcements

Clarity LIMS software is a powerful laboratory information management system (LIMS) designed to optimize genomics sample and workflow management. It enables labs to track samples, streamline complex tasks, generate sample sheets, and identify poor-quality samples before they reach the sequencing system.

Saves time and minimizes errors in sample handling through an automated workflow.
Out-of-the box integration with Illumina instruments. Accelerate adoption of Illumina NGS and array protocols with preconfigured workflows that require no coding experience.
Designed with compliance features including data entry validation, workflow enforcement, audit trails, electronic signatures and role-based permissions.
Easily collect and share data in real-time with external clients via LabLink. Collaborate on sample submission, status, and results delivery in a single, secure environment.
Scales with laboratory needs, accommodating third-party instruments and software through a robust RESTful Application Programming Interface (API).
Flexible deployment options with cloud and local implementations supported.

What's New

IPP v2.10 Release Note

Security Bulletin

Investigation of OpenSSH vulnerability with Clarity LIMS

Clarity LIMS

Clarity & LabLink

API and Database

API Portal

Together, REST and External Program Integration Plug-ins (EPP)/automation provide powerful and simple-to-use scripting. Before working with the REST API, understand the conceptual structure and design of these interfaces.

The links below provide overview information to help you get started, a self-training Cookbook guide with example scripts, and videos that supplement the API training materials.

Current API Version:

Clarity LIMS v6.3 - v2 r34

Previous API Versions:

Clarity LIMS v6.2 - v2 r33

REST

Overview

The internal Clarity LIMS API (eg, https://example.claritylims.com/clarity/api) is the API used to deliver the Clarity LIMS web interface. This interface is not typically meant for public consumption. However, some customers use it for troubleshooting and to mitigate system issues.

Preventing CSRF Attacks

As of Clarity LIMS v5.1, access to the internal Clarity LIMS API changed to enhance security and prevent Cross Site Request Forgery (CSRF) attacks. Two new HTTP headers must now be present when issuing PUT, POST, DELETE, and PATCH requests:

Origin—This header must be set to the scheme and authority of the server being accessed (eg, https:// example.claritylims.com).
X-Requested-With—This header must be set to XMLHttpRequest.

The attached cURL, Python, and Java examples demonstrate how to authenticate and issue internal API requests. These examples assume a Clarity LIMS server at https://example.claritylims.com.

csrf headers.sh:

csrf headers.py:

csrf headers.java:

REST General Concepts

REST and automation are the key interfaces for scripting. A language-agnostic application programming interface (API) is important to scientists as it allows for broad and diverse integration. Together, REST and automation provide powerful and easy-to-use scripting. However, you first need to understand the conceptual structure and design of these interfaces.

Within the Clarity LIMS Rapid Scripting API, REST technology is used to provide data specifically structured for life science research.

The API documentation includes the terms External Program Integration Plug-in (EPP) and EPP node.

As of BaseSpace Clarity LIMS v5.0, these terms are deprecated. The term EPP has been replaced with automation. EPP node is referred to as the Automation Worker or Automation Worker node. These components are used to trigger and run scripts, typically after lab activities are recorded in the LIMS.

NOTE: If you are new to the REST Web Service, we recommend that you read Development Prerequisites and REST Web Services.

REST: a web service for data access

The REST Web Service is the fundamental data access interface using XML over HTTP. It is agnostic to programming languages as most languages support HTTP and XML with libraries or built-in methods.

In life science research labs, tracking samples and the data associated with biology, research, and lab work is complex. The REST resources return information that is human-readable and interpretable. Use a web browser to explore the XML returned.

REST represents real laboratory items and activities in self-contained groups of data called resources. It provides access to recorded lab steps and to sample test results, and it provides this access using resources. For example:

The process and steps resources track the steps in the lab in terms of who did what and when.
The sample and artifact resources contain information on the submitted sample and test results on sample derivatives (also referred to as derived samples).

The REST resources and their relationships are explained in Structure of REST Resources.

The full details of each resource are described in the API Portal.

General concepts

Requests are made to the API by sending XML messages:

POST is used to create an item.
GET is used to read an item.
PUT is used to update an item.
DELETE is used to delete an item.

Note: HEAD requests are not supported.

The full URL to which requests should be sent will vary depending on the specific installation, but will generally follow this format:

  http[s]://<hostname>:<port>/api/<version.of.api>

Automation / EPP: GUI trigger for calling scripts

Automation / EPP is used to trigger scripts from within the Clarity LIMS interface.

Script-triggering is often used because the data collected needs to be dispatched for further processing. Automating data processing and returning information, in the appropriate format, to the lab for immediate use increases efficiency and quality.

File handling and file management are fundamental elements in life science scripting. When triggered, scripts can issue a command, transfer files for processing, and collect and transfer files back to the server. To enable triggering of scripts in any programming language, the information and files are provided for batch processing at the operating system command line level.

For Clarity LIMS (v5 and later)

As of Clarity LIMS v5, the Operations Interface Java client, which was used by administrators to configure processes, consumables, user-defined fields, and users, has been deprecated. All configuration and administration tasks are now executed in the Clarity LIMS web interface.

To use automation, administrators complete the following steps:

In Clarity LIMS, create and configure master steps.
Configure automations that trigger scripts. Enable those automations on the master steps.
Use the configured master steps as building blocks to create and configure steps to be run by lab scientists.

Related Resources

Automation
Work with EPP/Automation and Files

REST Web Services

The Clarity LIMS Rapid Scripting™ API provides scientific programmers with self-descriptive, yet flexible, data access. It uses a RESTful model for data access because this model is well suited to these requirements. This article provides a high-level introduction to REST concepts and technologies.

Introduction to REST Web Service Technology

Representational State Transfer (REST) is a style of software architecture for distributed information retrieval systems, most commonly observed by using the web.

REST governs proper behavior. It is not a methodology or a design principal, but rather a set of rules to which a system should conform.
REST allows a uniform interface between clients and servers that is simple and decoupled, enabling each system to evolve independently.
REST is referred to as stateless because each new API request contains all the information required to complete it, without relying on previous requests. Conforming to these REST principals is referred to as being RESTful.
REST was developed in parallel with HTTP and makes use of this protocol. It is an elegant way to programmatically access resources over HTTP. It is very flexible because you can use it with any language or tool that supports HTTP.

Other uses of REST

The web is probably the largest known RESTful system. Its behavior is very simple:

When you click a link in a web browser, your system requests information by sending a GET request to the specified URL. This URL is a resource.
The server that hosts the URL responds, typically with one of two things:
- If the page exists, the server sends the browser an HTTP 200 response code and the contents of the page.
- If the page does not exist, the server sends an HTTP 404 response code and an error message indicating that the page cannot be found.

Many software development groups use RESTful APIs. Google, Yahoo, and many public web sites use the RESTful model for information access.

Communicating with the REST API

The REST API allows you to retrieve and update information using HTTP operations. This ability provides some flexibility in how to communicate with the system.

While REST requests and responses can be in a variety of formats, we chose XML. Each resource and XML element is detailed in the API Portal.

Authenticating with the API

To use the REST API, sign in using HTTP BASIC authentication. The method used to authenticate will depend on how you use the API:

When using a browser to retrieve information from the API, sign in to the browser with a user name and password. When signing in using a browser, the session remains open until the browser is closed.
When using an HTTP request tool to retrieve, add, update, or remove data using the API, the tool asks for a user name and password each time you submit a request to the system.
When using a script to communicate with the API, the script must first authenticate with the API. The session remains open for as long as the script is being actively read by the system.

The account you use to sign in to the API must have System Administrator or Facility Administrator privileges.

Understanding URIs used by the API

The API allows self-discovery of an object. When you request information about an object, the system typically returns URIs to its children and, sometimes, its parent. Use one URI to find the next URI in a hierarchy.

When viewing XML in a browser, tools can automatically create links from the URIs returned by the system. Examples of such tools are the Firefox Text Link or Linkificator add-ons. This way, you can select URIs to browse through the API.

Requests are made to the API by sending XML in HTTP calls:

GET is used to read an item.
POST to create an item.
DELETE to delete an item.
PUT is used to update an item.

In its simplest form, use a browser to enter and read the content of a URI, which allows browsing through the system. When using this method, a GET request is issued to the API for a specified object (referred to as a resource). The request returns XML containing the metadata about that resource. See the following section for details.

If you want to add, update, and delete small amounts of data using the API, use an HTTP request tool, such as the Firefox RESTClient add-on.

Resources and Namespaces

When working with REST, there are references to resources and namespaces.

For references:

A RESTful API groups related information into resources, each of which is referenced with a global identifier (URI).

In the API, for example, every sample in the LIMS has a resource for its information. When scripting, the resource is created or updated with POST and PUT HTTP calls. There are two types of resources: single and list types.

A list resource is used to access a collection of single resources (such as a listing of all samples).
The single resource type is used to access details on just one resource (a sample, for example).

It's important to understand how the information in the LIMS has been grouped and structured into resources. To learn more, see Structure of REST Resources.

For namespaces:

An API that uses XML relies on namespaces. In XML, namespaces define the vocabulary of elements and attributes in an XML document. Each REST resource references the XML structure defined by a particular namespace.

When scripting, we use namespaces to look up specific details related to the XML data elements, attributes, and formats that represent a resource. Namespaces also order the subelements of the XML document.

In the current revisions of the API, the PUT and POST methods read the subelements of the XML independent of order, but the namespace still defines the order of the XML provided in GET calls.

Additional Information

For sophisticated write operations and automation of work, you must use a script to communicate with the API. The Cookbook contains examples that demonstrate how to use scripts to perform your work.

HTTP Response Codes and Errors

The REST API methods attempt to return appropriate HTTP status codes for every request. To use the REST API effectively, a good understanding of HTTP and status codes is required. A complete list of HTTP status codes and definitions can be found at the following website:

The primary status codes used by the REST API are as follows:

200 OK: Success.
201 Created: A resource was successfully created.
400 Bad Request: Invalid data was supplied for the relevant resource type.
401 Unauthorized: The requested resource cannot be loaded until valid logon credentials have been entered. If this error is received after logon credentials have been entered, this indicates that the credentials are not valid.
403 Forbidden: Access to the requested resource has been denied. (Make sure that the authorized user has administrative privileges.)
404 Not Found: The URI requested is invalid or the resource requested does not exist.
413 Request Entity Too Large: The request is larger than the server is willing or able to process.
500 Internal Server Error: A generic error message, given when there is no suitable specific message.

Error message format

Error messages are returned as exception elements with a message element containing a user-facing error message.

The exception may also include a suggested-actions element with more detail on how to resolve the error.

User-facing XML error messages are not returned for 401 and 403 errors. In these cases, the HTTP error must be resolved.

XML UTF-8 Character Encoding

When generating XML, explicitly set the document to UTF-8 character encoding.

If using other encoding methods (eg, MacRoman for OS X), special characters such as μg/ml are stored incorrectly. This could cause data integrity issues.

In Groovy, set the encoding attribute on the StreamingMarkupBuilder object as shown in the following example:

Requesting API Version Information

When submitting a request to the REST API, specify the version of the API being used. The version number is a path parameter in each resource URI. The desired API version is substituted into the request URI as follows:

Changes to the API are tracked with a version number (version major) and a revision number (version minor).

The version number indicates forwards and backwards compatibility.
The revision number within the version describes features added to the API that will not negatively affect current functionality.

Only the version number is referenced as part of the request. The revision number simply tracks incremental enhancements to the API.

When a new version of the API is released, update the scripts and code as soon as possible.

Check API Version

To find out what version of the API is available from a given server, submit a GET request to the base API URI. For example, in a web browser, browse to:

The system will return the version:

Exception for Legacy Systems

The API was originally intended for internal use or for just a few customers. In those early days, API versioning was different. If working with legacy scripts, this older functionality can be maintained. For example, if scripts were written before v2 and have nondefault system configuration properties for api.prefix and api.rewrite on the server, the …/api/ URI lists the resources and does not provide version information.

Viewing Paginated List Resources

The REST API only returns 500 results per request. Because of this feature, with certain resources, you can use the start-index parameter and previous-page and next-page elements to work with large amounts of data.

For example, the following request is submitted to the API:

The response looks like this:

Note the presence of the previous-page and next-page URIs, which allow moving within the pages of results.

Viewing Results from a Specified Point

Use the start-index parameter to view results from a specified point in a list. The first record in a list is index 0, and you can use values that are positive, whole numbers. If the value specified is greater than the number of results available from a resource, the system returns an empty list.

Configuring the Maximum Number of Requests

By default, the REST API only returns 500 results per request. To change the default number, contact the Illumina Support Team.

Filtering List Resources

When submitting a GET request to certain REST API resources (also known as list resources), the system returns a list of records. For example, submitting a GET request to the samples resource returns a list of all submitted samples stored in the system. Depending on the resource being used, use various query parameters to filter the records based on certain criteria. For more information about the parameters that are available, refer to the reference documentation for the desired resource.

To filter a list, the resource and parameter must be separated with a question mark (?). The parameter and the value you want to base the query on must be separated with an equal sign (=).

When filtering a list of artifacts, combine parameters within the same query statement. You can also repeat certain parameters, specifying a new value with each occurrence of the parameter.

The first parameter must be preceded with a question mark (?). Add additional parameters by separating each parameter with an ampersand (&).

Repeating a parameter with new values:

Combining parameters:

When combining or repeating parameters, each record returned matches one of the parameter values, or all the parameter values, depending on the usage:

If the query statement contains multiple values for the same parameter, the ampersands are treated as an OR.
If the query statement contains values for multiple parameters, the ampersands are treated as an AND.

For example, if a project LIMS ID and a process type are provided as parameters, the system returns only the files that match both the project LIMS ID and the process type. To see the files that match the project LIMS ID or the process type, issue two separate GET requests and combine the results.

/api/v2/processes—This URI returns all processes run in the system.
/api/v2/processes?type=MALDI—This URI returns all MALDI processes run in the system.
/api/v2/processes?type=Sample Prep&type=MALDI—This URI returns all MALDI or Sample Prep processes run in the system.
/api/v2/containers—This URI returns all containers in the system.
/api/v2/containers?type=Tube—This URI returns all tubes in the system.
/api/v2/containers?type=Tube&name=27-111&name=27-112—This URI returns the tubes in the system that are named 27-111 OR 27-112.

Using the Last-Modified Parameter

Certain resources include the last-modified query parameter.

When used, the system displays only the results that have been modified because the last-modified date. The last-modified date is represented in ISO 8601 Complete date including hours, minutes, and seconds format: YYYY-MM-DDThh:mm:ssTZD.

Viewing Paginated Results

Lists of records are often large, spanning multiple pages. Many list resources include parameters that are used to work with paginated results.

Using UDF and UDT parameters

Certain resources include parameters that can be used to filter the results displayed based on UDF information that is associated with the results:

udf.UDFNAME[.OPERATOR]=UDFVALUE—This parameter filters the results based on a specified value for a specified UDF. Any item that contains the value for the UDF is returned, unless parameters include an optional operator filter ( [.OPERATOR] in the expression provided). The lowercase filter operators of min or max are described later.
udt.name=UDTNAME—This parameter filters the results based on a specified UDT. Any item that has the UDT selected is returned.
udt.UDTNAME.UDFNAME[.OPERATOR]=UDFVALUE—This parameter filters the results based on a specified value for a specified UDF that resides within a specified UDT. Any item that contains the value for the UDF is returned, unless parameters include an optional operator filter ( [.OPERATOR] in the expression provided). The lowercase filter operators of min or max are described later.

To To filter results using UDF information, use the following query structure:

To filter results using the name of a UDT, use the following query structure:

To filter results using UDF information that is part of a specific UDT, use the following query structure:

When filtering lists and using date or numeric UDF or UDT values, use operators to restrict a query. The following operators are supported:

.min—This operator displays results that are greater than or equal to the specified value.
.max—This operator displays results that are less than or equal to the specified value.

Examples:

/api/v2/processes?type=Sample Prepandudt.name=Plasma—This URI returns all Sample Prep processes run in the system that have the Plasma UDT selected.
/api/v2/processes?type=Sample Prepandudt.Plasma.Platelet Count.min=50—This URI returns all Sample Prep processes that have a UDF named Platelet Count with a value of 50 or greater, within a UDT named Plasma.
/api/v2/processes?type=Sample Prepandudf.Sample=Serumandudf.Sample=Tissue—This URI returns all Sample Prep processes that have a UDF named Sample with a value of Serum OR a UDF named Sample with a value of Tissue.More examples of filtering exist in the Cookbook.

For more examples of filtering, see the Cookbook.

When filtering with UDT or UDF parameters, all special characters in the parameter string must be URL encoded. The Pipe ( | ) or the URL-encoded pipe ( ) cannot be used.

When filtering on a UDF that is configured as a Multiline Text UDF, if a value contains a hard return, the value must include the URL-encoded line feed () at the appropriate location. Depending on how API requests are issued (via a browser or a script), spaces in names or values may require URL encoding, and trailing spaces in a name or value always require encoding. For example, for results to be returned, ‘name ‘ requires ‘name’.

Working with User-Defined Fields (UDF) and Types (UDT)

Removing Fields and Types

To remove a UDF or UDT value, submit a PUT request with the desired UDF or UDT omitted from the XML.

Updating Resources

When submitting a PUT, it is critical to update all information. The submitted XML must include all the current UDFs and UDTs for the resource. If the field and type elements for a UDF and UDT are not included, the system removes those fields and types.

To update the UDF information for an item, the PUT request can add new UDF values and update or remove current UDF values. When working with UDTs, replace the current UDT with another UDT, or add or remove fields within the current UDT.

Update all UDFs and UDTs in a PUT
Even if the current user-defined values are not changing, include the current UDF and UDT values in the XML representation for a PUT request.

Filtering with UDF and UDT Values

Data formatting of the UDF and UDT values is important when filtering a resource list with a query parameter. When using UDF or UDT values as a query parameter, all nonalphanumeric characters must be URL encoded.

Data Type Formats

UDFs and UDTs are presented as fields and types in the XML. The following example shows a representation of fields and types returned by a GET request for a sample:

Data Type Values in XML

XML resource representations do not render UDF values in the same format as views in the client user interface. The table later in this section compares images taken from the client user interface and the XML from an http GET.

The differences are intentional, to remove ambiguity and aid script writers when handling the data values.

The following table compares the values displayed by the user interface and the API for UDF data types.

Numerics and Significant Digits

Trailing zeros are removed to support both integer and real numerics. Determine the significant digits by looking up the display-precision element of the /configuration/udfs/{udfid} resource.

EPP UDF Date format

When Date UDFs are expanded on the EPP command line, their format differs from the one used in the Clarity LIMS GUI and the REST API.

Traversing a Genealogy

Traversing Up a Genealogy

The artifacts resource includes a parent-process element that provides a URI to the process that created an artifact.

To facilitate walking up the genealogy, the processes resource exposes the parent process for an input artifact in the input-output-map of a process:

The parent-process element does not display if the parent process is not supported by the API.

Traversing Down a Genealogy

The processes resource supports an inputartifactlimsid query parameter. This parameter limits the list of processes to those processes with one of the specified artifacts as an input.

Start with the initial sample.
The processes for the sample can be queried, as follows.
The process contains an input-output-map for the input artifact.
The steps can then be repeated using the LIMS ID of each output artifact that is associated with the input artifact.

Working with Batch Resources

When multiple users are working on multiple plates in high-throughput labs, programmers may find that the large number of HTTP method calls to the REST API can slow down their scripts.

To improve performance, Illumina has created the following batch resources:

artifacts.batch.retrieve
artifacts.batch.update
containers.batch.create
containers.batch.retrieve
containers.batch.update
files.batch.retrieve
files.batch.update
samples.batch.create
samples.batch.retrieve
samples.batch.update

Use the batch resources to access a group of artifacts or a group of containers using a single batch method call. Using these resources to iterate a list of items significantly improves script execution times.

Key Concepts

Batch resources are best thought of as unordered collections or lists of items accessed. A POST to batch/create, batch/update or batch/retrieve, therefore, is a request to create, update, or retrieve those items. There is no guaranteed order to batch responses.

Batch resources are nonbreaking additions to the existing REST API. Updated scripts can still use their existing nonbatch methods.

For example, the resources may have URIs (Universal Resource Identifiers) such as:

Batch operations do not require sophisticated HTTP client or server methods. The only HTTP method for batch resources is POST.

What is sent in a batch resource POST?

To update a group of artifacts, use a POST operation to the /artifacts/batch/update resource. The XML input payload consists of a series of elements, as follows.

What is returned from a batch resource POST?

As large data transfers can affect performance, it is important to return concise XML in response to a batch resource request. Therefore, except for retrieve resources, the XML output payload consists of a list of created or updated URI links, such as the following:

Return codes

The batch resources use common HTTP return codes:

An HTTP 200 (OK) code is returned when batch resources have been successfully created or updated.
An HTTP 400 error code is returned if the input payload details included incorrect, mixed, or duplicate URI links. For example, if the details of an artifacts.batch.update (list) request included a container resource.

Getting Started with API

Preventing CSRF Attacks

Origin—This header must be set to the scheme and authority of the server being accessed (eg, https:// example.claritylims.com).
X-Requested-With—This header must be set to XMLHttpRequest.

The attached cURL, Python, and Java examples demonstrate how to authenticate and issue internal API requests. These examples assume a Clarity LIMS server at https://example.claritylims.com.

Attachments

csrf headers.sh:

csrf headers.py:

csrf headers.java:

Understanding API Terminology (LIMS v5 and later)

The following table shows how API terminology maps to terminology used in the Clarity LIMS v5.x interface.

API Terminology

Clarity LIMS Terminology

Notes

See also the Terms and Definitions section.

API-Based URIs (LIMS v4 and later)

Clarity LIMS version 4.0 introduced architectural changes that enforce SSL-based security. As a result, the structure of the URIs that reference the Clarity LIMS API was modified, and scripts written before Clarity LIMS v4.0 may require updating.

Details

Scripts that use the API do so by using RESTful methods on specific URIs. The base portion of the URI references the server on which the Clarity LIMS application is running.

Before Clarity LIMS v4.0 the base portion of the URI took the following form:

http[s]://<your_server_name>:<your_port_number>/api

Where:

The protocol could either be HTTP or HTTPS, depending on whether the application was SSL-enabled or not.
<your_server_name> represented the fully qualified domain name (or IP address) relating to the server on which the Clarity LIMS application was running.
<your_port_number> represented the port number (typically 8080) on which the Clarity LIMS application was listening.

In Clarity LIMS v4.0 and later, the base portion of the URI is in the following form:

https://<your_server_name>/api

Where:

The protocol must be HTTPS, because the Clarity LIMS application is now installed with SSL enabled.
The server name must match the certificate that was purchased and installed into Clarity LIMS.
The port number (and the colon) is no longer required. Do not provide it.

When to Update Existing Scripts

The following information should help determine if updates to the scripts are needed.

Scripts generally determine the API URI in one of the following ways:

The URI is passed to the script by the automation or External Program Plugin (EPP) component, as a parameter or command-line argument.
The URI is passed to the script by another script or a command line embedded in a crontab file.
The script contains the URI as a hard-coded string literal.
The script determines the fully qualified domain name of the server and adds the prefix (http://) and suffix (:8080) accordingly.
The script imports, or includes a file that contains, the URI.

Most scripting uses methods one or three. However, other methods may be used in the facility.

If method one is used, it is not necessary to update the scripts because Clarity LIMS passes in the new form of the URI.
If other methods are used, you likely need to update the scripts to convert the URI to the new format. Often, a search and replace tool is able to make these changes.

To make sure that the correct locations are searched, keep in mind that scripts are often stored in the following locations:

In the /opt/gls/clarity/customextensions folder (and subfolders) on the server where Clarity LIMS is running. This location is the domain of the default Automation Worker (AW)/Automated Informatics (AI) node, which listens on the channel name of limsserver.
If there are additional AW/AI nodes on the server, in the folders used by these nodes.
If there are additional AW/AI nodes external to the Clarity LIMS server, within the folders used by these nodes.
If scripts are launched by cron, or other mechanisms, they could be stored anywhere and may not even be on the Clarity LIMS server itself.

For points 1–3, query Clarity LIMS (via either the API or the database) to produce a listing of all the scripts it is configured to use. As a result, determine on which node they run and their location.

For point 4, there is no easy answer. Hopefully, if the script is important, the location has been documented.

As of Clarity LIMS v5.0, the terms External Program Integration Plug-in (EPP), EPP node, and AI node are deprecated.

The term EPP has been replaced with automation, while the Automated Informatics (AI) node is referred to as the Automation Worker (AW) node.

Development Prerequisites

Understanding lab information management in a scientific context is one of the more powerful skills in genomics research today. The Clarity LIMS Rapid Scripting™ API is designed to use these skills, allowing a knowledgeable scientific programmer to adapt lab informatics with scripts and automation.

NOTE: Based on experience working with bioinformaticians and scientific programmers, assumptions about your background, setup, and skills have been made.

Before you begin

Before using the API Cookbook, set up a Non-production scripting sandbox servers.

If any of the topics covered on this page are a concern, contact the IlluminaSupport team for additional training or custom scripting services.

Terminology

Within the Cookbook, the term scripting refers to programs running independently of the client and server that direct the input and output of information. Use scripts and the API for file handling and text processing in the context of biological samples, containers, and instruments.

Skills and Training

This API Cookbook assumes that you can program in modern computer languages, and are comfortable with scripting and bioinformatics.
The topics are best understood by those users who can program small applications and are experienced with experimental processes in molecular biology.
The topics assume that you have received administrator-level training or know how to configure the system. The topics also assume that a nonproduction server is set up to play with cookbook examples, develop real scripts, and test before deploying in production.
Be comfortable with the following skills:
- XML
- System file handling
- General-purpose scripting languages
- Working on the command line

Non-production scripting sandbox servers

Illumina provides multiple server licenses for API users: a production server license and one or more non-production server licenses for developing and testing.

To allow developers to design, build, test, and upgrade efficiently, it is recommended to install at least two servers. Installing three is even better.

The non-production server licenses serve the following purposes:

To provide a sandbox in which to experiment with the API and the system configuration.
To provide a verification platform for upgrading scripts, software components, and overall system integration before deploying to production.

All the examples in the Cookbook are intended to be used with the nonproduction scripting sandbox server. See Useful Tools.

If you do not have the time or resources to use the API, but are interested in expanding your implementation, contact the Illumina Support team. There are various consulting, training, and scripting services available.

The Life Cycle of a Sample: Stages Versus Steps

As an API programmer, it is important to understand the difference between steps and stages. This distinction is especially important because the concept of stages is hidden from the end user. As such, when receiving requirements from end users, steps sometimes means steps. At other times, steps mean stages. This article highlights the differences between these two entities.

The Life Cycle of a Sample

We tend to think of a protocol as being a linear collection of steps, as shown below.

Figure 1

However, this illustration is not complete as the life cycle of a sample modeled within Clarity LIMS reflects what happens in reality. The workflow is broken into periods of activity and inactivity. If a workflow is comprised of three steps (A, B, and C—as shown in Figure B), Step B does not begin at the exact time that Step A is complete.

Stages and Steps

To reflect these inactive periods, Clarity LIMS uses the concept of stages in addition to steps. A more complete representation of a workflow is shown below, with the stages occurring between the steps.

Figure 2

The following phrase simplifies this concept:

If the sample isn't active in a step, it's waiting in a stage.

NOTE:

The Clarity LIMS concept of the virtual ice bucket is another state that occurs when a sample leaves a stage, but work on the step has not started. This scenario is represented in the Figure 3, with the virtual ice bucket appearing between stages and steps, as the sample moves from left to right. However, virtual ice buckets are largely irrelevant to this discussion. While recognizing their existence, they are discounted from further explanation.

Figure 3

Having simplified our model of a sample passing through a workflow to resemble Figure 2, we can now add the next layer of complexity.

Protocols are components of workflows. As such, it is easy to imagine two or more workflows sharing a protocol. This detail leads to the following summary:

Steps belong to protocols, whereas stages belong to workflows.

This summary means that the stages that exist between steps are part of the workflow (represented in Figure 4 below). For example, samples passing through Workflow O proceed through Step A, Stage X, Step B, Stage Y, and Step C.

Samples passing through Workflow P (which shares the protocol with Workflow O) pass through the same steps. However, samples pass through a different set of stages (Stage X' and Stage Y').

Figure 4

Why Stages Are Important

Looking at the counts of samples associated with steps in a protocol (for example, in the Lab View dashboard in Clarity LIMS), the number of samples awaiting a particular step is actually the total number of samples across all relevant stages that feed into the step.

For bioinformaticians and programmers who are using the Clarity LIMS API, stages have an additional function. Route samples in ways that vary from the expected, linear route by manipulating which stages the artifacts are in. For example, using the API via a script, do the following actions:

Implement a forking workflow by assigning artifacts to one (or more) additional stages.
Create iterative (or looping) workflows by routing artifacts to an earlier stage for additional work.

Integrating Scripts

The BaseSpace Clarity LIMS Rapid Scripting™ API adapts lab informatics using the Clarity LIMS platform.

It is important to integrate scripting into the overall processes. Begin by identifying any areas that may require adaptation to fit the lab workflow. It also helps if users are involved in the early stages of the software system analysis process.

Most scripts in an implementation are finalized towards the end of the process, as the full impact and benefits of the new system become clear.

Take some time to become familiar with the user interface, learn how to configure the product, and work with the tools that the lab uses. Also, establish the workflows and the configuration of the system before investing in API scripts and automation.

Start with Administrator Training

New customers receive administrator-level training before working with the API.

If you are not comfortable configuring steps, custom fields, containers, etc., in Clarity LIMS, you may find the API material difficult to understand. Contact Illumina for more information on administrator training and training materials.

If you are not comfortable configuring steps, custom fields, containers, etc., in the LIMS, you will find the API material difficult to understand. Contact Illumina for more information on administrator training and training materials.

Defining the Lab Solution

Before committing time and resources to using the API, it is important to define what you would like to accomplish. Understanding the key outcomes, use cases, users, and constraints of the lab helps with learning the API more quickly and improves efficiency.

If you require assistance, Illumina can provide expert resources to audit and analyze the laboratory users, processes, workflows, instrumentation, data production, and environment. This careful and focused analysis results in a requirements specification that provides extensive value to the facility.

Automation Triggers and Command Line Calls

As of BaseSpace Clarity LIMS v5, the Operations Interface Java client, which is used by administrators to configure processes, consumables, user-defined fields, and users, has been deprecated. All configuration and administration tasks are now executed in the Clarity LIMS web interface.

In addition, several terms have been deprecated:

External Program Integration Plug-in (EPP) has been replaced with automation
EPP/Automated Informatics (AI) node has been replaced with automation worker / AW node
Parameter has been replaced with token

Use step automations to trigger a command-line call on a process/step or a file attachment event. The steps required differ depending on the LIMS version.

This article provides an overview of the steps required to configure automations and automation triggers. For detailed version-specific instructions, see the following documentation:

Clarity LIMS v6 reference guide > Configuration > Automations

Configuration (Clarity LIMS v5 and later)

On the main menu bar, click Configuration, and then click the Automation tab.
On the Automation configuration screen, on the Step Automation tab, add a new automation:
- Name the automation.
- Set the channel name.
- Define the command line.
- Enable the automation on the desired steps.
On the Master Step Settings or Step Settings screen of the related step, set the following:
- Trigger Location—The stage at which the script it is to be initiated (beginning of step, end of step, on entry to/exit from a screen, etc.).
- Trigger Style—How the script is to be initiated (automatically or manually when the user selects a button in the interface).

Additional resources

For more information, see the

Automation Execution Environment

As of BaseSpace Clarity LIMS v5.0, several terms have been deprecated:

External Program Integration Plug-in (EPP) has been replaced with automation
EPP/AI node has been replaced with automation worker / AW node
Parameter has been replaced with token
User defined field (UDF) has been replaced with custom field

When a job is dispatched to the AI node/automation worker, the following steps occur:

A temporary working directory is created on the AI node / automation worker:
- In AIInstallDirectory/temp/
- With a unique name including the client process LIMS ID.
The command configured and selected as part of the step run in the LIMS is then sent to the AI node / automation worker, with any specified parameters / tokens replaced with actual values.
The command is executed on the AI node / automation worker, spawning step execution using the temporary working directory as the working directory context.
- Script processing can use stdout, stderr, and return codes following standard shell programming packages.
When the script exits, the AI node/automation worker automatically retrieves any files with matching LIMS IDs from the temporary working directory. The files are attached to the appropriate output file placeholders.

How the API Infrastructure is Used

The automation API infrastructure can be used alone or with the REST API infrastructure.

For example:

Simple scripts can use automation parameters/tokens and data files directly from the current working directory. They can write results back to the current working directory, associating them back to the relevant placeholders in Clarity LIMS.
More advanced scripts can also use the REST API infrastructure to retrieve additional required information and place relevant data back into UDFs/custom fields. Advanced scripts can also attach and associate data files to placeholders, which may be in different locations, while the script is still running.

Supported Command Line Interpreters

Clarity LIMS automations typically call scripts or third-party programs written for a shell or command-line interpreter, of either a Linux or Windows operating system (OS). Although the use of any system shell if acceptable, Bash is recommended.

Depending on the systems that integrate with the given automations, various restrictions apply to the string parameters/tokens and formatting used in the automation command line.

As of Clarity LIMS v5, several terms have been deprecated:

External Program Integration Plug-in (EPP) has been replaced with automation
EPP/AI node has been replaced with automation worker / AW node
Parameter has been replaced with token

Environment variables

Environment variables can be used to aid in configuration. However, automation commands are generated with a limited shell. For full access to environment variables, the recommended practice is to start the command to instantiate a 'full' user shell. For example, for Bash use the following command:

bash -l -c

This procedure provides the following advantages:

Ensures updates of the environment variables, removing the need for repeated AI node/automation worker restarts.
Ensures access to all environment variables, including the full path to Groovy.
Allows certainty of the shell being used.

Operating system shell formatting, spaces, & special characters

The various operating system (OS) shells each have their own rules and regulations. When creating command-line strings, be aware of the considerations described in the following sections.

Operating System Shell Formatting Differences

Windows shell command-line interpreters require different syntax and formatting than Linux shell variants. For example, the following scripts are identical, but are formatted for different AI nodes/automation workers running on different operating systems.

On Linux:

groovy -cp ../../scripts ../../scripts/flexcontrol_drivergen.groovy -l {processU:v1:http} -a {artifactsU:v1:http} -u {username} -p {password} -f {compoundOutputFile0}.txt

On Windows:

C:\Windows\System32\cmd /c "groovy -cp ..\..\scripts ..\..\scripts\flexcontrol_drivergen.groovy -l {processU:v1:http} -a {artifactsU:v1:http} -u {username} -p {password} -f {compoundOutputFile0}.txt"

Most of the examples in this specification use Windows formatting, because Windows is the most common platform found in the lab.

Spaces

Spaces in paths, file names, or parameter/token data can cause commands to be misinterpreted as information passes between systems.

Many OS shells automatically parse command-line contents by space, which cannot be what is intended. Enclose commands in double quotes " " to avoid misinterpretation of spaces by the OS shell command-line interpreter.

Special characters

OS shell command-line interpreters can attempt to interpret and act upon certain special characters, rather than passing them along as textual information. A character can have a rule applied to it within one OS shell environment, and a different rule under another environment. To use a character in its literal form, escape the characters. The escape character used varies depending on your OS shell. The most common escape character is the backslash character.

The most common OS shell characters that require escaping are:

%  (  )  /  \  *  &  |

To make sure that a configured command in the client is properly interpreted, test it on the AI node/automation worker machine command line.

Automation Channels

The automation and integration of the day-to-day work in the lab requires different Automated Informatics (AI) nodes/automation workers to perform different tasks.

For BaseSpace Clarity LIMS v5.0, several terms are deprecated:

Automation replaced External Program Integration Plug-in (EPP).
Automation Worker (AW) node replaced EPP/AI node.

Specifying Channels

Channels are manually named, and ideally clearly represent the task performed (eg. Type_2_Analysis). To make sure that dispatched automation work is routed to the correct destination, specify a channel in the following places:

On the AI node / automation worker
Clarity LIMS v5 and later: When configuring step and derived sample automations on the Automation tab

How Channels Work

When the automation trigger conditions are met in the LIMS, the automation job first enters a channel-specific 'first in, first out' (FIFO) queue of work for completion.
Jobs queue in this channel until one of the AI nodes/automation workers operating on the channel completes its previous work, and indicates it is free to accept more.
The next job is then dispatched from the channel queue to the node. This strategy allows a single channel queue to receive service by one, or many, AI nodes/automation workers servicing the specified channel.

It is possible to have multiple AI nodes/automation workers performing the same type of work all configured on the same channel, allowing a simple but effective way to increase throughput of a particular analysis bottleneck, or to ensure redundancy during a single node failure.

Error Handling

Scripts produce a numeric code on exit. By convention and by default, a successful exit has a code of 0 (zero). Within error handling, select different nonzero exit codes to indicate various error conditions.

NOTE: For Clarity LIMS v5.0, the term External Program Integration Plug-in (EPP) is deprecated and replaced with automation.

Script Error Logging and Standard Streams

Though logging information in the Clarity LIMSuser interface is useful, scripts can also write debugging/troubleshooting information into the automatedinformatics.log file using stderr (standard error). When a line is printed to stderr, a [WARN] line is written to this log, which is useful for troubleshooting interactions between the script and the automation programs. For more information on the automatedinformatics.log file, see . For more information on using stderr on the command line, refer to Unix and Windows documentation on standard I/O streams, especially standard error.

Error handling

In Clarity LIMS, the last line written to stdout (standard out) is automatically captured and shown in the interface.

If the exit code is zero, the message displays in green. If the exit code is nonzero, the message displays in red.

The Operations Interface Java client is deprecated in Clarity LIMS v5. All configuration and administration tasks are currently executed in the LIMS web interface.

If the script exits with a nonzero code, a sample genealogy flag is automatically added to the process outputs, with a standard error message indicating there is an External Program Error.

If the script is complex, or includes several error conditions, configure an additional result file process output (eg result.log) designed to capture status and error information. Make the external script write additional information to this file. If an error occurs, the file is still captured in the client, and is available to view for troubleshooting purposes.

Automation Tokens

This section provides information to help you work with Clarity LIMS automation tokens in Clarity LIMS v5 and later.

Derived Sample Automation Tokens

When configuring automations in the BaseSpace Clarity LIMS, copy tokens from the Tokens list and paste them into the Command-Line field.

These tokens are available for use in derived sample automations. If using multiple variables, add a space between each entry. All tokens and parameters are case-sensitive.

Token

Purpose

Example

{username}

Supplies the username of the current user running the step to the triggered automation script.

cmd /c "C:\ai\ai.bat {username}"

resolves to:

cmd /c C:\ai\ai.bat adminuser

{password}

Supplies the password of the current user running the step to the triggered automation script.

cmd /c "C:\ai\ai.bat {password}"

resolves to:

cmd /c C:\ai\ai.bat 3BlindMice

In log files, the password supplied on the command line is replaced with a series of *** characters.

{baseURI}

Supplies the base API URI to the triggered automation script.

cmd /c "C:\ai\ai.bat {baseURI}"

resolves to:

cmd /c C:\ai\ai.bat https://lims.lan.29/api

{derivedSampleLuids}

Supplies the derived sample LIMS IDs to the triggered automation script.

cmd /c "C:\ai\ai.bat {derivedSampleLuids}"

resolves to:

cmd /c C:\ai\ai.bat 2-1641 2-1642 2-1643

{userinput:customParameterName}

Allows for data input to supply the triggered automation script. Custom parameters are identified with the prefix 'userinput:'

The following command line requires the user to input a value for 'more_yield':

yieldscript.sh -y {userinput:more_yield} -u {username}

Step Automation Tokens

When configuring automations in BaseSpace Clarity LIMS, copy tokens from the Tokens list and paste them into the Command Line field. These tokens are available for use in step automations. If using multiple variables, add a space between each entry. All tokens and parameters are case-sensitive.

Token

Purpose

Example

Project Automation Tokens

When configuring automations in BaseSpace Clarity LIMS, copy tokens from the Tokens list and paste them into the Command Line field.

These tokens are available for use in project automations. If using multiple variables, add a space between each entry. All tokens and parameters are case-sensitive.

Token

Purpose

Example

Automation Testing

Use the setExitStatus.py Python script, attached to this page, to test and simulate the use of the automation triggers within Clarity LIMS.

The setExitStatus.py script is designed to illustrate concepts for API training purposes. Do not use in a production environment.

The setExitStatus.py script relies on the presence of the glsapiutilv2.py script. Typically, both scripts are located in the same directory.

The setExitStatus.py script uses the following command-line parameters:

-u {user}

LIMS username

-{password}

LIMS password

-l {stepURI}

LIMS stepURI—the URI of the transient step API resource that invokes the script.

-s {status}

The status the script is reporting (OK, WARNING, or ERROR).

-m {message}

The descriptive message displayed to the user.

An example of a parameter string that invokes this script from Clarity LIMS is provided in the following. Note the use of the stepURI token in the -l parameter.

python /opt/gls/clarity/customextensions/setExitStatus.py -l {stepURI:v2:http} \

-u {username} -p {password} -s "OK" -m "successful"

Attachments

glsapiutilv2.py:

3KB

glsapiutilv2.py

setExitStatus.py.txt:

992B

setExitStatus.py

The latest glsapiutil (and glsapiutil3) Python libraries can be found on the GitHub page.

Troubleshooting Automation

Compatibility

Clarity LIMS v4 and later

Automation is powerful and simple in design. However, its applications can quickly become complex. We recommend you keep your scripts simple. When troubleshooting, the best practice is to isolate the issue to determine the source. The Automated Informatics (AI)/automation worker log file, automatedinformatics.log, is useful for isolating system components and diagnosing problems.

Isolate System Components

Isolate the behavior of each component in the system. In particular, determine if the following components can be ruled out as causing of the problem:

The script or program—Running the custom logic, REST calls, and file handling.
The AI nodes/automation workers—Calling the command line and invoking the script or program.
The network—Providing reliable and timely TCP/IP packet transfers.
The client—Completing the process/step and notifying the server.
The server—Responding to client notifications and dispatching to AI nodes/automation workers.

Start with the Script

The script provides many options for troubleshooting. For example, increase logging to rule out unexpected behavior.

Printing to stderr in the script writes a line to the automatedinformatics.log file. This file is a great source of information.
The records in the log file allow for emulating the command-line call for unit testing the script, and calling it manually on the command-line prompt (see ).
helps verify the automation program on the AI node/automation worker. If the script and the AI node/automation worker are functioning, review the log file entries for any warning (WARN) or error (ERR) lines near the time-stamp of the process completion event sent from the client.
If the issue is related to the client or server software, contact the Illumina Support team, providing:
- The automatedinformatics.log file
- The server log
- The results of the isolation tests (in the previous section).

Validate AI Node/Automation Worker

Use the following steps to test and verify the setup.

Test and Verify Setup on a Windows AI Node/Automation Worker

Create a process/step that generates a result file.
Configure an automation on the process/step. Associate it with the channel on which the AI node/automation worker is configured to communicate.
Add the following command line string.
cmd /c "C:\ai\ai.bat {outputFile0}"
On the AI node/automation worker machine, create an ai folder in C:\ so the system has an C:\ai path. Create a new file named ai.bat.
Edit the ai.bat file and add the following line:
echo Data for Output File LIMS ID %1 > %1.txt
Run the process/step created on an existing attached result file.
- The step passes the LIMS ID of its output file placeholder to the script.
- The script creates a file in the working directory.
- When the script exits, this file transfers to the LIMS and associated with the step. The file contains a single line of text that includes the LIMS ID of the output file for easy verification.

Test and Verify Setup on a Linux AI Node/Automation Worker

Create a process/step that generates a result file
Configure an automation on the process/step. Associate it with the channel on which the AI node/automation worker is configured to communicate.
Add the following command line string:
bash -c "echo Automation Test > {outputFile0}.txt"
Run the process/step created on an existing attached result file.
- The step passes the LIMS ID of its output file placeholder to the script.
- The script creates a file in the working directory with a file name that contains this LIMS ID.
- When the script exits, this file transfers to the LIMS and associated with the step. The file contains the text "Automation Test". When open, the file opens in the default program associated with *.txt files.

With automation, the command-line information is important. The actual values sent on the command line are recorded in the AI log file as the AI node/automation worker receives them. Copy the parameters/tokens from the log and use them on the command line to troubleshoot. If scripting in Groovy, the cli class handles command-line tokens well. See the example *.groovy files used with automation and the utility class section in .

Review Automated Informatics Log File

AI nodes/automation workers are installed using the Automated Informatics (Automation Worker for LIMS v5 and later) software package.

In the installation directory:

Find the /log directory, which contains an automatedinformatics.log file.
Use this file to locate log lines near the time of the process/step completion event.
- Locate the log line containing the parameter/token command string, and manually run and test the script.
- Locate the working directory to review temporary files created.

Step 1: Locate log lines near time of the process completion event

This step can indicate if the cause of the error lies with a network or other issue external to the computer running the AI node / automation worker.

If no records are found, use the procedure to confirm that logging is functional.

Step 2: Locate log line containing parameter/token command string to manually run/test script

Running the script manually forms a unit test. The script is run on the command line, without being invoked by automation.

To locate the line containing the command-line string, search for the Command string, or externalprogram.runExternalProgram.

An example is shown in the following abridged log file section. Copy the line to a text file and modify it for script testing.

2014-02-28 21:24:48,688 INFO ... definitions.behaviour.automatedinformatics.plugins.externalprogram.runExternalProham as ...

2014-02-28 21:24:49,323 INFO ... (ExternalProgramBehaviour.java:133)... Command string: bash -c "~/scripts/HelloWorld.sh http://###.###.###.###/api/v2/processes/A30-MXX-110228-24-2320 > 92-2869.txt"

Step 3: Locate working directory to review temporary files created

Used when temporary files are left by the script, this automation script removes the temporary working directory, unless there was an error. In case of an error, the directory provides clues to the root cause of the error.

To locate the working directory, search for "Working directory:" in the log file.

An example is shown in the following abridged log file section. Use the recorded directory to list and review temporary files.

2014-02-28 21:24:49,324 INFO ... Working directory: /home/gls/GenoLogicsAutomatedInformatics/temp/runExternalProgram-28022014-432350866632555775.A30-MXX-110228-24-2320

2014-02-28 21:24:49,324 INFO ... Retrieved files. Executing command.

As of BaseSpace Clarity LIMS v5.0, several terms have been deprecated:

Automation replaces External Program Integration Plug-in (EPP). In LIMS v4.x and earlier, the Operations Interface still uses the term EPP.
Automation worker/AW node replaces AI node.
Token replaces Parameter.
Step replaces Process in the web interface.

Tips and Tricks

This section provides tips and tricks to help you work efficiently with the API. For example, learn how to copy and update field values, create and rename samples, work with files and QC flags, and automate BCL conversion.

Accessing Step UDFs from a different Step
Obfuscating Sensitive Data in Scripts
Integrating Clarity LIMS with Upstream Sample Accessioning Systems
Creating Samples and Projects via the API
Displaying Files From an Earlier Step
Transitioning Output Artifacts into the Next Step
Determining the Workflow(s) to Which a Sample is Assigned
Standardizing Sample Naming via the API
Copying UDF Values from Source to Destination
Updating Preset Value of a Step UDF through API
Automating BCL Conversion
Finding QC Flags in Aggregate QC (Library Validation) via REST API
Setting the Value of a QC Flag on an Artifact
Creating Notifications When Files are Added via LabLink
Remote HTTP Filestore Setup

Accessing Step UDFs from a different Step

This section outlines several strategies to enable this feature.

In all cases, assume that a UDF called Batch ID that was on Step A, and you want to access it on Step D:

NOTE: If the samples in Step D do not have a homogeneous lineage, expect multiple values for the Batch ID.

Scenario 1: Crawl Back

This method involves crawling backwards from Step D to Step A.

The general form is as follows.

Examine the inputs to Step D.
Each input (I) has a parent-process element with a URI to the step that created the artifact. In this case, it is the URI to Step C.
Get the input-output maps for Step C (from the /details resource) and find the input (I') that produced output I. Each input (I') has a parent-process element with a URI to the step that created the artifact. In this case, it is the URI to Step B.
Get the input-output maps for Step B (from the /details resource) and find the input (I'') that produced the output I'. Each input (I'') has a parent-process element with a URI to the step that created the artifact. In this case, it is the URI to Step A.
Get the value of the UDF (Batch ID) from Step A: 1234.

This method is computationally slow, but it is safe. As the number of steps that need to be crawled back through increases, so does the duration of the script to retrieve the value.

Scenario 2: Jump Back

This method tried to jump straight to Step A, without passing through Steps B and C.

The general form is as follows.

Examine the inputs to Step D. Each input (I) has a sample element that contains the limsid (S) of the related submitted sample.
https://<your_hostname>/api/v2/artifacts?samplelimsid=Sandprocess-type=Step%20A
This query should give an XML response containing the URI to Step A. From there, get the value of the UDF (Batch ID): 1234.

This method makes two assumptions:

That Step A produces analytes (derived samples). Thus, if Step A is a QC process, or does not produce analyte outputs, this method fails.
That the analytes (derived samples) resulting from S only passed through Step A one time. If this assumption is not true, you receive multiple URIs to the individual instances of Step A that relate. Also, you cannot be certain which Batch ID to rely upon.

This method is computationally fast, and its duration is not reduced if there are many steps between Step A and Step D.

Scenario 3: Pay it Forward

This method works well, but it involves making configuration changes to the steps. As such, this method is useless for legacy data resulting from samples that passed through the steps before the configuration was applied.

Its general form involves:

In Step A: Add a script that copies the value of the Batch ID UDF (1234) to every input and output of type analyte in the step.
In Step B: Add a script that copies the value of the Batch ID UDF (1234) to every output of type analyte in the step.
In Step C: Add a script that copies the value of the Batch ID UDF (1234) to every output of type analyte in the step.
In Step D: The inputs contain the value of the Batch ID.

This method relies on propagating the Step UDF through Steps A, B, and C to Step D. It is safe and fast. However, if the protocol is edited and a new step is inserted between B and C, add the script that propagates the value. This addition is so the chain does not break. This method is safe if any of the steps are QC steps or do not produce analyte outputs.

Scenario 4: Along for the Ride

This method is a niche solution, but it works well. It assumes that the samples from Step A proceed to Step D as an intact group, and they are joined by a control sample.

This method involves making configuration changes to the steps. As such, this method is useless for legacy data resulting from samples that passed through the steps before the configuration was applied.

In Step A: Identify the control sample for the group, then copy the value of the Batch ID to the control sample.
In Step D: Identify the control sample for the group, then retrieve the value of the Batch ID from it.

This method is the least work, but it does make several assumptions that might make it impracticable.

Obfuscating Sensitive Data in Scripts

If a BaseSpace Clarity LIMS script is run in an automation context, it is easy to obfuscate usernames and passwords by choosing the appropriate tokens ({username} or {password}) to be passed in as run-time arguments.

However, this type of functionality is not easily available outside of automations, and it is often necessary to store various credentials on machines that need to interact with the LIMS API, database, or some other protected resource. This article explains how to use cryptography in Python to protect and obfuscate these important authentication tokens.

Background

Many of the API Cookbook examples use a simple auth_tokens.py file that has usernames and passwords stored in plain text. This file can be compiled in Python, simply by importing it at a Python console:

Importing this file creates an auth_tokens.pyc file—a byte-compiled version of the source file. The source file can now be deleted, providing the first rudimentary level of security. However, the credentials can still quite easily be retrieved. Even if the permissions on this file are restricted, this solution does not present a suitable level of security for most IT administrators. It does, however, allow us to easily prototype our code, hence its use in Cookbook examples.

Assumptions

You have pycrypto installed (either through the OS package manager or pip).
You have generated a secret key of random ASCII characters (the easiest way to do this is to button-mash on a US-layout keyboard and include a lot of symbols).
You already have a plain-text auth_tokens.py file. An example is attached at the bottom of this article.
You have access to the Python or iPython command line console.

Cryptography in Python

Python provides the pycrypto library that can easily be installed using the operating system's package manager, or the pip installation tool. It contains myriad different encryption algorithms and gives us a straightforward interface to wrap our own encryption objects and accessor functions.

Towards a Better auth_tokens.py

The goal is to be able to create a flat text file containing obfuscated usernames, passwords, hostnames, and so on. To do this, use a utility class called ClarityCred that provides encryption and decryption functionality using the ARC4 cipher from pycrypto. The ClarityCred class is provided in cred.py, attached at the bottom of this article.

While the use of ARC4 is considered deprecated in favor of stronger encryption algorithms, such as AES, the ARC4 example lends itself to easier understanding. ARC4 simply requires a secret key and a salt size to be specified. The secret key can be generated at random using any preferred method and is hard-coded in cred.py, along with the salt size purely for ease of demonstration. Ideally, the secret key and salt size should be stored externally.

After applying the ARC4 encryption, the ClarityCred class wraps base64 encoding around it to obfuscate the data further.

Assume that you need to store a username, password, and hostname inside our auth_tokens.py, and we have this information in plain-text stored in another file called auth_tokens_plain.py. The usage is as follows.

Open a Python console, and import ClarityCred from cred.py.
Call the ClarityCred.encrypt() static function on the plain text username, password, and hostname strings.
Copy-paste these values into auth_tokens.py.

The following image illustrates steps 1 and 2, using an existing auth_tokens_plain.py file:

The old auth_tokens_plain.py looked like this:

username = 'testuser' password = 'testpass' hostname = 'https://encryptiontest.claritylims.com'

The new auth_tokens.py looks like this:

username = 'zq1AwnqIkfA=$YFY1UuO1r6edu7qPnN9/l3kMI15ZG1JAsH7IhnxnNvYulMndhYh6lxjVBfFwjN9sZEqPM0Qlx6kjq3fbht/FlRrgklDL79H7NiUP6uYM2qVltPloRA4g8SiphF3KHx4gVTE93Ku58sFCgu1rnH5u6tkCz98v0R7PsuIOW1CDMi9zSToIu+IkcYDPPYcD1b4z8ojez/7lczunaDfrmPhwopyyUiETu9BR49Bwp5fz4XSWICZFGCd9AjoEg/FTE+/X18f+0pIz0viXQyN+JjE3vJkpNsRY2Z3d72sPgQmFFZhd48m+POUtD1UXLXhaijdxp78QTcEp7AHY+TiM8hsXT7BX1Q=='

password = '9qW5BftGyXY=$6GL1t/Zl1CbSmB7Qq54uf2TJ5fI8GUlW9NdBnumkTtF/X27WLEsr1+C0ilXQX6jnLm4kzR+5pCVgnz4xz6/80/dMLMlTll6tOvCJgPU4ZkRpkUYmcPVbrp+X3azR7I024O8UjV/JeJYV869h3kvdPyWJGXRH4oJgs5NTJKI2y6URBs0wlrlgBuZ2YkO855ZGPw9J07UMM606q9xERRzQ+LT1XLRzSCuFnuSoDVEhshhYqZ/jpYWDHvA6Z5+YTYI/i099iYZ+WQdJAiU9hcgkUnWCybjcwivvHG6vAIROroLqlOefo+hrJsVFBA3uDaPS8pkgMVsKMPUGeft6vx4NgN/jaw==

hostname = 'Q+oyq2m9Nv8=$rhgeJOMdm/M+dDNlSbBA3RCsUoo0Ts65G7lePvuajRmsLSNC5Qo5bwagRuyat0ztpeZrUmD8xTxTvhUBvZYDlM6GBLsq5drBP6PFh/lplxb6O8YiSRXrboFov8tRnu6GbaTfGR8WV7s8vBZsXhrhlPn67p7yalJLnHWb9VOKhx8AgCTtytQkkEwmpm2vbDwDha9kMdK63IrOSp2jmRaI/9X3xsd4upqaxvX7zrEJ8ruGU/szN0ITxTK1rprnowpyXfBRiOEcrI7uh1bg73oqOETn3pB/uTrGkhGETKYB2aHaewwWMccbeZTgEPT0kDmuJdpoGYy+p+gxSoR9Arh3JtREIA=='

Examples of the plain-text auth_tokens_plain.py and encrypted auth_tokens.py are attached at the bottom of this article.

Using the New auth_tokens.py in Your Scripts

Now that the new auth_tokens.py is ready to use, you can import it and create the corresponding PYC file to provide that extra level of security, as previously discussed. You can remove the PY file and ship the PYC file everywhere it is required.

It may also be a good idea to restrict the read/write/execute permissions on the file to the system user that is calling the file (usually glsai in Clarity LIMS installations).

To use the values in this file in the code, we need to use the decrypt() function in ClarityCred. Look at the simple example of initializing a glsapiutil api object. For reference, the example current directory listing looks like this:

Notice the .py source files are removed wherever possible.

Using a Python console, the normal api invocation (using a plain-text auth_tokens file) would look as follows.

import glsapiutil import auth_tokens_plain api = glsapiutil.glsapiutil2() api.setHostname( auth_tokens_plain.hostname ) api.setVersion( 'v2' ) api.setup( auth_tokens_plain.username, auth_tokens_plain.password )

Now, however, with our encrypted tokens, we decrypt the values on-the-fly (changes shown in italicized red text):

import glsapiutil import auth_tokens from cred import ClarityCred api = glsapiutil.glsapiutil2() api.setHostname( ClarityCred.decrypt( auth_tokens.hostname ) ) api.setVersion( 'v2' ) api.setup( ClarityCred.decrypt( auth_tokens.username ), ClarityCred.decrypt( auth_tokens.password ) )

This method provides a relatively robust solution for encrypting and obfuscating sensitive data and can be used in any Python context, not just for Clarity LIMS API initialization. By further ensuring that only the auth_tokens.pyc file is shipped and copied with restricted read/write/execute permissions, this method should help satisfy IT security requirements.

However, the matter of storing the secret key externally remains. One idea is to store the secret key in a separate file and encrypt that file using openssl or an OpenPGP key. While the problem of storing each piece of information in encrypted format likely never fully goes away, the use of multiple methods of encryption can offer better protection and peace of mind.

Attachments

auth_tokens.py:

auth_tokens_plain.py:

Integrating Clarity LIMS with Upstream Sample Accessioning Systems

This section discusses methods for integrating BaseSpace Clarity LIMS with upstream sample accessioning systems.

The following illustration shows a typical architectural overview:

Creating a Sample in Clarity LIMS

Required:

A sample must have a Name / ID
A sample must be associated with a Case / Patient / Study / Project
A sample must be associated with a Container (Tube / Plate etc)

Optional (but expected):

User-defined fields (UDFs)/custom fields (defined by your LIMS configuration)

Typical flowchart of actions within the broker:

The following animation illustrates the elements of an XML sample-creation message to Clarity LIMS.

Common options for the broker

Build your own:

Pro: Not too difficult
Con: Stability as number of messages increases
?: Maintainable over the long-term

Use a commercial / open-source offering (e.g. Mirth Connect)

Pro: Quicker than build
Pro: Robust, multi-threaded support for millions of messages per day
?: May prove to be an excessive or over-complicated means to accomplish something relatively simple

Does the broker need to carry out other business logic?

For example, one customer added logic to their broker that dealt with medical billing and was able to distinguish between physicians ordering duplicate tests for a subject (not reimbursable, therefore the duplicate sample wasn’t submitted to Clarity LIMS), versus a temporal study that was reimbursable.

The best practice is to take advantage of as many legacy systems as possible, rather than creating samples in Clarity LIMS, then reinventing business logic to remove unwanted ones.

Displaying Files From an Earlier Step

This article explains how to make files that were produced by / attached to the LIMS in an earlier step, visible in a subsequent step.

Solution

Consider a simplified workflow / protocol containing just two steps: Produce Files and Display Files.

The first step, Produce Files, will take in analytes (derived samples), and generate individual result files (one per input).
The subsequent Display Files step will allow us to view the files associated with the analytes from the previous step.

After the files have been generated by and attached to the Produce Files step, the Record Details screen of the step displays the files.

The key to displaying these files in any subsequent step involves producing a hyperlink to the file and displaying it as a user-defined field (UDF)/custom field in subsequent steps.

You may be familiar with creating and using text, numeric, and checkbox UDFs/custom fields. However, you may be less familiar with the hyperlink option. Fields of this type are used less frequently, but they are perfect for this solution.

NOTE: As of Clarity LIMS v5.0, the term user-defined field (UDF) has been replaced with custom field in the user interface. However, the API resource is still called UDF.

This solution involves a script that runs on the Record Details screen on the subsequent Display Files step and populates the fields. See the following figure.

As you can see, the structure of the hyperlink is straightforward and includes:

The IP address / hostname of the server.
The port.
A link to the LIMS ID of the file to be linked to.

The Script

To populate these fields, there are numerous methods available within an API-based script. The method discussed here works for the two-step protocol described earlier (namely that we want the files displayed in the next step of the protocol). It also works when the steps in which the files are uploaded and displayed are separated by several intermediate steps.

Assuming that the script will run just as the Record Details screen of the Display Files step is being displayed, use pseudocode to produce the hyperlinks.

For each output:

Determine the LIMS Unique ID (LUID) of the output artifact.
Determine the LUID of the submitted sample associated with the output artifact.
Determine the LUID of the resultfile artifact produced by the earlier process, derived from the common submitted sample.
Determine the LUID of the file associated with the resultfile artifact.
Update the hyperlink UDF / custom field on the output artifact (from step 1) with the specific hyperlink value.

To illustrate these pseudocode steps, XML from a demo system is provided.

1. Gather the LIMS Unique ID (LUID) of the Output Artifact

<prc:process uri="http://192.168.8.10:8080/api/v2/processes/24-24452" limsid="24-24452">
    <type uri="http://192.168.8.10:8080/api/v2/processtypes/1555">display files</type>
    <technician uri="http://192.168.8.10:8080/api/v2/researchers/1">
    <first-name>System</first-name>
    <last-name>Administrator</last-name>
    </technician>
    <input-output-map>
        <input post-process-uri="http://192.168.8.10:8080/api/v2/artifacts/ADM1301A2PA1?state=48636" 
            uri="http://192.168.8.10:8080/api/v2/artifacts/ADM1301A2PA1?state=48636" 
            limsid="ADM1301A2PA1"/>
        <output uri="http://192.168.8.10:8080/api/v2/artifacts/2-81806?state=49106" 
            output-generation-type="PerInput" output-type="Analyte" 
            limsid="2-81806"/>
    </input-output-map>
    <input-output-map>
        <input post-process-uri="http://192.168.8.10:8080/api/v2/artifacts/ADM1301A3PA1?state=48632" 
            uri="http://192.168.8.10:8080/api/v2/artifacts/ADM1301A3PA1?state=48632" 
            limsid="ADM1301A3PA1"/>
        <output uri="http://192.168.8.10:8080/api/v2/artifacts/2-81805?state=49105" 
            output-generation-type="PerInput" output-type="Analyte" 
            limsid="2-81805"/>
    </input-output-map>
    <input-output-map>
        <input post-process-uri="http://192.168.8.10:8080/api/v2/artifacts/ADM1301A1PA1?state=48650" 
            uri="http://192.168.8.10:8080/api/v2/artifacts/ADM1301A1PA1?state=48650" 
            limsid="ADM1301A1PA1"/>
        <output uri="http://192.168.8.10:8080/api/v2/artifacts/2-81804?state=49104" 
            output-generation-type="PerInput" output-type="Analyte" 
            limsid="2-81804"/>
    </input-output-map>
</prc:process>

From the XML representation of the Display Files process/step. we see that we have three output artifact LUIDS: 2-81806, 2-81805 and 2-81804.

2. Determine the LUID of the Submitted Sample Associated with the Output Artifact

By examining the XML representation of the first output artifact (2-81806), we see the LUID of the associated submitted sample is ADM1301A2:

<art:artifact uri="http://192.168.8.10:8080/api/v2/artifacts/2-81806?state=49106" limsid="2-81806">
    <name>WG-23476-2</name>
    <type>Analyte</type>
    <output-type>Analyte</output-type>
    <parent-process uri="http://192.168.8.10:8080/api/v2/processes/24-24452" limsid="24-24452"/>
    <qc-flag>UNKNOWN</qc-flag>
    <location>
        <container uri="http://192.168.8.10:8080/api/v2/containers/27-11353" limsid="27-11353"/>
        <value>1:1</value>
    </location>
    <working-flag>true</working-flag>
    <sample uri="http://192.168.8.10:8080/api/v2/samples/ADM1301A2" limsid="ADM1301A2"/>
</art:artifact>

3. Determine the LUID of the resultfile Artifact Produced by the Earlier Process/Step, Derived from the Common Submitted Sample

After the common ancestor is found, ask Clarity LIMS for the output artifacts produced by our step of interest (Produce Files) directly.

For example:

http://192.168.8.10:8080/api/v2/artifacts?samplelimsid=ADM1301A2&process-type=Produce%20Files&type=ResultFile

Yields the following XML:

<art:artifacts>
    <artifact limsid="92-81803" uri="http://192.168.8.10:8080/api/v2/artifacts/92-81803"/>
</art:artifacts>

The resultfile with LUID 92-81803 is associated with the current output artifact (2-81806), even though these entities may be separated by several steps.

If the process/step produces multiple resultfiles, you may need to further constrain the search using the name= parameter. For example:

http://192.168.8.10:8080/api/v2/artifacts?samplelimsid=ADM1301A2&process-type=Produce%20Files&type=ResultFile&name=<name of ResultFile>

4. Determine the LUID of the File Associated with the resultfile Artifact

By gathering the XML representation of artifact 92-81803, the associated file has LUID 40-3652:

<art:artifact uri="http://192.168.8.10:8080/api/v2/artifacts/92-81803?state=49103" limsid="92-81803">
    <name>WG-23476-2</name>
    <type>ResultFile</type>
    <output-type>ResultFile</output-type>
    <parent-process uri="http://192.168.8.10:8080/api/v2/processes/24-24451" limsid="24-24451"/>
    <qc-flag>UNKNOWN</qc-flag>
    <sample uri="http://192.168.8.10:8080/api/v2/samples/ADM1301A2" limsid="ADM1301A2"/>
    <file:file limsid="40-3652" uri="http://192.168.8.10:8080/api/v2/files/40-3652"/>
</art:artifact>

5. Update the Hyperlink UDF/Custom Field on the Output Artifact (from Step 1) with the Specific Hyperlink Value

Now that you know the LUID of the file associated with output artifact 2-81803, set the value of its hyperlink field in the following form:

http://192.168.8.10:8080/clarity/api/files/3652

When constructing the value for the hyperlink, the 40- prefix should be removed from the LUID of the file.

Transitioning Output Artifacts into the Next Step

In a highly automated workflow, a lab gains little value from manually selecting samples into the ice bucket and then transitioning them through a step. Ideally, upon completion of one step, a following step could be automated such that the output analytes were transitioned through to the Record Details screen.

The Clarity LIMS External Program Plugin (EPP)/automation system cannot aid in this transition. The last point at which an automation can be triggered is before the step completion.

This scenario requires a stand-alone API application, which can be run by an automation at the end of a step.

Using this approach, a standalone app would poll the API until each of the output analytes from the previous step were queued for the next step. After they are queued, they can be walked through to the Record Details stage.

The steps are as follows:

EPP / automation triggers at step completion and launches an API app as a new Linux process and then finishes. The parameter for the API app is the URL for the current process.
API app polls to see if each output analyte is queued.
- Use the artifacts batch endpoint (api/v2/artifacts/batch/retrieve) to poll.
- Check the last workflow-stage node within workflow-stages and look for status="QUEUED".
API app moves the output analytes through the step to Record Details.
- Use the /api/v2/steps endpoints to start the step and then move the analytes forward.

Standardizing Sample Naming via the API

A lab may receive samples submitted from various sources. This can pose a problem with regards to sample names.There may be duplicate sample names, and/or various name formats, all of which make it hard for lab scientists to recognize a sample.

Clarity LIMS programmers often rename all incoming samples to a certain naming convention.

This section provides an example to address this problem.

Recommendations

When accepting a project and its samples, the receiving lab scientist runs a Clarity LIMS step named Receive Samples.
The underlying Receive Samples process type / master step is configured with analyte (sample) inputs, and no analyte outputs.
A shared result file output is configured to capture logging from the script.
The sample name could be a derivative of the Sample LIMSID, with a prefix:
- Because the LIMSID is guaranteed to be unique, this approach mitigates any need to maintain an external sequence of numbers.
- The Sample LIMSID is derived from the Project LIMSID, which is configurable.

Proposed solution

The Receive Samples process is configured to trigger a script that renames the samples that are input to the process.
This trigger also passes the OriginatingProcessURI to the script. This example assumes that the original submitted sample name must be preserved, and so it is saved in a sample UDF.

The following pseudo code shows how one might implement the sample-renaming script:

Connect to the API, using the OriginatingProcessURI.
Retrieve the OriginatingProcessXML and store it in a variable.
Iterate through the inputoutput map of the OriginatingProcessXML, and for each InputArtifact:
- GET the InputArtifactURI and store the input ArtifactXML in a variable.
- From this ArtifactXML, GET the SourceSampleXML and store it in another variable.
- Modify the SourceSampleXML. To do this:
- Rename the SampleName to a desired name (see Recommendations section, above).
- Finally, PUT the Sample XML back.

Copying UDF Values from Source to Destination

How to copy the value of a UDF/custom field from source to destination (typically from the inputs of a process/step to the outputs) is a frequently asked question.

For example, suppose a process/step takes in libraries and tracks their normalization. In such a case, the input samples have a UDF/custom field that is used to track the library ID. Since the library ID changes, it is desirable for the output samples to also have this ID.

Solution

Use the API to gather the XML for the inputs, then copy the XML node relating to the UDF/custom field to the outputs.

Alternatively, use the out-of-the-box copyUDFs script, which Illumina provides as part of the NextGen Sequencing configuration.

Script

The copyUDFs script is available in the ngs-extensions.jar archive*, and can be called from the EPP / automation parameter string.

The archive file may be named differently, depending upon the version you are running.

Usage:

Defining the UDF / custom field values

The UDF / custom field values to be copied are defined in the -f portion of the syntax. These values must be present on both the inputs and outputs of a process.

For example, suppose you wanted to use this script to copy the value of a UDF called Library ID:

The Library ID field must be defined on both inputs and outputs.
The -f flag is defined as follows:

To copy multiple UDF values from source to destination, list them in comma-separated form as part of the -f flag.

To copy Library ID and Organism from source to destination, use the following example:

Updating Preset Value of a Step UDF through API

Use the API to update the preset value of a user-defined field (UDF)/custom field configured on a step.

From your test server:

GET a chosen UDF/custom field.
Do a PUT and include a new line.

For example, to add 'My new preset', insert the preset (My new preset), after your last value in your XML:

This tool is powerful when integrating with external systems and combined with the Begin Work trigger. For example, it can be used to reach out to an external source with a script, initiated with the Begin Work trigger. The script makes sure that the presets for the Step Details UDFs/custom fields are always up to date and in sync with the server—before entering the Record Details screen.

Automating BCL Conversion

When a sequencing run is complete, it is often desirable to pass data to CASAVA for BCL conversion automatically rather than manually. This section proposes a method to configure this automation.

NOTE: This solution is not tested end-to-end on an instrument.

The proposed approach involves adding an automation trigger to the Sequencing step, such that it invokes a script that launches the BCL Conversion step.

However, because the BCL Conversion step does not run immediately, it is launched in a dormant state until the Sequencing step is complete.

The key event here is the Run Report that is created and attached to the Sequencing step. As the last event to occur in the step, the creation of this report is used to prompt the BCL Conversion step to 'wake up' from its dormant state and begin processing.

The following pseudocode describes the work that must occur within the script:

Harvest the command line parameters

Use the API to convert the LIMS ID (that was passed to the script as the -r parameter) to a full URI

Craft the XML required to create the Input-Output Map for the BCL Conversion Process with -
- the URI we just discovered as the input artifact to the Input-Output Map

POST the XML to the API

Solution details

Step 1. Create Script to Launch BCL Conversion Process / Step

A required script that launches the BCL Conversion step via the API might be absent. The creation of such a script is covered in Process Execution with EPP/Automation Support. This example only covers the functionality of the script rather than code.

In addition to the expected processURI, username, and password parameters/tokens, the script should accept another parameter (the LIMSID of the Run Report from the Sequencing step).

For example, the script can be invoked as follows:

/path/to/script/scriptname -i {processURI:v2:http} -u {username} \
-p {password} -r {runreportLIMSID}

Use this syntax when configuring the command line on the Sequencing process/step.

Step 2. Configure the Clarity LIMS Web Interface

Configure the automation so the script is automatically triggered when exiting the Record Details screen.

Example configuration details

The BCL Conversion process is configured:
- To take in a ResultFile input and generate a non-shared ResultFile output
- With a process parameter of 'Standard,' which initiates the actual BCL conversion.
The script is passed the value '92-3771' as the -r parameter.
This is then converted to a full URI and became the input element of the following XML, which is POSTed to the /processes API resource:

<?xml version="1.0" encoding="UTF-8"?>
  <prx:process xmlns:prx="http://genologics.com/ri/processexecution">
    <type>BCL Conversion</type>
    <technician uri="http://localhost:8080/api/v2/researchers/1"></technician>
    <input-output-map shared="false">
       <input uri="http://localhost:8080/api/v2/artifacts/92-3771"></input>
       <output type="ResultFile"></output>
    </input-output-map>
    <process-parameter name="Standard"></process-parameter>
</prx:process>

Notes

Update all URIs in the XML to point to the hostname and API version for the system.
Provide a valid URI for the lab scientist. There might be a user in the system with LIMS ID of '1'.
If the POST is successful, the API returns the valid XML for the created process.

Note: This scenario is one of the few occasions where the POST succeeds, yet returns XML that differs from the input XML. The results can be confusing, because a standard approach for validating whether POSTs are successful is to compare the output XML with the input. If they differ, assume that the POST failed. However, in this scenario it did not fail.

Finding QC Flags in Aggregate QC (Library Validation) via REST API

When running the Aggregate QC step in Clarity LIMS, the QC pass and fail flags for the samples display in the Record Details screen.

This section explains how to use the API instead to find the samples that passed or failed QC aggregation.

Query the API and filter the results list based on the qc-flag parameter value. For more on filtering, see section.

To filter the list by QC flag with a value of PASSED, use the following example:
To find an individual QC flag result for an individual sample, use the LIMS ID of the sample:\
Then search for the value of the element of the endpoint payload for the artifact.

The <qc-flag> element of the input analyte (sample) artifact is sent into the Aggregate QC step.

To demonstrate this detail, review the following steps:

In the API, find a single analyte artifact (derived sample) that has passed QC. The XML QC flag value is PASSED.
In Clarity LIMS, find the same sample and change the value of the element from PASSED to FAILED. Save the change.
In the API, find the sample again. See that the XML QC flag value is set to FAILED.

Setting the Value of a QC Flag on an Artifact

The QC flag parameter qc-flag can be set on an input or output analyte (derived sample) or on an individual result file (measurement) with a few lines of Groovy code.

In the following example, the qc-flag value of the analyte artifact is set based on the value of the bp_size variable when compared to the threshold1 and threshold2 variables.

The following code determines whether a qc-flag value is previously set, such that a flag is only set if one does not exist.

Cookbook

The Clarity LIMS Cookbook uses example scripts to help you learn how to work with REST and EPP automation scripts. Cookbook recipes are small, specific how-to articles designed to help you understand REST and automation script concepts. Each recipe includes the following:

Explanations about a concept and how a particular programming interface is used in a script.
A snippet of script code to demonstrate the concept.

The API documentation includes the terms External Program Integration Plug-in (EPP) and EPP node.

As of Clarity LIMS v5.0, these terms are deprecated.

EPP has been replaced with automation.
EPP node is referred to as the Automation Worker or Automation Worker node. These components are used to trigger and run scripts, typically after lab activities are recorded in the LIMS.

The best way to get started is to download the example script and try it out. After you have seen how the script works, you can dissect it and use the pieces to create your own script.

Get Started with the Cookbook

Before downloading your first script, do the following actions:

Familiarize yourself with the API Cookbook prerequisites and key concepts found in Development Prerequisites and REST General Concepts.
Use a non-production server for script development.
Familiarize yourself with the coding language.
Use the GLSRestApiUtils file to assist with recipe development.
Review Tips and Troubleshooting.

Script Development with a Non-Production Server

The example script recipes really come to life when you change them and see what happens. Running the scripts often requires new custom fields and master steps to be added to the system. You need unrestricted access to development and test servers (licensed as non-production servers) with Groovy (a coding language). You also need an AI node/automation worker installed so that you can experiment freely.

For more information and recommendations for deploying and copying scripts in development, test, and product environments, refer to Useful Tools.

Script Types

Groovy

The Cookbook Recipe Examples are written in Groovy. Many of our examples use the following Groovy concepts:

Closures: Groovy closures are essentially blocks of code that can be stored for later use.
The each method: The each method takes a closure as an argument. It then iterates through each element in a collection, performing the closure on the element, which is (by default) stored in the 'it' variable. For example:
```
outputNodes.each {
    GLSRestApiUtils.setUdfValue(it, 'Library Size', '25')
}
```

Python

The Cookbook also provides a few examples written in Python, which uses the minidom module. The following script shows how the minidom module is used:

dom = parseString(pXML)
elementList = dom.getElementsByTagName("udf:field")
for element in elementList:
    name = element.getAttribute("name")
    if name == udfName:
        udf = api.getInnerXml(element.toxml(), "udf:field")

This same functionality can be obtained using any programming language capable of interacting with the Web API. For more information on the minidom module, refer to Python Minidom.

Use GLSRestApiUtils

In addition to the Groovy file example attached to each Cookbook recipe page, most recipes require the glsapiutil.py file, which is available on our GitHub repository. The mature glsapiutil.py library is strictly for Python 2. A newer version, glsapiutil3.py, works with Python 3.

For more information on these files, see Obtain and Use the REST API Utility Classes.

Tips and Troubleshooting

This article provides hints and tips to help you get the most out of the Cookbook recipes included in this section.

File attachments

When reading a recipe, look for file attachments. Almost all examples have an attached Groovy script to download.

Working with the attached scripts

To use the scripts with a non-production server, edit the script to include your server network address and credentials.

For illustration purposes, most scripts use populated information. You must add your own sample, process (eg, a master step in Clarity LIMS v5 and later), and other data. The non-production server has a directory set up for this purpose at

Using Full Production Scripts

When using full production scripts, the following considerations must be taken:

Cookbook scripts are written to explain concepts. They are not deeply engineered code written in a defensive programming style. Always think through the expected and unexpected input of your scripts when incorporating concepts or code from Cookbook recipe examples.
Full production servers can require different configurations for scripting languages other than Groovy, and for the EPP/automation worker node. For example, your script directory can be accessible by the user account running the EPP/automation worker node for User Interface (UI) triggers.

Discuss the software deployment plans with your system administrator to coordinate between non-production and production servers. For more information on using production scripts, see and .

Version Compatibility

Each recipe was written with a specific API version. For information on how to check the version of the API on your system, see .

Apache Groovy is required for most Cookbook examples. It is open source and is available under an Apache license from . It is installed on non-production servers, but you can also install it to your desktop. The Cookbook examples were developed with Groovy v1.7.

Python is required for some Cookbook examples. It is available from . The Cookbook examples were developed with Python v2.7.

Path to Groovy

The automation worker node executing the command uses the first instance of Groovy it finds in the executable search path for the limited shell. This is the $PATH variable.

If you have multiple versions of Groovy (or multiple users using different versions) and experience problems with your command-line calls, declare the full path to Groovy/Java in your command.

To see your executable search path, and other environment variables available to you, run the following command:

Compare this command to the full logon shell, which is

For more information on command-line actions, see .

References

For details on the programming interface methods and data elements available, refer to the following documentation:

Browser Plug-Ins

Browsing for, and adjusting resources, in Firefox, Chrome, or other browsers is great for getting started or for troubleshooting.

The following plug-ins are available with Firefox:

Text Link—Makes any URI in the XML a hyperlink.
Linkificator—Converts text links into selectable links.
RESTClient—Provides a simple interface to call HTTP methods on REST resources. It is useful for troubleshooting, checking error codes, and for getting comfortable with GET, PUT, and POST requests.

The following plug-ins are available with Chrome:

Advanced REST Client—Provides similar functionality to Poster by Firefox.
XML Tree—Displays XML data in a user-friendly way.

Obtain and Use the REST API Utility Classes

This page is maintained for posterity, but customers are encouraged to visit the repository for all subsequent updates to the library (including changelogs). Unless otherwise specified, changes are only made in the Python version of the library.

Changelog

Dec. 19, 2017:

glsapiutil v3 ALPHA (bleeding-edge library) released on GitHub. GitHub has the most current library.
Links to library removed from this page.

Dec. 15, 2016:

reportScriptStatus() function had a bug that caused it to not work when a <message> node was unavailable. This has been fixed.
deleteObject() functions now available for both v1 and v2 of the library.
getBaseURI() should now return a trailing slash at the end of the URI string.
getFiles() function added to batch retrieve files.

NOTE: The Python glsapiutil.py and glsapiutil3.py classes are now available on GitHub. GitHub has the most current libraries. glsapiutil3.py works with both Python v2 and v3.

Legacy Overview

The GLSRestApiUtils utility class provides a consistent way to perform common REST operations, such as REST HTTP methods or common XML string manipulation. It is a utility class written in Python and Groovy for the API Cookbook examples. This utility class is specific to the Cookbook examples. The class is not required for the API with Groovy or Python, as there are many other ways to manipulate HTTP and XML in these languages. However, it is required if you want to run the Cookbook examples as written. It is also not part of REST or EPP/automation.

Using Utility Calls in your Scripts

Almost all Cookbook example files use the HTTP methods from the GLSRestApiUtils class.

The HTTP method calls in Groovy resemble the following example:

In this example, the returnNode and inputNode are Groovy nodes containing XML. The XML in the returnNode contains the XML available from the server after a successful method call. If the method call was unsuccessful, the XML contains error information. The following is an example of the XML manipulation functions in the utility:

As you can see from these examples, the utility class is easy to include in your scripting. The code is contained in the GLSRestApiUtils files attached to this page.

Deploying Groovy and Python scripts

Deploying a Groovy Script that Uses the Utility Class

To deploy a Groovy script that uses the utility class, you must include the directory containing GLSRestApiUtils.groovy in the Groovy class path.

Groovy provides several ways to package and distribute source files, including the following methods:

Call Groovy with the -classpath (or -cp) parameter.
Add to the CLASSPATH environment variable.
Create a ~/.groovy/lib directory for jar files for common libraries.

If you would like to experiment with the Cookbook examples, you can also copy the file into the same directory as the example script.

Use the Python Library

Library functions

The HTTP method calls for the Python version of the library resemble the following:

Unlike with the Groovy library, the rest functions in the Python library require XML (text) as input (not DOM nodes). The return values of the GET, PUT, and POST functions are also XML text.

Deploy Scripts with the Python Library in Clarity LIMS (v4 and Later)

If a script must work with a running process or step, it is normal to use either the {processURI:v2} or the {stepURI:v2} tokens. The following example has the {stepURI:v2} token:

In Clarity LIMS v4 and above, these tokens sometimes resolve to https://localhost:9080/api/v2/... instead of the expected HOSTNAME. Setting up the API object with a hostname other than https://localhost:9080 can cause Access Denied errors. To avoid this issue, alter the API authentication code slightly as follows.

TThe changes are highlighted in red. This code takes the resolved {stepURI:v2} token (assumed to be stored in the args object) and resets the HOSTNAME variable to the new value (eg, https://localhost:9080) before authenticating.

These changes are fully backward-compatible with Clarity LIMS v4 or earlier. The EPP/automation URI tokens resolve to the expected hostname, and the setupGlobalsFromURI() function still parses it correctly.

NOTE: On , in addition to the libraries, a basic_complete_recipe.py script that contains the skeleton code is needed to get started with the Python API. This script also includes the modifications required to work with Clarity LIMS v4 and later. The legacy Groovy library can still be obtained using the attachment.

Attachments

GLSRestApiUtils.groovy:

Work with EPP/Automation and Files

You can configure the automation trigger and use automation to invoke any external program that runs from a command line. Refer to the following for details:

EPP automation/support is compatible with API v2 r21 and later.

The API documentation includes the terms External Program Integration Plug-in (EPP) and EPP node.

As of Clarity LIMS v5.0, these terms are deprecated.

EPP has been replaced with automation.
EPP node is referred to as the Automation Worker or Automation Worker node. These components are used to trigger and run scripts, typically after lab activities are recorded in the LIMS.

Automation Trigger Configuration

Automations (formerly referred to as EPP triggers or automation actions) allow lab scientists to invoke scripts as part of their workflow. These scripts must successfully complete for the lab scientist to proceed to the next step of the workflow.

EPP automation/support is compatible with API v2 r21 and later.

The API documentation includes the terms External Program Integration Plug-in (EPP) and EPP node.

As of Clarity LIMS v5.0, these terms are deprecated.

EPP has been replaced with automation.
EPP node is referred to as the Automation Worker or Automation Worker node. These components are used to trigger and run scripts, typically after lab activities are recorded in the LIMS.

Automations have various uses, including the following:

Workflow enforcement—Makes sure that samples only enter valid protocol steps.
Business logic enforcement—Validates that samples are approved by accounting before work is done on them. This automation can also make sure that selected samples are worked on together.
Automatic file generation—Automates the creation of driver files, sample sheets, or other files specific to your protocol and instrumentation.
Notification—Notifies external systems of lab progress. For example, you can notify Accounting of completed projects so that they can then bill for services rendered.

Configuration

You can enable automations on master steps in two configuration areas of Clarity LIMS:

On the Automations tab, when adding/configuring an automation. See the Adding and Configuring Automations article in the Automations section of the Clarity LIMS documentation.
On the Lab Work tab, on the master step configuration form. See the _Adding & Configuring Master Steps and Step_s article in the Steps and Master Steps section of the Clarity LIMS documentation.

After it is enabled on a master step, the automation becomes available for use on all steps derived from that master step.

You can configure the automation trigger on the master step, or on the steps derived from that master step.

Script messages

Progress message

While executing a script, if more than one script would be triggered for a single user action, they are reported in sequence. This reporting continues until all scripts complete, or one of them fails.

An example scenario would be a step that is configured to execute the following:

One script upon exit of the Placement screen.
A second script upon entry of the Record Details screen.

In this scenario, when the lab scientist advances their protocol step from the Placement screen to the Record Details screen, the scripts are executed in sequence.

The parameter string/automation name configured on the master step is displayed in a progress message. You can use this feature by giving your parameter strings/automations meaningful names that provide you with context about what the script is doing. The following is an example of a progress message.

![In\_Progress.png](https://genologics.zendesk.com/attachments/token/yawon1xdfirt9mm/?name=In+Progress.png)

You cannot proceed until the script completes successfully.

Non-responsive scripts

You can request to cancel a script that is not responsive. While canceling abandons the monitoring of script execution, it does not stop the execution of the script.

After canceling a script, follow up with the Clarity LIMS administrator to determine if the AI node/automation worker must be restarted.

Non-Blocking Success and Warning Message

The scientific programmers in your facility can provide you with a message upon successful execution of a script. There are two possible non-fatal messages: OK and WARNING. These messages can be set using the step program status REST API endpoint.

Message boxes display the script name, followed by a message that is set by the script using the step program status REST API endpoint. Line breaks are permitted in the custom message. The following is an example of a success message:

After you select OK, you are permitted to proceed in the workflow.

Blocking Script Failure Message

When a script fails, a message box displays. There are two ways to produce fatal messages:

By using the step program status REST API endpoint (informing FAILURE as the status)
By generating output to the console and returning a non-zero exit code.

For example, when beginning a step, if the script does not allow you to work on the samples together in Ice Bucket, the samples will be returned to Ice Bucket after acknowledging the error message. In this case, the step is prevented from being tracked. The following is an example of a failure message:

If you attempt to advance a step from the Pooling screen, but an error is detected, the error state prevents you from continuing. The following is an example of this type of message:

After you select OK, you are prevented from proceeding in the workflow. Instead, you must return to the Pooling screen and address the problem before proceeding.

Work with Submitted Samples

When working with submitted samples, you can do the following:

Adding Samples to the System
Renaming Samples
Assigning Samples to Workflows
Updating Sample Information
Show the Relationship Between Samples and Analyte Artifacts (Derived Samples)

Adding Samples to the System

You can add samples to the system using API (v2 r21 and later). This example assumes that you have sample information in a file that is difficult to convert into a format suitable for importing into Clarity LIMS. The aim is to add the samples, and all associated data, into Clarity LIMS without having to translate the file manually. You can use the REST API to add the samples.

Prerequisites

Follow the instructions provided in the following examples:

Code example

To add a sample in Clarity LIMS, you must assign it to a project and place it into a container. This example assumes that you are adding a new project and container for the samples being created.

Step 1. Create a New Project

As shown in , you define a project by using StreamingMarkupBuilder. StreamingMarkupBuilder is a built-in Groovy data structure designed to build XML structures. This structure creates the XML that is used in a POST to the projects resource:

If the POST to projects is successful, the following XML is returned:

Step 2. Create a New Container

As shown in the example, you can add a container by using StreamingMarkupBuilder to create the XML for a new container. This creates the XML that is used in a POST to the containers resource:

If the POST to containers is successful, the following XML is returned:

Step 3. Create a New Sample

Now that you have the project and container, you can use StreamingMarkupBuilder to create the sample. The XML created to add the sample uses the URIs for the project and container that were created in the previous steps.

This POST to the samples resource creates a sample in Clarity LIMS, adding it to the project and container specified in the POST.

Expected Output and Results

In Clarity LIMS Projects and Samples dashboard, open the project to find the new sample in its container.

Attachments

PostSample.groovy:

Renaming Samples

You can rename samples in the system using API (v2 r21 and later). The amount of information provided in the sample name is sometimes minimal. After the sample is in the system, you can add additional information to the name. For example, you can help lab scientists understand what they must do with a sample, or where it is in processing.

As of Clarity LIMS v5, the term user-defined field (UDF) has been replaced with custom field in the user interface. However, the API resource is still called UDF.

There are two types of custom fields:

Master step fields—Configured on master steps. Master step fields only apply to the following:
- The master step on which the fields are configured.
- The steps derived from those master steps.
Global fields—Configured on entities (eg, submitted sample, derived sample, measurement, etc.). Global fields apply to the entire Clarity LIMS system.

Code example

Clarity LIMS displays detailed information for each sample, including its name, container, well, and date submitted.

In this example, the sample name is Colon-1. To help keep context when samples are processed by default, the submitted sample name is used for the downstream samples (or derived samples) generated by a step in Clarity LIMS.

Step 1. Retrieve the Sample

Before you rename a sample, you must first request the resource via a GET. As REST resources are self-contained entities, always request the full XML representation before editing the portion that you wish to change.

The XML representations of individual REST resources are self-contained entities. Always request the full XML representation before editing any portion of the XML. If you do not use the complete XML when you update the resource, you can inadvertently change data.

The following GET method below returns the full XML structure for the sample:

// Retrieve the sample
sampleURI = "http://${hostname}/api/v2/samples/${sampleLIMSID}"
sample = GLSRestApiUtils.httpGET(sampleURI, username, password)

The variable sample now holds the complete XML structure returned from the sampleURI.

The following example shows the XML for the sample, with the name element on the second line. In this particular case, the Clarity LIMS configuration has expanded the sample with 18 custom fields that provide sample information.

<smp:sample uri="http://yourIPaddress/api/v2/samples/HAM754A1" limsid="HAM754A1">
<name>Colon-1</name>
<date-received>2010-03-25</date-received>
<project uri="http://yourIPaddress/api/v2/projects/HAM754" limsid="HAM754"/>
<artifact uri="http://yourIPaddress/api/v2/artifacts/HAM754A1PA1?state=11822" limsid="HAM754A1PA1"/>
<udf:field type="String" name="Organism">Homo sapiens</udf:field>
<udf:field type="String" name="Gender">Unknown</udf:field>
<udf:field type="Numeric" name="Concentration (ng/uL)">200.0</udf:field>
<udf:field type="Numeric" name="Total Volume (uL)">50</udf:field>
<udf:field type="String" name="Investigator Sample Name">F3_5016</udf:field>
<udf:field type="String" name="Sample Group Phase">Pick Plate</udf:field>
<udf:field type="String" name="Date of Sample Group Phase">2010-05-19</udf:field>
<udf:field type="String" name="Sample QC">FAILED</udf:field>
<udf:field type="String" name="Sample QC Date">2010-04-12</udf:field>
<udf:field type="String" name="Sample Picked">Yes</udf:field>
<udf:field type="String" name="Sample Latest Pick Date">2010-05-19</udf:field>
<udf:field type="Numeric" name="Sample Picked Iteration">3</udf:field>
<udf:field type="String" name="Sample Library Requeue Status"/>
<udf:field type="String" name="Sample Library Requeue Date">2010-05-19</udf:field>
<udf:field type="Numeric" name="Paired End Read Sequencing Lane Iteration">2</udf:field>
<udf:field type="String" name="Paired End Read Sequencing Last Date">2010-01-15</udf:field>
<udf:field type="String" name="Investigator Last Name">Hershberger</udf:field>
<udf:field type="String" name="Cohort">Not Applicable</udf:field>
</smp:sample>

Step 2. Rename the Sample

Renaming the sample consists of the following:

The name change in the XML
The PUT call to update the sample resource

The name change is executed with the nameNode XML element node, which references the XML element containing the name of the sample.

// Rename the sample
nameNode = sample.name[0]
nameNode.setValue(newName)
returnNode = GLSRestApiUtils.httpPUT(sample, sample.@uri, username, password)

The PUT method updates the individual sample resource using the complete XML representation, which includes the new name. Such complete updates provide a simple interaction between client and server.

Expected Output and Results

The updated sample view displays the new name. You can also view the results in a web browser via the URI at

http://<YourIPaddress>/api/v2/samples/<SampleLIMSID>

<smp:sample uri="http://yourIPaddress/api/v2/samples/HAM754A1" limsid="HAM754A1">
<name>Colon-1 Updated</name>
<date-received>2010-03-25</date-received>
<project uri="http://yourIPaddress/api/v2/projects/HAM754" limsid="HAM754"/>
<artifact uri="http://yourIPaddress/api/v2/artifacts/HAM754A1PA1?state=11822" limsid="HAM754A1PA1"/>
<udf:field type="String" name="Organism">Homo sapiens</udf:field>
<udf:field type="String" name="Gender">Unknown</udf:field>
<udf:field type="Numeric" name="Concentration (ng/uL)">200.0</udf:field>
<udf:field type="Numeric" name="Total Volume (uL)">50</udf:field>
<udf:field type="String" name="Investigator Sample Name">F3_5016</udf:field>
<udf:field type="String" name="Sample Group Phase">Pick Plate</udf:field>
<udf:field type="String" name="Date of Sample Group Phase">2010-05-19</udf:field>
<udf:field type="String" name="Sample QC">FAILED</udf:field>
<udf:field type="String" name="Sample QC Date">2010-04-12</udf:field>
<udf:field type="String" name="Sample Picked">Yes</udf:field>
<udf:field type="String" name="Sample Latest Pick Date">2010-05-19</udf:field>
<udf:field type="Numeric" name="Sample Picked Iteration">3</udf:field>
<udf:field type="String" name="Sample Library Requeue Status"/>
<udf:field type="String" name="Sample Library Requeue Date">2010-05-19</udf:field>
<udf:field type="Numeric" name="Paired End Read Sequencing Lane Iteration">2</udf:field>
<udf:field type="String" name="Paired End Read Sequencing Last Date">2010-01-15</udf:field>
<udf:field type="String" name="Investigator Last Name">Hershberger</udf:field>
<udf:field type="String" name="Cohort">Not Applicable</udf:field>
</smp:sample>

Attachments

RenamingSample.groovy:

1KB

RenamingSample.groovy

Assigning Samples to Workflows

You can use the API (v2 r21 and later) to automate the process of assigning samples to a workflow. This example shows how to create the required XML. The example also provides a brief introduction on how to use the route/artifacts endpoint, which is the endpoint used to perform the sample assignment.

The example takes two samples that exist in Clarity LIMS and assigns each of them to a different workflow.

Code example

Step 1. Define Assignment Endpoint URI

Define the assignment endpoint URI using the following example. The assignment endpoint allows you to assign the artifacts to the desired workflow.

// Define the assignment endpoint URI
assignmentOrderURI = "http://${hostname}/api/v2/route/artifacts/"

Step 2. Retrieve the Samples' Base Artifact URIs

You can also retrieve the base artifact URIs of the samples using the following example:

// Retrieve the samples' base artifact URIs
sampleAArtifactURI = GLSRestApiUtils.httpGET(sampleURIs[0], username, password).'artifact'.@uri
sampleBArtifactURI = GLSRestApiUtils.httpGET(sampleURIs[1], username, password).'artifact'.@uri

Step 3. Gather Workflow URIs

Use the following example to gather the workflow URIs:

// Gather the required workflow URIs
workflowAURI = workflowURIs[0]
workflowBURI = workflowURIs[1]

Step 4. Construct and POST the XML

Next, you can construct the XML that is posted to perform the workflow assignment. You can do this construction by using the StreamingMarkupBuilder and the following example.

Assign the analyte (derived sample) artifact of the sample to a workflow as follows.

Create an assign tag with the URI of the destination workflow as an attribute.
Create an artifact tag inside the assign tag with the URI of the analyte as an attribute.
After the assignment XML is defined, you can POST it to the API. This POST performs the sample assignment.

// Create a new routing assignment using the Markup Builder
def assignmentOrder = builder.bind {
    mkp.xmlDeclaration()
    mkp.declareNamespace(rt: 'http://genologics.com/ri/routing')
    'rt:routing' {
        'assign'('workflow-uri': workflowAURI) {
            'artifact'(uri: sampleAArtifactURI)
        }
        'assign'('workflow-uri': workflowBURI) {
            'artifact'(uri: sampleBArtifactURI)
        }
    }
}

// Post the commands to the API
assignmentOrderNode = GLSRestApiUtils.xmlStringToNode(assignmentOrder.toString())
assignmentOrderNode = GLSRestApiUtils.httpPOST(assignmentOrderNode, assignmentOrderURI, username, password)
println GLSRestApiUtils.nodeToXmlString(assignmentOrderNode)

Expected Output and Results

After the script has run, the samples display in the first step of the first protocol in the specified workflows.

Attachments

AssigningArtifactsToWorkflows.groovy:

2KB

AssigningArtifactsToWorkflows.groovy

Updating Sample Information

The most important information about a sample is often recorded in custom fields in API (v2 r21 and later). These fields often contain information that is critical to the processing of the sample, such as species or sample type.

When samples come into the lab, you can provide lab scientists with information about priority or quality. You can provide this information by changing the value of specific sample custom fields.

This example shows how to change the value of a sample custom field called Priority after you have entered a submitted sample into the system.

As of Clarity LIMS v5, the term user-defined field (UDF) has been replaced with custom field in the user interface. However, the API resource is still called UDF.

There are two types of custom fields:

Master step fields—Configured on master steps. Master step fields only apply to the following:
- The master step on which the fields are configured.
- The steps derived from those master steps.
Global fields—Configured on entities (eg, submitted sample, derived sample, measurement, etc.). Global fields apply to the entire Clarity LIMS system.

Code example

In Clarity LIMS, you can display detailed information for a sample, including the following:

Name
Clarity LIMS ID
Custom fields

In the following figure, you can see that the sample name is DNA Sample-1 and the field named Priority has the value High.

In this example, change the value of the Priority custom field to Critical.

Step 1. Retrieve the Resource

Before you can change the value of the field, you must first request the resource via a GET method.

To change a sample submitted in Clarity LIMS, use the individual sample resource. The XML returned from a GET on the individual sample resource contains the information about the sample.

The following GET method returns the full XML structure for the sample:

// Retrieve the sample
sampleURI = "http://${hostname}/api/v2/samples/${sampleID}"
sample = GLSRestApiUtils.httpGET(sampleURI, username, password)

The sample variable now holds the complete XML structure returned from the sample GET request.

The XML representations of individual REST resources are self-contained entities. Always request the complete XML representation before editing any portion of the XML. If you do not use the complete XML when you update the resource, you can inadvertently change data.

The following shows XML returned for the sample, with the Priority field shown in red in the second to last line. In this example:

The Clarity LIMS configuration has added three fields to the expanded sample information.
The UDFs are named Sample Type, Phenotypic Information, and Priority.

<smp:sample uri="http://yourIPaddress/api/v2/samples/ADM201A1" limsid="ADM201A1">
    <name>DNA sample-1</name>
    <date-received>2017-04-18</date-received>
    <project uri="http://yourIPaddress/api/v2/projects/ADM201" limsid="ADM201"/>
    <artifact uri="http://yourIPaddress/api/v2/artifacts/ADM201A1PA1?state=2995" limsid="ADM201A1PA1"/>
    <udf:field type="String" name="Sample Type">DNA</udf:field>
    <udf:field type="String" name="Phenotypic Information">Human</udf:field>
    <udf:field type="String" name="Priority">High</udf:field>
</smp:sample>

Step 2. Update the Sample UDF

When updating the Priority field, you need to do the following:

Change the value in the XML.
Use a PUT method to update the sample resource.

You can change the value for Priority to Critical by using the utility files setUdfValue method.

The subsequent PUT method updates the sample resource at the specified URI using the complete XML representation, which includes the new custom field value for XML.

// Update the sample's Priority UDF value
sample = GLSRestApiUtils.setUdfValue(sample, 'Priority', 'Critical')
returnNode = GLSRestApiUtils.httpPUT(sample, sample.@uri, username, password)
println GLSRestApiUtils.nodeToXmlString(returnNode)

A successful PUT returns the new XML in the returnNode. The results can also be reviewed in a web browser at <YourIPaddress>/api/v2/samples/<SampleLIMSID> URI.

An unsuccessful PUT returns the HTTP response code and message in the returnNode XML .

NOTE: The values for the other two fields, Sample Type and Phenotypic Information, did not change. These values did not change because they were included in the XML used in the PUT (eg, they were held in the sample variable as part of the complete XML structure).

If those custom fields had not been included in the XML, they would have been updated to have no value.

Expected Output and Results

The following XML from our example shows the expected output:

<smp:sample uri="http://yourIPaddress/api/v2/samples/ADM201A1" limsid="ADM201A1">
    <name>DNA sample-1</name>
    <date-received>2017-04-18</date-received>
    <project uri="http://yourIPaddress/api/v2/projects/ADM201" limsid="ADM201"/>
    <artifact uri="http://yourIPaddress/api/v2/artifacts/ADM503A1PA1?state=2995" limsid="ADM201A1PA1"/>
    <udf:field type="String" name="Sample Type">DNA</udf:field>
    <udf:field type="String" name="Phenotypic Information">Human</udf:field>
    <udf:field type="String" name="Priority">Critical</udf:field>
</smp:sample>

In Clarity LIMS, the updated sample details now show the new Priority value.

Attachments

UpdateSampleUDF.groovy:

1KB

UpdateSampleUDF.groovy

Work with Containers

When working with containers, you can do the following:

Add an Empty Container to the System

When a lab processes samples, the samples are always in a container of some sort (eg, a tube, a 96-well plate, or a flow cell). In Clarity LIMS, this processing is modeled by placing all samples into containers. Because the Clarity LIMS interface relies on container placement for the display of many of its screens, adding containers is a critical step when running a process or adding samples through the API (v2 r21 or later).

The following example demonstrates how to add an empty container, of a predefined container type, to Clarity LIMS through the API.

If you would like to add a batch of containers to the system, you can increase the script execution speed by using batch operations. For more information, refer to the and the articles in the section.

Code example

Step 1. Define the New Container

Before you can add a container to the system, you must first define the container to be created. You can construct the XML that defines the container using StreamingMarkupBuilder, a built-in Groovy data structure designed to build XML structures.

To construct the XML, you must declare the container namespace because you are building a container. The minimum information that can be specified to create a container are the container name and container type.

If you also want to add custom field values to the container you are creating, you must declare the userdefined namespace.

NOTE: As of Clarity LIMS v5, the term user-defined field (UDF) has been replaced with custom field in the user interface. However, the API resource is still called udf.

Step 2. Post the New Container

The POST command posts the XML constructed by StreamingMarkupBuilder to the containers resource of the API. The POST command also adds a link from the containers URI (the list of containers) to the new container.

Expected Output and Results

The XML for the new container is as follows.

The XML for the list of containers, with the newly added container shown at the end of the list, is as follows.

For Clarity LIMS v5 and above, the Operations Interface Java client has been deprecated, and there is no equivalent Containers view screen in which to view empty containers added via the API. However, if you intend to add samples to Clarity LIMS through the API, this example is still relevant, as you must first add containers in which to place those samples.

Attachments

PostContainer.groovy:

Find the Contents of a Well Location in a Container

As samples are processed in the lab, they are kept in a container. Some of these containers hold multiple samples, and lab scientists often must switch between container tracking and sample tracking.

If you process several containers each day and track them in a list, you would need to find which samples are in those containers. This way, you can record specifics from these container-based activities in relation to the samples from Clarity LIMS.

The example finds which sample is in a given well of a multi-well container using Clarity LIMS and API (v2 r21 or later).

Prerequisites

Before you follow the example, make sure that you have the following items:

Several samples exist in the Clarity LIMS.
A step has been run on the samples.
The outputs of the step have been placed in a 96-well plate.

Code example

Clarity LIMS captures detailed information for a container (eg, its name, LIMS ID, and the names of the sample in each of its wells). Information about the container and what it currently contains is available in the individual XML resource for the container.

The individual container resource contains a placement element for each sample placed on the container. Each placement element has a child element named value that describes one position on the container (eg, the placement elements for a 96-well plate include A:1, B:5, E:2).

Step 1. Retrieve the Container Information

In the script, the GET request retrieves the container specified by the container LIMS ID provided as input to the {containerLIMSID} parameter. The XML representation returned from the API is stored as the value of the container variable:

The following example shows the XML format returned for a container. The XML includes a placement element for each artifact that is placed in a well location in the container.

Step 2. Find the Artifact URI

When you look for the artifact at the target location, the script searches through the placement elements for one with a value element that matches the target. If a match is found, it is stored as the value of the contents variable.

The >uri attribute of the matching placement element is the URI of the artifact that is in the target well location. This is stored as the value of the artifactURI variable, and printed as the output of the script:

Expected Output and Results

Running the script in a console produces the artifact at

Attachments

GetContentsOfWellLocation.groovy:

Filter Containers by Name

Samples in the lab are always in a container (eg, a tube, plate, or flow cell). When a container holds more than one sample, it is often easier to track the container rather than the individual samples. These containers can be found in API (v2 r21 or later).

In Clarity LIMS, containers are identified the LIMS ID or by name. The best way to find a container in the API is with the LIMS ID. However, the API also supports searching for containers by name by using a filter.

LIMS ID—This is a unique ID. The container resource with LIMS ID 27-42 can be found at\
```
http://<YourIPaddress>/api/containers/27-42.
```
Name—Container names can be unique, depending on how the server software was set up. In some labs, container names are reused to show when a container is recycled or when samples are submitted in containers.

The following example shows a container list filtered by name. Your system contains a series of containers, named with a specific naming convention.

Code example

the queried containers are named Smith553 and 001TGZ.

The request for a container with a specific name is structured in the same way as the request for all containers, but also includes a parameter to filter by name:

http://<YourIPaddress>/api/<apiversion>containers?name=<yourcontainername>

The name parameter is repeatable, and the results returned match any of the names queried:

// Determine the containers URIs and retrieve them
containersURI = "http://${hostname}/api/v2/containers?name=" + URLEncoder.encode('Smith553') + "&name=" + URLEncoder.encode('001TGZ')
containers = GLSRestApiUtils.httpGET(containersURI, username, password)
 
// For each container, print its limsid
containers.'container'.each {
    println it.@limsid
}

The GET method returns the full XML structure for the list of containers matching the query. In this case, the method returns the XML structure for containers with the names Smith553 and 001TGZ.

The XML contains a list of container elements. The .each method goes through each container node in the list and prints the container LIMS ID.

The XML returned is placed in the variable containers:

<con:containers>
    <container uri="http://yourIPaddress/api/v2/containers/27-505" limsid="27-505">
        <name>Smith553</name>
    </container>
    <container uri="http://yourIPaddress/api/v2/containers/27-511" limsid="27-511">
        <name>001TGZ</name>
    </container>
</con:containers>

If the system has no containers named Smith553 or 001TGZ, then containers.container is an empty list. The .each method does nothing, as expected.

Expected Output and Results

When execution completes, the code returns the list of LIMS IDs associated with the container names Smith553 and 001TGZ. The name and LIMS IDs are different in this case (eg, 27-505 27-511).

Attachments

GetContainerNameFilter.groovy:

1KB

GetContainerNameFilter.groovy

Work with Derived Sample Automations

When working with containers, you can do the following:

Remove Samples from Workflows
Requeue Samples
Rearray Samples

Remove Samples from Workflows

In Clarity LIMS, derived sample automations are automations that users can run on derived samples directly from the Projects Dashboard.

The following example uses an automation to initiate a script that removes multiple derived samples from workflows. The example also describes the main functions included in the script, and shows how to configure the automation in Clarity LIMS and run it from the Projects Dashboard.

Prerequisites

Before removing samples from the workflows, make sure you have the following items:

A project containing at least one sample assigned to a workflow.
A step has been run on a sample, resulting in a derived sample.
The derived sample is associated with one or more workflows.

Code example

The attached UnassignSamplesFromWorkflows.groovy script uses the derived sample automations feature to remove selected derived samples from their associated workflows. The following actions must be done when removing samples from these workflows.

Step 1. Create Sample Node List

This getSampleNodes function is passed a list of derived sample LIMS IDs (as a command-line argument) to build a list containing the XML representations of the samples. A for-each loop on the derived sample list makes a GET call for each sample and creates the sample node list. The following command-line example shows how the getSampleNodes function works:

def getSampleNodes(sampleList, hostname, username, password){
    sampleURIList = sampleList.collect{"${hostname}v2/artifacts/${it}"}
    return sampleNodeList = GLSRestApiUtils.batchGET(sampleURIList, username, password)
}

This list is used to retrieve the sample URIs and the workflow-stage URIs. These URIs are required to build the unassignment XML.

Step 2. Retrieve Workflow URIs

A sample can be associated with one or many workflows, and each derived sample has a list of the workflow-stages to which it is assigned. Making a GET call on each workflow-stage URI retrieves its XML representation, from which the workflow URI can be acquired and added to a list. The getWorkflowURIs function calls for each sample node included in the list (eg, sampleURIList, username, and password from Step 1. Create Sample Node List).

The for loop does the following actions:

Makes a GET call for each workflow-stage to which the passed sample is assigned.
Retrieves the associated workflow URIs.
Returns a list containing all URIs for the workflows with which the sample is associated.

def getWorkflowURIs(sampleNode, username, password){
    return sampleNode.'workflow-stages'?.'workflow-stage'.collect\
          {GLSRestApiUtils.httpGET(it.@uri, username, password).workflow.@uri}
}

Step 3. Build and POST the Unassignment XML

Now that the functions used to retrieve both the derived sample URIs and the workflow URIs have been built, you can use StreamingMarkupBuilder to create the XML and then POST to the unassignment URI. This process can be done with the unassignSamplesFromWorkflows and unassignSamplesXML functions.

To unassign the derived samples, you can POST to the artifacts URI at ${hostname}v2/route/artifacts. Nested loops create the declaration for each sample and their associated workflows. The following example shows the declaration built in the format of the workflow URI, with the unassign flag followed by the URI of the sample being unassigned.

def unassignSamplesFromWorkflows(sampleNodeList, username, password){
    def builder = new StreamingMarkupBuilder()
    builder.encoding = "UTF-8"
    def unassignSamplesXML = builder.bind{
        mkp.xmlDeclaration()
        mkp.declareNamespace(rt: 'http://genologics.com/ri/routing')
         'rt:routing' {
             //Dynamically build XML using a for loop for each sampleNode
             sampleNodeList.each{ sampleNode ->
                 //Generates the URI list for each sample
                 workflowUriList = getWorkflowURIs(sampleNode, username, password)
                 workflowUriList?.each{
                     'unassign'('workflow-uri': it){
                         'artifact'(uri: sampleNode.@uri)
                     }
                 }
             }
         }
     }
     return GLSRestApiUtils.xmlStringToNode(unassignSamplesXML.toString())
}

Now that the XML is built, convert the XML to a node and post it as follows.

Use GLSRestApiUtils to convert the XML to a node
POST the node using the following command:

unassignSampleNode = unassignSamplesFromWorkflows(sampleNodeList, username, password)
//Try the post
try{
    unassignSampleNode = GLSRestApiUtils.httpPOST(unassignSampleNode, unassignmentURI, username, password)
}

Configuring and Running the Automation

Automations can be configured and run using Clarity LIMS

In Clarity LIMS, under Configuration, select the Automation tab.
Select the Derived Sample Automation tab.
Select New Automation and enter the following information:
- Automation Name—This is the name that displays to the user running the automation from the Projects Dashboard. Choose a descriptive name that reflects the functionality/purpose (eg, Remove from Workflows).
- Channel Name—Enter the channel name.
- Command Line—Enter the command line required to invoke the script.\
Select Save.

Run the automation as follows.

Open the Projects Dashboard.
Select a project containing in-progress samples. Select In-progress samples.
In the sample list, you see the submitted and derived samples that are currently in progress for this project.
Select one or more derived samples.
Selecting samples activates the Action button and drop-down list.
In the Action drop-down list, select the Remove From Workflows automation created in the previous step.

Expected Output and Results

The API of the selected samples now shows an additional workflow stage with a status of REMOVED.

Attachments

UnassignSamplesFromWorkflows.groovy:

4KB

UnassignSamplesFromWorkflows.groovy

Work with Process/Step Outputs

When working with process and step outputs, you can do the following:

Update UDF/Custom Field Values for a Derived Sample Output

As processing occurs in the lab, associated processes and steps are run in Clarity LIMS. Often, key data must be recorded for the derived samples (referred to as analytes in the API) generated by these steps.

The following example explains how to change the value of an analyte UDF/global custom field.

If you would like to update a batch of output derived samples (analytes), you can increase the script execution speed by using batch operations. For more information, see .

Prerequisites

In Clarity LIMS v5 or later, the key data fields are configured as global custom fields on derived samples. If you are using Clarity LIMS v5 or later, make sure you have the following items:

A defined global custom field named Library Size on the Derived Sample object.
A configured Library Prep step to apply Library Size to generated derived samples.
A Library Prep process that has been run and has generated derived samples.

Terminology

As of Clarity LIMS v5, the term user-defined field (UDF) has been replaced with custom field in the user interface. However, the API resource is still called UDF.

There are two types of custom fields:

Master step fields—Configured on master steps. Master step fields only apply to the following:
- The master step on which the fields are configured.
- The steps derived from those master steps.
Global fields—Configured on entities (eg, submitted sample, derived sample, measurement, etc.). Global fields apply to the entire Clarity LIMS system.

Code example

In Clarity LIMS v5 and later, the Record Details screen displays the information about the derived samples generated by a step. You can view the global fields associated with the derived samples in the Sample Table.

The following screenshot shows the Library Size values for the derived samples.

Derived sample information is stored in the API in the analyte resource. Step information is stored in the process resource. Each global field value is stored as an udf.

An analyte resource contains specific derived sample details that are recorded in lab steps. Those details are typically stored in global custom fields (configured in Clarity LIMS on the Derived Sample object) and then associated with the step.

When you update the information for a derived sample by updating the analyte API resource, only the global fields that are associated with the step can be updated.

Step 1. Request the Process Resource

To update the derived samples generated by a step, you must first request the process resource through a GET method.

The following GET method provides the full XML structure for the step:

The process variable now holds the complete XML structure returned from the GET request.

The XML returned from a GET on the process resource contains the URIs of the process output artifacts (the derived samples generated by the step). You can use these URIs to query for each individual artifact resource.

The process resource contains many input-output-map elements, where each element represents an artifact. The following snippet of the XML shows the process:

Because processes with multiple inputs and outputs tend to be large, many of the input-output-map nodes have been omitted from this example.

Step 2. Request the Artifact Resource and Update the Analyte UDF/Custom Field

After you have retrieved each individual artifact resource, you can use this information to update the UDFs/custom fields for each output analyte after you request its resource.

Request the analyte output resource and update the UDF/custom field as follows.

If the output-type is analyte, then run through each input-output-map and request the output artifact resource.
Use a GET to return the XML for each artifact and store it in a variable.
When you have the analytes stored, change the analyte UDF/custom field through the following methods:
- The UDF/custom field change in the XML.
- The http PUT call to update the artifact resource.

The UDF/custom field change can be achieved with the Library Size UDF/custom field XML element defined in the following code. In this example, the Library Size value is updated to 25.

The PUT method updates the artifact resource at the specified URI using the complete XML representation, including the UDF/custom field. The setUdfValue method of the util library is used to perform this in a safe manner.

The output-type attribute is the user-defined name for each of the output types generated by a process/step. This is not equivalent to the type element of an artifact whose value is one of several hard-coded artifact types.

If you must filter inputs or outputs from the input-output-map based on the artifact type, you will must GET each artifact in question to discover its type.

It is important that you remove the state from each of the analyteURIs before you GET them, to make sure that you are working with the most recent state.

Otherwise, when you PUT the analyteURI back with your UDF/custom field changes, you can inadvertently revert information, such as QC, volume, and concentration, to previous values.

Expected Output and Results

The results can be reviewed in a web browser through the following URI:

In Clarity LIMS v5 or later, in the Record Details screen, the Sample table now shows the updated Library Size.

Attachments

UpdateProcessUDFInfo.groovy:

UpdateUDFAnalyteOutput.groovy:

Find the Container Location of a Derived Sample

As samples are processed in the lab, substances are moved from one container to another. Because container locations are sometimes used to reference the sample in data files, tracking the location of these substances within containers is one of the key values that Clarity LIMS provides to the lab.

Within the REST API (v2 r21 or later), analytes represent the substances on which processes/steps are run. These analytes are the substances that are chemically altered and transferred between containers as samples are processed in the lab.

Each individual sample resource has an analyte artifact that describes its container location and is used to run processes.

In Clarity LIMS, steps are not run on the original submitted samples, but are instead run on (and can also generate) derived samples. In the API, derived samples are known as analytes. Each sample resource, which is the original submitted sample in Clarity LIMS, has a corresponding analyte that is used for running processes/steps and describing placement in a container.

For more information on analyte artifacts and other REST resources, see Structure of REST Resources.

Prerequisites

For all Clarity LIMS users, make sure you have done the following actions:

Added a sample to Clarity LIMS.
Run a process/step on the sample, with the same process/step generating a derived sample output.
Added the generated derived sample to a multi-well container (eg, a 96-well plate).

Code Example

The container location information for an individual derived sample/analyte is located within the XML for the individual artifact resource. Because artifacts are generated by running steps in the LIMS, this is a logical place to keep track of the location.

Within a script, you can use a GET method to request the artifact. The resulting XML structure contains all the information related to the artifact, including its container and well location.

In this example, a derived sample named Brain-600 is placed in well A:1 of a container with LIMS ID 27-1259. This information is found in the location element.

<art:artifact uri="http://yourIPaddress/api/v2/artifacts/HAM751A481PA1?state=9995" limsid="HAM751A481PA1">
    <name>Brain-600</name>
    <type>Analyte</type>
    <output-type>Analyte</output-type>
    <volume unit="uL">645.0</volume>
    <concentration unit="ug/mL">0.5478</concentration>
    <qc-flag>UNKNOWN</qc-flag>
    <location>
        <container uri="http://yourIPaddress/api/v2/containers/27-1259" limsid="27-1259"/>
        <value>A:1</value>
    </location>
    <working-flag>true</working-flag>
    <sample uri="http://yourIPaddress/api/v2/samples/HAM751A481" limsid="HAM751A481"/>
</art:artifact>

The location elements has two child data elements:

One linking to the container URI, which specifies which container the analyte is in.
One for the well location, which has the name 'value' in the XML structure.

Valid values for a well location can be either numeric or alphabetic, and are determined by the configuration of the container in Clarity LIMS.

Well locations are always represented in the row:column format. For example, a 96-well plate can have locations A:1 and C:12, and a tube can have a single well called 1:1.

Step 1. Retrieve the Artifact

Use the following XML example to retrieve the artifact:

// Retrieve the artifact
artifactURI = "http://${hostname}/api/v2/artifacts/${artifactLIMSID}"
artifact = GLSRestApiUtils.httpGET(artifactURI, username, password)

Step 2. Access, Store, and Print the Container Location

Because the container position is structured in the row:column format, you can store the row and column in separate variables by splitting the container position on the colon character. You can access the string value of the location value node using the text() method, as shown in the following code:

// Separate the artifact's position inside of its container
containerPosition = artifact.location.value.text()
positionList = containerPosition.tokenize(':')
 
// Output its position
row = positionList[0]
column = positionList[1]
println "This sample is located at row: $row, column: $column"

Expected Output and Results

Running the script in a console produces the following output:

This sample is located at row: A, column: 1

Attachments

GetContainerAnalyteLocation.groovy:

1KB

GetContainerAnalyteLocation.groovy

Traverse a Pooled and Demultiplexed Sample History/Genealogy

The large capacity of current Next Generation Sequencing (NGS) instruments means that labs are able to perform multiplexed experiments with multiple samples pooled into a single lane or region of the container. Before being pooled, samples are assigned a unique tag or index. After sequencing and initial analysis are complete, the sequencing results must be demultiplexed to separate data and relate the results back to each individual sample.

Clarity LIMS allows you to track a multiplexing workflow by adding reagents and reagent labels to artifacts, and then using the reagent labels to demultiplex the resulting files.

There are several ways to apply reagent labels. However, all methods involve creating placeholders that link the final sequences back to the original submitted samples. Either the lab scientist or an automated process must determine which file actually belongs with which placeholder. For more information on applying reagent labels, refer to .

This example walks through assigning user-defined field (UDF)/custom field values to the demultiplexed output files based on upstream derived sample (analyte) UDF/custom field values. This includes upwards traversal of a sample history / genealogy, based on assigned reagent labels. This differs from upstream traversal based strictly upon process input-output mappings.

As of Clarity LIMS v5, the term user-defined field (UDF) has been replaced with custom field in the user interface. However, the API resource is still called UDF.

There are two types of custom fields:

Master step fields—Configured on master steps. Master step fields only apply to the following:
- The master step on which the fields are configured.
- The steps derived from those master steps.
Global fields—Configured on entities (eg, submitted sample, derived sample, measurement, etc.). Global fields apply to the entire Clarity LIMS system.

Prerequisites

If you are using Clarity LIMS v5 or later, make sure you have completed the following actions:

Created a project and have added multiple samples to it.
Run the samples through a sequence of steps that perform the following:
- Reagent addition / reagent label assignment
- Pooling
- Demultiplexing (to produce a set of per-reagent-label result file outputs).
Set a Numeric custom field value on each derived sample input to the reagent addition process.
A Numeric custom field with no assigned value exists on each of the per-reagent-label result file outputs. The value of this field will be computed from the set of upstream derived sample custom field values corresponding to the reagent label of the result file.

You also must make sure that API v2 r21 or later is installed.

Code Example

Due to the complexity of NGS workflows, beginning at the top level submitted sample resource and working down to the result file is not the most efficient way to traverse the sample history/genealogy. It is easier to start with the result file artifact, and then trace upward to find the process with the UDFs/custom fields that you are looking for.

Starting from the per-reagent-label result file, you can traverse upward in the sample history using the parent process URI in the XML returned for each artifact. At each level of the sample history, the number of artifacts returned may increase due to processes that pooled individual artifacts.

In this example:

The upstreamArtifactLUIDs list represents the current set of relevant artifacts.
The foundUpstreamArtifactNodes list stores the target upstream artifact nodes found.
The sample history traversal stops at the inputs to the process that performed the reagent addition/reagent label assignment.

The traversal is executed using a while loop over the contents of the upstreamArtifactLUIDs list.

The list serves as a stack of artifacts. With each iteration of the loop, an artifact is removed from the end of the list and the relevant input artifacts to its parent process are pushed back onto the end of the list.

After the loop has executed, the foundUpstreamArtifactNodes list will contain all of the artifacts that are assigned the reagent label of interest upon execution of the next process in the sample history.

The final step in the script assigns a value to a Numeric UDF / custom field on the per-reagent-label output result file, Mean DNA Prep 260:280 Ratio, by computing the mean value of a Numeric UDF / custom field on each of the foundUpstreamArtifactNodes, DNA prep 260:280 ratio.

First, compute the mean using the following example:

Then, set the UDF/custom field on the per-reagent-label output result file using the following example:

Attachments

TraversingPooledDemuxGenealogy.groovy:

Work with Projects and Accounts

When working with projects and accounts, you can do the following:

Remove Information from a Project

The following example shows you how to remove information from a project using Clarity LIMS and API (compatible with v2 r21 and later).

As of Clarity LIMS v5, the term user-defined field (UDF) has been replaced with custom field in the user interface. However, the API resource is still called UDF.

There are two types of custom fields:

Master step fields—Configured on master steps. Master step fields only apply to the following:
- The master step on which the fields are configured.
- The steps derived from those master steps.
Global fields—Configured on entities (eg, submitted sample, derived sample, measurement, etc.). Global fields apply to the entire Clarity LIMS system.

Prerequisites

Before you follow the example, make sure that you have the following items:

A user-defined field (UDF) / custom field named Objective is defined for projects.
A project name that is unique and does not exist in the system.

Code Example

This example does the following actions:

POST a new project to the LIMS, with a UDF / custom field value for Objective.
Remove a child XML node from the parent XML representing the project resource.
Update the project resource.

Step 1. POST a New Project with a UDF Value for Objective

First, set up the information required to perform a successful project POST. The project name must be unique.

The projectNode should contain the response XML from the POST and resemble the following output:

Step 2. Remove Child XML Node from Parent XML Node

The following code removes the child XML node <udf:field> from the parent XML node <prj:project>:

If multiple nodes of the same type exist, [0] is the first item in this list of same typed nodes (eg, 0 contains 1st item, 1 contains 2nd item, 2 contains 3rd item, and so on).

To remove the 14th udf:field, you would use projectNode?.children()?.remove(projectNode.'udf:field'[13])

Attachments

RemoveChildNode.groovy:

Add a New Project to the System with UDF/Custom Field Value

Imagine that you use projects in Clarity LIMS to track a collection of sample work that represents a subset of work from a larger translational research study. The translational research study consists of several projects within the LIMS and the information about each of the projects that make up the research study is predefined in another system.

Before the work starts in the lab, you can use the information in the other system to automatically create projects. This reduces errors and means that lab scientists do not have to spend time manually entering data a second time.

This example shows how to automate the creation of a project using a script and the projects resource POST method.

As of Clarity LIMS v5, the term user-defined field (UDF) has been replaced with custom field in the user interface. However, the API resource is still called UDF.

There are two types of custom fields:

Master step fields—Configured on master steps. Master step fields only apply to the following:
- The master step on which the fields are configured.
- The steps derived from those master steps.
Global fields—Configured on entities (eg, submitted sample, derived sample, measurement, etc.). Global fields apply to the entire Clarity LIMS system.

Prerequisites

Before you follow the example, make sure you have the following items:

A user-defined field (UDF) / custom field named Objective is defined for projects.
A project name that is unique and does not exist in the system.
A compatible version of API (v2 r21 or later).

Code Example

Before you can add a project to the system via the API, you must construct the XML representation for the project you want to create. You can then POST the new project resource.

Step 1. Define the Project

You can define the project XML using StreamingMarkupBuilder, a built-in Groovy data structure designed to build XML structures.

Declare the project namespace because you are building a project.
If you wish to include values for project UDFs as part of the project XML you are constructing, then you must also declare the userdefined namespace.

In the following example, the project name, open date, researcher, and a UDF / custom field named Objective are included in the XML constructed for the project.

// Determine project list's URI
String projectsListURI = "http://${hostname}/api/v2/projects"
 
def builder = new StreamingMarkupBuilder()
builder.encoding = "UTF-8"
openDate = "2017-08-22"
 
// Build a new project using Markup Builder
def projectDoc = builder.bind {
    mkp.xmlDeclaration()
    mkp.declareNamespace(prj: 'http://genologics.com/ri/project')
    mkp.declareNamespace(udf: 'http://genologics.com/ri/userdefined')
    'prj:project'{
        'name'(projectName)
        'open-date'(openDate)
        'researcher'(uri:"http://${hostname}/api/v2/researchers/1")
        'udf:field'(name:"Objective", "To test httpPOST")
    }
}

UDFs / custom fields must be configured in the Clarity LIMS before they can be set or updated using the API. You can find a list of the fields defined for a project in your system by using the resource: http://youripaddress/api/v2/configuration/udfs and looking for those with an attach-to-name of 'project'.
For Clarity LIMS v5 or later, UDTs are only supported in the API.

Step 2. Post the project

For the POST to the projects resource to be successful, only project name and researcher URI are required. Adding more details is a good practice for keeping your system organized and understanding what must be accomplished for each project.

The following POST command adds a new project resource using the XML constructed by StreamingMarkupBuilder:

// Turn the markup into a node and post it to the API
def projectNode = GLSRestApiUtils.xmlStringToNode(projectDoc.toString())
projectNode = GLSRestApiUtils.httpPOST(projectNode, projectsListURI, username, password)
println GLSRestApiUtils.nodeToXmlString(projectNode)

Expected Output and Results

The XML returned after a successful POST of the XML built by StreamingMarkupBuilder is the same as the XML representation of the project:

<prj:project uri="http://yourIPaddress/api/v2/projects/ADM1101" limsid="ADM1101">
    <name>httpPOST Project</name>
    <open-date>2017-08-22</open-date>
    <researcher uri="http://yourIPaddress/api/v2/researchers/1"/>
    <udf:field xmlns:udf="http://genologics.com/ri/userdefined" type="String" name="Objective">To test httpPOST</udf:field>
        <permissions uri="http://yourIPaddress/api/v2/permissions/projects/ADM1101"/>
</prj:project>

Attachments

PostProject.groovy:

2KB

PostProject.groovy

Get a Project Name

Projects contain a collection of samples submitted to the lab for a specific goal or purpose. Often, a script needs information recorded at the project level to do its task. In this simple example, an HTTP GET against a project is shown to obtain information on the project in XML.

Prerequisites

Before you follow the example, make sure you have the following items:

A project exists with name "HTTP Get Project Name with GLS Utils".
The LIMS ID of the project above, referred to as <project limsid>.
A compatible version of API (v2 r21 or later).

Code Example

The easiest way to find a project in the system is with its LIMS ID.

If the project was created in the script (with an HTTP POST) then the LIMS ID is returned as part of the 201 response in the XML.
If the LIMS ID is not available, but other information uniquely identifies it, you can use the project (list) resource to GET the projects and select the right LIMS ID from the collection.

Working with list resources generally requires the same script logic, so if you need the list of projects to find a specific project then review example. This example demonstrates listing and finding resources for labs, but the same logic applies.

Step 1. Determine the Project's URI

The first step is to determine the URI of the project:

Step 2. Retrieve the Project Resource

Next, use the project LIMS ID to perform an HTTP GET on the resource, and store the response XML in the variable named projectNode:

The projectNode variable can now be used to access XML elements and/or attributes.

Step 3. Obtain the Project Name

To obtain the project's name ask the projectNode for the text representation of the name element:

Expected Output and Results

Attachments

GetProjectName.groovy:

Find an Account Registered in the System

Imagine that each month the new external accounts with which your facility works are contacted with a Welcome package. In this scenario, it would be helpful to obtain a list of accounts that have been modified in the past month.

NOTE: In Clarity LIMS v2.1 and later, the term Labs was replaced with Accounts. However, the API resource is still called labs.

Prerequisites

Before you follow the example, make sure you have the following items:

Several accounts exist in the system.
At least one of the accounts was modified after a specific date.
A compatible version of API (v2 r21 or later).

Code Example

In LIMS v6.2 and later, in the Configuration > User Management page, the Accounts view lists the account resources available.

To obtain a list of all accounts modified after a specific date, you can use a GET request on the accounts list resource and include the ?last-modified filter.

To specify the last month, a Calendar object is instantiated. This Calendar object is initially set to the date and time of the call, rolled back one month, and then passed as a query parameter to the GET call.

// Retrieve and format a date
c = Calendar.getInstance()
c.add(Calendar.MONTH, -1)
df = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ssZ")
time = df.format(c.getTime()).replace('+', '%2B')
 
// Filter labs by last-modified on specified date
labsURI = "http://${hostname}/api/v2/labs?last-modified=" + time
labs = GLSRestApiUtils.httpGET(labsURI, username, password)
 
// For each lab, retrieve it
labs.'lab'.each {
    labResource = GLSRestApiUtils.httpGET(it.@uri, username, password)
    println "Welcome to the team, ${labResource.name.text()}"
    labState = labResource.'shipping-address'.state.text()
     
    // If lab is in Washington, print message
    if (labState == "WA") {
        println "Go Huskies!"
    }
}

The first GET call returns a list of the first 500 labs that meet the date modified criterion specified. The script iterates through each lab element to look at individual lab details. For each lab, a second GET method populates a lab resource XML node with address information.

The REST list resources are paged. Only the first 500 items are returned when you query for a list of items, (eg, http://youripaddress/api/v2/artifacts).

If you cannot filter the list, it is likely that you must iterate through the pages of a list resource to find the items that you are looking for. The URI for the next page of resources is always the last element on the page of a list resource.

Expected Output and Results

In the following example, the XML returned lists three out of the four labs, excluding one due to the date filter:

<lab:labs>
    <lab uri="http://yourIPaddress/api/v2/labs/2">
       <name>Administrative Lab</name>
    </lab>
    <lab uri="http://yourIPaddress/api/v2/labs/52">
      <name>Giant Facility</name>
    </lab>
    <lab uri="http://yourIPaddress/api/v2/labs/102">
      <name>University West</name>
    </lab>
</lab:labs>

One of the labs has 'WA' recorded as the state, adding a second printed line to the output:

Welcome to the team, Administrative Lab
Welcome to the team, Giant Facility
Welcome to the team, University West
Go Huskies!

Attachments

GetLab.groovy:

1KB

GetLab.groovy

Update Contact (User and Client) Information

The researcher resource holds the personal details for users and clients in Clarity LIMS.

Suppose that you have a separate system that maintains the contact details for your customers and collaborators. You could use this system to synchronize the details for researchers with the details in Clarity LIMS. This example shows how to update the phone number of a researcher using a PUT to the individual researcher resource.

In the Clarity LIMS user interface, the term Labs has been replaced with Accounts. However, the API resource is still called labs and the Collaborations Interface still refers to Labs rather than Accounts. The term Contact has been replaced with Client. The API resource is still called contact.

The LabLink Collaborations Interface is not supported in Clarity LIMS v5 and later. However, because support for this interface is planned for a future release, the Collaborator user role has not been removed.

Prerequisites

Before you follow the example, make sure you have the following items:

A defined client in Clarity LIMS.
A compatible version of API (v2 r21 or later).

Code Example

For Clarity LIMS v5 and later, in the web interface, the User and Clients screen lists all users and clients in the system.

Step 1. Retrieve the Researcher Resource

In the API, information for a particular researcher can be retrieved within a script using a GET call:

In this case, the URI represents the individual researcher resource for the researcher named Sue Erikson. The GET returns an XML representation of the researcher, which populates the groovy node researcher.

The XML representation of individual REST resources are self-contained entities. Always request the complete XML representation before editing any portion of the XML. If you do not use the complete XML when you update the resource, you may inadvertently change data.

The following example shows the XML returned for the Sue Erikson researcher:

Step 2. Update the Researcher Information

Updating the telephone number requires the following steps:

Changing the telephone value in the XML.
Using a PUT call to update the the researcher resource.

The new telephone number for Sue Erikson can be set with the phone value within the Groovy researcher node:

The PUT command updates the research resource at the specified URI using the complete XML representation, including the new phone number. A successful PUT returns the new XML in the returnNode. An unsuccessful PUT returns the http response code and error message in XML in the returnNode.

Expected Output and Results

For a successful update, the resulting XML can also be reviewed in a web browser via the URI:

http://yourIPaddress/api/researchers/103

In the LIMS, the updated user list should show the new phone number.

Attachments

UpdateContactInfo.groovy:

Work with Multiplexing

When working with multiplexing, you can do the following:

Find the Index Sequence for a Reagent Label
Demultiplexing
Pool Samples with Reagent Labels
Apply Reagent Labels with REST
Apply Reagent Labels When Samples are Imported
Apply Reagent Labels by Adding Reagents to Samples

Find the Index Sequence for a Reagent Label

A common requirement in applications involving indexed sequencing is to determine the sequence corresponding to a reagent label. This example shows how to configure index reagent types, which you can then use to find the sequence for a reagent label. Before you follow the example, make sure that you have a compatible version of API (v2 r14 to v2 r24).

Code Example

Reagents and reagent labels are independent concepts in the API. However, the recommended practice is to name reagent labels after reagent types. This allows you to use the label name to look up the sequence information on the reagent type resource. This practice is consistent with the Operations Interface process wizards. When a reagent is applied to a sample in the user interface, a reagent label with the same name of the reagent type is added to the analyte resource.

The following actions are also recommended:

Configure an index reagent type with the correct sequence for each type of index or tag you plan to use.
Use the names of the index reagent types as reagent labels.

Following these practices allows you to find the sequence for a reagent label by looking up the sequence in the corresponding reagent type.

Step 1. Configure Index Reagent Types

For each index or tag you plan to use in indexed sequencing, configure a corresponding index reagent type as follows.

As administrator, click Configuration > Consumables > Labels.
Add a new label group.
Then, to add labels to the group:
- Download a template label list (Microsoft® Excel® file) from the Labels configuration screen.
- Add reagent type details to the downloaded template.
- Upload the completed label list.

Step 2. Retrieve the sequence for a reagent label using the REST API

After you have configured reagent types for each indexing sequence you intend to use, and have used those reagent type names as reagent label names, you can easily retrieve the corresponding sequence using the REST API.

The following code snippet shows how to retrieve the index sequences (when available):

// Determine the URI of the labeled artifact and retrieve it
labeledArtifactURI = "http://${hostname}/api/v2/artifacts/${labeledArtifactLIMSID}"
labeledArtifact = GLSRestApiUtils.httpGET(labeledArtifactURI, username, password)
 
// Gather its reagent labels
reagentLabels = labeledArtifact.'reagent-label'.@name
if (!reagentLabels) {
    println "No labels found"
    return
}
 
// Build a query URI for possibly multiple reagent labels and execute it
queryParameters = reagentLabels.collect { "name=${it.replace(' ', '+')}" }.join('&')
reagentTypesURI = "http://${hostname}/api/v2/reagenttypes/"
reagentTypeQueryURI = reagentTypesURI + '?' + queryParameters
reagentTypeLinks = GLSRestApiUtils.httpGET(reagentTypeQueryURI, username, password)
 
// For each reagent type found, retrieve it
reagentTypeLinks.'reagent-type'[email protected] {
    reagentType = GLSRestApiUtils.httpGET(it, username, password)
    reagentTypeName = reagentType.@name
    reagentLabels.remove(reagentTypeName)
    index = reagentType.'special-type'.'attribute'.find { it.@name = 'Sequence' }?.@value
 
    // Output the result
    println "Label: $reagentTypeName"
    println "Index: $index"
}
 
// If there are reagent labels that found no matches
if (reagentLabels) {
    unmatchedLabels = reagentLabels.join(',')
    println "No reagent types found for labels: $unmatchedLabels"
}

For an artifact labeled with Index 1, this would produce the following information:

Label: Index 1
Index: ATCACG

Attachments

RetrievingReagentLabelIndex.groovy:

2KB

RetrievingReagentLabelIndex.groovy

Demultiplexing

Demultiplexing is the last step in an indexed sequencing workflow. While the specifics depend on the sequencing instrument and analysis software used, taking pooled samples through sequencing and analysis produces result files/metrics per lane/identifier tag.

These results will likely be in the form of multiple files that you can import back into Clarity LIMS. To do this, you need to set up a configured process that generates process outputs that apply to inputs per reagent label, usually in the form of ResultFile artifacts.

Prerequisites

Before you follow the example, make sure you have the following items:

Configured reagent types named Index 1 through Index 6 in Clarity LIMS.
Reagents of type Index 1 through Index 6 in Clarity LIMS.
A compatible version of API (v2 r14 to v2 r24).

Code Example

Create a Demultiplexing Process in the Clarity LIMS Operations Interface

Configure a process that generates ResultFile with process outputs that apply to inputs per reagent label. It is recommended to name your outputs in a way that clearly identifies the samples to which they correspond (eg, Results for {SubmittedSampleName}-{AppliedReagentLabels}).

Demultiplexing in the User Interface

Running the demultiplexing process on a labeled pooled input produces a process run in the Operations Interface, similar to the one illustrated below.

Note the following:

There were three reagent labels in the input analyte (sample) artifact. As a result, three outputs were generated (the process was configured to produce one output result file per label per input).
The names of the outputs of the demultiplexing process expose the original sample name and label.
The Operations Interface shows details of the genealogy from the downstream result file all the way back to the original sample.

While reagent labels are not explicitly exposed in the Clarity LIMS client user interface, genealogy views in the Operations Interface are aware of reagent labels and will show the true sample inheritance. As noted above, you can use the {AppliedReagentLabels} output naming variable to show the reagent labels applied to each artifact in the user interface.

Demultiplexing in the API with a Process POST

Executing a demultiplexing process by issuing a process POST via the REST API is similar to the typical process execution found in .

The key difference is that when executing a demultiplexing process through the REST API, outputs per reagent label are automatically generated from the inputs provided. You do not need to explicitly specify them.

For example, when running the demultiplexing process configured against a single (pooled) sample, you could post a process execution representation like this:

The input-output-map only refers to inputs, not outputs, because the demultiplexing process is configured to exclusively produce outputs per reagent label.

If your process produces other outputs, such as shared or per-input outputs, you must explicitly specify input-output-maps for them.

Verify with the REST API

Irrespective of whether you use the user interface or the REST API to run the demultiplexing process, the REST API representation for the process looks something like this:

For each input with reagent labels, one output was created per reagent label.

In the example, the process ran on one pooled input, and produced three outputs (the pooled input included three reagent labels). The following example shows one of the demultiplexed result file outputs:

The output contains only one reagent label, and relates only to the sample that was tagged with the same reagent label. Compare this to the case of a pooled artifact, which has several labels and relates to several samples. This level of traceability (from a demultiplexed output back to its specific original sample) is only possible because the artifacts were labeled before they were pooled.

The artifact name generated by the demultiplexing process output name pattern is ("Results for SAM-3 - Index 3"). You can use the {SubmittedSampleName} naming variable to show true ancestors, and the {AppliedReagentLabels} to show any reagent labels applied to an output.

Apply Reagent Labels with REST

Reagent labels are artifact resource elements and can be applied using a PUT. To apply a reagent label to an artifact using REST, the following steps are required:

GET the artifact representation.
Insert a reagent-label element with the intended label name.
PUT the modified artifact representation back.

You can apply the reagent label to the original analyte (sample) artifact or to a downstream sample or result file.

Prerequisites

Before you follow the example, make sure that you have the following items:

Reagent types that are configured in Clarity LIMS and are named index 1 through index 6.
Reagents of type index 1 through index 6 that have been added to Clarity LIMS.
A compatible version of API (v2 r14 to v2 r24).

Code Example

In this example, you can adjust the following code:

By inserting the reagent-label element, you end up with the following code.

Although it is not mandatory, it is recommended that you name reagent labels after reagent types using the Index special type. This allows you to relate the reagent label back to its sequence.

Apply Reagent Labels by Adding Reagents to Samples

If your samples are already in Clarity LIMS, you can assign reagent labels by running the Add Multiple Reagents process/protocol step from the Clarity LIMS user interface. Adding a reagent implicitly assigns a reagent label to every sample artifact. The reagent label applied is derived from the reagent type used.

Prerequisites

Before you follow the example, make sure that you have the following items:

Reagent types that are configured in Clarity LIMS and are named index 1 through index 6.
Reagents of type index 1 through index 6 that have been added to Clarity LIMS.
A compatible version of API (v2 r14 to v2 r24).

For more information on indexes with reagent labels, see .

Code Example

The following illustrations show the Add Multiple Reagents process, as run from the Operations Interface.

Run the Add Multiple Reagents process

In the Add Multiple Reagents wizard panel, reagents (Indexes 1 to 3) are selected and then assigned to the samples (SAM-1 to 3) in the Sample Workspace, using a click and drag process.

The cells of the Sample Workspace represent the wells of the container used for this process.

When the wizard completes, the Add Multiple Reagents process replaces the input sample artifacts with output analyte artifacts.

In the following illustration, the Name column shows the reagent labels applied to the outputs. These are generated by the default output naming pattern for the Add Multiple Reagents process: {InputItemName}-{AppliedReagentLabels}.

Verify with the REST API

When running the Add Multiple Reagents process, the output analyte artifact names show the reagent label applied, as the output naming pattern in the process configuration uses the {AppliedReagentLabels} variable.

By examining the REST API representation of the Add Multiple Reagents process, you can verify the following information:

The output analyte artifacts show a reagent-label element matching the name of the reagent type used.
The input analyte artifacts are not modified and do not have reagent labels added.
The input analyte artifacts do not have a location element, as they were displaced by the outputs.
You can only determine that reagent labels were applied. You cannot determine which reagent was applied.

The following shows an example of an output from an Add Multiple Reagents process when viewed with the REST API:

Although adding a reagent to a sample automatically assigns a reagent label, reagents and reagent labels are independent concepts in Clarity LIMS. There are ways to add reagent labels that do not involve reagents, and that even when using reagents, it is not possible to accurately determine the reagent used based on the reagent label attached to an artifact.

Working with User Defined Fields/Custom Fields

Performing Post-Step Calculations with Custom Fields/UDFs

Compatibility: API version 2 revision 21 and later

Important measurements and values are often calculated from other values. Instead of performing these calculations by hand, and then manually entering them into the LIMS (thereby increasing the probability of error), you can develop scripts to perform these calculations and update the data accordingly.

This example demonstrates the use of scripts and user-defined fields (UDFs) / custom fields for information retrieval and recording of calculation results in the LIMS.

NOTE:

Information about a step is stored in the process resource in the API.
Information about a derived sample is stored in the analyte resource in the API. This resource is used as the input and output of a step, and also used to record specific details from lab processing.
As of BaseSpace Clarity LIMS v5.0, the term user-defined field (UDF) has been replaced with custom field in the user interface. However, the API resource is still called udf.

Prerequisites

Clarity LIMS v5 and later:

You have defined the following custom global fields on the Derived Sample object:
- Concentration
- Size (bp)
- Conc. nM
You have set the three fields configured in step 1 to display in the Sample table of the Record Details screen.
You have configured a Calc. Prep step to apply Concentration, Size (bp) to generated derived samples.
- You have run the Calc. Prep step and it has generated derived samples.
- You have input values for the Calculation and Size (bp) fields.
You have configured a Calculation step to apply Conc. nM to generated derived samples.
- You have run the Calculation step - with the derived samples generated by the Calc. Prep step as inputs, and it has generated derived samples.

Code Example

First, the values to be used in the calculation - the Concentration and Size (bp) UDFs / custom fields are applied to the samples by running the Calc. Prep preparation step. You can then enter the values for these fields into the LIMS as follows:

Clarity LIMS v5 and later:

In the Record Details screen, in the Sample table.

Expected Output and Results

After the script has successfully completed, the Conc. nM results will display

(LIMS v5 & later) In the Record Details screen, in the Sample table.

Attachments

UsingAnalyteUDFForCalculations.groovy:

Work with Processes/Steps

When working with multiplexing, you can do the following:

Filter Processes by Date and Type
Find Terminal Processes/Steps
Run a Process/Step
Update UDF/Custom Field Information for a Process/Step
Work with the Steps Pooling Endpoint

Filter Processes by Date and Type

Workflows, chemistry, hardware, and software are continually changing in the lab. As a result, you can determine which samples were processed after a specific change happened.

Using the processes (list) resource you can construct a query that filters the list using both process type and date modified.

Prerequisites

Before you follow the example, make sure you have the following items:

Samples that have been added to the system.
Multiple processes of the Cookbook Example type that have been run on different dates.
A compatible version of API (v2 r21 or later).

Code Example

In Clarity LIMS, when you search for a specific step type, the search results list shows all steps of that type that have been run, along with detailed information about each one. This information includes the protocol that includes the step, the number of samples in the step, the step LIMS ID, and the date the step was run.

The following screenshot shows the search results for the step type Denature and Anneal RNA (TruSight Tumor 170 v1.0).

The list shows the date run for each step, but not the last modified date. This is because a step can be modified after it was run, without changing the date on which it was run.

To find the steps that meet the two criteria (step type and date modified), you must to do the following steps:

Request a list of all steps (processes), filtered on process type and date modified.
Once you have the list of processes, you can use a script to print the LIMS ID for each process.

Step 1. List processes of a specific type that were modified after a specified date

To request a list of all processes of a specific type that were modified after a specified date, use a GET method that uses both the ?type and ?last-modified filter on the processes resource:

The GET call returns a list of the first 500 processes that match the filter specified. If more than 500 processes match the filter, only the first 500 are available from the first page.

In the XML returned, each process is an element in the list. Each element contains the URI for the individual process resource, which includes the LIMS ID for the process.

The URI for the list of all processes is http://yourIPaddress/api/processes. In the example code, the list was filtered by appending the following:

This filters the list to show only processes that are of the Cookbook Example type and were modified after the specified date.

The date must be specified in ISO 8601, including the time. In the example, this is accomplished using an instance of a Calendar object and a SimpleDateFormat object, and encoding the date using UTF-8. The date specified is one week prior to the time the code is executed.

All of the REST list resources are paged. Only the first 500 items are returned when you query for a list of items, such as http://youripaddress/api/v2/artifacts.

If you cannot filter the list, you must iterate through the pages of a list resource to find the items that you are looking for. The URI for the next page of resources is always the last element on the page of a list resource.

After requesting an individual process XML resource, you have access to a large collection of data that lets you modify or view each process. Within the process XML, you can also access the artifacts that were inputs or outputs of the process.

After running the script on the command line, output is be generated showing the LIMS ID for each process in the list.

Find Terminal Processes/Steps

Information about a step is stored in the process resource. In general, automation scripts access information about a step using the processURI, which links to the individual process resource. The input-output-map in the XML returned by the individual process resource gives the script access to the artifacts that were inputs and outputs to the process.

Processing a sample in the lab can be complex and is not always linear. This may be because more than one step (referred to as process in the API and in the Operations Interface in Clarity LIMS v4.x and earlier) is run on the same sample, or because a sample has to be modified or restarted because of quality problems.

The following illustration provides a conceptual representation of a Clarity LIMS workflow and its sample/process hierarchy. In this illustration, the terminal processes are circled.

The following illustration provides a conceptual representation of a LIMS workflow and its sample / process hierarchy. In this illustration, the terminal processes are circled.

This example finds all terminal artifact (sample)-process pairs. The main steps are as follows:

All the processes run on a sample are listed with a process (list) GET method using the ?inputartifactlimsid filter.
All the process outputs for an input sample are found with a process (single) GET.
Iteration through the input-output maps finds all outputs for the input of interest.

Prerequisites

Before you follow the example, make sure you have the following items:

A sample to the system.
Several steps that have been run, with several steps run on a single output at least one time.
A compatible version of API (v2 r21 or later).

Code Example

To walk down the hierarchy from a particular sample, you must do the following steps:

List all the processes that used the sample as an input.
For each process on that list, find all the output artifacts that used that particular input. These output artifacts represent the next level down the hierarchy.
To find the artifacts for the next level down, repeat steps 1 and 2, starting with each output artifact from the previous round.
To find all artifacts in the hierarchy, repeat this process until there are no more output artifacts. The last processes found are the terminal processes.

This example starts from the original submitted sample.

Step 1. Retrieve the Sample Resource and Find its Analyte Artifact (Derived Sample) URI

The first step is to retrieve the sample resource via a GET call and find its analyte artifact (derived sample) URI. The analyte artifact of the sample is the input to the first process in the sample hierarchy.

The following GET method provides the full XML structure for the sample including the analyte artifact URI:

The sample.artifact.@limsid contains the original analyte LIMS ID of the sample. For each level of the hierarchy, the artifacts are stored in a Groovy Map called artifactMap. The artifactMap uses the process that generated the artifact as the value, and the artifact LIMS ID as the key. At the top sample level, the list is only comprised of the analyte of the original sample. In the map, the process is set to null for this sample analyte.

Step 2. Find All of the Processes Run on the Artifact

To find all the processes run on the artifacts, use a GET method on the process (list) resource with the ? inputartifactlimsid filter.

In the last line of the example code, the processURI string sets up the first part of the URI. The artifact LIMSID is added (concatenated) for each GET call in the following while loop:

In the last line of the example code provided above, the processURI string sets up the first part of the URI.

The artifact LIMSID will be added (concatenated) for each GET call in the while loop below.

The while loop evaluates one level of the hierarchy for every iteration. Each artifact at that level is evaluated. If that artifact was not used as an input to a process, an artifact/process key value pair is stored in the lastProcMap. All the Groovy maps in the previous code use this artifact/process pair structure.

The loop continues until there are no artifacts that had outputs generated. For each artifact evaluated, the processes that used the artifact as an input are found and collected in the processes variable. Because a process can be run without producing outputs, a GET call is done for each of the processes to determine if the artifact generated any outputs.

Any outputs found will form the next level of the hierarchy. The outputs are temporarily collected in the outputArtifactMap. If no processes were found for that artifact, then it is an end leaf node of a hierarchy branch. Those artifact/process pairs are collected in the lastProcMap .

You can iterate through each pair of artifact and process LIMS IDs in outputArtifactMap and print the results to standard output.