1 of 16

Tips and Tricks

This section provides tips and tricks to help you work efficiently with the API. For example, learn how to copy and update field values, create and rename samples, work with files and QC flags, and automate BCL conversion.

Accessing Step UDFs from a different Step
Obfuscating Sensitive Data in Scripts
Integrating Clarity LIMS with Upstream Sample Accessioning Systems
Creating Samples and Projects via the API
Displaying Files From an Earlier Step
Transitioning Output Artifacts into the Next Step
Determining the Workflow(s) to Which a Sample is Assigned
Standardizing Sample Naming via the API
Copying UDF Values from Source to Destination
Updating Preset Value of a Step UDF through API
Automating BCL Conversion
Finding QC Flags in Aggregate QC (Library Validation) via REST API
Setting the Value of a QC Flag on an Artifact
Creating Notifications When Files are Added via LabLink
Remote HTTP Filestore Setup

Accessing Step UDFs from a different Step

This section outlines several strategies to enable this feature.

In all cases, assume that a UDF called Batch ID that was on Step A, and you want to access it on Step D:

NOTE: If the samples in Step D do not have a homogeneous lineage, expect multiple values for the Batch ID.

Scenario 1: Crawl Back

This method involves crawling backwards from Step D to Step A.

The general form is as follows.

Examine the inputs to Step D.
Each input (I) has a parent-process element with a URI to the step that created the artifact. In this case, it is the URI to Step C.
Get the input-output maps for Step C (from the /details resource) and find the input (I') that produced output I. Each input (I') has a parent-process element with a URI to the step that created the artifact. In this case, it is the URI to Step B.
Get the input-output maps for Step B (from the /details resource) and find the input (I'') that produced the output I'. Each input (I'') has a parent-process element with a URI to the step that created the artifact. In this case, it is the URI to Step A.
Get the value of the UDF (Batch ID) from Step A: 1234.

This method is computationally slow, but it is safe. As the number of steps that need to be crawled back through increases, so does the duration of the script to retrieve the value.

Scenario 2: Jump Back

This method tried to jump straight to Step A, without passing through Steps B and C.

The general form is as follows.

Examine the inputs to Step D. Each input (I) has a sample element that contains the limsid (S) of the related submitted sample.
https://<your_hostname>/api/v2/artifacts?samplelimsid=Sandprocess-type=Step%20A
This query should give an XML response containing the URI to Step A. From there, get the value of the UDF (Batch ID): 1234.

This method makes two assumptions:

That Step A produces analytes (derived samples). Thus, if Step A is a QC process, or does not produce analyte outputs, this method fails.
That the analytes (derived samples) resulting from S only passed through Step A one time. If this assumption is not true, you receive multiple URIs to the individual instances of Step A that relate. Also, you cannot be certain which Batch ID to rely upon.

This method is computationally fast, and its duration is not reduced if there are many steps between Step A and Step D.

Scenario 3: Pay it Forward

This method works well, but it involves making configuration changes to the steps. As such, this method is useless for legacy data resulting from samples that passed through the steps before the configuration was applied.

Its general form involves:

In Step A: Add a script that copies the value of the Batch ID UDF (1234) to every input and output of type analyte in the step.
In Step B: Add a script that copies the value of the Batch ID UDF (1234) to every output of type analyte in the step.
In Step C: Add a script that copies the value of the Batch ID UDF (1234) to every output of type analyte in the step.
In Step D: The inputs contain the value of the Batch ID.

This method relies on propagating the Step UDF through Steps A, B, and C to Step D. It is safe and fast. However, if the protocol is edited and a new step is inserted between B and C, add the script that propagates the value. This addition is so the chain does not break. This method is safe if any of the steps are QC steps or do not produce analyte outputs.

Scenario 4: Along for the Ride

This method is a niche solution, but it works well. It assumes that the samples from Step A proceed to Step D as an intact group, and they are joined by a control sample.

This method involves making configuration changes to the steps. As such, this method is useless for legacy data resulting from samples that passed through the steps before the configuration was applied.

In Step A: Identify the control sample for the group, then copy the value of the Batch ID to the control sample.
In Step D: Identify the control sample for the group, then retrieve the value of the Batch ID from it.

This method is the least work, but it does make several assumptions that might make it impracticable.

Obfuscating Sensitive Data in Scripts

If a BaseSpace Clarity LIMS script is run in an automation context, it is easy to obfuscate usernames and passwords by choosing the appropriate tokens ({username} or {password}) to be passed in as run-time arguments.

However, this type of functionality is not easily available outside of automations, and it is often necessary to store various credentials on machines that need to interact with the LIMS API, database, or some other protected resource. This article explains how to use cryptography in Python to protect and obfuscate these important authentication tokens.

Background

Many of the API Cookbook examples use a simple auth_tokens.py file that has usernames and passwords stored in plain text. This file can be compiled in Python, simply by importing it at a Python console:

Importing this file creates an auth_tokens.pyc file—a byte-compiled version of the source file. The source file can now be deleted, providing the first rudimentary level of security. However, the credentials can still quite easily be retrieved. Even if the permissions on this file are restricted, this solution does not present a suitable level of security for most IT administrators. It does, however, allow us to easily prototype our code, hence its use in Cookbook examples.

Assumptions

You have pycrypto installed (either through the OS package manager or pip).
You have generated a secret key of random ASCII characters (the easiest way to do this is to button-mash on a US-layout keyboard and include a lot of symbols).
You already have a plain-text auth_tokens.py file. An example is attached at the bottom of this article.
You have access to the Python or iPython command line console.

Cryptography in Python

Python provides the pycrypto library that can easily be installed using the operating system's package manager, or the pip installation tool. It contains myriad different encryption algorithms and gives us a straightforward interface to wrap our own encryption objects and accessor functions.

Towards a Better auth_tokens.py

The goal is to be able to create a flat text file containing obfuscated usernames, passwords, hostnames, and so on. To do this, use a utility class called ClarityCred that provides encryption and decryption functionality using the ARC4 cipher from pycrypto. The ClarityCred class is provided in cred.py, attached at the bottom of this article.

While the use of ARC4 is considered deprecated in favor of stronger encryption algorithms, such as AES, the ARC4 example lends itself to easier understanding. ARC4 simply requires a secret key and a salt size to be specified. The secret key can be generated at random using any preferred method and is hard-coded in cred.py, along with the salt size purely for ease of demonstration. Ideally, the secret key and salt size should be stored externally.

After applying the ARC4 encryption, the ClarityCred class wraps base64 encoding around it to obfuscate the data further.

Assume that you need to store a username, password, and hostname inside our auth_tokens.py, and we have this information in plain-text stored in another file called auth_tokens_plain.py. The usage is as follows.

Open a Python console, and import ClarityCred from cred.py.
Call the ClarityCred.encrypt() static function on the plain text username, password, and hostname strings.
Copy-paste these values into auth_tokens.py.

The following image illustrates steps 1 and 2, using an existing auth_tokens_plain.py file:

The old auth_tokens_plain.py looked like this:

username = 'testuser' password = 'testpass' hostname = 'https://encryptiontest.claritylims.com'

The new auth_tokens.py looks like this:

username = 'zq1AwnqIkfA=$YFY1UuO1r6edu7qPnN9/l3kMI15ZG1JAsH7IhnxnNvYulMndhYh6lxjVBfFwjN9sZEqPM0Qlx6kjq3fbht/FlRrgklDL79H7NiUP6uYM2qVltPloRA4g8SiphF3KHx4gVTE93Ku58sFCgu1rnH5u6tkCz98v0R7PsuIOW1CDMi9zSToIu+IkcYDPPYcD1b4z8ojez/7lczunaDfrmPhwopyyUiETu9BR49Bwp5fz4XSWICZFGCd9AjoEg/FTE+/X18f+0pIz0viXQyN+JjE3vJkpNsRY2Z3d72sPgQmFFZhd48m+POUtD1UXLXhaijdxp78QTcEp7AHY+TiM8hsXT7BX1Q=='

password = '9qW5BftGyXY=$6GL1t/Zl1CbSmB7Qq54uf2TJ5fI8GUlW9NdBnumkTtF/X27WLEsr1+C0ilXQX6jnLm4kzR+5pCVgnz4xz6/80/dMLMlTll6tOvCJgPU4ZkRpkUYmcPVbrp+X3azR7I024O8UjV/JeJYV869h3kvdPyWJGXRH4oJgs5NTJKI2y6URBs0wlrlgBuZ2YkO855ZGPw9J07UMM606q9xERRzQ+LT1XLRzSCuFnuSoDVEhshhYqZ/jpYWDHvA6Z5+YTYI/i099iYZ+WQdJAiU9hcgkUnWCybjcwivvHG6vAIROroLqlOefo+hrJsVFBA3uDaPS8pkgMVsKMPUGeft6vx4NgN/jaw==

hostname = 'Q+oyq2m9Nv8=$rhgeJOMdm/M+dDNlSbBA3RCsUoo0Ts65G7lePvuajRmsLSNC5Qo5bwagRuyat0ztpeZrUmD8xTxTvhUBvZYDlM6GBLsq5drBP6PFh/lplxb6O8YiSRXrboFov8tRnu6GbaTfGR8WV7s8vBZsXhrhlPn67p7yalJLnHWb9VOKhx8AgCTtytQkkEwmpm2vbDwDha9kMdK63IrOSp2jmRaI/9X3xsd4upqaxvX7zrEJ8ruGU/szN0ITxTK1rprnowpyXfBRiOEcrI7uh1bg73oqOETn3pB/uTrGkhGETKYB2aHaewwWMccbeZTgEPT0kDmuJdpoGYy+p+gxSoR9Arh3JtREIA=='

Examples of the plain-text auth_tokens_plain.py and encrypted auth_tokens.py are attached at the bottom of this article.

Using the New auth_tokens.py in Your Scripts

Now that the new auth_tokens.py is ready to use, you can import it and create the corresponding PYC file to provide that extra level of security, as previously discussed. You can remove the PY file and ship the PYC file everywhere it is required.

It may also be a good idea to restrict the read/write/execute permissions on the file to the system user that is calling the file (usually glsai in Clarity LIMS installations).

To use the values in this file in the code, we need to use the decrypt() function in ClarityCred. Look at the simple example of initializing a glsapiutil api object. For reference, the example current directory listing looks like this:

Notice the .py source files are removed wherever possible.

Using a Python console, the normal api invocation (using a plain-text auth_tokens file) would look as follows.

import glsapiutil import auth_tokens_plain api = glsapiutil.glsapiutil2() api.setHostname( auth_tokens_plain.hostname ) api.setVersion( 'v2' ) api.setup( auth_tokens_plain.username, auth_tokens_plain.password )

Now, however, with our encrypted tokens, we decrypt the values on-the-fly (changes shown in italicized red text):

import glsapiutil import auth_tokens from cred import ClarityCred api = glsapiutil.glsapiutil2() api.setHostname( ClarityCred.decrypt( auth_tokens.hostname ) ) api.setVersion( 'v2' ) api.setup( ClarityCred.decrypt( auth_tokens.username ), ClarityCred.decrypt( auth_tokens.password ) )

This method provides a relatively robust solution for encrypting and obfuscating sensitive data and can be used in any Python context, not just for Clarity LIMS API initialization. By further ensuring that only the auth_tokens.pyc file is shipped and copied with restricted read/write/execute permissions, this method should help satisfy IT security requirements.

However, the matter of storing the secret key externally remains. One idea is to store the secret key in a separate file and encrypt that file using openssl or an OpenPGP key. While the problem of storing each piece of information in encrypted format likely never fully goes away, the use of multiple methods of encryption can offer better protection and peace of mind.

Attachments

auth_tokens.py:

auth_tokens_plain.py:

Integrating Clarity LIMS with Upstream Sample Accessioning Systems

This section discusses methods for integrating BaseSpace Clarity LIMS with upstream sample accessioning systems.

The following illustration shows a typical architectural overview:

Creating a Sample in Clarity LIMS

Required:

A sample must have a Name / ID
A sample must be associated with a Case / Patient / Study / Project
A sample must be associated with a Container (Tube / Plate etc)

Optional (but expected):

User-defined fields (UDFs)/custom fields (defined by your LIMS configuration)

Typical flowchart of actions within the broker:

The following animation illustrates the elements of an XML sample-creation message to Clarity LIMS.

Common options for the broker

Build your own:

Pro: Not too difficult
Con: Stability as number of messages increases
?: Maintainable over the long-term

Use a commercial / open-source offering (e.g. Mirth Connect)

Pro: Quicker than build
Pro: Robust, multi-threaded support for millions of messages per day
?: May prove to be an excessive or over-complicated means to accomplish something relatively simple

Does the broker need to carry out other business logic?

For example, one customer added logic to their broker that dealt with medical billing and was able to distinguish between physicians ordering duplicate tests for a subject (not reimbursable, therefore the duplicate sample wasn’t submitted to Clarity LIMS), versus a temporal study that was reimbursable.

The best practice is to take advantage of as many legacy systems as possible, rather than creating samples in Clarity LIMS, then reinventing business logic to remove unwanted ones.

Creating Samples and Projects via the API

Assumptions

The incoming message contains the following:

Project ID or Name
Sample ID or Name
Container ID or Name
Container type (plate / tube type)
Container well position (if sample is on a plate) eg G:2
Sample user-defined fields (UDFs) / custom fields

Main Logic

Does the project exist?
    if NO: create it
Does the container exist?
    if NO: create it
Create sample