1 of 47

Application Examples

About the Application Examples

The Clarity LIMS Application Examples use example scripts to help you learn how to work with REST and EPP (automation) scripts. Application examples are larger scripts that teach you how to do something useful. The purpose of these examples is to get you scripting quickly.

Be aware that the application examples differ from purchased Clarity LIMS integrations:

Purchased LIMS integrations cannot be modified. Because Illumina guarantees specific functionality when using integrations, any code or script modifications must only be performed by the Illumina Integrations or Support teams.
Application examples are learning tools that are intended to be modified. The application examples follow best practices in terms of over-all script construction; however, their exact functions are less critical than the integrations.

The API documentation includes the terms External Program Integration Plug-in (EPP) and EPP node. As of Clarity LIMS v5.0, these terms are deprecated. The term EPP has been replaced with automation. EPP node is referred to as the Automation Worker or Automation Worker node. These components are used to trigger and run scripts, typically after lab activities are recorded in the LIMS.

The best way to get started is to download the example script and try it out. Once you have seen how the script works, you can dissect it and use the pieces to create your own script.

Python API Library (glsapiutil.py) Location

The latest glsapiutil and glsapiutil3 Python libraries can be found on the GitHub page. Download links to these files will be removed from example documentation over time.

Scripts That Help Automate Steps

Route Artifacts Based Off a Template File

Workflows do not always have a linear configuration. There are situations when samples progressing through a workflow must branch off and be directed to different stages, or even different workflows. You can add samples to a queue using the /api/{version}/route/artifacts endpoint.

In the lab, decisions are made dynamically. At the initiation of this example workflow, it is not known whether the sample is destined to be sequenced on a HiSeq or MiSeq instrument. As a result, the derived samples must be routed to a different workflow stage midworkflow.

This programmatic approach to queuing samples can be used many times throughout a workflow. This approach eliminates the need to write multiple complex scripts - each of which must be maintained over time.

This example describes a Python script that takes instruction from a .csv template file. Removing any hard-coded reference to specific UDFs / custom fields or workflow stages from the script allows for easy configuration and support. All business logic can be implemented solely through generation of a template and EPP / automation command.

The template contains UDF / custom field names and values. If samples in the active step have fields with matching values, they are queued for the specified workflow stage.

Step Configuration

The step is configured to display the two checkbox analyte UDFs / derived sample custom fields. These fields are used to select the destination workflow stages for each derived sample / analyte. You can queue the sample for HiSeq, MiSeq, or both.

In the preceding example of the Sample Details screen, the user selected:

Two samples to be queued for HiSeq
Two samples for MiSeq
Two samples where routing is not selected
Two samples for both HiSeq and MiSeq

In the preceding example of the Step Details screen, the user selected:

A step level UDF / custom field to assign all samples to a given step. All output samples are routed to the associated step. Samples can be duplicated in the first step of the following protocol if:
Samples are routed as the last step of a protocol and;
The action of the next step is Mark Protocol as Complete.

The duplication is due to the Next Steps action queuing the artifact for its default destination in addition to the script routing the same artifact.

The solution here is to set the default Next Steps action to Remove from Workflow instead. This action can be automated using the Lab Logic Toolkit or the API.

LIMS v6.x

On the protocol configuration screen, make sure that Start Next Step is set to Automatic.

Template Configuration

Each UDF name and value pair correspond to a workflow and stage combination as the destination to route the artifact.

The UDF name in the template could be configured as a step or analyte level UDF. If a step level UDF has a value that is specified in the template, all analytes in the step are routed.
The UDFs can be of any type (Numeric, Text, Checkbox ). If a checkbox UDF is used, the available values are true or false.
UDF values in the template are not case-sensitive.

The template requires four columns:

UDF_NAME
UDF_VALUE
WORKFLOW_NAME
STAGE_NAME

For this example, the template values would be:

UDF_NAME, UDF_VALUE, WF_NAME, STAGE_NAME
Route all samples to: Step A, Workflow A, Stage A
Route all samples to: Step B, Workflow B, Stage B
Route A, True, Workflow A, Stage A
Go to HiSeq, True, TruSeq Nano DNA for HiSeq 5.0, Library Normalization (Illumina SBS) 5.0
Go to MiSeq, True, TruSeq DNA PCR-Free for MiSeq 5.0, Sort MiSeq Samples (MiSeq) 5.0

This script might be used numerous times in different EPP / automation scripts, with each referencing a different template.

Using a Template String Instead of a Template File

Due to restrictions on file server access, this script accepts routing template instructions using a string EPP parameter with lines separated by a newline character ('\n'). The following example shows how this parameter string would be added to represent the previous template:

"Route all samples to:, Step A, Workflow A, Stage A\nRoute all samples to:, Step B, Workflow B, Stage B\nRoute A, True, Workflow A, Stage A\nGo to HiSeq, True, TruSeq Nano DNA for HiSeq 5.0, Library Normalization (Illumina SBS) 5.0\nGo to MiSeq, True, TruSeq DNA PCR-Free for MiSeq 5.0, Sort MiSeq Samples (MiSeq) 5.0"

Required Parameters

The EPP/automation that calls the script must contain the following parameters:

*Either --template or --template_string is required. If both are provided, --template_string is used.

Optional Parameter

When the --input parameter is used, the script routes input artifacts instead of the default output artifacts. UDF values of the input artifacts (instead of outputs) are against the template file.

An example of the full syntax to invoke the script is as follows:

bash -c "/usr/bin/python /opt/gls/clarity/customextensions/Route_by_template.py -u {username} -p {password} -s {stepURI:v2} -t /opt/gls/clarity/customextensions/routing_template.csv -l {compoundOutputFileLuid0}"

Or, if you wish to route the inputs instead of outputs:

bash -c "/usr/bin/python /opt/gls/clarity/customextensions/Route_by_template.py --input -u {username} -p {password} -s {stepURI:v2} -t /opt/gls/clarity/customextensions/routing_template.csv -l {compoundOutputFileLuid0}"

User Interaction

When the Record Details screen is entered, the UDF / custom field checkboxes or drop-down options specify to which workflow/stage combination each derived sample is sent.

About the Code

The first important piece of information required is the URI of the destination stage. There is a unique stage URI assigned to each workflow and stage combination.

A stage URI can change across LIMS instances (such as switching from Dev to Prod). Therefore, the script gathers the stage URI from the workflow and stage name. This process occurs even when the workflows are identical.

The main method in the script is routeAnalytes() and it carries out several operations:

Gathers the information for the process that triggered the script, including output (or input) analytes.
For each analyte, evaluates which UDFs have been set, and adds the analyte to a list of analytes to route.
Creates the XML message for each stage.
Does a POST to the REST API to add the analytes to the queue in Clarity LIMS

This example, while in itself useful, also serves to demonstrate a more general concept. Routing artifacts is valuable in any situation where a sample must be queued for a stage outside of the usual order of a workflow. This routing is applicable even when routing newly submitted samples to the first stage in a workflow.

For more information, see the artifact/route REST API documentation.

Assumptions and Notes

You are running a version of Python supported by Clarity LIMS, as documented in the Clarity LIMS Technical Requirements.
The attached files are placed on the LIMS server, in the /opt/gls/clarity/customextensions folder.
The Python API Library (glsapiutil.py) is placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder. You can download the latest glsapiutil library from the GitHub page.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

routing_template.csv:

route_by_template.py:

Invoking bcl2fastq from BCL Conversion and Demultiplexing Step

Illumina sequencing protocols include a BCL Conversion and Demultiplexing step. This stage allows you to select the command options for running bcl2fastq2. The bcl2fastq must be initiated through a command-line call on the BCL server.

This example allows you to initiate the bcl2fastq2 conversion software by the selection of a button in BaseSpace Clarity LIMS.

Step Configuration

The "out of the box" step is configured to include the following UDFs / custom fields. You can select these options on the Record Details screen. You can also configure additional custom options.

About the Code

The main method in the script is convertData(). This method performs several operations:

The script determines the run folder. The link to the run folder is attached as a result file to the sequencing step.
- The script searches for the appropriate sequencing step and downloads the result file containing the link.
- The script changes directories into the run folder.
The script gathers all the step level UDFs / custom fields from the BCL Conversion and Demultiplexing step.
Using the information gathered, the script builds the command that is executed on the BCL server. The command consists of two parts: cd (changing directory) into the run folder.
- Executing the bcl2fastq command with the selected options.

Script Configuration

This script must be copied to the BCL server because the script is executed on the BCL server remote Automated Informatics (AI) / Automation Worker (AW) node.

By default, the remote AI / AW node does not come with a custom extensions folder. Therefore, if this script is the first script on the server you can create a customextensions folder in /opt/gls/.

It is not recommended to have the customextensions folder in the remoteai folder as the remoteaifolder can get overwritten.

When uploading the script, ensure the following:

The path to the bcl2fastq application is correct (line 17)
The sequencing process type matches exactly the name of the process type / master step the artifact went through (the -d parameter)
The customextensions folder contains both glsapiutil.py and glsfileutil.py modules. See #assumptions-and-notes.

Parameters

The script accepts the following parameters:

An example of the full syntax to invoke the script is as follows:

/usr/bin/python /opt/gls/customextensions/ kickoff_bcl2fastq2.py -u {username} -p {password} -s {stepURI:v2} -d 'Illumina Sequencing (Illumina SBS) 5.0'

Assumptions and Notes

You are running a version of Python supported by Clarity LIMS, as documented in the Clarity LIMS Technical Requirements.
The attached files are placed on the LIMS server, in the /opt/gls/clarity/customextensions folder.
The Python API Library (glsapiutil.py) is placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder. You can download the latest glsapiutil library from the GitHub page.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

kickoff_bcl2fastq2.py:

glsfileutil.py:

Email Notifications

Stakeholders are interested in the progress of samples as they move through a workflow. E-mail alerts of events can provide them with real-time notifications.

Some possible uses of notifications include the following:

Completion of a workflow for billing department
Manager review requests
Notice of new files added via the Collaborations Lablink Interface
Updates on samples that are not following a standard path through a workflow

Clarity LIMS provides a simple way of accomplishing this using a combination of the Clarity LIMS API, EPP / automation triggers, and Simple Mail Transfer Protocol (SMTP).

Solution

The send_email() method uses the Python smptlib module to create a Simple Mail Transfer Protocol (SMPT) object to build and send the email. The attached script does the following:

Gathers relevant data from the Clarity LIMS API endpoints.
Generates an email body according to a template.
Calls the send_email() function.

Mail Server Configuration

Connect to Clarity SMTP with:

host='localhost', port=25

Because of server restrictions, the script can send emails from only:

noreply.clarity@illumina.com

Automation Parameters

The automation / EPP command is configured to pass the following parameters:

Example command line:

python /opt/gls/clarity/customextensions/emails_from_Clarity.py -u {username} -p {password} -s {stepURI}

User Interaction

The script can be executed using a Clarity LIMS automation / EPP command, and trigged by one of the following methods:
- Manually, via a button on the Record Details screen.
- Automatically, at a step milestone (on entry to or exit from a screen).
The script can also be triggered outside of a Clarity LIMS workflow, using a time-based job scheduler such as cron.

Assumptions and Notes

You are running a version of Python that is supported by Clarity LIMS, as documented in the Clarity LIMS Technical Requirements.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

emails_from_clarity.py:

emails_attachmentPOC.py:

Finishing the Current Step and Starting the Next

Many facilities have linear, high-throughput workflows. Finishing the current step and finding the samples that have been queued for the next step. Then putting the samples into the virtual ice bucket. Then actually starting the next step. All of these steps can be seen as unnecessary overhead.

This solution provides a methodology that allows a script to finish the current step, and start the next.

We have already illustrated how a step can be run in its entirety via the API. Finishing one step and starting the next would seem a straightforward exercise, but, as most groups have discovered, there is a catch.

Consider the step to be autocompleted as Step A, with Step B being the next step to be autostarted.

The script to complete Step A and start Step B will be triggered from Step A (in the solution defined below, it will be triggered manually by clicking a button on the Record Details screen).
As Step A invokes a script:
- The step itself cannot be completed - because the script has not completed successfully. - and -
- The script cannot successfully complete unless the step has been completed.

To break this circle, the script initiated from Step A must invoke a second script. The first script will then complete successfully, and the second script will be responsible for finishing Step A and started Step B.

Solution

Invoking script: runLater.py

The runLater.py script launches the subsequent script (finishStep.py) by handing it off to the Linux 'at' command (see #h_3d84355e-7aa7-45b6-b8e8-54029669870d). This process effectively 'disconnects' the execution of thefinishStep.py script from runLater.py, allowing runLater.py to complete successfully, and then Step A to be completed.

Parameters

The script accepts the following parameter:

-c   The command to be passed to the at command (Required)

Although this script accepts just a single parameter, the value of this parameter is quite complex, since it is the full command line and parameters that need to be passed to the finishStep.py script.

Following is an example of a complete EPP / automation command-line string that will invoke runLater.py, and in turn finishStep.py:

/usr/bin/python /opt/gls/clarity/customextensions/runLater.py -c "/usr/bin/python /opt/gls/clarity/customextensions/finishStep.py -u {username} -p {password} -s {stepURI:v2:http} -a START"

Invoking script: finishStep.py

The finishStep.py script completes and starts Steps A and B respectively.

Parameters

The script accepts the following parameters:

About the code

The first thing that this script does is go to sleep for five seconds before the main method is called. This delay allows the application to detect that the invoking script has completed successfully, and allow this script to work. The duration of this delay can be adjusted to allow for your server load, etc.

The central method (finishStep()) assumes that the current Step (Step A) is on the Record Details screen. Next, the method callsadvanceStep() to move the step onto the screen to allow the next step to be selected. By default, the routeAnalytes() method selects the first available next step in the protocol / workflow for each analyte. Then theadvanceStep() method is called, which completes Step A.

If the script has been passed a value of 'START' for the -a parameter, the next step (Step B) will now be started. ThestartNextStep() method handles this process, carrying out the following high-level actions:

It determines the default output container type for Step B.
It invokes an instance of Step B on the analytes routed to it via the routeAnalytes() method.
If the Step B instance was invoked successfully, it gathers the LUID of the step.

At this point, it is possible that the apiuser user has started the step. This situation is not ideal. We would like to update it so that it shows up in the Work in Progress section of the GUI for the user that initiated Step A.

This script makes a note of the current user and then updates the new instance of Step B with the current user. This update can fail if Step B has mandatory fields that must be populated on the Record Details screen (see #h_3d84355e-7aa7-45b6-b8e8-54029669870d). It can also fail if the step has mandatory reagent kits that must be populated on the Record Details screen.

If the update does fail, the step has been started, but the apiuser user still owns it. It can be continued as normal.

Assumptions

The attached files are placed on the Clarity LIMS server, in the following location: opt/gls/clarity/customextensions
The attached files are readable by the glsai user.
Updating the HOSTNAME global variable such that it points to your Clarity LIMS server is required.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.
The scripts use rudimentary logging. After the scripts are installed and validated, these logs are of limited value, and their creation can likely be removed from the scripts.

Notes

The Linux 'at' command might not be installed on your Missing variable reference server. For installation instructions, refer to the Linux documentation.
For a solution to this issue, see Validating Process/Step Level UDFs.

Attachments

runLater.py:

clarityHelpers.py:

finishStep.py:

Adding Downstream Samples to Additional Workflows

When processing samples, there can be circumstances in which you must add downstream samples to additional workflows. This sample addition is not easy to achieve using the Server Name interfaces, but is easy to do via the API.

Solution

This example provides a Python script that can be used to add samples to an additional workflow step. The example also includes information on the key API interactions involved.

It is the outputs, not the inputs, of the process that are added to the workflow step. (If you would like to add the inputs, changing this step is simple.)
This example is an add function, not a move. If you would like to remove the samples from the current workflow, you can do arrange to do so by building an <assign> element.
The process is configured to produce analyte outputs, and not result files.

Parameters

The script accepts the following parameters:

About the Code

The step URI (-s) parameter is used to report a meaningful message and status back to the user. These reports depend upon the outcome of the script.

Initially, the parameters are gathered and the helper object defined in the glsapiutil.py has been instantiated and initialized. Its methods can be called to take care of the RESTful GET/PUT/POST functionality. The script calls the following functions:

Once the parameters have been gathered, and the helper object defined in the glsapiutil.py has been instantiated and initialized, its methods can be called to take care of the RESTful GET/PUT/POST functionality, leaving the script to call the following functions:

getStageURI()
routeAnalytes().

getStageURI() function

The getStageURI() function converts the workflowname and stagename parameters into a URI that is used as the assign element, for example:

routeAnalytes() function

The routeAnalytes() function gathers the outputs of the process, and harvests their URIs to populate in the <artifact> elements. This function also uses the reporting mechanism based upon the step URI parameter.

API resource

The crucial resource in this example is the route/artifacts resource. This API endpoint can only be POSTed to, and accepts XML of the following form:

For more information, refer to the route/artifacts REST API documentation. Also useful are the configuration/workflow resources, both single and list.

Assumptions and Notes

The attached file is placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder.
You must update the HOSTNAME global variable such that it points to your Clarity LIMS server.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

addToStep.py:

Advancing/Completing a Protocol Step via the API

This topic forms a natural partner to the application example. When protocol steps are being initiated programmatically, we must know how to advance the step through the various states to completion.

Solution

Advancing a step is actually quite a simple task. It requires the use of the steps/advance API endpoint - in fact, little else is needed.

Let us consider a partially completed step with ID 24-1234. To advance the step to the next state, the following is required:

Perform a GET to the resource .../api/v2/steps/24-1234, saving the XML response.
POST the XML from step 1 to .../api/v2/steps/24-1234/advance, and monitor the returned XML for success.
If successful, the protocol step advances to its next state, just as if the lab scientist had advanced it via the Clarity LIMS interface.
Advancing a protocol step that is in its final state completes the step.

The Python advanceStep (STEP_URI) method shown below advances a step through its various states. To achieve this, the URI of the step is passed to the method to be advanced/completed.

Assumptions and Notes

The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Setting a Default Next Action

Often, a protocol step will have just a single 'next action' that is required to continue the workflow. In such cases, it can be desirable to automatically set the default action of the next step.

This example script shows how this task can be achieved programmatically.

Solution

Step configuration

The step is configured to invoke a script that sets the default next action when you exit the Record Details screen of the step.

Parameters

The automation command is configured to pass the following parameters to the script:

An example of the full syntax to invoke the script is as follows:

User Interaction

When the lab scientist exits the Record Details screen, the script is invoked and the following message displays:

If the script completes successfully, the LIMS displays the default next action. If the current step is the final step of the protocol, it is instead marked as complete:

Manual Intervention

At this point the lab scientist is able to manually intervene and reroute failed samples accordingly.

NOTE: it is not possible to prevent the user from selecting a particular next-step action. However, it is possible to add a validation script that checks the next actions that have been selected by the user against a list of valid choices.

If the selected next step is not a valid choice, you can configure the script such that it takes one of the following actions:

Replaces the next steps with steps that suit your business rules.
Issues a warning, and prevents the step from completing until the user has changed the next step according to your business rules.

About the Code

The main method in the script is routeAnalytes(). The method in turn carries out several operations:

The actions resource of the current protocol step is investigated.
The possible next step(s) are identified.
If there is no next step, the current step is set to 'Mark protocol as complete.'
If there are multiple next steps, the first is used to set the next action.

Assumptions and Notes

The attached file is placed on the LIMS server, in the /opt/gls/clarity/customextensions folder.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

setDefaultNextAction.py:

Automatic Placement of Samples Based on Input Plate Map (Multiple Plates)

The BaseSpace Clarity LIMS interface offers tremendous flexibility when placing the outputs of a step into new containers. Consider if your protocol step always involves the placement of samples using the plate map of the input plate. If yes, it makes sense to automate sample placement.

This example provides a script that allows sample 'autoplacement' to occur, and describes how the script can be triggered.

This script is similar in concept to the example. The difference being that this updated example is able to work with multiple input plates, which can be of different types.

Solution

In this example, samples are placed according to the following logic:

The process type / master step is configured to produce just one output analyte (derived sample) for every input analyte.
The output analytes are placed on the same type of container as the corresponding source plate.
Each occupied well on the source plate populates the corresponding well on the destination plate.
The destination plate is named such that it has the text 'DEST-' prepended to the name of its corresponding source plate.

Step Configuration

In this example, the step is configured to invoke the script on entry to the Sample Placement screen.

Parameters

The EPP / automation command is configured to pass the following parameters:

An example of the full syntax to invoke the script is as follows:

About the Code

The main method in the script is autoPlace(). This method executes several operations:

The creation of the destination plates is effected by calls to createContainer().
It harvests just enough information so that the objects required by the subsequent code can retrieve the required objects using the batch API operations. This involves using some additional code to build and manage the cache of objects retrieved in the batch operations, namely:
- cacheArtifact()
- prepareCache()
- getArtifact()
The cached analytes are then accessed. After the source well to which the analyte maps has been determined, the output placement can be set. This information is presented in XML that can be POSTed back to the server in the format required for the placements resource.
After all the analytes have been processed, the placements XML is further supplemented with required information, and POSTed to the ../steps/<stepID>/placements API resource.
Finally, a meaningful message is reported back to the user via the ../steps/<stepID>/programstatus API resource.

Assumptions and Notes

The attached file is placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder.
The HOSTNAME global variable must be updated so that it points to your Clarity LIMS server.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

autoplaceSamplesDefaultMulti.py:

Automatic Placement of Samples Based on Input Plate Map

The Clarity LIMS interface offers tremendous flexibility when placing the outputs of a protocol step into new containers. Consider if your protocol step always involves the placement of samples using the plate map of the input plate. If yes, it makes sense to automate sample placement.

This example provides a script that allows sample 'autoplacement' to occur. It also describes how the script can be triggered, and what the lab scientist sees when the script is running.

For an example script that automates sample placement using multiple plates, see the Automatic Placement of Samples Based on Input Plate Map (Multiple Plates) example.

Solution

In this example, samples are placed according to the following logic:

The step produces one output sample for every input sample.
The output samples are placed on a 96 well plate.
Each occupied well on the source 96 well place populates the corresponding well on the destination 96 well plate.

Step Configuration

In this example, the step is configured to invoke the script on entry to the Sample Placement screen.

Parameters

The EPP / automation command is configured to pass the following parameters:

An example of the full syntax to invoke the script is as follows:

/usr/bin/python /opt/gls/clarity/customextensions/autoplaceSamplesDefault.py -l 122-7953 -u admin -p securepassword -s http://192.168.9.123:8080/api/v2/steps/122-5601

User Interaction

When the script has completed, the rightmost Placed Samples area of the Placement screen will display the container of auto-placed samples:

At this point the lab scientist can review the constituents of the container, and complete the step as normal.

About the Code

The main method in the script is autoPlace(). This method in turn carries out several operations:

A call to createContainer() prompts the creation of the destination 96 well plate.
The method harvests enough information so that the objects required by the subsequent code can retrieve the required objects using the 'batch' API operations. You add additional code to build and manage the cache of objects retrieved in the batch operations:
- cacheArtifact()
- prepareCache()
- getArtifact()
The cached analytes are accessed and the source well to which each analyte maps is determined.
Output placement can then be set. This information is presented in XML that can be POSTed back to the server in the correct format required for the placements resource.
After the analytes have been processed, the placements XML is further supplemented with required information, and POSTed to the ../steps/<stepID>/placements API resource.
Finally, a meaningful message is reported back to the user via the ../steps/<stepID>/programstatus API resource.

Assumptions and Notes

The attached file is placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder.
The Python API Library (glsapiutil.py) is placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder. You can download the latest glsapiutil library from our GitHub page.
The HOSTNAME global variable must be updated so that it points to your Clarity LIMS server.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

autoplaceSamplesDefault.py:

Publishing Files to LabLink

Some steps produce data that you would like your collaborators to have access to.

This example provides an alternative method and uses a script to publish the files programmatically via the API.

Solution

In this example, suppose we have a protocol step, based upon a Sanger/capillary sequencing workflow, that produces up to two files per sample (a .seq and a .ab1 file).

Our example script runs at the end of the protocol step. The script publishes the output files so that they are available to collaborators in the LabLink Collaborations Interface.

Parameters

The EPP / automation command is configured to pass the following parameters:

An example of the full syntax used to invoke the script is as follows:

User Interaction

After the script has completed its execution, collaborators are able to view and download the files from the LabLink Collaborations Interface.

About the code

The main method used in the script is publishFiles(). The method in turn carries out several operations:

The limsids of the steps' artifacts are gathered, and the artifacts are retrieved, in a single transaction using the 'batch' method.
Each artifact is investigated. If there is an associated file resource, its limsid is stored.
The files resources are retrieved in a single transaction using the 'batch' method.
For each file resource, the value of the <is-published> node is set to 'true'.
The files resources are saved back to Clarity LIMS in a single transaction using the 'batch' method.

Assumptions & notes

The attached file is placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

publishFilesToLabLink.py:

publishFilesToLabLink_v2.py:

Automatic Pooling Based on a Sample UDF/Custom Field

In some facilities, when samples are initially submitted into BaseSpace Clarity LIMS, it has already been determined which samples are combined to give pooled libraries. In such cases, it is desirable to automate the pooling of samples within the protocol step. Doing so means that the lab scientist does not have to manually pool the samples in Clarity LIMS interface. This automatic pooling saves time and effort and reduces errors.

Solution

This example provides a script that allows 'autopooling' to occur. It also describes how the script can be triggered, and what the lab scientist sees when the script is running.

The attached script relies upon a user-defined field (UDF) / custom field, named Pooling Group, at the analyte (sample) level.

This UDF is used to determine the constitution of each pool. This determination makes sure that samples combined to create a pool having the name of the Pooling Group value all have a common Pooling Group value.

For example, consider the Operations Interface (LIMS v4.x & earlier) Samples list shown below. The highlighted samples have a common Pooling Group value. Therefore, we can expect that they will be combined to create a pool named 210131122-pg1.

Step Configuration

In this example, the Pooling protocol step is configured to invoke the script as soon as the user enters the step's Pooling screen.

Parameters

The EPP / automation command is configured to pass the following parameters:

An example of the full syntax to invoke the script is as follows:

/usr/bin/python /opt/gls/clarity/customextensions/autopoolSamples.py -l 122-7953 -u admin -p securepassword -s http://192.168.9.123:8080/api/v2/steps/122-5601

User Interaction

When the lab scientist enters the Pooling screen, a message similar to the following displays:

When the script has completed, the rightmost Placed Samples area of the Placement screen displays the auto-created pools:

At this point, the lab scientist can review the constituents of each pool, and then complete the protocol step as normal.

About the Code

The main methods of interest are autoPool() and getPoolingGroup().

The autoPool() method harvests just enough information so that the objects required by the subsequent code in the method can retrieve the required objects using the 'batch' API operations. This involves using additional code to build and manage the cache of objects retrieved in the batch operations, namely:
- cacheArtifact()
- prepareCache()
- getArtifact()
After the cache of objects has been built, each artifact is linked to its submitted sample. The getPoolingGroup function harvests the Pooling Group UDF value of the corresponding submitted sample.
The script now understands which artifacts are to be grouped to produce the requested pools. An appropriate XML payload is constructed and then POSTed to the ../steps/<stepID>/placements API resource.

Assumptions and Notes

The attached file is placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder.
The Python API Library (glsapiutil.py) is placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder. You can download the latest glsapiutil library from our GitHub page.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

autopoolSamples.py:

Completing a Step Programmatically

This method supersedes the use of the processes API endpoint.

The capacity for completing a step programmatically, without having to open the BaseSpace Clarity LIMS web interface, allows for rapid validation of protocols. This method results in streamlined workflows for highly structured lab environments dealing with high throughput.

This example uses the /api/v2/steps endpoint, which allows for more controlled execution of steps. In contrast, a process can be executed using the api/v2/processes endpoint with only one POST. This ability is demonstrated in the Process Execution with EPP/Automation Support example.

Solution

The Clarity LIMS API allows for each aspect of a step to be completed programatically. Combining the capabilities of the API into one script allows for the completion of a step with one click.

Step Configuration

This example was created for non-pooling, non-indexing process types.

Parameters

The script accepts the following parameters:

An example of the full syntax to invoke the script is as follows:

/usr/bin/python /opt/gls/clarity/customextensions/autocomplete-wholestep.py -p apipassword -u apiuser -s https://demo-4-1.claritylims.com/api/v2/

User Interaction

The script contains several hard coded variables, as shown in the following example.

step_config_uri = "https://demo-4-1.claritylims.com/api/v2/configuration/protocols/551/steps/1003"
step_config = "simple step"
queue = '1003'

containerType = "96 well plate"
reagentCat = ""
replicates = 40

nextAction = 'nextstep'
nextStepURI = 'https://demo-4-1.claritylims.com/api/v2/configuration/protocols/551/steps/1004'

reagent_lot_limsid = ""

step_config_uri Is the stage that is automatically completed. Because this script is starting the step, there is no step limsid needed as input parameter to the script. After the script begins the step, it gathers the step limsid from the APIs response to the step-creation post.

queue = '1003'

About the Code

The main() method in the script carries out the following operations:

startStep()
addLot()
addAction()
addPlacement()
advanceStep()

Each of these functions creates an XML payload and interacts with the Clarity LIMS API to complete an activity that a lab user would be doing in the Clarity LIMS interface.

startStep()

This function creates a 'stp:step-creation' payload.

As written, the script includes all the analytes in the queue for the specified stage.

addLot()

This function creates a 'stp:lots' payload. This may be skipped if the process does not require reagent lot selection.

addAction()

This function creates a 'stp:actions' payload. As written, all output analytes are assigned to the same 'next-action'. To see the options available as next actions, see the REST API documentation: Type action-type:

NOTE: This example only supports the following next-actions: 'nextstep', 'remove', 'repeat'.

addPlacement()

This function creates a 'stp:placements' payload.

In this example, it is not important where the artifacts are placed, so the analytes are assigned randomly to a well location.

This function relies on the createContainer function, since a step producing replicate analytes may not create enough on-the-fly containers to place all out the output artifacts.

advanceStep()

This function advances the current-state for a step. The current-state is an attribute that is found at the /api/v2/steps/{limsid} endpoint. It is a representation of the page that you see in the user interface. For more information, see API Portal and search for the REST API documentation relating to the /{version}/steps/{limsid}/advance endpoint.

This function creates a 'stp:placements' payload. POSTing this payload to steps/{limsid}/advance is the API equivalent of moving to the next page of the GUI, with the final advance post completing the step.

Known Issues

There is a known bug with advance endpoint that prevents a complete end-to-end programatic progression through a pooling step.

Assumptions and Notes

You are running a version of Python that is supported by Clarity LIMS, as documented in the Clarity LIMS Technical Requirements.
The attached file is placed on the LIMS server, in the /opt/gls/clarity/customextensions folder.
The Python API Library (glsapiutil.py) is placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder. You can download the latest glsapiutil library from our GitHub page.
The HOSTNAME global variable must be updated so that it points to your Clarity LIMS server.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.Attachments

Attachments

autocomplete-wholestep.py:

Automatic Sample Placement into Existing Containers

There are often cases where empty containers are received and added into Clarity LIMS before being used in a protocol. This application example describes how to use the API to place samples into existing containers automatically. The application uses a CSV file that describes the mapping between the sample and its destination container.

Furthermore, the API allows accessioning into multiple container categories, something that is not possible through the web interface.

Prerequisites

If you use Python version 2.6x, you must install the argparse package. Python 2.7 and later include this package by default.
Also make sure that you have the latest glsapiutil.py Python API library on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder. You can download the latest glsapiutil library from our GitHub page.
Check the list of allowed containers for the step and make sure that all expected container categories are present. The API cannot place samples into containers that are not allowed for the step!

Structure of the Input File

The suggested input format is a four-column CSV with the following columns:

Sample Name, Container Category, Container Name, Well Position

The sample name should match the name as shown in the Ice Bucket/Queue screen.

Step Setup Screen Configuration

First, make sure that the Step Setup screen has been activated and is able to accept a file for upload:

EPP / Automation command line

Assuming the file is `compoundOutputFileLuid0`, the EPP / automation command line would be structured as follows:

bash -l -c "/usr/bin/env python /opt/gls/clarity/customextensions/placeSamplesIntoExistingContainers.py -u {username} -p {password} -s {stepURI:https} -f {compoundOutputFileLuid0}"

The automation should be configured to trigger automatically when the Placement screen is entered.

The Script

NOTE: The attached Python script uses the prerelease API endpoint (instead of v2), which allows placement of samples into existing containers.

The script performs the following operations:

Parses the file and create an internal map (Python dict) between sample name and container details:
- Key: sample name
- Value: (container name, well position) tuple
Retrieves the URI of each container.
Accesses the step's 'placements' XML using a GET request.
Performs the following modifications to the XML:
- Populates the <selected-containers> node with child nodes for each retrieved container.
- Populates each <output> artifact with a <location> node with the container details and well position.
PUTs the placement XML back to Clarity LIMS.

After the script runs, the Placement screen should show the placements, assuming there were no problems executing the script.

The attached script also contains some minimal bulletproofing for the following cases:

Container was not found.
Container is not empty.
Well position is invalid.
Sample in the ice bucket does not have a corresponding entry in the uploaded file.
Sample in the uploaded file is not in the ice bucket.

In all cases, the script reports an error and does not allow the user to proceed.

Attachments

placeSamplesIntoExistingContainers.py:

Routing Output Artifacts to Specific Workflows/Stages

Samples progressing through workflows can branch off and must be directed to different workflows or stages within a workflow.

Example: If it is not known at the initiation of a workflow if a sample is to be sequenced on a HiSeq or MiSeq. Rerouting the derived samples could be necessary.

Solution

This example provides the user with the opportunity to route samples individually to the HiSeq, MiSeq, or both stages from the Record Details screen.

Step Configuration

The step is configured to display two checkbox analyte UDFs / derived sample custom fields. The fields are used to select the destination workflow/stages for each derived sample. You can choose to queue the sample for HiSeq, MiSeq, or both.

In this example, you select the following:

Two samples to be queued for HiSeq
Two samples for MiSeq
Two that are not routed
Two samples for both HiSeq and MiSeq

Parameters

The script accepts the following parameters:

An example of the full syntax to invoke the script is as follows:

bash -c "/usr/bin/python /opt/gls/clarity/customextensions/Route_to_HiSeq_MiSeq.py -u {username} -p {password} -s {stepURI:v2}"

User Interaction

On the Record Details screen, you use the analyte UDF / derived sample custom field checkboxes to decide which workflow/stage combination to send each derived sample.

About the Code

The first important piece of information required is the URI of the destination stage.

The main method in the script is routeAnalytes. This method in turn carries out several operations:

Gathers the information for the process / master step that triggered the script, including output analytes.
For each analyte, evaluates which UDFs / custom fields have been set, and adds the analyte to a list of analytes to route.
Creates the XML message for each stage.
Does a POST to the REST API in order to add the analytes to the queue in Clarity LIMS.

Modifications

This script can be modified to look for a process level UDF (master step custom field), in which case all outputs from the step would be routed to the same step.

This example also serves to demonstrate a more general concept. Routing artifacts is valuable in any situation where a sample needs to be queued for a stage outside of the usual order of a workflow - or even routing newly submitted samples to the first stage in a workflow.

For more information about how to use the artifact/route endpoint, see REST.

Assumptions and Notes

You are running a version of Python that is supported by Clarity LIMS, as documented in the Clarity LIMS Technical Requirements.
The attached file is placed on the LIMS server, in the /opt/gls/clarity/customextensions folder.
The Python API Library (glsapiutil.py) is placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder. You can download the latest glsapiutil library from our GitHub page.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.
Samples can be inadvertently duplicated in the next step. This duplication occurs if:
- The sample is being routed at the last step of a protocol and;
- If the action of next steps is Mark Protocol as Complete.
This duplication is due to:
- The next step is routing the artifact to its default destination and;
- The script is routing the same artifact.
The solution here is to set the default next steps action to Remove from Workflow instead. This solution can be automated using Lab Logic Toolkit or the API.

Attachments

Route_to_HiSeq_MiSeq.py:

Creating Multiple Containers / Types for Placement

The Clarity LIMS interface offers tremendous flexibility in placing the outputs of a protocol step into new containers.

Sometimes it is necessary that a step produces multiple containers of differing types (for example a 96-well plate and a 384-well plate). Such an interaction is not possible without using the Clarity LIMS API.

This example provides a script that creates a 96-well plate and a 384-well plate in preparation for subsequent manual placement of samples.

Solution

In this example, containers are created according to the following logic:

A 96 well plate is produced along with a 384 well plate.
Neither plate has a user-specified name. The LIMS names them using the LIMS ID of the plates.

Step Configuration

The step is configured:

To allow samples to be placed in 96 or 384 well plates.
To invoke the script as soon as the user enters the step's Sample Placement screen.

Parameters

The EPP / automation command is configured to pass the following parameters:

An example of the full syntax to invoke the script is as follows:

/usr/bin/python /opt/gls/clarity/customextensions/createMultipleContainerTypes.py -l 122-7953 -u admin -p securepassword -s http://192.168.9.123:8080/api/v2/steps/122-5601

User Interaction

When the lab scientist enters the Sample Placement screen, the rightmost Placed Samples area displays the first container created (1 of 2). By selecting the second container, the display shows the second container (2 of 2).

The lab scientist manuallys place the samples into the containers and completes the protocol step as normal.

About the Code

Two calls to createContainer() create the destination 96-well and 384-well plates.
- To create custom containers, supplement the createContainer() method with the configuration details that apply to your instance of Clarity LIMS.
- To name the container with the value of the second argument, pass a non-empty string as the second argument to createContainer() method.
An XML payload is created containing only the details of the created containers, ready for the user to record the actual placements in the Clarity LIMS user interface.
This information is POSTed back to the server, in the format required for the placements resource.
A meaningful message is reported back to the user via the ../steps/<stepID>/programstatus API resource.

Assumptions and Notes

The attached file is placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder.
The Python API Library (glsapiutil.py) is placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder. You can download the latest glsapiutil library from our GitHub page.
Update the HOSTNAME global variable so that it points to your Clarity LIMS server.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

createMultipleContainerTypes.py:

Starting a Protocol Step via the API

In some circumstances, it can be desirable to automate the initiation of a step in Clarity LIMS. In this scenario, the step is thus executed without any user interaction, ie, a liquid-handling robot drives the step to completion. This example provides a solution that allows for automatic invocation of steps via the API.

Solution

Querying the Queues Endpoint

Before we can invoke a step, we must first employ the queues endpoint.

Every step displayed in the Clarity LIMS web interface has as associated queue, the contents of which can be queried. This image of the Nextera XT Library Prep shows the samples queued for each step in the Nextera XT Library Prep protocol.

For this example, we investigate the queue for the Step 1 - Tagment DNA (Nextera XT DNA) step.

Step 1: Find Step ID

First, we must find the LIMS ID of the step. We query the configuration/workflows resource and hone in on the Nextera XT for MiSeq protocol:

<wkfcnf:workflow status="ACTIVE" uri="http://192.168.8.10:8080/api/v2/configuration/workflows/309" name="Nextera XT for MiSeq">
<protocols>
    <protocol uri="http://192.168.8.10:8080/api/v2/configuration/protocols/3" name="DNA Initial QC"/>
    <protocol uri="http://192.168.8.10:8080/api/v2/configuration/protocols/302" name="Nextera XT Library Prep"/>
    <protocol uri="http://192.168.8.10:8080/api/v2/configuration/protocols/10" name="Illumina SBS (MiSeq)"/>
</protocols>
<stages>
    <stage uri="http://192.168.8.10:8080/api/v2/configuration/workflows/309/stages/690" name="DNA Initial QC"/>
    <stage uri="http://192.168.8.10:8080/api/v2/configuration/workflows/309/stages/691" name="Tagment DNA (Nextera XT DNA)"/>
    <stage uri="http://192.168.8.10:8080/api/v2/configuration/workflows/309/stages/692" name="PCR Amplification (Nextera DNA) 4.0"/>
    <stage uri="http://192.168.8.10:8080/api/v2/configuration/workflows/309/stages/693" name="PCR Clean-up (Nextera DNA) 4.0"/>
    <stage uri="http://192.168.8.10:8080/api/v2/configuration/workflows/309/stages/694" name="Bead Based Library Normalization"/>
    <stage uri="http://192.168.8.10:8080/api/v2/configuration/workflows/309/stages/695" name="Library Pooling (Nextera XT)"/>
    <stage uri="http://192.168.8.10:8080/api/v2/configuration/workflows/309/stages/696" name="Sort MiSeq Samples (MiSeq) 4.0"/>
    <stage uri="http://192.168.8.10:8080/api/v2/configuration/workflows/309/stages/697" name="Library Normalization (MiSeq) 4.0"/>
    <stage uri="http://192.168.8.10:8080/api/v2/configuration/workflows/309/stages/698" name="Library Pooling (MiSeq) 4.0"/>
    <stage uri="http://192.168.8.10:8080/api/v2/configuration/workflows/309/stages/699" name="Denature, Dilute and Load Sample (MiSeq) 4.0"/>
    <stage uri="http://192.168.8.10:8080/api/v2/configuration/workflows/309/stages/700" name="MiSeq Run (MiSeq) 4.0"/>
</stages>
</wkfcnf:workflow>

From the XML returned, we can see that the Tagment DNA (Nextera XT DNA) step has an associated stage, with an ID of 691:

<stage uri="http://192.168.8.10:8080/api/v2/configuration/workflows/309/stages/691" name="Tagment DNA (Nextera XT DNA)"/>

Step 2: Find Stage ID

If we now query this stage ID, we see something similar to the following:

<stg:stage index="0" name="Tagment DNA (Nextera XT DNA)" uri="http://192.168.8.10:8080/api/v2/configuration/workflows/309/stages/691">
    <workflow uri="http://192.168.8.10:8080/api/v2/configuration/workflows/309"/>
    <protocol uri="http://192.168.8.10:8080/api/v2/configuration/protocols/302"/>
    <step uri="http://192.168.8.10:8080/api/v2/configuration/protocols/302/steps/567"/>
</stg:stage>

We now have the piece of information we need: namely the ID 567 that is associated with this step.

Step 3: Query the Queues Resource

We can use this ID to query the queues resource, which provide us with something similar to the following:

<que:queue name="Tagment DNA (Nextera XT DNA)" protocol-step-uri="http://192.168.8.10:8080/api/v2/configuration/protocols/302/steps/567" uri="http://192.168.8.10:8080/api/v2/queues/567">
    <artifacts>
        <artifact limsid="HES208A1PA1" uri="http://192.168.8.10:8080/api/v2/artifacts/HES208A1PA1">
            <queue-time>2013-04-07T17:01:00.636-07:00</queue-time>
            <location>
                <container uri="http://192.168.8.10:8080/api/v2/containers/27-2654" limsid="27-2654"/>
                <value>1:1</value>
            </location>
        </artifact>
        <artifact limsid="HES208A2PA1" uri="http://192.168.8.10:8080/api/v2/artifacts/HES208A2PA1"></artifact>
        <artifact limsid="HES208A3PA1" uri="http://192.168.8.10:8080/api/v2/artifacts/HES208A3PA1"></artifact>
        <artifact limsid="HES208A4PA1" uri="http://192.168.8.10:8080/api/v2/artifacts/HES208A4PA1"></artifact>
        <artifact limsid="HES208A5PA1" uri="http://192.168.8.10:8080/api/v2/artifacts/HES208A5PA1"></artifact>
        <artifact limsid="HES208A6PA1" uri="http://192.168.8.10:8080/api/v2/artifacts/HES208A6PA1"></artifact>
        <artifact limsid="HES208A7PA1" uri="http://192.168.8.10:8080/api/v2/artifacts/HES208A7PA1"></artifact>
        <artifact limsid="HES208A8PA1" uri="http://192.168.8.10:8080/api/v2/artifacts/HES208A8PA1"></artifact>
        <artifact limsid="HES208A9PA1" uri="http://192.168.8.10:8080/api/v2/artifacts/HES208A9PA1"></artifact>
        <artifact limsid="HES208A10PA1" uri="http://192.168.8.10:8080/api/v2/artifacts/HES208A10PA1"></artifact>
    </artifacts>
</que:queue>

This result matches the information displayed in the Clarity LIMS web interface. In the next image, we can see the derived samples awaiting the step.

Initiating the Step

Now that we have the contents of the queue, starting the step programmatically is quite simple.

All that is required is a POST to the steps API endpoint. The XML input payload to the POST request will take the following form:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<tmp:step-creation xmlns:tmp="http://genologics.com/ri/step">
    <configuration uri="http://192.168.8.10:8080/api/v2/configuration/protocols/302/steps/567"/>
    <container-type>96 well plate</container-type>
    <inputs>
        <input uri="http://192.168.8.10:8080/api/v2/artifacts/HES208A1PA1" replicates="1"/>
    </inputs>
</tmp:step-creation>

If the POST operation was successful, the API will return XML of the following form (for details, see #h_3047a4d3-38b9-4378-b09a-3295e92a7476 section):

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<stp:step xmlns:stp="http://genologics.com/ri/step" current-state="Placement" limsid="24-19301" uri="http://192.168.8.10:8080/api/v2/steps/24-19301">
    <configuration uri="http://192.168.8.10:8080/api/v2/configuration/protocols/302/steps/567">Tagment DNA (Nextera XT DNA)</configuration>
    <actions uri="http://192.168.8.10:8080/api/v2/steps/24-19301/actions"/>
    <placements uri="http://192.168.8.10:8080/api/v2/steps/24-19301/placements"/>
    <details uri="http://192.168.8.10:8080/api/v2/steps/24-19301/details"/>
    <available-programs/>
</stp:step>

User Interaction

In the Clarity LIMS web interface, two pieces of evidence indicate that the step has been initiated:

The partially completed step is displayed in the Work in Progress area.
The Recent Activities area shows that the protocol step was started.

About the Code

The XML payload POSTed to the steps resource is quite simple in nature. In fact there are only three variables within the payload:

1. The step to be initiated:

<configuration uri="http://192.168.8.10:8080/api/v2/configuration/protocols/302/steps/567"/>

2. The type of output container to be used (if appropriate):

<container-type>96 well plate</container-type>

3. The URI(s) of the artifact(s) on which the step should be run, along with the number of replicates the step needs to create (if appropriate):

<input uri="http://192.168.8.10:8080/api/v2/artifacts/HES208A1PA1" replicates="

Setting Quality Control Flags

A key reason to track samples is to monitor their quality. In Clarity LIMS, samples are flagged with a check mark to indicate good quality (QC Pass) or an X to indicate poor quality (QC Fail).

There are many ways to determine quality, including concentration and DNA 260/280 light absorbance measurements. This example uses a tab-separated value (TSV) results file from a Thermo NanoDrop Spectrophotometer to:

Record concentration and 260/280 measurements, and;
Set quality flags in the LIMS.

In this example, once the script is installed, the user simply runs and records a step and imports the results file. The EPP / automation script does the rest of the work, by reading the file, capturing the measurements in Clarity LIMS, and setting the QC flags.

QC file formats

As spectrophotometers are used for many measurements, within many lab protocols, file formats can vary depending on the instrument software settings. Check your instruments for specific file formats and use the example below to get familiar with the QC example scripts.

User Interaction

The user selects samples and runs the QC Example step.
In the Operations Interface (LIMS v4 & earlier), the user sets the required minimum concentration and/or 260/280 lower and upper bounds.
The QC Example process creates an output artifact (shared ResultsFile) called QC Data File. This file is shown in the Sample Genealogy and Outputs panes. The "!" icon indicates that this entry is a placeholder for a file.
The users loads samples onto the spectrophotometer and follows the instrument's protocol for QC measurement.
After the measurements are complete, the user exports the TSV results file created by the spectrophotometer, using the NanoDrop software.
The user imports the TSV file into the LIMS: As Clarity LIMS parses the file, the measurements are captured and stored as output user-defined fields (UDFs). The QC Pass/Fail flags are then set on the process inputs. The flags are set according to whether they meet the concentration and/or 260/280 bounds specified in Step 2.

Example

A TSV file is created by the NanoDrop spectrophotometer. The specific file used in this example is shown below.

Default
1/6/2011  5:27 PM
Sample ID	ng/uL	A260	260/280	260/230	Constant
QCExamplePlate_A01_DNAIsolate-1	62.49	1.450	1.83	1.86	50
QCExamplePlate_A02_DNAIsolate-2	49.50	1.413	1.99	1.88	50
QCExamplePlate_A03_DNAIsolate-3	70.00	1.198	1.84	2.03	50
QCExamplePlate_A04_DNAIsolate-4	62.49	1.450	1.83	1.86	50
QCExamplePlate_A05_DNAIsolate-5	49.50	1.413	1.99	1.88	50
QCExamplePlate_A06_DNAIsolate-6	70.00	1.198	1.84	2.03	50
QCExamplePlate_A07_DNAIsolate-7	62.49	1.450	1.83	1.86	50
QCExamplePlate_A08_DNAIsolate-8	49.50	1.413	1.99	1.88	50
QCExamplePlate_A09_DNAIsolate-9	70.00	1.198	1.84	2.03	50

After the TSV file is imported and attached to the QC Example process in Clarity LIMS, a second script is automatically called with EPP. This second EPP script is part of a second process, called QC Example(file handling). The file attachment event triggers the second EPP script. You can see the process in the Sample Genealogy pane:
The example uses the TSV file Sample ID value to locate the plate and well location to which the QC flag is to be applied. In the example file shown in Step 1: The container is QCExamplePlate.
- The location A01 maps to the sample on the first well of the container named QCExamplePlate.
- The data contained in the TSV file is captured in Clarity LIMS and can be viewed on the Details tab.
Notice that the user ran only one process (QC Example), but two processes were recorded. The first EPP script created the second process, QC Example (file handling), using the REST API. Using REST to create a process is described in the Running a Process Cookbook example.
The NanoDrop QC algorithm in the script compares the concentration and the 260/280 ratio for each sample in the imported TSV file. The values are entered into the process UDFs.
- A QC Fail flag is applied:
  - If the concentration of the sample is less than specified, or;
  - If its 260/280 ratio is outside the bounds given when running the process.
- A QC Pass flag is applied when the sample has values inside the parameters provided.
- Samples with no associated values in the TSV file are unaffected. In this example, the minimum concentration is set to 60. A QC Fail flag is applied to sample-2 because its concentration level does not meet the minimum value specified.

Automation/EPP can be used to process files when they are attached to the LIMS. Using file attachment triggers is sometimes called data analysis pipelining. Basically, a series of analysis steps across a chain of processes is triggered from one file attachment.

Installation

Download the zip file to the server; on a non-production server use the gls user account.
Unzip the file to the following directory: /opt/gls/clarity/Applications. The contents of the zip file will be installed within that directory, to CookBook/NanoDropQC/.
```
unzip -d /opt/gls/clarity/Applications file-qc-2.0-bundle.zip
```
Next, unzip the config-slicer-<version>-deployment-bundle.zip in /opt/gls/clarity/Applications/CookBook/NanoDropQC/. Replace <version> with the version number of the included config-slicer.
```
cd /opt/gls/clarity/Applications/CookBook/NanoDropQC/
 
unzip config-slicer--deployment-bundle.zip
```
With Clarity LIMS running, run the following server command-line call to import the required configuration into the server (i.e., the process, sample, and fields used by the scripts):
```
java -jar config-slicer-.jar \
    -u  -p  -host  \
    -o import -k file-qc-package.xml
```

Installation validation

To confirm that the example is correctly installed, follow the steps below to simulate the recording of QC information by a user in the lab:

Start the Clarity LIMS Operations Interface client.
Create a project, and submit a 96 well plate named QCExamplePlate full of samples
Select the samples and run the QC Example process.
Click the Next and Done buttons to complete the wizard.
When the process completes, in the process summary tab's Input/Output Explorer, you'll see the shared output file placeholder (QC Data File) in the Outputs pane and in the Sample Genealogy. Right-click this placeholder and click Import.
Import the example TSV nanodrop-qc-example.tsv file provided in the zip file.
Wait for the QC flags to become visible on a subset of the samples used as inputs to the process (those located from A:1 to A:9).

Example modifications

You can modify the example script to suit your lab's QC experimental methods and calculations. For example, you may want to consider phenotypic information or extra sample data recorded in the LIMS. Two modifications to the example are described below.

Recording QC measurements on sample inputs instead of file outputs

The example script writes the measurements into user-defined fields (UDFs) associated with outputs of the process. This allows multiple measurements to be recorded for one sample, by running the process multiple times. Each time the process is run on an input sample, a new process with new output results is recorded in the LIMS.

You may instead want to write the measurements into UDFs associated with the input samples. For example, you may want to keep the data records simple: the greater the number of outputs recorded in the LIMS, the more confusing it becomes for the user to upload files and navigate results. Setting the fields on the inputs provides a single 'golden' value.

To change the configuration and script to set QC flags and field values on inputs:

Change the code in NanoDrop.groovy so that UDFs are set on inputs instead of outputs. That is, replace this line:
```
setUDF(output, <udf-name>, <value>)
```
with the following:
```
setUDF(input, <udf-name>, <value>)
```
Since you are no longer changing the outputs, you can comment or delete the line where the output is saved:
```
// Save changes
client.httpPUT(input.@uri, input)
// client.httpPUT(output.@uri, output)
```
Run the QC Example (preparation) process that was included in the configuration package you imported into your system.
The process wizard provides the option for you to either generate a new plate (container) or select a preexisting plate to hold the process outputs.
- Note: The results of this process will be placed into a plate that is different from the one in which you originally placed the samples.
Run the QC Process on the plate created in the previous step.
Edit the nanodrop-qc-example.tsv file to reflect the name of this plate (remember that the Sample ID column in the NanoDrop QC data file depends on the plate name):
- To do this, for each row, replace QCExamplePlate with the name of the plate that now holds the outputs of the process-generated plate names, for example, would be in a format similar to "27-124")
Import the modified nanodrop-qc-example.tsv into the result file placeholder generated by the QC Example process.
Wait for the QC flags to update on the inputs for which the nanodrop-qc-example.tsv file has measurements.
This time, instead of the measurements appearing in the outputs of the QC Example process, they will instead be collected as UDF values on the inputs of that process. To see these measurements, select the outputs of the parent process, QC Example (preparation), and view the output Details tab.

Using additional information stored in sample fields to set the QC flags

Most labs use multiple factors to determine sample QC flags. These factors might be associated with the submitted sample, multiple instrument measurements, or even the type of project or sample.

To demonstrate how easy it is to aggregate multiple factors into the QC flag logic, a boolean field called Human is added to the sample configuration. The script logic is modified to only set flags for human samples.

To change the configuration and script to check for human samples:

Change the code in NanoDrop.groovy (where we loop through the input/output pairs adjusting QC flags/updating UDFs), so that we first ensure we are dealing with a Human sample.

To do this, change the loop at the end of the script from this:

// all inputs we have NanoDrop results for exist, we can now assign QC status 
nanodropResults.each {
    def (input, output) = identifiedIOPairs[it.key]
 
    // Collect measurements as UDF values
    ...
    // Determine QC flags
    ...
    // Save changes
    ...
}

to this:

// All inputs we have NanoDrop results for exist, we can now assign QC status 
nanodropResults.each {
    def (input, output) = identifiedIOPairs[it.key]
 
    // Skip non-human samples
    def sample = client.httpGET(input.'sample'[0].@uri)
    def isHuman = getUDF(sample, 'Human', 'false').toBoolean()
    if (!isHuman) {
        return
    }
 
    // Collect measurements as UDF values
    ...
    // Determine QC flags
    ...
    // Save changes
    ...
}

Configure a checkbox UDF on Sample (this was done for you when you imported the configuration package provided in this application example).
Submit a new batch of samples on a 96 well plate.
Edit samples on wells from A:1 to A:3 so that the Human check box field is selected.
Run the QC Example process on all 96 samples in the plate.
In the nanodrop-qc-example.tsv file, update the Sample ID column with the correct plate name.
Import the modified nanodrop-qc-example.tsv into the result file placeholder generated by the QC Example process.
Wait for the QC flags to update on the inputs for which the NanoDrop QC data file has measurements.
Note that only the inputs A:1, A:2 and A:3 will have QC flags assigned. The other inputs, including those for which the NanoDrop QC data file has data, will be left untouched because of the selection criteria implemented.

Assumptions & notes

Clarity LIMS v1 or later (API v2 r14)
Groovy 1.7.4 or later (expected location: /opt/gls/groovy/current/)

All prerequisites are preloaded if you install on a non-production server.

Attachments

file-qc-2.0-bundle.zip:

Applying Indexing Patterns to Containers Automatically

The indexing of samples is often performed in patterns, based upon the location of the samples in the container.

This example shows how to automate the default placing of reagents on samples, based on their container position. This greatly reduces the amount of time spent on the Add Labels screen (LIMS v6.x) and also reduces user error.

In this example, reagent labels are assigned to samples in a predetermined pattern as the user enters the Add Reagents screen. This pattern is applied to all containers entering this stage.

Solution

The example AssignIndexPattern.groovy script is configured to run on the Adenylate Ends & Ligate Adapters (TruSeq DNA) 4.0 step.

Parameters

The script accepts the following parameters:

An example command line is shown below:

bash -c "/opt/gls/groovy/current/bin/groovy -cp /opt/groovy/lib /opt/gls/clarity/customextensions/AssignIndexPattern.groovy -i {stepURI:v2:http} -u {username} -p {password}"

NOTE: The location of Groovy on your server may be different from the one shown in this example. If this is the case, modify the script accordingly.

Step Configuration

In the Clarity LIMS web interface, for the Adenylate Ends & Ligate Adapters (TruSeq DNA) 4.0 step (in the TruSeq DNA Sample Prep protocol), configure Automation as follows:

Clarity LIMS v6.x

Trigger Location: Add Labels
Trigger Style: Automatic upon entry

User Interaction

Assuming the user has added 96 samples and has reached the Adenylate Ends & Ligate Adapters (TruSeq DNA) 4.0 step:

The user transfers all 96 samples to a new 96-well plate and proceeds with step.
When the user enters the Add Labels screen, the script is initiated. A message box alerts the user that a custom script is in progress.
Upon completion, the previously defined success message displays.
When the success message is closed, the Add Labels screen loads, and the pattern shown below is applied to samples.

About the code

Once the script has processed the input and ensured that all the required information is available, we can start applying the reagents to our samples.

To begin, we need to define the reagents and pattern to apply.
- The storing of reagents can be accomplished by placing the reagents in a Map, comprised of the reagent names indexed by their respective number, i.e. 'AD030' indexed at 30.
- The pattern can be stored as a List of Lists. This can be arranged as a visual representation of the pattern to be applied.\
  // Reagent Labels public static final def REAGENT_MAP = [ (27):'AD027 (ATTCCT)', (23):'AD023 (GAGTGG)', (20):'AD020 (GTGGCC)', (15):'AD015 (ATGTCA)' ] // Reagent Pattern public static final def REAGENT_PATTERN = [ [27,27,27,27,27,15,27,27,27,27,27,27], [27,27,27,27,15,15,15,27,27,27,27,27], [27,27,27,15,15,15,15,15,20,20,20,20], [27,27,23,23,15,15,15,20,20,20,20,27], [27,23,23,23,23,15,20,20,20,20,27,27], [23,23,23,23,23,23,23,23,23,27,27,27], [27,27,27,27,27,23,23,23,27,27,27,27], [27,27,27,27,27,27,23,27,27,27,27,27] ]
- Once we have our reagents and pattern defined, we can start processing the samples:
- We start by retrieving the node from the reagent setup endpoint. We use this node as a base for subsequent commands.
- We then gather the unique output artifact URIs and retrieve the output artifacts using batchGET:\
  // Retrieve the reagent setup Node reagentSetup = GLSRestApiUtils.httpGET(stepURI + '/reagents', username, password) // Collect the artifact URIs and retrieve the artifacts def artifactURIs = reagentSetup.'output-reagents'.'output'.collect { it.@uri }.unique() def artifacts = GLSRestApiUtils.batchGET(artifactURIs, username, password)
- Next, we iterate through our list of output artifacts.
- For each artifact, we determine its position and use its components to index our pattern. This allows us to determine which reagent should be placed on which sample.
- Once we determine the reagent's name, we create a reagent-label node with a name attribute equal to the desired reagent name.
- In the list of output-reagents in the reagent setup node, we find the output that corresponds to the output artifact that we are processing and add our reagent-label node to it. NOTE: We must strip off the state from our artifact's URI. The URIs stored in the step setup node are stateless and will not match the URI returned from our output artifact.

   ```
   // For each artifact, determine its position and set its reagent label accordingly
   artifacts.each { artifact ->
       // Split the position into its two components
       def positionIndices = parsePlacement(artifact.'location'[0].'value'[0].text())
    
       // Using our relationship maps, determine which reagent should be placed at that position
       String reagentName = REAGENT_MAP[((REAGENT_PATTERN[positionIndices[0]])[positionIndices[1]])]
    <
       // Create and attach the reagent-label node to our setup
       Node reagentNode = NodeBuilder.newInstance().'reagent-label'(name:reagentName)
       reagentSetup.'output-reagents'.'output'.find { it.@uri == GLSRestApiUtils.stripQuery(artifact.@uri) }.append(reagentNode)
   }
   ```

Once we have processed all of our output artifacts, we POST our modified setup node to the reagentSetup endpoint. This updates the default placement in the API.

We then define our success message to display to the user upon the script's completion.\

// Set the reagent setup in the API
GLSRestApiUtils.httpPOST(reagentSetup, reagentSetup.@uri, username, password)
 
// Define the success message to the user
outputMessage = "Script has completed successfully.${LINE_TERMINATOR}" +
        "Clarity LIMS reagent pattern has been applied to all containers."

Assumptions and Notes

Your configuration conforms with the script requirements documented in #h_42788a7c-1199-4d16-8b43-8ab6ab9ca7a6.
You are running a version of Groovy that is supported by Clarity LIMS, as documented in the Clarity LIMS Technical Requirements.
The attached Groovy file is placed on the LIMS server, in the following location: /opt/gls/clarity/customextensions
GLSRestApiUtils.groovy is placed in your Groovy lib folder.
You have imported the attached Reagent XML file into your system using the Config Slicer tool.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

Single Indexing ReagentTypes.xml:

AssignIndexPattern.groovy:

Assignment of Sample Next Steps Based On a UDF

In the default configuration of Clarity LIMS, at the end of every step, the user is required to choose where the samples will go next - i.e. the 'next step'.

If samples in the lab follow a logical flow based on business logic, this is an unnecessary manual task. This example shows how to automate this next step selection, to reduce error and user interaction.

This example uses the Automatically Assign Next Protocol Step (Example) step, in the Automation Examples (API Cookbook) protocol. The examples shoes how to:

Automate the selection of a sample's Next Steps, as displayed on the Assign Next Steps screen of this step.
Use the Pooling sample UDF / custom field to determine which next step each sample is assigned.

Solution

Step Configuration

The Automatically Assign Next Protocol Step (Example) step has two permitted Next Steps:

Confirmation of Low-plexity Pooling (Example)
Automated Workflow Assignment (Example)

Depending on the value of a sample's Pooling UDF / custom field, the sample's Next Step will default to one of the permitted next steps:

If the value of the Pooling UDF / custom field is any case combination of No or None, the sample's next step will default to Automated Workflow Assignment (Example).
Otherwise, the sample's next step will default to Confirmation of Low-plexity Pooling (Example). Next step configuration (LIMS v4.x shown)

Automation is configured as follows:

Behavior: Automatically initiated
Stage of Step: On Record Details screen
Timing: When screen is exited

Parameters

The script takes three basic parameters:

An example command line is shown below.

(Note: The location of groovy on your server may be different from the one shown in this example. If this is the case, modify the script accordingly.)

bash -c "/opt/gls/groovy/current/bin/groovy -cp /opt/groovy/lib /opt/gls/clarity/customextensions/NextStepAutomation.groovy -u {username} -p {password} -i {stepURI:v2:http}"

User Interaction

Assuming samples have been placed in the protocol and are ready to be processed, the user proceeds as normal:

Upon reaching the transition from the Record Details screen to the Assign Next Steps screen, the script is run. A message box alerts the user that a custom script is in progress.
Upon completion of the script, a custom success message is displayed.
Once the success message is closed and the screen has transitioned, the default next steps display for the samples.

About the Code

Once the script has processed the input and ensured that all the required information is available, we can start to process the samples to determine their next steps.

First, we retrieve the next actions list:\

// Retrieve the current protocol step
String nextActionsURI = stepURI + '/actions'
Node nextActionsList = GLSRestApiUtils.httpGET(nextActionsURI, username, password)
String currentProtocolStepURI = nextActionsList.'configuration'[0].@uri
Node currentProtocolStep = GLSRestApiUtils.httpGET(currentProtocolStepURI, username, password)

This endpoint contains a list of the step's output analytes, and a link to its parent step configuration. In this case, we want to retrieve the step configuration so that we can collect the URIs of the expected next steps.
```
// Determine the uris of the possible next steps
currentProtocolStep.'transitions'.'transition'.each {
    if(NEXT_STEPS.containsKey(it.@name)) {
        NEXT_STEPS[it.@name] = it.@'next-step-uri'
    }
}
```
Once we have retrieved the step configuration, we iterate over its possible next steps, gathering their URIs and storing them by name in a Map.

Once we have collected the URIs of our destination steps, we can start analyzing each sample to determine what its default should be.

For each possible 'next-action', we retrieve the target artifact, which then enables us to retrieve that artifact's parent sample.

We then retrieve the value of the sample's Pooling UDF / custom field , if it exists. If it doesn't exist, a default value is given.\

// For each output analyte, set its corresponding next step according to the value of the UDF 'Pooling'
nextActionsList.'next-actions'.'next-action'.each {
    Node artifact = GLSRestApiUtils.httpGET(it.@'artifact-uri', username, password)
    Node sample = GLSRestApiUtils.httpGET(artifact.'sample'[0].@uri, username, password)
    String poolingValue = sample.'udf:field'.find { UDF_NAME == it.@name } ? sample.'udf:field'.find { UDF_NAME == it.@name }.value()[0] : 'Default'
 
    // If Pooling is a variation of No or None, set to workflowAssignment, otherwise set to pooling step
    if(!NO_VALUES.contains(poolingValue.toLowerCase())) {
            it.@action = NEXT_STEP_ACTION
            it.@'step-uri' = NEXT_STEPS[POOLING_STEP]
            poolingSamples++
    } else {
            it.@action = NEXT_STEP_ACTION
            it.@'step-uri' = NEXT_STEPS[WORKFLOW_STEP]
            workflowAssignmentSamples++
    }
}

To set the next step, we set the step-uri attribute of the node to the URI of the expected destination step.
- We also increment counters, so that we can report to the user what actions were taken on the given samples.
- Once this is done, we perform an httpPUT on the action list, adding the changes to the API and allowing our defaults to be set.

Finally, we define the successful output message to the user. This allows the user to check the results.

// Update the next steps in the API
GLSRestApiUtils.httpPUT(nextActionsList, nextActionsURI, username, password)
 
// Define the success message to the user
outputMessage = "Script has completed successfully.${LINE_TERMINATOR}" +
            "Next steps for ${poolingSamples + workflowAssignmentSamples} samples have been set:${LINE_TERMINATOR}" +
            "${poolingSamples} samples set to '${POOLING_STEP}'.${LINE_TERMINATOR}" +
            "${workflowAssignmentSamples} samples set to '${WORKFLOW_STEP}'."

Assumptions and Notes

You are running a version of Groovy that is supported by Clarity LIMS, as documented in the Clarity LIMS Technical Requirements.
The attached Groovy file is placed on the LIMS server, in the folder /opt/gls/clarity/customextensions
GLSRestApiUtils.groovy is placed in your Groovy lib folder.
A single-line text sample UDF named Pooling has been created.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

NextStepAutomation.groovy:

Parsing Metadata into UDFs (BCL Conversion and Demultiplexing)

This example provides a script that can be used to parse lanebarcode.html files from demultiplexing. This script is written to be easily used with the out of the box Bcl Conversion & Demultiplexing (HiSeq 3000/4000) protocol.

Result values are associated with a barcode sequence as well as lane.
Values are attached to the result file output in Clarity LIMS, with matching barcode sequence (index on derived sample input) and lane (container placement of derived sample input).
Script modifications may be needed to match the format of index in Clarity LIMS to the index in the HTML result file.

Parameters

The script accepts the following parameters:

An example of the full syntax to invoke the script is as follows:

Configuration

Defining the UDFs / Custom Fields

All user defined fields (UDFs) / custom fields must first be defined in the script. Within the UDF / custom field dictionary, the name of the field as it appears in Clarity LIMS (the key) must be associated with the field from the result file (the value).

The fields should be preconfigured in Clarity LIMS for result file outputs.

Modifying individual UDFs / Custom Fields

The UDF / custom field values can be modified before being brought into Clarity LIMS. In the following example, the value in megabases is modified to gigabases.

Checking for matching flow cell ID

The script currently checks the flow cell ID for the projects in Clarity LIMS against the flow cell IS in the result file.

NOTE: The script will still complete and attach UDF / custom field values. You may wish to modify the script to not attach the field values if the flow cell ID does not match.

Assumptions and Notes

Your configuration conforms with the script's requirements, as documented in the Configuration section of this document.
You are running a version of Python that is supported by Clarity LIMS, as documented in the Clarity LIMS Technical Requirements.
The attached Python file is placed on the LIMS server, in the /opt/gls/clarity/customextensions folder.
The glsapiutil file is placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

demux_stats_parser.py:

demux_stats_parser_4.py:

Scripts That Validate Step Contents

Validating Process/Step Level UDFs
Checking That Containers Are Named Appropriately
Checking for Index Clashes Based on Index Sequence
Validating Illumina TruSeq Index Adapter Combinations

Validating Process/Step Level UDFs

In this example, the protocol step is configured to invoke a script that is triggered when the user exits the step's Record Details screen.

Parameters

The EPP command is configured to pass the following parameters to the script:

An example of the full syntax to invoke the script is as follows:

User Interaction

When the lab scientist attempts to exit the Record Details screen, the script is invoked and the UDF names specified in the -f parameter will be checked to see that they have been populated. If they ALL have been, the script issues no message, and the process will continue as normal. If however, SOME of the UDFs have not been populated, a message will be displayed to the user indicating which mandatory fields need to be populated, and the user will be unable to move forward until all the specified fields have been populated.

About the Code

The method of central interest in the script is checkFields(). The method in turn carries out several operations:

The list of fields passed to the script via the -f parameter is broken into its component UDF names
For each UDF name in the list, the corresponding value is checked via the API, and if the value is empty, the UDF name is noted as absent
If any UDF names are noted as being absent, an error dialog will be displayed to the user.

Assumptions and Notes

Both of the attached files are placed on the LIMS server, in the /opt/gls/clarity/customextensions folder.
The HOSTNAME global variable needs to be updated so that it points to your Clarity LIMS server.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.
If required, this script could be enhanced so that it not only checked to see that the UDFs had been populated, but that their values also matches a regexp pattern for additional validation.

Attachments

checkprocessFields.py:

Checking That Containers Are Named Appropriately

When samples are placed onto containers in BaseSpace Clarity LIMS, the default operation is that containers produced by the system are named with their LIMS ID (for example 27-1234 etc). The expectation is that after samples have been placed, the user will rename the containers with the barcode that is on the plate, chip, or other container.

There are places in the workflow in which issues will occur with integrations if the user renames the container incorrectly.

For example, when loading libraries onto a flow cell for sequencing, the flow cell container must be renamed with the actual ID/barcode of the flow cell in order for the Illumina integration to complete its work successfully.

This example provides a short script that checks to see if the user has correctly renamed containers after the placement of samples. It also shows what occurs in the Clarity LIMS Web Interface when the script encounters a naming issue.

Solution

Protocol Step Configuration

In this example, the protocol step is configured to invoke the script when the user exits the step's Sample Placement screen.

Parameters

The EPP command is configured to pass the following parameters:

An example of the full syntax to invoke the script is as follows:

User Interaction

When the user enters the Sample Placement screen, the rightmost Placed Samples area will show the containers created with their default, system-assigned names:

When the user tries to leave the Sample Placement screen, the script is invoked:

If the script finds any containers that still have their default, system-assigned names, an error message is generated:

To complete the protocol step, the user must first rename the containers:

About the Code

The main method of interest is validateContainerNames().

This method queries the placements resource for the current step, and gathers the LIMS IDs of the selected containers.
For each selected container associated with the protocol step, the containers resource is called:
- The container name is compared to the LIMS ID.
- If the values are identical, an error message is generated.
Additional validation: Ideally, this script is set up to also validate the renamed containers against a specific goal. For example, the attached example script checks to see that the new name is at least 10 characters in length. You may choose to replace or supplement this optional, additional validation to provide specific logic to suit your business needs.\

Assumptions and Notes

Both of the attached files are placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder.
You will need to update the HOSTNAME global variable such that it points to your Clarity LIMS server.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

validateContainerNames.py:

Checking for Index Clashes Based on Index Sequence

While not a common scenario, it is possible to have two molecular barcodes/indices in BaseSpace Clarity LIMS that share the same sequence.

Within Clarity LIMS, there is logic that can be enabled to prevent pooling of samples with the same index name. However, this check is based upon the index name, not the sequence. In the case of identical indexes with differing names, additional logic is required to prevent index clashes based upon the index sequence.

This example provides a solution that looks for index clashes based upon sequence.

Solution

In this example, the protocol step that is responsible for pooling is configured to invoke the script as soon as the user exits the step's Pooling screen.

Parameters

The EPP command is configured to pass the following parameters:

An example of the full syntax to invoke the script is as follows:

User Interaction

When the user exits the Pooling screen, the script is invoked.

If no index clashes are found, the step proceeds and the script does not report back to the user.
If an index clash is found, a message is returned to the user, informing them of the problematic components within the affected pool. The user is not able to continue the step without first re-constituting the pools.

About the Code

The main method in the script is checkIndexes(). The method in turn carries out several operations:

The step's pools resource is queried.
The constituent artifacts of each pool are identified, along the with name of the reagent-label applied.
For each reagent-label:
- The associated sequence is identified.
- A dictionary is constructed using the sequence as key, the value being the name of the derived sample(s) associated with the sequence
If any sequence in the dictionary is associated with more than one derived sample, an error message is constructed and is reported to the user once all pools have been considered.

Assumptions and Notes

Both of the attached files are placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder.
You will need to update the HOSTNAME global variable such that it points to your Clarity LIMS server.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

NOTE: The current script uses caching to avoid unnecessary GETs, but does not use batch transactions. In the case of large levels of multiplexing, the script could be reworked to gather all of the constituent artifacts in a single batch transaction.

Attachments

checkIndexes.py:

Validating Illumina TruSeq Index Adapter Combinations

When pooling samples, there are often numerous complex rules and restrictions regarding which combinations of adapters are acceptable.

As a method of applying custom business logic, it is possible to automate the verification of your pools using Clarity LIMS.

This example shows how to confirm the composition of pools before they are created, allowing the lab scientist to alter the composition of pools that have caused an error.

Solution

In this example, we will enforce the following Illumina TruSeq DNA LT adapter tube pooling guidelines:

Process

The example script is configured to run on the Library Pooling (Illumina SBS) 4.0 process.

Parameters

The EPP command is configured to pass the following parameters to the script:

An example of the full syntax to invoke the script is as follows:

NOTE: The location of Groovy on your server may be different from the one shown in this example. If this is the case, modify the script accordingly.

User Interaction

Assuming samples have been worked through the protocol and have reached the Library Pooling (Illumina SBS) 4.0 protocol step, the user pools the samples following the specified guidelines.
When the pools are created, the user attempts to proceed to the next page.
A message box displays alerting the user that a custom program is executing.
On successful completion, a success message displays.

About the Code

Once the script has processed the input and ensured that all the required information is available, we process the pools to determine if they meet the required specifications.

The first challenge is to represent the adapter combinations in the script.
- This is accomplished by a map comprised of the adapter names, indexed by their respective number, ie. AD001 indexed at 1.
Next, we define the three combination groups: 2 plex, 3 plex, and 4 plex.
- This is achieved by creating a List of Lists, with the inner lists representing our combinations.
- To facilitate the required fallback through lower plexity combinations, we store the combinations groups in a list, in ascending plexity.\

Once the combinations are defined, we need to create a method which will compare the actual combination of adapters in a pool with our ideal combinations. There are two cases we need to handle:
- When we are comparing two equal plexity combinations.
- When we are comparing a higher plexity pool to a lower plexity combination.
To handle the first case, we create a function that takes in our actual combination, and the ideal combination.
If the actual combination contains the entire combination, we remove those adapters. We then ensure that the leftover adapters are not in our Illumina TruSeq DNA LT adapter.\
The second case is similar to the first.
- We create a function that takes in our actual combination, the ideal combination, and the amount of wildcards. A wildcard represents an 'any adapter' condition in Illumina's TruSeq DNA LT adapter tube pooling guidelines.
- Like the first case, we ensure that the actual list contains the entire ideal combination.
- After removing the ideal adapters, we ensure that the amount of leftover Illumina TruSeq DNA LT adapters is equal to the amount of wildcards.
To represent the adapter combination fallbacks, we require a method which will attempt to match the highest possible plexity for a given list of adapters. If it cannot do this, it will attempt to match it with a lower plexity combination with a wildcard.
- To achieve this, we define a recursive function that handles both the exact and wildcard cases. The ideal combination plexitys will be chosen by the patternIndex input.
- If no wildcards are present, we check each combination in the designated plexity.
- If a match is not found, we call the match function again. This time, we increase the amount of wildcards by 1 and reduce the plexity of the combinations by 1. The function will now compare the adapter list using the wildCardMatch function. If a match is found, the function will exit and return true.
Now, with our supporting functions defined, we can start processing our pools.
- First we retrieve the definitions of the pools from the API. This node contains a list of the output pools, in addition to what input each pool contains.
- Using this list, we create a map that stores the URIs of the output pools and the amount of inputs to each pool.
- We then retrieve the output pools using a batchGET.
Once we have the pools, we iterate through the list.
- If a pool is valid, we increment a counter which will be used in our success message.
- If invalid, we set the script outcome to failure, and append to the failure message.
- The script continues searching for other issues and adding their information to the failure message.
After each pool has been checked, we determine how to alert the user of the script's completion.
- If a pool is invalid, an error is thrown containing the list of failures and a recommendation to review the Illumina pooling guidelines.
- If all pools are valid, we alert the user of a success.

Assumptions and Notes

You are running a version of Groovy that is supported by Clarity LIMS, as documented in the Clarity LIMS Technical Requirements.
The attached Groovy file is placed on the LIMS server, in the /opt/gls/clarity/customextensions folder.
GLSRestApiUtils.groovy is placed in your Groovy lib folder.
You have imported the attached Reagent XML file into your system using the Config Slicer tool.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

ConfirmationOfPoolComposition.groovy:

Single Indexing ReagentTypes.xml:

Scripts Triggered Outside of Workflows/Steps

Repurposing a Process to Upload Indexes

Compatibility: API version 2

Uploading indexes (reagent types) into BaseSpace Clarity LIMScan be done 3 different ways:

Manually adding the indexes one at a time using the Operations interface.
Uploading an XML data file using the config slicer.
Using the API directly.

There is no "out of the box" method to quickly and easily upload a CSV file of many indexes to the LIMS. Adding the indexes one at a time is not quick, and using the config slicer requires admin privileges, knowledge of the correct XML file format, and the command line.

Solution

This example enables Clarity LIMS lab users to upload a CSV file containing new indexes and, through an EPP trigger, instantly create the indexes in Clarity LIMS. This example provides the provides the functionality included with the config slice, including preventing the indexes from being created if there is already an index with the same name in the system as well as ensuring the sequence only contains a valid nucleotide sequence (composed of A,G,T and C).

Protocol Step Configuration

In this example, a protocol step is repurposed as a mechanism to upload indexes. The protocol is configured to not produce any analyte outputs, and a single shared output file is used as the placeholder to upload the CSV file containing the indexes. The step is configured to permit use of a control sample, which allows the user to begin the step without any sample inputs.

Parameters

The script accepts the following parameters:

An example of the full syntax to invoke the script is as follows:

User Interaction

The lab scientist begins the step with only a control in the ice bucket. The user uploads the CSV file to the placeholder on the record details screen, and pushes the button to trigger the script.

About the Code

The main method in the script is importIndexes(). This method first calls the function downloadfile() which finds the location of the CSV file on the Clarity LIMS server and reads the content into memory.

The script then generates an XML payload for each index, checks the nucleotide sequence against the acceptable characters, and POSTs to the /api/v2/reagenttypes endpoint. If an index with the same name is already present in the LIMS, the log will be appended with the exception.

This example script produces a log which tracks which indexes were not added to the system as well as the reason. The log displays how many indexes were successfully added to the system and the total number of indexes included in the CSV file.

Assumptions and Notes

You are running a version of Python that is supported by Clarity LIMS, as documented in the Clarity LIMS Technical Requirements.
The attached files are placed on the LIMS server, in the /opt/gls/clarity/customextensions folder.
You will need to update the HOSTNAME global variable such that it points to your Clarity LIMS server.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

example_Indexes.csv:

addindexescsv.py:

Adding Users in Bulk

When Clarity LIMS is replacing an existing LIMS, or if the LDAP integration is not being utilized, a list of current LIMS users and collaborators often already exists in a file.

In this scenario, it is useful to use this file to create users in Clarity LIMS. This topic provides an example script and outlines a strategy for the bulk import of users from a simple CSV file.

Solution

The attached script parses a CSV file containing user information. Based on the columns of data present in the file, the script then creates users and their associated labs in Clarity LIMS.

Parameters

Since this operation is independent of workflow and sample processing, the parameters used to invoke the script differ from those typically used:

An example of the full syntax to invoke the script is as follows:

File format

The format of the file is very flexible. The order of the columns is not relevant.

If the names of your columns do not match the column names in the example file, modify the script so that the column names match yours.

Attached to this topic, you'll find a CSV file containing test data that illustrates the format further. The structure of this file is shown below.

Notice that the first two rows do not contain data of interest, and so the script ignores them. This is controlled by line 38 of the script, which specifies the location of the header row:

About the Code

The main method of interest is importData(). After parsing the file into data structures (COLS and DATA), the data is processed one line at a time using the following pseudocode:

To reduce the number of queries the script makes back to the server, each time a new lab is created it is added to the cache of existing labs that is stored in the LABS dictionary.

Finally, if you set the DEBUG variable to true, the script will stop processing the file after the first successful creation of a user. This is useful as it allows you to test your script using just one line of the CSV file at a time.

Assumptions and Notes

Both of the attached *.py script files are placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder.
The users will be created with the same hard-coded password (abcd1234). It is possible to have the script create a unique password for each user. If this is required, consider having the script e-mail the user with a 'welcome' e-mail outlining their username and password.
The users will be created with permissions to access the Collaborations Interface only. This can be modified as required.
The contents of the 'Institution' column will be used to associate the user with a 'Lab', if a value for this column is provided.
The example code provided is for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

user_List_Test.csv:

userImport.py:

Moving Reagent Kits & Lots to New Clarity LIMS Server

Whereas config slicer is a useful tool for moving configuration data from one server to another and, strictly speaking, reagent lot data is not considered configuration, there is a need for an easy way to transfer reagent lot data to a new Clarity LIMSserver.

Solution

The preventative solution is to not create production reagent kits on the development server. However, if you're reading this it might be too late for that.

This example demonstrates how to use the Clarity LIMS API to extract reagent kit and reagent lot data from one server, transfer the data as an XML file and create the equivalent reagent kits and/or lots on the new server.

This example contains two scripts which can be run on the CLI, one which exports data from the dev server and generates a txt file. The second which uses the exported data to create the kits/ lots on the prod server.

Export

The first script accepts the following parameters:

An example of the full syntax to invoke the first script is as follows:

Import

The second script accepts the following parameters:

An example of the full syntax to invoke the first script is as follows:

About the Code

reagents_export.py

The main method in the script searches the reagent kits and reagent lots endpoints for all the kits and lots or, if the -k parameter is used, for the kits and lots belonging to kits specified in the -k parameter.

The script writes the XML data for each kit and lot to the XML data file.

reagents_import.py

The main method of this script creates the reagent kits and reagent lots in CClarity LIMS. In the case of duplicate reagent kits, the reagent lots are associated with the newly created kits. If the --checkforKits parameter is included, the script does not create kits with duplicate names, and associates the reagent lots with the preexisting kits with the matching reagent lot name.

Assumptions and Notes

You are running a version of Python that is supported by Clarity LIMS, as documented in the Clarity LIMS Technical Requirements.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.Attachments
Script Updates
Jan 2018: reagents_export_jan2018.py and reagents_import_jan2018.py
- Script updated to enable import/export of reagent kits with special characters.
Aug 2018: reagents_import_Aug2018.py
- Script checks for preexisting reagent lots.
- --checkforkits prevents duplicate reagent lots. Lots with non-matching status are considered to already exist (if duplicate kit, name & number).

Attachments

reagents_import.py:

reagents_export.py:

reagents_import_jan2018.py:

reagents_export_jan2018.py:

reagents_import_Aug2018.py:

Programatically Importing the Sample Submission Excel File

Compatibility: API version 2

The Clarity LIMSweb interface provides an example Sample Import Excel file which can be manually uploaded to submit new samples to a selected project within the LIMS.

This application example shows how the Sample Import Sheet can be uploaded programatically with just one change to its format.

Solution

This example uses the same Sample Sheet with an additional column 'Project/Name'. The script processes the Sample Sheet, creates the samples and adds the samples to their respective projects.

This script leverages a python module xlrd, which is not included in the standard python library. It is used to extract data from .xls and .xlsx excel files.

Parameters

The script accepts the following parameters:

An example of the full syntax to invoke the script is as follows:

About the code

parseFile

This method carries out several operations:

Opens the excel file and reads the text
Stores the column headers in a dictionary variable called COLS
Stores the row data accessibly in an array variable called ROWS

createProject

This method in turn carries out several operations:

For each project name the script encounters it searches the LIMS to identify if a project with this name has already been created.
If the project does not exist, the script will create the project. This example script is easily modifiable, however as written:
- Project researcher is System Administrator
- Project open date is today
- No project level UDFs are created

processRows

This method prepares the data needed to create a sample in LIMS:

Assembles the UDF values, the project ID and container ID.
For each non-tube container the script encounters it searches the LIMS to identify if a container with this name already exists.
If the container does not exist the script will create the container.
- If container type is not specified, TUBE will be assumed.
- For TUBE, well location will always be 1:1

The script contains additional supporting methods to generate XML which is POSTED to the API.

Assumptions & notes

The UDFs in the Sample Sheet header have been configured in LIMS prior to submission.
The following column headers are required: Sample/Name, Project/Name and any sample-level UDFs that are mandatory within your system.
The script need not be run on the Clarity server, however it must have a connection to the Clarity LIMS API.
You are using Python version 2.6 or 2.7.
The Python installation contains the non-standard xlrd library.
The _auth_tokens.py file has been updated to include the information for your Clarity installation.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

_auth_tokens.py:

ClaritySampleSheetprojects.xlsx:

SampleSheetImporter.py:

Generating an MS Excel Sample Submission Spreadsheet

An easy way to get sample data into BaseSpace Clarity LIMS is to import a Microsoft® Excel® sample spreadsheet. However, manually configuring the spreadsheet can be time-consuming and error-prone.

Use this Python application example to generate a custom Excel sample submission spreadsheet that meets the specific needs of your lab.

Solution

Parameters

The script accepts the following parameters:

The SampleSubmissionXLSGenerator.py application script is run as follows:

An XLS workbook containing the following worksheets is generated:

Sample Import Sheet: Contains columns for all Sample UDFs/UDTs.
Container Type Names: Contains a list of all container type names.

User interaction

In the Clarity LIMS Operations Interface, the lab manager configures Container Types and Sample UDFs/UDTs.
In the Clarity LIMS Web Interface, the lab manager or lab scientist runs the SampleSubmissionXLSGenerator.py application script, providing the required parameters.
An Excel workbook containing the Sample Import Sheet and Container Type Names worksheets is generated.
The Sample Import Sheet is provided to lab scientists for use when creating lists of samples to import; it may also be used by collaborators to import samples via the LabLink Collaborations Interface.

Using the generated spreadsheet

The spreadsheet will contain red and green column headers. Populate the spreadsheet with sample data:
- Red headers: These columns must contain data.
- Green headers: These columns may be left empty.
Import the spreadsheet into Clarity LIMS.

If there are no sample user-defined fields (UDFs) or user-defined types (UDTs) in the system, the generated spreadsheet will only contain four columns. After configuring the UDFs/UDTs, you can re-run the script to add columns to the spreadsheet that reflect the updated configuration.

Example modifications

You can edit the Python application to include supplementary functions. For example, you may want to use other attributes from the resulting XML to generate additional data entry columns.

Inserting additional non-required & non-configured columns

The SampleSubmissionXLSGenerator.py Python application script adds Sample/Volume and Sample/Concentration columns to the spreadsheet.
The script includes a 'commented' section. You can remove the ### NON-REQUIRED COLUMN MODIFICATION ### comments and use this section to add your own columns.

Assumptions & notes

You have downloaded the attached zip file to the server. On a non-production server use the glsai user account.
Unzip the file to the following directory: /opt/glsclarity/customextensions. The contents of the zip file will be installed within that directory, to /CookBook/SpreadSheetGenerator/
Python 2.7 is installed and configured on the system path.
You can run the SampleSubmissionXLSGenerator.py Python application script on any system running Python, provided it has web access to the Clarity LIMS REST API.
This script is not compatible with the 3.x branch of Python.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

All dependencies are preloaded if you install on a non-production server.

Attachments

python-xls-generator-2.0-bundle.zip:

Assigning Samples to New Workflows

It is sometimes necessary to assign a sample to a new workflow from within another workflow.

You can do this in the BaseSpace Clarity LIMS Operations Interface, by manually adding the sample to the desired workflow. However, it is also possible to perform this action through the API.

Solution

This example shows how to use the API to automate the addition of samples to a specified workflow, based on a UDF value.

The script can be run off any protocol step whose underlying process is configured with Analyte inputs and a single per-input ResultFile output.
The process may have any number of per-all-input ResultFile outputs, as they will be ignored by the script.
A Result File UDF, named Validate, will control which samples will be added to the specified workflow.

Parameters

The script accepts the following parameters:

-i

The limsid of the process invoking the script (Required)

The {processLuid} token

Step 1: Create Result File UDF

Before the example script can be used, first create the ResultFile's Validate UDF in the Operations Interface. This is a single-line text UDF with preset values of 'Yes' and 'No'.

Step 2: Create and Configure Process Type

Also in the Operations Interface, create a new process type named Cookbook Workflow Addition.

This process type must:

Have Analyte inputs.
Have a single per-input ResultFile output.
Apply the Validate UDF on its ResultFile outputs.

Step 3: Configure an EPP call on this process type as follows:

Step 4: Modify the file paths to suit your server's Groovy installation.

Step 3: Create Protocol

Once the process type is created, in the Clarity LIMS Web Interface, create a protocol named Cookbook Workflow Addition Protocol.

This protocol should have one the protocol step - Cookbook Workflow Addition.

Step 6: Configure EPP

Configure the EPP script to automatically initiate at the end of the Cookbook Workflow Addition step:

Step 7: Create workflows

To finish configuration, create two workflows:

Destination Workflow: THis workflow should contain the DNA Initial QC protocol only.
Sending Workflow: This workflow should contain the new Cookbook Workflow Addition Protocol.

About the code

Once the script has processed the input parameters and ensured that all the required information is available, we can start processing the samples to determine if they should be assigned to the new workflow.

To begin, we retrieve the process from the API. This gives us access to the input-output maps of the process. These will be used to determine which ResultFiles we will examine.
Next, we retrieve the protocol step action list. This contains a list of the input analytes' URIs and their next steps.
We then search this list for and collect all analyte URIs whose next action has been set to Mark as protocol complete.
Next, we gather the per-input ResultFile input-output maps. We can collect the ResultFile URIs of those related to the analytes who have been marked as complete. NOTE: It is important that we strip any extra state information from the URIs. The URIs found in the next action list do not contain any state information and, when compared against a non-stripped URI, will return 'false'.
Once we have the ResultFile URIs, we can retrieve them with batchGET. It is important that the list contains unique URIs, as the batchGET will fail otherwise.
After we have retrieved the ResultFiles, we can iterate through the list, adding the parent sample's URI to our list of sample URIs if the ResultFile's Validate UDF is set to Yes. We also increment a counter which will allow us to report to the user how many samples were assigned to the new workflow.
Since we don't assign samples themselves to workflows, we first need to retrieve the samples' derived artifacts. We can do this by iterating through each sample URI, retrieving it, and adding its artifact's URI to a list.
Before we can add the artifacts to the workflow, we need to determine the destination workflow's URI. By retrieving a list of all the workflows in the system, we can find the one that matches our input workflow name.
Assigning artifacts to workflows requires the posting of a routing command to the routing endpoint.
- We first generate the required XML by using a Streaming Markup Builder.
- We then dynamically build our XML by looping inside of the markup declaration. // Create a new routing assignment using the Markup Builder
To create our routing command, we pass the workflow URI and the artifact URIs that we wish to assign to the workflow to a method containing the above code. This will generate the required node.
We then perform an httpPOST to the routing endpoint to perform the action.
Finally, we define our success message to the user. This will allow us to inform the user of the results of the script.

User Interaction

Assuming samples have been placed in the Switching Workflow, the user proceeds as normal through the protocol step.
In the Record Details screen, the user enters Validate values in the ResultFile UDFs.
A message displays, alerting the user of the execution of a custom script.

Assumptions and Notes

The attached file is placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder.
GLSRestApiUtils.groovy is placed in the Groovy lib folder.
The required configuration has been set up, as described in Configuration.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

SwitchingWorkflows.groovy:

Miscellaneous Scripts

Generating a Hierarchical Sample History

Often, workflows do not have a linear configuration. Even in those cases, samples progressing through workflows may be re-queued, replicated, or submitted into a number of parallel workflows.

Attempts to align downstream results with submitted samples may get hindered when trying to account for sample replicates or the dynamic decisions made in the lab.

A visual representation of a samples complete history presented in a clear hierarchical format allows for a digestible report of the work done on the sample. This format provides at-a-glance understanding of the any of the branching or re-queuing of a sample.

Solution

This example describes a python script which, given an artifact, recursively finds all the processes for which that artifact was an input, then finds all the associated output artifacts of that process. This continues for all processes all the way down to the most downstream artifact.

A clear visual representation of the entire history of work on a sample, similar to what was available in the Ops interface, can allow a user to see all the processes and derivations of a sample. This is especially of use for troublesome samples that have branched into numerous downstream replicates which may end up in the same or different sequencing protocol.

Parameters

The script accepts the following parameters:

An example of the full syntax to invoke the script is as follows:

Script output example:

Sibling artifacts will appear aligned vertically with the same indentation. In the above example, 122-1650 Library Pooling (MiSeq) 5.0 created two replicate Analytes, 2-4110 and 2-4111. Analyte 2-4111 was the input to the subsequent step ( 24-1952 ) and no additional work was performed on 2-4110.

Processes performed on an artifact will appear underneath with a tab indentation. In the above example, the first 4 processes ( 3 QC processes and Fragment DNA ) are all using the Root Analyte (CRA201A1PA1) as an input.

Adding Colours with termcolor

Install package termcolor for colour printing support. Entity colours can be configured within the script. Globally turn off colours by changing the variable use_colours to False ( line 16 ).

Assumptions and Notes

You are running a version of Python that is supported by Clarity LIMS, as documented in the Clarity LIMS Technical Requirements.
The glsapiutil.py file is placed in the working directory.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

sample_history_colours.py:

sample_history.py:

Protocol-based Permissions

Laboratories may want to limit what steps Researchers can start. At the time of writing, BaseSpace Clarity LIMS does not natively support protocol-based permissions. However, with an EPP at the beginning of the step we can check to see if the technician/researcher starting the step has been given approval to start the step, and halt the step from starting if they do not have permission. There are several ways this can be done, but special considerations to how these permissions are administered need to be made.

Solution

In order to allow an administrator to easily maintain permissions we will assign users to groups in a config file and our EPP will consume this information. One parameter of the EPP is the groups that are permitted to run the step. When the script is triggered at the start of the step, it will look for the name of the technician starting the step in the config file and determine if the technician is:

Included in the config file and,
Has been assigned to a group that is permitted to run the step.

It is important to remember that by exiting a script with a negative number an EPP will fail and the user will not be able to move forward in the step. We will take advantage of this EPP feature and if the technician/researcher is part of a permitted group the step would start as expected. But, if they are not part of a permitted group, entry into the step will be halted and an error box will appear with whatever the last print message was in the script.

Parameters

The EPP command is configured to pass the following parameters:

An example of the full syntax to invoke the script is as follows:

User Interaction

The config file can reside in any directory that the EPP script will have access to.
The config file that is used in this example has tab delimited columns of Last Name, First Name, and Groups. The permitted groups need to be separated by commas (see the attached example config file). The script can be easily modified if a different format is desire for the config file
The EPP should be "automatically initiated" at "the beginning of the step"

If the user is not allowed to move forward a message box will appear and the step is aborted.

Assumptions and Notes

You are running a version of Python that is supported by Clarity LIMS, as documented in the Clarity LIMS Technical Requirements.
Both of the attached files are placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

Group_Permissions.py:

config.txt:

Self-Incremental Counters

Sequential numbers are sometimes needed for naming conventions and require self-incrementing counters be created and maintained. We do not recommend using the BaseSpace Clarity LIMSdatabase for this. However the Unix “(n)dbm” library provides an easy way to create and manage counters by creating Dbm objects that behave like mappings (dictionaries).

Solution

They way this would work is the attached script ( and the counters file it creates / manages ) would live on the Clarity server and other scripts would depend upon it, and use code similar to below whenever a sequential number was needed. While the script is written in python and uses the dbm module there is nothing inherently Pythonic about this code that couldn’t be reimplemented in another language. However, more information on the Python dbm module can be found at: https://docs.python.org/2/library/dbm.html

User Interaction

The counters live in a file, the path to which is defined in the cm.setPath() command. The file will be created if it doesn’t exist.
The file can contain as many counters as you wish (it’s better to have many counters in one file than many files each with only one counter)
The name of the counter is passed to the function cm.getNextValue(). If this is the first time the counter has been used, it will be created and added to the file.
Each time you want the next value just call cm.getNextValue() for that counter and you will be given the next value.
The counters and the file will look after themselves, you don’t need to explicitly update / save them – this is all handled behind the scenes.

Assumptions and Notes

You are running a version of Python that is supported by Clarity LIMS, as documented in the Clarity LIMS Technical Requirements.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

clarityCounters.py:

Renaming Samples to Add an Internal ID

When a lab takes in samples, they are named by the lab scientist who supplies them. As such, samples may be named in any way imaginable. Conversely, the research facilities processing the samples often have strict naming conventions.

When these two situations occur, there is a need for the sample to renamed with the strict nomenclature of the processing lab. However, in order for the final data to be meaningful for the scientist who supplied the samples, the original name must also be retained.

Solution

In this example, the attached script may be used to rename samples, while retaining their original names in a separate field.

The original name of the sample is saved in a user-defined field (UDF) called Customer's Sample Name, .
The Sample Name for the submitted sample is overwritten, using the specific naming convention of the customer.

It is recommended that the script be launched by a process/protocol step as early as possible in the sample lifecycle.

Parameters

The script is invoked with just three parameters:

An example of the full syntax to invoke the script is as follows:

About the Code

Once the command-line parameters have been harvested, and the API object set up ready to handle API requests, the renameSamples() method is called.

This method implements the following pseudo code:

Assumptions and Notes

Both of the attached files are placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder.
A user defined field, named Customer's Sample Name has been created at the analyte (sample) level.
The Customer's Sample Name field is visible within the LabLink Collaborations Interface.
You will need to update the HOSTNAME global variable such that it points to your Clarity LIMS server.
You will need to implement your own logic to apply the new sample name in Step 4 of the script.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

renameSamples.py:

Creating Custom Sample Sheets

Clarity LIMS can create Illumina-based MiSeq and HiSeq 'flavoured' sample sheets. However, if you are using algorithms or indexes outside of those suggested by Illumina, you may be required to produce your own 'custom' sample sheet.

This example script provides an algorithm that harvests the contents of a flow cell (or any container) that may contain pooled samples, and uses the resulting information to output a custom sample sheet.

Solution

The attached script uses aggressive caching in order to execute as quickly as possible. When extreme levels of multiplexing are involved, the cache size could consume considerable quantities of memory, which may be counter-productive.

The algorithm has been tested on the following unpooled analytes; pooled analytes; and 'pools of pools' — in which multiple homogeneous or heterogeneous pools are themselves combined to produce a new pool.

In these tests, the algorithm behaved as expected. If you find this is not the case please contact Illumina Support team.

The algorithm uses recursion to determine the individual analytes (samples) and their indexes that are located on the flow cell lane(s).
To determine whether an analyte constitutes a pool or not, the script looks at the number of submitted samples with which the analyte is associated.
- If the answer is 1, the analyte is not a pool.
- If the answer is greater than 1, the analyte is considered to be a pool.
If a pooled analyte is discovered, the inputs of the process that produced the pooled analyte are gathered and the same test is used to see if they themselves are pools.
This gathering of ancestor analytes continues until the contents of each pool have been resolved, at which point the script produces some example output.
- Note that while it is expected that you will augment this section of the script with the fields you need for your custom sample sheet, the logic to recursively identify analytes that are not themselves pools should be applicable to all.

Parameters

The script is invoked with just three parameters:

An example of the full syntax to invoke the script is as follows:

Assumptions and Notes

Both of the attached files are placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder.
You will need to implement your own logic to gather the fields required for your specific sample sheet.
You will need to update the HOSTNAME global variable such that it points to your Clarity LIMS server.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

flowcellContents.py:

Downloading a File and PDF Image Extraction

Compatibility: API version 2 revision 21

A lab user can attach a PDF file containing multiple images to a result file placeholder, the script extracts the images which are automatically attached to corresponding samples as individual result files.

Images within a PDF may be in a number of formats, and will usually be .ppm or .jpeg. The example script includes additional code to convert .ppm images to .jpeg.

Prerequisites

You have the package 'poppler' (linux) installed.
You have defined a process with analytes (samples) as inputs, and outputs that generate the following:
- A single shared result file output.
- A result file output per input.
You have added samples to Clarity LIMS.
You have uploaded the Results PDF to Clarity LIMS during 'Step Setup'.
Optionally, if you wish to convert other file types to .jpeg, installation of ImageMagick (linux package).

Code Example

How it works:

The lab scientist runs a process/protocol step and attaches the PDF in Clarity LIMS
When run, the scrip uses the API and the 'requests' package available in python to locate and retrieve the PDF.
The script generates a file for each image.
Files are named with LUIDs and well location.
The images are attached to the ResultFile placeholder. **The file names must begin with the {outputFileLuids} for automatic attachment.**

Additionally, this script converts the images to JPEG format for compatibility with other LIMS features.

Step 1. Create the script

Part 1 - Downloading a file using the API

The script will find and get the content of the PDF through 2 separate GET requests:

Following the artifact URI using the {compoundOutputFile0} to identify the LUID of the PDF file.
Using the ~/api/v2/files/{luid}/download endpoint to save the file to the temporary working directory.

def dl_pdf( artifactluid_ofpdf ): ### finds the file LUID from artifact LUID of the PDF
 artif_URI = BASE_URI + "artifacts/" + artifactluid_ofpdf
 artGET = requests.get(artif_URI, auth=(args[ "username" ],args[ "password" ]))
 root = ET.fromstring(artXML)
 for id in root.findall("{http://genologics.com/ri/file}file"):
     fileLUID = id.get("limsid")
 file_URI = BASE_URI + "files/" + fileLUID + "/download"
 fileGET = requests.get(file_URI, auth=(args[ "username" ],args[ "password" ])) 
 with open("frag.pdf", 'wb') as fd: 
     for chunk in fileGET.iter_content():
         fd.write(chunk)

The PDF is written to the temporary directory.

The script performs a batch retrieval of the artifact XML for all samples. Subsequently a python dictionary is created defining which LIMS id corresponds to a given well location.

Part 2 - Extracting images as individual results files

The script uses the pdfimages function to extract the images from the PDF. This function is from a linux package and can be called using the the os.system() function.

This example script extracts an image from each page, beginning with page 10. Files are named with LUIDs and well location. The file names must begin with the {outputFileLuids} for automatic attachment.

page = 10
for each in range(len(wells)):
    well_loci = wells[each]        
    if well_loci in well_map.keys():
        limsid = well_map[well_loci]
	filename = limsid + "_" + well_loci
        command = 'pdfimages ' + 'frag.pdf' +' -j -f ' + str(page) + ' -l ' + str(page) + ' ' + filename
	os.system(command)

Additionally, the cookbook example script converts the image files to JPEG for compatibility with other features in Clarity LIMS. The script uses 'convert', a function from the linux package called 'ImageMagick'. Like the 'pdfimages' function, 'convert' can be called in a python script through the os.system() function.

Step 2. Configure the Process

The steps required to configure a process to run EPP are described in the Process Execution with EPP/Automation Support example, namely:

Configure the inputs and outputs.
On the External Programs tab, select the check box to associate the process with an external program.

Parameters

The process parameter string for the external program is as follows:

bash -c "/usr/bin/python /opt/gls/clarity/customextensions/pdfimages.py -a {compoundOutputFileLuid0} -u{username} -p {password} -f '{outputFileLuids}'"

The EPP command is configured to pass the following parameters:

Step 3. Run the Process

Record Details page in Clarity LIMS

The placeholder where the lab scientist can upload the PDF.

External Process ready to be run. 'Script generated' message marking individual result file placeholder.

Expected Output and Results

External program was run successfully. Individual result files named with artifact LUID and well location.

Attachments

pdfimages.py:

Submit to a Compute Cluster via PBS

There are some algorithms which work well on massively parallel compute clusters. BCL Conversion is such an example and is the basis for this application example.

The concepts illustrated here are not limited to BCL Conversion; as such, they may also be applied in other scenarios. For instance the example PBS script below uses syntax for illumina's CASAVA tool, but could easily be re-purposed for the bcl2fastq tool.

Also, in this example, Portable Batch System (PBS) is used as the job submission mechanism to the compute cluster, which has read/write access to the storage system holding the data to be converted.

Example PBS file

For illustrative purposes, an example PBS file is shown here. (As there are many ways to configure PBS, it is likely that the content of your PBS file(s) will differ from the example provided.)

#!/bin/bash

#PBS -N run_casava

#PBS -q himem

#PBS -l nodes=1:ppn=20

export RUN_DIR=/data/instrument_data/120210_SN1026_0092_BXXXXXXXXX

export OUTPUT_DIR=/data/processed_data/processed_data.1.8.2/120210_SN1026_0092_BXXXXXXXXX

export SAMPLE_SHEET=/data/SampleSheets/samplesheet.csv

cd $PBS_O_WORKDIR

source /etc/profile.d/modules.sh

module load casava-1.8.2

export TMPDIR=/scratch/

export NUM_PROCESSORS=$((PBS_NUM_NODES*PBS_NUM_PPN))

configureBclToFastq.pl --input-dir $RUN_DIR/Data/Intensities/BaseCalls --output-dir $OUTPUT_DIR 
 --sample-sheet $SAMPLE_SHEET --force  --ignore-missing-bcl --ignore-missing-stats
 --use-bases-mask y*,I6,y* --mismatches 1

cd $OUTPUT_DIR

make -j $NUM_PROCESSORS

Solution

Process configuration

In this example, the BCL Conversion process is configured to:

Accept a ResultFile input.
Produce at least two ResultFile outputs.

The process is configured with the following process level UDFs:

The syntax for the external program parameter is as follows:

python /opt/gls/clarity/customextensions/ClusterBCL.py -l {processLuid} -u {username} -p {password}
 -c {udf:Number of Cores} -m {udf:Number of mismatches} 
 -b "{udf:Bases mask}" -a {compoundOutputFileLuid0}.txt -e {compoundOutputFileLuid1}.txt -r "{udf:Run Name}"

Parameters

User Interaction and Results

The user runs the BCL Conversion process on the output of the Illumina Sequencing process. The sequencing process is aware of the Run ID, as this information is stored as a process level user-defined field (UDF).
The user supplies the following information, which is stored as process level UDFs on the BCL Conversion process:
- The name of the folder in which the converted data should be stored.
- The bases mask to be used.
- The number of mismatches.
- The number of CPUs to dedicate to the job.
The BCL Conversion process launches a script (via the EPP node on the Clarity LIMS server) which does the following:
- Builds the PBS file based on the user's input.
- Submits the job by invoking the 'qsub' command along with the PBS file.

Assumptions and Notes

Portable Batch System (PBS) is used as the job submission mechanism to the compute cluster.
The compute cluster has read/write access to the storage system holding the data to be converted.
There is an EPP node running on the Clarity LIMS server.
The PBS client tools have been installed and configured on the Clarity LIMS server, such that the 'qsub' command can be launched directly from the server.
When the 'qsub' command is invoked, a PBS file is referenced; this file contains the job description and parameters.
The script was written in Python (version 2.7) and relies upon the GLSRestApiUtil.py module. Both files are attached below. The required Python utility is available for download at Obtain and Use the REST API Utility Classes.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

ClusterBCL.py:

Copying Output UDFs to Submitted Samples

It may at times be desirable to take key data derived during a workflow and copy it to the submitted sample. There are several reasons why this could be useful:

All key data is combined with all of the submitted sample's data, and becomes available on a single object.
Key data determined during a workflow can be made immediately available to external collaborators via the LabLink Collaborations Interface, since these users have access to their submitted samples.
Searching for data becomes easier as the data is not spread over several entities.

This example provides a script to allow the copying to occur, and describes how the script can be triggered.

Solution

To illustrate the script, we will copy a user-defined field (UDF) that is collected on the outputs of a QC type protocol step.

This UDF is named Concentration, and it is stored on the individual ResultFile entities associated with the analytes that went through the QC protocol step.
Once the QC protocol step has completed, the Concentration UDF values are copied to a UDF on the submitted Samples, which is called Sample Conc.
The QC protocol step is configured to invoke the script from a button on the step's Record Details screen.

Parameters

The EPP command is configured to pass the following parameters:

An example of the full syntax to invoke the script is as follows:

/usr/bin/python /opt/gls/clarity/customextensions/setUDFonSample.py -l MCL-SA1-131211-24-6813 -u admin -p securepassword -f "Sample Conc., Units" -t ResultFile -v "Concentration, Conc.Units"

User Interaction

Once the script has copied the UDF values from the output to the submitted samples, the values are visible in the Submitted Samples view of the Operations Interface:

Similarly, assuming that the Sample Conc. UDF is set to be visible within LabLink Collaborations Interface, collaborators are able to see these values in their interface:

About the Code

The main method of interest is setUDFs(). This method carries out several operations:

It harvests just enough information so that the objects required by the subsequent code can retrieve the required artifacts using the 'batch' API operations. This involves using some additional code to build and manage the cache of artifacts retrieved in the batch operations, namely:
- cacheArtifact()
- prepareCache()
- getArtifact()
The cached artifacts are then accessed, and for each one:
- The corresponding sample is retrieved via the API.
- The sample XML is updated such that the UDF value is obtained from the artifact by calling api.getUDF(), and stored on the sample by calling api.setUDF().
- The sample XML is saved by calling api.updateObject().
Finally, a meaningful message is reported back to the user via the contents of the successMsg and/or failMsg variables.

Assumptions and Notes

Both of the attached files are placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder.
You will need to update the HOSTNAME global variable such that it points to your Clarity LIMS server.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

setUDFonSample.py:

Illumina LIMS Integration

This combination of configuration and python script can be used to set up an integration between Clarity LIMS and Illumina LIMS. There are 2 main parts to this integration:

Generating a sample manifest from Clarity LIMS to import the samples into Illumina LIMS.
Once the analysis is completed, automatically parsing in the results from Illumina LIMS into Clarity LIMS.

Disclaimer: This application example is provided as is, with the assumption that anyone deploying this to their LIMS server will own the testing and customization of the configuration and scripts provided.

Installation

Protocol and Workflow Import

Using the config-slicer tool import the configuration file attached ([IlluminaLIMSIntegration.xml]) as the glsjboss user with the following command:

java -jar /opt/gls/clarity/tools/config-slicer/config-slicer-3.<x>.jar -o import -k IlluminaLIMSIntegration.xml -u <user> -p <password> -a https://<hostname>/api

Script and Configuration Setup

As the glsjboss user on the Basespace Clarity LIMS server, copy the Illumina LIMS manifest template file (IlluminaLIMS_Manifest_Template.csv) attached below to the following folder: /opt/gls/clarity/customextensions/IlluminaLIMS
On the Illumina LIMS Windows/Linux workstation create a folder called Clarity_gtc_Parser and do the following :
- Copy the clarity_gtc_parser_v2.py file into this folder and update the following configuration parameters:\
  Parameter Description
  NOTE This script supports the current LIMS gtc version, 3, and will be compatible with version 5 when available.
  - Download and copy the IlluminaBeadArrayFiles.py to the same folder (Also Available here). Edit the file with variables: API, gtc file path and username/password for clarity API for the relevant server.
  - Create an empty file called processed_gtc.txt in the gtc files directory.
  - Setup a scheduled task(windows) or cronjob(linux) to run this python script every 10 minutes. (Assuming Python (version 2.7.1) is installed and available on the workstation).

Workflow

The configuration attached to this page contains an example protocol with two Steps.

Prerequisites:

Samples have been accessioned into Clarity LIMS with the following sample metadata as Submitted Sample UDFs:

Is Control
Institute Sample Label
Species, Sex
Comments
Volume (ul)
Conc (ng/ul)
Extraction Method
Parent 1
Parent 2
Replicate(s)
WGA Method (if Applicable)
Mass of DNA used in
WGA
Tissue Source

Protocol Step: IlluminaLIMS Sample Prep

This manual step is meant to be merged into the last step of a Sample Prep Protocol. It has the configuration to generate a Derived Sample with the LIMSID in the name so that the name can be unique, and used to match data back using the data parser to the next step.

Protocol Step: IlluminaLIMS Manifest and Analysis

This requires the user to perform the following steps:

Generate the Illumina LIMS Manifest using the button provided called "Generate Illumina LIMS Manifest".
Download the manifest and import this to IlluminaLIMS Project Manager under the correct institution.
Run the appropriate lab workflow on Illumina LIMS
After the Illumina LIMS analysis is complete, allow 10minutes and come back to Clarity LIMS to find the step in progress and ensure the following derived sample UDFs are populated:
- Autocall Version
- Call Rate, Cluster File
- GC 10
- GC 50
- Gender
- Imaging Date
- LogR dev
- Number of Calls
- Number of No Calls
- SNP Manifest
- Sample Plate
- Sample Well, 50th Percentiles in X
- 50th Percentiles in Y
- 5th Percentiles in X
- 5th Percentiles in Y
- 95th Percentiles in X
- 95th Percentiles in Y
- Number of Calls Number of Intensity Only Calls
- Number of No Calls

Attachments

IlluminaBeadArrayFiles.py:

IlluminaLIMSIntegration.xml:

IlluminaLIMS_Manifest_Template.csv:

clarity_gtc_parser_v2.py:

Parsing Sequencing Meta-Data into Clarity LIMS

Once a sequencing run has occurred, there is often a requirement to store the locations of the FASTQ / BAM files in Clarity LIMS.

For paired-end sequencing, it is likely that the meta-data file that describes the locations of these files will contain two rows for each sample sequenced: one for the first read, and another for the second read.

Such a file is illustrated below:

Column 2 of the file, Sample ID, contains the LIMS IDs of the artifacts for which we want to store the FASTQ file values listed in column 3 (Fastq File).

This example discusses the strategy for parsing and storing data against process inputs, when that data is represented by multiple lines in a data file.

Solution

The attached script will parse a data file containing multiple lines of FASTQ file values, and will store the locations of those FASTQ files in user-defined fields.

Process configuration

In this example, the process is configured to have a single shared ResultFile output.

Parameters

The EPP command is configured to pass the following parameters:

An example of the full syntax to invoke the script is as follows:

/usr/bin/python /opt/gls/clarity/customextensions/parseMetadata.py -l 24-9953 -u admin -p securepassword -s http://192.168.9.123:8080/api/v2/steps/24-9953 -i 92-20553

User Interaction

The user-interaction is comprised of the following steps:

Step 1

The user runs the process up to the Record Details screen as shown in the following image. Note that initially:

The Sequencing meta-data file is still to be uploaded.
The values for the R1 Filename and R2 Filename fields are empty.

Step 2

The user clicks Upload file and attaches the meta-data file. Once attached, the user's screen will resemble this:

Step 3

Now that the meta-data file is attached, the user clicks Parse Meta-data File. This invokes the parsing script.

If parsing was successful, the user's screen will resemble Figure 4 below.

Note that the values for the R1 Filenames and R2 Filenames have been parsed from the file and will be stored in Clarity LIMS.

About the code

The key methods of interest are main(), parseFile() and fetchFile(). The main() method calls parseFile(), which in turn calls fetchFile().

fetchFile() method

The fetchFile() method relies upon the fact that the script is running on the Clarity LIMS server, and as such has access to the local file system in which the file (uploaded in Step 2) now resides.

Thus, fetchFile() can use the API to:

Convert the LIMSID of the file to the location on disk.
Copy the file to the local working directory, ready to be parsed by parseFile().

parseFile() method

The parseFile() method creates two data structures that are used in the subsequent code within the script:
- The COLS dictionary has the column names from the first line of the file as its key, and the index of the column as the value.
- The DATA array contains each subsequent line of the file as a single element. Note that this parsing logic is overly-simplistic, and would need to be supplemented in a production environment. For example, if the CSV file being parsed does not have the column names in the first row, then exceptions would likely occur. Similarly, we assume the file being parsed is CSV, and likewise any data elements which themselves contain commas would likely cause a problem. For the sake of clarity such exception handling has been omitted from the script.
Once parseFile() has executed successfully the inputs to the process that has invoked the script are gathered, and 'batch' functionality is used to gather all of the artifacts in a single batch-retrieve transaction.
All that remains is to step through the elements within the DATA array, and for each line, gather the values of the Fastq File and Sample ID columns. For each Sample ID value:
- The corresponding artifact is retrieved.
- Depending on whether the value of the Fastq File column represents the filename for the first or second read, either the R1 Filename or the R2 Filename user-defined field is updated.
Once the modified artifacts have been saved, the values will display in the Clarity LIMS Web Interface.

Assumptions and Notes

Both of the attached files are placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder.
You will need to update the HOSTNAME global variable such that it points to your Clarity LIMS server.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

parseMetadata.py:

Generic CSV Parser Template (Python)

Compatibility: API version 2

Many different types of CSV files are attached to BaseSpace Clarity LIMS. This example provides a template for a script that can parse a wide range of simple CSV files into Clarity LIMS. The user can change parameters to read the format of the file to be parsed.

The Lab Instrument Tool Kit includes the parseCSV script, which allows for parsing of CSV files. However, this tool has strict limitations in its strategy for mapping data from the file to corresponding samples in Clarity LIMS. For information on the parseCSV script, refer to the Clarity LIMS Integrations and Tool Kits documentation.

Solution

Protocol step configuration

CSV files are attached to a protocol step. Artifact UDFs, where data will be written to, need to be configured for artifacts and for the protocol step.

Parameters

The script accepts the following parameters:

An example of the full syntax to invoke the script is as follows:

/usr/bin/python /opt/gls/clarity/customextensions/genericParser.py -u {username} -p {password} -s {stepURI:v2} -r {compoundOutputFileLuid0}

About the Code

The script contains an area with a number of configurable variables. This allows a FAS or bioinformatician to customize the script to parse their specific txt file. The following variables within the script are configurable:

MAPPING MODES

There are many attributes of samples which can be used to map the data in the text file with the corresponding derived samples in Clarity LIMS. The script should be configured such that one of these modes is set to True.

Three modes are available:

For any of the three modes, a mapping column value must be explicitly given. The value is the index of the column containing the mapping data (either artifact name, well location, or UDF value).

If using the mode MapTo_UDFValue, a UDFName must be given. This is the name of the UDF in clarity which will be used to match the value found in the mapping column.

artifactUDFMap

artifactUDFMap = {
    "Concentration" : "Concentration",
    "Avg. Size" : "Average Size"
}

This Python dictionary maps the name of columns in the txt file to artifact UDFs for the outputs of the step. The data from these columns in the file will be written to these UDFs for the output artifacts. The dictionary can contain an unlimited number of UDFs. The dictionary keys, (left side), are the names of the columns in the txt file, and the dictionary values, (right side), are the names of the UDFs as configured for the artifacts.

Assumptions and Notes

You are running a version of Python that is supported by Clarity LIMS, as documented in the Clarity LIMS Technical Requirements.
The attached files are placed on the LIMS server, in the /opt/gls/clarity/customextensions folder.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

genericParser.py:

glsfileutil.py: