Obfuscating Sensitive Data in Scripts

If a BaseSpace Clarity LIMS script is run in an automation context, it is easy to obfuscate usernames and passwords by choosing the appropriate tokens ({username} or {password}) to be passed in as run-time arguments.

However, this type of functionality is not easily available outside of automations, and it is often necessary to store various credentials on machines that need to interact with the LIMS API, database, or some other protected resource. This article explains how to use cryptography in Python to protect and obfuscate these important authentication tokens.

Background

Many of the API Cookbook examples use a simple auth_tokens.py file that has usernames and passwords stored in plain text. This file can be compiled in Python, simply by importing it at a Python console:

import auth_tokens

print auth_tokens.username #just for sanity check

Importing this file creates an auth_tokens.pyc file—a byte-compiled version of the source file. The source file can now be deleted, providing the first rudimentary level of security. However, the credentials can still quite easily be retrieved. Even if the permissions on this file are restricted, this solution does not present a suitable level of security for most IT administrators. It does, however, allow us to easily prototype our code, hence its use in Cookbook examples.

Assumptions

You have pycrypto installed (either through the OS package manager or pip).
You have generated a secret key of random ASCII characters (the easiest way to do this is to button-mash on a US-layout keyboard and include a lot of symbols).
You already have a plain-text auth_tokens.py file. An example is attached at the bottom of this article.
You have access to the Python or iPython command line console.

Cryptography in Python

Python provides the pycrypto library that can easily be installed using the operating system's package manager, or the pip installation tool. It contains myriad different encryption algorithms and gives us a straightforward interface to wrap our own encryption objects and accessor functions.

Towards a Better auth_tokens.py

The goal is to be able to create a flat text file containing obfuscated usernames, passwords, hostnames, and so on. To do this, use a utility class called ClarityCred that provides encryption and decryption functionality using the ARC4 cipher from pycrypto. The ClarityCred class is provided in cred.py, attached at the bottom of this article.

While the use of ARC4 is considered deprecated in favor of stronger encryption algorithms, such as AES, the ARC4 example lends itself to easier understanding. ARC4 simply requires a secret key and a salt size to be specified. The secret key can be generated at random using any preferred method and is hard-coded in cred.py, along with the salt size purely for ease of demonstration. Ideally, the secret key and salt size should be stored externally.

After applying the ARC4 encryption, the ClarityCred class wraps base64 encoding around it to obfuscate the data further.

Assume that you need to store a username, password, and hostname inside our auth_tokens.py, and we have this information in plain-text stored in another file called auth_tokens_plain.py. The usage is as follows.

Open a Python console, and import ClarityCred from cred.py.
Call the ClarityCred.encrypt() static function on the plain text username, password, and hostname strings.
Copy-paste these values into auth_tokens.py.

The following image illustrates steps 1 and 2, using an existing auth_tokens_plain.py file:

The old auth_tokens_plain.py looked like this:

username = 'testuser' password = 'testpass' hostname = 'https://encryptiontest.claritylims.com'

The new auth_tokens.py looks like this:

username = 'zq1AwnqIkfA=$YFY1UuO1r6edu7qPnN9/l3kMI15ZG1JAsH7IhnxnNvYulMndhYh6lxjVBfFwjN9sZEqPM0Qlx6kjq3fbht/FlRrgklDL79H7NiUP6uYM2qVltPloRA4g8SiphF3KHx4gVTE93Ku58sFCgu1rnH5u6tkCz98v0R7PsuIOW1CDMi9zSToIu+IkcYDPPYcD1b4z8ojez/7lczunaDfrmPhwopyyUiETu9BR49Bwp5fz4XSWICZFGCd9AjoEg/FTE+/X18f+0pIz0viXQyN+JjE3vJkpNsRY2Z3d72sPgQmFFZhd48m+POUtD1UXLXhaijdxp78QTcEp7AHY+TiM8hsXT7BX1Q=='

password = '9qW5BftGyXY=$6GL1t/Zl1CbSmB7Qq54uf2TJ5fI8GUlW9NdBnumkTtF/X27WLEsr1+C0ilXQX6jnLm4kzR+5pCVgnz4xz6/80/dMLMlTll6tOvCJgPU4ZkRpkUYmcPVbrp+X3azR7I024O8UjV/JeJYV869h3kvdPyWJGXRH4oJgs5NTJKI2y6URBs0wlrlgBuZ2YkO855ZGPw9J07UMM606q9xERRzQ+LT1XLRzSCuFnuSoDVEhshhYqZ/jpYWDHvA6Z5+YTYI/i099iYZ+WQdJAiU9hcgkUnWCybjcwivvHG6vAIROroLqlOefo+hrJsVFBA3uDaPS8pkgMVsKMPUGeft6vx4NgN/jaw==

hostname = 'Q+oyq2m9Nv8=$rhgeJOMdm/M+dDNlSbBA3RCsUoo0Ts65G7lePvuajRmsLSNC5Qo5bwagRuyat0ztpeZrUmD8xTxTvhUBvZYDlM6GBLsq5drBP6PFh/lplxb6O8YiSRXrboFov8tRnu6GbaTfGR8WV7s8vBZsXhrhlPn67p7yalJLnHWb9VOKhx8AgCTtytQkkEwmpm2vbDwDha9kMdK63IrOSp2jmRaI/9X3xsd4upqaxvX7zrEJ8ruGU/szN0ITxTK1rprnowpyXfBRiOEcrI7uh1bg73oqOETn3pB/uTrGkhGETKYB2aHaewwWMccbeZTgEPT0kDmuJdpoGYy+p+gxSoR9Arh3JtREIA=='

Examples of the plain-text auth_tokens_plain.py and encrypted auth_tokens.py are attached at the bottom of this article.

Using the New auth_tokens.py in Your Scripts

Now that the new auth_tokens.py is ready to use, you can import it and create the corresponding PYC file to provide that extra level of security, as previously discussed. You can remove the PY file and ship the PYC file everywhere it is required.

It may also be a good idea to restrict the read/write/execute permissions on the file to the system user that is calling the file (usually glsai in Clarity LIMS installations).

To use the values in this file in the code, we need to use the decrypt() function in ClarityCred. Look at the simple example of initializing a glsapiutil api object. For reference, the example current directory listing looks like this:

Notice the .py source files are removed wherever possible.

Using a Python console, the normal api invocation (using a plain-text auth_tokens file) would look as follows.

import glsapiutil import auth_tokens_plain api = glsapiutil.glsapiutil2() api.setHostname( auth_tokens_plain.hostname ) api.setVersion( 'v2' ) api.setup( auth_tokens_plain.username, auth_tokens_plain.password )

Now, however, with our encrypted tokens, we decrypt the values on-the-fly (changes shown in italicized red text):

import glsapiutil import auth_tokens from cred import ClarityCred api = glsapiutil.glsapiutil2() api.setHostname( ClarityCred.decrypt( auth_tokens.hostname ) ) api.setVersion( 'v2' ) api.setup( ClarityCred.decrypt( auth_tokens.username ), ClarityCred.decrypt( auth_tokens.password ) )

This method provides a relatively robust solution for encrypting and obfuscating sensitive data and can be used in any Python context, not just for Clarity LIMS API initialization. By further ensuring that only the auth_tokens.pyc file is shipped and copied with restricted read/write/execute permissions, this method should help satisfy IT security requirements.

However, the matter of storing the secret key externally remains. One idea is to store the secret key in a separate file and encrypt that file using openssl or an OpenPGP key. While the problem of storing each piece of information in encrypted format likely never fully goes away, the use of multiple methods of encryption can offer better protection and peace of mind.