Convert CSV to Excel

Available from: BaseSpace Clarity LIMS v3.1.0

Included as part of the NGS Extensions package, the convertToExcel script is designed to convert separated-value files (eg CSV) to Microsoft Excel spreadsheets of type XLS and XLSX.

  • The script can be run on comma- and tab-separated files with any file extension. The original file is not edited, unless its name matches the name given for the output file.

  • The script can update an existing Excel spreadsheet or produce an entirely new one.

  • When updating an existing Excel spreadsheet, if this spreadsheet does not have a file extension XLSX will be used by default.

  • A single worksheet is updated with the input file contents. When producing a new Excel spreadsheet, this worksheet name may optionally be specified. Otherwise, The default name will be used.

  • The worksheet name must be provided when updating an existing Excel spreadsheet. If the worksheet exists, its contents will be overwritten with the contents from the input file. Otherwise, a new worksheet will be added.

  • Each line in the input file becomes a row in the output file, and its values are placed into the cells of that row. The first value in the input file becomes the value of cell A:1, and so forth.

  • When updating an existing worksheet, cells that are not overwritten by values from the input file are left untouched. For example, there may be a footer section that is not updated.

  • The Excel file produced may be written to a location accessible from the LIMS server (a location on the server, a mounted drive, or a shared file store for example) and also attached in the LIMS via quick Attach. If both options are specified, the script will warn if the file cannot be written and report an error if the file cannot be uploaded.

  • The cell types currently supported are Numeric, Boolean, Blank, and String.

  • Supported number formats include period (.) as the decimal point and numbers that include an exponential (eg, 1e-8 or 4E2).

  • Boolean values are case-insensitive.

Macros and equations are not supported when updating an existing Excel file. Other cells in the file that depend on the new values will not be updated when the worksheet is.

Script Parameters and Usage

Parameter

Description

Remark

-u

LIMS username (Required)

-p

LIMS password (Required)

-i

LIMS step URI (Required)

-logFileLIMSID

LIMSID of the log file that will be generated (Required)

-logFileName

Custom name to be given to the generated log file (Optional) The name must begin with 'logFileLIMSID', or else the file will not be attached to the LIMS.

  • If the name does not end in '.html', then '-LogFile.html' will be appended to the end of the file name.

  • Default is ${logFileLIMSID}-LogFile.html.

-srcLIMSID

The LIMSID of the file to be converted (Required if not providing inputFileName)

  • If no local file is found in the current directory whose name starts with srcLIMSID, an attempt will be made to download the file from the LIMS.

  • You may only provide one of either inputFileName or srcLIMSID.

-inputFileName

Name of the input file to be converted (Required if not providing srcLIMSID)

  • Assumes that the file is accessible from the LIMS server (either local to the server or available in a mounted drive or shared file store).

  • You may only provide one of either inputFileName or srcLIMSID.

-destLIMSID

The LIMSID of the result file to attach the output XLS/XLSX file to. (Required if not providing outputFileName, or if using quickAttach.)

  • At least one of outputFileName and destLIMSID are required.

-outputFileName

Name to be given to the output XLS/XLSX file. (Required if not providing destLIMSID.)

  • Default name is {destLIMSID}.xls(x), based off the file type being created.

  • At least one of outputFileName and destLIMSID are required.

-q quickAttach

Typically used when the EPP invocation calls a subsequent script that must access the file via the API. (Optional)

  • If true, destLIMSID is required and the following occurs:

    • When the script has completed running, it will attach the file to the LIMS via FTP, rather than writing it to disc to be attached by AI (see note for information about file names and AI).

  • Default is false.

-s separator

The separator used in the input file. (Optional)

  • Options are COMMA or TAB.

  • Default is COMMA.

-xls generateXLS

Determines the output file type and extension. XLS and XLSX are supported. (Optional)

  • If true, then the input file will be converted to an XLS file.

  • If false, then the input file will be converted to an XLSX file.

  • Default is false.

  • This parameter is ignored if updateFileName or updateFileLIMSID are provided. The file provided to update will be used to determine the output file type, and if not available XLSX will be used by default.

-worksheet worksheetName

Name of the worksheet to create or update. Must be provided when updating an existing file. (Optional)

  • The worksheet name must be provided when either updateFileName or updateFileLIMSID are provided.

  • The worksheet name must not exceed 32 characters and must not contain any of the following: ? [ ] / \ : *

-updateFileName

Name of an existing Excel file to update. (Optional)

  • Assumes that the file is accessible from the LIMS server (either local to the server or available in a mounted drive or shared file store).

  • You may only provide one of either updateFileName or updateFileLIMSID.

-updateFileLIMSID

LIMSID of an existing Excel file to update. (Optional)

  • If no local file is found in the current directory whose name starts with updateFileLIMSID, an attempt will be made to download the file from the LIMS.

  • You may only provide one of either updateFileName or updateFileLIMSID.

  • Automated Informatics (AI) will only pick up and attach files in BaseSpace Clarity LIMS if the outputFileName begins with the LIMSID of a file placeholder that exists in the LIMS.

  • For more information on AI, refer to the topics in the Automated Informatics forum.

  • The quickAttach option makes use of FTP to upload the file to the result file placeholder in the LIMS. This means that this script may not be run using this option on remote AI nodes, as it requires access to the LIMS database to retrieve the FTP credentials.

Configuration

  • The convertToExcel script can be run on any step, provided there is a way to supply it with an input file to convert.

  • The recommended configuration is to use a minimum of two shared result files on the step: One result file used to attach the final converted file; the other the log file. The input file placeholder may be the same as the final file destination, if the input file is to be overwritten with the script results, and likewise for an existing Excel file to be updated.

  • Configure an automation trigger, usually on the Record Details view, to use the script. The input file may be attached manually or produced automatically by another script such as the sample sheet generator.

  • To configure the script to both attach its output file to a placeholder in the LIMS and to write it to a location on the server (or in a directory with shared access), provide both outputFileName and destLIMSID with quickAttach. Include the destination path in outputFileName, for example

    -outputFileName '/opt/gls/clarity/customextensions/example/output.xls'.

Example Script Automation Strings

The following examples include various options for file handling. These options exist to reduce the FTP/Automated Informatics (AI) overhead so that the script executes faster.

For example, if quickAttach is set to true, the script will attach the file directly to the LIMS through FTP. It will only write the file locally if upload/attachment via the API fails.

Example 1: Typical Use

bash -c "/opt/gls/clarity/bin/java -jar /opt/gls/clarity/extensions/ngs-common/v5/EPP/ngs-extensions.jar script:convertToExcel \
-i {stepURI:v2:http} \
-u {username} \
-p {password} \
-srcLIMSID {compoundOutputFileLuid0} \
-outputFileName {compoundOutputFileLuid0}-converted.xlsx \
-logFileLIMSID {compoundOutputFileLuid1}"

In this example:

  1. The file currently attached in the LIMS with LIMS ID {compoundOutputFileLuid0} is downloaded.

  2. The file is converted to an XLSX file with the name {compoundOutputFileLuid0}-converted.xlsx.

    • This file is left in the current local directory for AI to attach to the LIMS automatically.

  3. When attached, the file overwrites the file with LIMS ID {compoundOutputFileLuid0} that was originally downloaded.

  4. Finally, the log file is uploaded to the LIMS with the name {compoundOutputFileLuid1}-LogFile.html.

Example 2: Updating an attached Excel file and both writing it to a specific location and uploading the result

bash -c "/opt/gls/clarity/bin/java -jar /opt/gls/clarity/extensions/ngs-common/v5/EPP/ngs-extensions.jar script:convertToExcel \
-i {stepURI:v2:http} \
-u {username} \
-p {password} \
-logFileLIMSID {compoundOutputFileLuid2} \
-srcLIMSID {compoundOutputFileLuid0} \
-outputFileName '/opt/gls/clarity/customextensions/example/{compoundOutputFileLuid1}.xls' \
-destLIMSID {compoundOutputFileLuid1} \
-worksheet 'Samples' \
-updateFileLIMSID {compoundOutputFileLuid1} \
-q ‘true’”

In this example:

  1. The input file currently attached in the LIMS with LIMS ID {compoundOutputFileLuid0} is downloaded.

  2. The Excel file to update, currently attached in the LIMS with LIMS ID {compoundOutputFileLuid1}, is downloaded.

  3. The file to update has a worksheet with the name Samples updated (or overwritten, if already present in the file) using the contents of the input file.

  4. The resulting file is written to /opt/gls/clarity/customextensions/example as {compoundOutputFileLuid1}.xls.

  5. Because quickAttach is passed as true:

    • The file is added to the LIMS directly with FTP with the LIMS ID {compoundOutputFileLuid1}.

    • This overwrites the Excel file to update, which was previously attached here.

  6. Finally, the log file is uploaded to the LIMS with the name {compoundOutputFileLuid2}-LogFile.html.

Example 3: Use with sample sheet generator

bash -c "/opt/gls/clarity/bin/java -cp /opt/gls/clarity/extensions/ngs-common/v5/EPP/DriverFileGenerator.jar driver_file_generator \
-i {processURI:v2:http} \
-u {username} \
-p {password} \
-t /opt/gls/clarity/extensions/ngs-common/v5/EPP/conf/readonly/bioA_driver_file_template.csv \
-o {compoundOutputFileLuid0}.csv \
-l {compoundOutputFileLuid1}-LogFile.html \
&& /opt/gls/clarity/bin/java -jar /opt/gls/clarity/extensions/ngs-common/v5/EPP/ngs-extensions.jar script:convertToExcel \
-i {stepURI:v2:http} \
-u {username} \
-p {password} \
-inputFileName {compoundOutputFileLuid0}.csv \
-destLIMSID {compoundOutputFileLuid8} \
-quickAttach true \
-logFileLIMSID {compoundOutputFileLuid1}"

In this example:

  1. A driver file is generated with the name {compoundOutputFileLuid0}.csv, and the {compoundOutputFileLuid1}-LogFile.html log file is created by the sample sheet generator.

  2. The conversion script is executed on {compoundOutputFileLuid0}.csv.

  3. Because quickAttach is passed as true and no outputFileName was provided, after the file has been converted:

    • It is added to the LIMS directly with FTP with the LIMS ID {compoundOutputFileLuid8}.

    • No file is created locally.

  4. As the input file is converted, log messages are appended to the {compoundOutputFileLuid1}-LogFile.html file.

Because the destLIMSID for the convertToExcel script differs from the LIMS ID in the name of the input CSV file, the CSV file is also uploaded by AI to the LIMS and both files will be available separately.

Example 4: Use with sample sheet generator and add blank lines

bash -c "/opt/gls/clarity/bin/java -cp /opt/gls/clarity/extensions/ngs-common/v5/EPP/DriverFileGenerator.jar driver_file_generator \
-i {processURI:v2:http} \
-u {username} \
-p {password} \
-t /opt/gls/clarity/extensions/ngs-common/v5/EPP/conf/readonly/bioA_driver_file_template.csv \
-o {compoundOutputFileLuid0}.csv \
-l {compoundOutputFileLuid1}-LogFile.html \
&& /opt/gls/clarity/bin/java -jar /opt/gls/clarity/extensions/ngs-common/v5/EPP/ngs-extensions.jar \
-i {stepURI:v2:http} \
-u {username} \
-p {password} \
script:addBlankLines \
-f {compoundOutputFileLuid0}.csv \
-l {compoundOutputFileLuid1}-LogFile.html \
-sep COMMA \
-b ', False,' \
-h 1 \
-c LIMSID \
-pre 'Sample ' \
script:convertToExcel \
-inputFileName {compoundOutputFileLuid0}.csv \
-destLIMSID {compoundOutputFileLuid0} \
-quickAttach false \
-xls true \
-logFileLIMSID {compoundOutputFileLuid1}"

In this example:

  1. Sample sheet generator creates the base driver file with name {compoundOutputFileLuid0}.csv.

  2. The add blank lines script takes that file and adds extra lines for empty wells in the container, editing the file in place.

  3. Finally, the convertToExcel script is run on that result.

    • In this case, the final output is an XLS file named {compoundOutputFileLuid8}.xls.

    • Because quickAttach is false, this file is written in the current local directory and it is assumed that AI will upload it to the LIMS.

    • The sample log file is appended to by all three programs that are run, and is attached to the LIMS.

Because the destLIMSID for the convertToExcel script is the same as the LIMS ID in the name of the input CSV file, the CSV file is not uploaded by AI to the LIMS and only the final XLS file will be available in the LIMS.

Rules and constraints

  • The input file separators supported are comma and tab.

  • Spaces between entries in the input file are not supported (eg "Sample Name, A:1" must instead be "Sample Name,A:1")

Logging

  • A short message is logged after each successful action by the script.

  • Any errors that occur will be logged in the log file before the script terminates.

  • Warnings will also be captured in the log file, and if any occur, a notification will be sent on script completion.

  • If a local log file exists that matches the log file name configured for the script, or if a file exists in the LIMS with the associated log file LIMSID, the log messages will be appended to these files. Otherwise, a new file will be created.

Last updated