Commands for the ORA Helper Suite software

oraHelper Interactive User Guide

The oraHelper interactive user guide helps you to determine which of the DRAGEN ORA Helper Suite software to use with your intended downstream bioinformatics software. The oraHelper interactive user guide also displays the correct syntax to use.

Info To get the relevant information about the intended downstream bioinformatics software, the bioinformatics software must be in the PATH variable.

Use the oraHelper interactive user guide as follows:

1. Open a terminal window.

2. Add the oraHelperSuite directory to your PATH as follows:

PATH=<oraHelperSuite Directory>:$PATH

3. Enter oraHelper followed by the intended downstream software command and the input *.fastq.ora file name.

$ oraHelper <command> <file name>.fastq.ora

Examples

The following example shows oraHelper used with the bwa command.

$ oraHelper bwa -i <filename>.fastq.ora

The output will print the proper syntax for each of the DRAGEN ORA Helper Suite software.

The following example shows the output of oraHelper used with the head command.

$ oraHelper head <file name>.fastq.ora

The following example shows the output of oraHelper used with the head command.

Command line Options

oraFuse Software

The oraFuse software requires root privileges during installation. Refer to Installation Requirements.

The oraFuse software creates a virtual FASTQ file for each FASTQ.ORA file in the directory. The virtual file created has a *.fastq file extension instead of a *fastq.gz file extension.

Info Make sure you indicate a *.fastq file extension as input.

Local Environment

1. The oraFuse software runs with the following dependencies: fuse, fuse- libs, libcurl and openssl. If the dependencies were not installed during the DRAGEN ORA Helper Suite Software installation, run the following command.

$ sudo yum install fuse fuse-libs libcurl openssl

2. Add the oraHelperSuite directory to your PATH as follows.

PATH=<oraHelperSuite DIR>:$PATH

3. Enter the following command to mount the oraFuse software to the directory where the FASTQ.ORA files are located.

$ oraFuse

4. Run the desired command on the virtual *.fastq file.

5. When finished, enter the following command to unmount the oraFuse software from the directory.

$ oraFuse --unmount

Examples

The following examples show the results of the oraFuse software with the ls command.

  • Before oraFuse has been mounted to the current directory:

r10K_1.fastq.ora

  • After oraFuse has been mounted to the directory:

https: r10K_1.fastq r10K_1.fastq.ora s3:

  • After oraFuse has been mounted to the current directory with ls -l:

The following example shows the head command on a virtual FASTQ file after oraFuse has been mounted.

$ head <file name>.fastq

The following example shows the bwa command on a virtual FASTQ file after oraFuse has been mounted.

$ bwa mem -t 8 -M <FASTA> -o <SAM> <file name>.fastq

Remote Environment

The oraFuse software works on FASTQ.ORA files located in AWS s3 or in Azure Blob Storage. The AWS and Azure Blob Storage account configurations and credentials are used for authentication. Refer to Remote and Local Usage section.

The steps to use the oraFuse software in the remote environment are the same as those used in the local environment.

Info Azure Blob is not supported If you are using DRAGEN v3.7 as a downstream software.

The following command shows the remote and virtual files on S3.

$ ls s3 -l s3://bucket/<path>

Examples

The following example runs BWA on S3 files.

$ bwa mem -t 8 -M <FASTA> -o <SAM> s3://bucket/path/<file name>.fastq

When using DRAGEN v3.7 as a downstream software, you can use oraFuse in the remote environment if the location of a file gets passed by specifying the file name as follows:

.virtual/ora/s3:/bucket/<file name>.fastq

The following is an example of using DRAGEN v3.7 on a single FASTQ.ORA located on AWS S3:

$ dragen -f
--ref-dir=<path to hash table directory on /ephemeral>
--1 .virtual/ora/s3:/bucket/<remote file>.fastq
--output-directory=<path to output directory on /ephemeral>
--output-file-prefix=<prefix name>
--output-format BAM

Comman Line Options

Info Multiple users of the same file are not supported.

Error Messages

If you receive an error message while using the oraFuse software, use the following command to get more information.

$ cat .virtual/ora/.error

ora LD-Preload Software

The ora LD-Preload software creates a virtual FASTQ file for each FASTQ.ORA file in the directory. The virtual file created has a *.fastq file extension instead of a *fastq.gz file extension.

Info Make sure you indicate a *.fastq file extension as input.

Local Environment

1. Add the oraHelperSuite directory to your PATH as follows.

PATH=<oraHelperSuite Directory>:$PATH

2. Run the command with the ora LD- Preload shared library on the *.fastq file as follows.

$ LD_PRELOAD=<oraHelperSuite direcory>/ora-ldpreload.so <command><input file name>.fastq

Examples

The following is an example of the ora LD-Preload software with the 'bwa' command.

$ LD_PRELOAD=<oraHelperSuite DIR>/ora-ldpreload.so bwa mem -t 8 -M <FASTA> -o <SAM> <file name>.fastq

The following is an example of the ora LD-Preload software with the ls command.

$ LD_PRELOAD=<oraHelperSuite DIR>/ora-ldpreload.so ls

The following shows the output of the ora LD-Preload software with the ls command.

<file name>.fastq.ora <file name>.fastq (virtual file)

Remote Environment

The ora LD-Preload software works on FASTQ.ORA files located in AWS s3 or Azure Blob Storage. The software reuses the AWS or Azure Blob Storage account configuration and credentials. Refer to Remote and Local Usage section.

The steps to use the ora LD-Preload software in the remote environment are the same as those used in the local environment. The features are the same, with the following exceptions:

  • In the remote environment, the ora LD-Preload software does not work on statically compiled bioinformatics software.

  • In the remote environment, the Linux shell auto-completion feature does not work on virtual files.

orad Software

Orad is the executable of the DRAGEN ORA decompression software. It uses the pipe process, or pipe substitution process, to reduce reads/writes to the disk. If the downstream bioinformatics software do not work with pipes or process substitution, and the oraFuse software or the ora LD-Preload software cannot be used, then fully decompressed temporary files are required. Refer to Troubleshooting section for more information.

The steps are the same for local and remote environments.

Local Environment

1. Add the oraHelperSuite directory to your PATH as follows.

PATH=<oraHelperSuite DIR>:$PATH

2. Enter the command with orad and the pipe or process substitution on the *.fastq.ora files. See the following examples.

Examples

Example with pipe:

When the downstream command or bioinformatics software can read from the standard input, the pipe process is used. The following is an example with the head command. The - c option decompresses to standard output.

$ orad <file name>.fastq.ora -c --raw | head

Example with process substitution:

When the downstream bioinformatics software cannot read from the standard input, for example md5sum, process substitution is used. The following example shows the command with process substitution.

$ md5sum <( orad -c --raw "<file name>.fastq.ora" )

Remote environment

The orad software works on FASTQ.ORA files located in AWS s3 or in Azure Blob Storage. The software reuses the AWS or Azure Blob account configuration and credentials. Refer to Remote and Local Usage section.

Last updated