1 of 6

DRAGEN ORA Decompression v2.7

Warning Do not use this documentation for decompression on the DRAGEN Server. The DRAGEN Server has its own integrated DRAGEN ORA Decompression component.

DRAGEN ORA Decompression Software decompresses fastq.ora files into fastq.gz files. Fastq.ora files are generated using lossless compression technology as part of DRAGEN. Orad is the executable file that runs the DRAGEN ORA Decompression Software, which is a standalone piece of software.

With DRAGEN v4.3, compression of FASTQ files derived from supported non-human species and from human bisulfite data (methylated DNA application) is available. DRAGEN ORA decompression v2.7 allows for the decompression of FASTQ.ORA compressed with all supported references. Refer to the References Installation and Supported References sections.

The DRAGEN ORA Decompression Software is available for the following operating systems in three separated executable:

Linux
Mac
Windows

Decompression of FASTQ.ORA stored on local storage is supported on Linux, Mac, and Windows. Decompression of FASTQ.ORA stored on AWS S3 is only supported on Linux.

Decompression of FASTQ.ORA stored on Azure Blob storage is only supported on Linux.

When FASTQ.ORA is located on AWS S3 or Azure Blob storage, the decompression occurs on a streaming mode: the FASTQ.ORA file does not need to be fully transferred before decompression can start.

Software Installation

Installation Requirements

The following are the minimum requirements for the DRAGEN ORA Decompression Software:

Component

Minimum Requirement

Installing the DRAGEN ORA Decompression Software

Use the following steps to install the DRAGEN ORA Decompression Software once DRAGEN ORA Decompression has been downloaded from the ORA Support Site.

Linux or Mac

1. Extract the archive files using the following command:

tar -xzvf orad.2.7.0.linux.tar.gz (Linux) 
tar -xzvf orad.2.7.0.mac.tar.gz (Mac)

2. Navigate to the Orad directory as follows:

cd orad.2.7.0.linux

3. Move the executable to your preferred location as follows:

mv orad your_preferred_location/

4. Add Orad to your path as follows:

echo 'PATH=$PATH: your_preferred_location/'» ~/.bashrc source ~/.bashrc

5. Move the oradata folder content into the home repository as follows:

mv oradata ~

To store the folder in a different location, use the following command:

mv oradata ~/otherlocation/

When oradata has been moved in another location, you can:

either point to the reference by using the ORA_REF_PATH environment variable as follows:

export ORA_REF_PATH=~/otherlocation/oradata/

or use the following command at decompression

--ora-reference ~/otherlocation/

Windows

1. Extract the downloaded archive with a software that can handle gziped tarballs, such as 7-Zip. Right-click on the archive and select Extract with. The following two files are extracted:

orad.exe
refbin

The following steps use C:\Users\user1 as an example location. Change C:\Users\user1 to the location where you extracted the archive.

2. Open the Command Prompt application.

3. Set the environment variables to use the orad.exe and the refbin file with the set command or the setx command. The set command configures the variables temporarily (for the current console window) while the setx command configures the variables permanently.

4. Set the path to the orad.exe to the PATH environment variable as follows:

set PATH=%PATH%; C:\Users\user1

setx PATH=%PATH%; C:\Users\user1

5. Set the path to the refbin file to an ORA_REF_PATH environment variable as follows:

set ORA_REF_PATH= C:\Users\user1

setx ORA_REF_PATH= C:\Users\user1

References Installation

For decompression of FASTQ.ORA derived from specific species/models, the reference that has been used at compression needs to be downloaded from the ORA Support Site. The whole reference database can also be downloaded.

Info For the default human reference, there is no need to add this extra step. The default human reference is already included in the DRAGEN ORA Decompression Software.

Linux or mac

Extract a specific reference

1. Move the downloaded archive to the oradata directory

mv yourdownloadpath/<genus_specificname>.tar.gz yourpath/oradata

2. Navigate to the oradata directory

cd yourpath/oradata

3. Extract the archive file using the following command:

tar -xzvf <genus_specificname>.tar.gz (Linux)
tar -xzvf <genus_specificname>.tar.gz (Mac)

When the extraction of the specific reference is completed on Linux OS, the orad.2.7.0.linux folder should be structured as follows with example on gallus_gallus reference:

orad.2.7.0.linux

|___orad

|___oradata

|___refbin

|___Gallus_gallus

|___refbin

|___readme_gallus_gallus

Note The oradatafolder can be moved to another location but should keep its structure.

Extract the full database

1. Move the downloaded archive to the orad.2.7.0.linux directory

mv yourdownloadpath/oradata.tar.gz yourpath/orad.2.7.0.linux

2. Navigate to the orad.2.7.0.linux directory

cd yourpath/ orad.2.7.0.linux

3. Delete existing oradata folder

rm -r oradata

4. Extract the archive file using the following command:

tar -xzvf oradata.tar.gz (Linux)
tar -xzvf oradata.tar.gz (Mac)

When the extraction of the full database is completed the orad.2.7.0.linux folder should be structured as follows:

orad.2.7.0.linux

|___orad

|___oradata

|___Homo_sapiens

|___refbin

|___readme_homo_sapiens

|___Homo_sapiens_bisulfite

|___refbin

|___readme_homo_sapiens_bisulfite

|___<Genus_specificname>

|___refbin

|___ readme_<Genus_specificname>

Note The oradatafolder can be moved to another location but should keep its structure.

Windows

Extract the downloaded archive with a software that can handle gziped tarballs, such as 7-Zip. Right-click on the archive and select Extract with.

When a specific reference is downloaded, a folder with name <Genus_specifcname> is extracted. This folder contains the corresponding refbin and readme files. Move this folder in the location where orad and the default human refbin file has been extracted during the installation of the DRAGEN ORA Decompression Software procedure.

When the full database is downloaded, a folder with name oradata is extracted. This folder contains subfolders for each specific species which contains the corresponding refbin and readme files. Move this oradata folder in the location where orad and the default human refbin has been extracted during the installation of the DRAGEN ORA Decompression Software procedure.

Commands

Required commands

Use the following commands to decompress the files:

orad FILE [args]or orad [args] FILE

A folder that contains FASTQ.ORA files can also be provided for batch decompression of FASTQ.ORA files at top level directory:

orad ./pathtofolder/ [args]

On Windows, replace orad with orad.exe.

orad.exe FILE [args].

To decompress FASTQ.ORA compressed with a reference other than the default human reference, ensure the specific reference is available locally.

No change applies in the command line. The decompression software automatically detects which species/model is used.

Command Line Options

Command

Description

Command Examples

Using Windows, replace orad with orad.exe. Example is: orad.exe myfile.fastq.ora --check

Combine with downstream analysis

For a fully transparent usage of fastq.ora files (no changes in the command, no overhead, no additional footprint) with third-party bioinformatics software, DRAGEN ORA Helper Suite Software is recommended and available for download on the ORA Support Site. This software is only supported on Linux.

For a semi-transparent usage of fastq.ora files with third-party bioinformatics software, use DRAGEN ORA Decompression with the pipe function or process substitution. This method improves system performance by reducing reads and writing to the disk versus a full decompression step.

If the analysis software can read from the standard input, such as BWA, use the following command:

orad file.fastq.ora -c --raw | bwa mem humanref.fasta - > resu.sam

The -c option decompresses to standard output. The result is sent | to BWA, which uses the dash option - to read from standard input. This also works for paired reads, which uses the -p option of BWA to specify that the input contains interleaved paired reads.

If the analysis software cannot read from the standard input, you can use process substitution:

bwa mem humanref.fasta <(orad file.fastq.ora -c --raw) > resu.sam

For the file name, use the <( ) syntax containing the command that generates the file to standard output. In this case, orad with the -c option as in the command above. This method does not work when the third-party software checks the input file name or when the third-party software does not read the file sequentially.

Info On Windows, replace orad with orad.exe

Check Losslessness

DRAGEN ORA Compression is a lossless compression.

If you wish to verify that no data was lost during the compression of the fastq.ora file, compare the MD5 checksum of the decompressed fastq.ora file and the MD5 checksum of the decompressed fastq.gz file.

1. Compute the md5 checksum of the uncompressed FASTQ.ORA content as follows

md5sum <(orad myfile.fastq.ora --raw -c )

2. Compute the md5 checksum of the uncompressed FASTQ.GZ content as follows

md5sum <(gzip -d -c myfile.fastq.gz)

Software Installation

Installation Requirements

The following are the minimum requirements for the DRAGEN ORA Decompression Software:

Component

Minimum Requirement

Installing the DRAGEN ORA Decompression Software

Use the following steps to install the DRAGEN ORA Decompression Software once DRAGEN ORA Decompression has been downloaded from the ORA Support Site.

Linux or Mac

1. Extract the archive files using the following command:

tar -xzvf orad.2.7.0.linux.tar.gz (Linux) 
tar -xzvf orad.2.7.0.mac.tar.gz (Mac)

2. Navigate to the Orad directory as follows:

cd orad.2.7.0.linux

3. Move the executable to your preferred location as follows:

mv orad your_preferred_location/

4. Add Orad to your path as follows:

echo 'PATH=$PATH: your_preferred_location/'» ~/.bashrc source ~/.bashrc

5. Move the oradata folder content into the home repository as follows:

mv oradata ~

To store the folder in a different location, use the following command:

mv oradata ~/otherlocation/

When oradata has been moved in another location, you can:

either point to the reference by using the ORA_REF_PATH environment variable as follows:

export ORA_REF_PATH=~/otherlocation/oradata/

or use the following command at decompression

--ora-reference ~/otherlocation/

Windows

1. Extract the downloaded archive with a software that can handle gziped tarballs, such as 7-Zip. Right-click on the archive and select Extract with. The following two files are extracted:

orad.exe
refbin

The following steps use C:\Users\user1 as an example location. Change C:\Users\user1 to the location where you extracted the archive.

2. Open the Command Prompt application.

4. Set the path to the orad.exe to the PATH environment variable as follows:

set PATH=%PATH%; C:\Users\user1

setx PATH=%PATH%; C:\Users\user1

5. Set the path to the refbin file to an ORA_REF_PATH environment variable as follows:

set ORA_REF_PATH= C:\Users\user1

setx ORA_REF_PATH= C:\Users\user1

References Installation

Info For the default human reference, there is no need to add this extra step. The default human reference is already included in the DRAGEN ORA Decompression Software.

Linux or mac

Extract a specific reference

1. Move the downloaded archive to the oradata directory

mv yourdownloadpath/<genus_specificname>.tar.gz yourpath/oradata

2. Navigate to the oradata directory

cd yourpath/oradata

3. Extract the archive file using the following command:

tar -xzvf <genus_specificname>.tar.gz (Linux)
tar -xzvf <genus_specificname>.tar.gz (Mac)

When the extraction of the specific reference is completed on Linux OS, the orad.2.7.0.linux folder should be structured as follows with example on gallus_gallus reference:

orad.2.7.0.linux

|___orad

|___oradata

|___refbin

|___Gallus_gallus

|___refbin

|___readme_gallus_gallus

Note The oradatafolder can be moved to another location but should keep its structure.

Extract the full database

1. Move the downloaded archive to the orad.2.7.0.linux directory

mv yourdownloadpath/oradata.tar.gz yourpath/orad.2.7.0.linux

2. Navigate to the orad.2.7.0.linux directory

cd yourpath/ orad.2.7.0.linux

3. Delete existing oradata folder

rm -r oradata

4. Extract the archive file using the following command:

tar -xzvf oradata.tar.gz (Linux)
tar -xzvf oradata.tar.gz (Mac)

When the extraction of the full database is completed the orad.2.7.0.linux folder should be structured as follows:

orad.2.7.0.linux

|___orad

|___oradata

|___Homo_sapiens

|___refbin

|___readme_homo_sapiens

|___Homo_sapiens_bisulfite

|___refbin

|___readme_homo_sapiens_bisulfite

|___<Genus_specificname>

|___refbin

|___ readme_<Genus_specificname>

Note The oradatafolder can be moved to another location but should keep its structure.

Windows

Extract the downloaded archive with a software that can handle gziped tarballs, such as 7-Zip. Right-click on the archive and select Extract with.