1 of 19

DRAGEN Array v1.0

Overview

Welcome to DRAGEN Array

DRAGEN (Dynamic Read Analysis for GENomics) Array secondary analysis is a powerful bioinformatics software for Illumina Infinium array-based assays. DRAGEN Array uses cutting-edge data analysis tools to provide accurate, comprehensive, and highly efficient secondary analysis to maximize genomic insights and meet your research needs across multiple applications.

DRAGEN Array is offered as a local package with command-line interface (no specialized server or hardware required) and as a cloud-based package with an intuitive graphical user interface, as summerized in the table below.

Description

Key features

Local analysis

Cloud analysis

This product documentation describes the installation and setup, analysis execution, and result outputs. For the latest updates and release details, see the . See for additional details on DRAGEN Array genotyping, PGx CNV calling and PGx star allele annotation.

DRAGEN Array Applications

The following Types of Analysis are currently supported by DRAGEN Array:

DRAGEN Array – Genotyping
DRAGEN Array – PGx – CNV calling

Product Guides

Input Files

The following section describes the input files required by DRAGEN Array.

IDAT Files

For each sample a pair of raw intensity files (.idat) are generated from the iScan System or NextSeq550 (for non-methylation arrays). They provide intensities in the red and green channels for each probe on the Infinium array.

An IDAT file is identified by the BeadChip Barcode (12-digit unique Sentrix ID, i.e. 123456789101), BeadChip Position (row and column of the sample, i.e. R01C01), and Grn (Green) or Red for the specific channel.

Manifest Files

The CSV and BPM manifest files can be found on the Illumina Support Site for all commercial Infinium BeadChips or on for custom and semi-custom designs. For instructions on obtaining manifest files from MyIllumina, see Illumina Knowledge article, .

The CSV manifest file (.csv) provides complementary data to the BPM manifest file in a human readable format. It is a required input to the genotype gtc-to-vcf command to enable VCF generation for insertion/deletion variants.

Cluster File

The cluster file (.egt) is a standard product file provided by Illumina for commercial genotyping products and it is a required input for the genotype call command in DRAGEN Array. Custom cluster files may be required for optimal genotyping performance. See section for additional details.

CN Model File

The CN (Copy Number) model file (.dat) is a required input to the copy-number call command to enable accurate copy number calling for pharmacogenomics. Illumina provides a standard CN model file for each PGx array product. See section for additional details.

PGx Database File

The PGx database file (.zip) contains the variant mapping information from Infinium PGx arrays to PGx variants. For each gene and each variant used in the star allele definitions of the gene, there is a mapping to the ID field in the SNV VCF file. Each line in the gene mapping file represents a single variant and contains the SNV VCF ID for that variant followed by the HGVS (Human Genome Variation Society) tag for the variant. The PGx database file is array specific and is one of the product files provided by Illumina for each PGx array product.

Genome FASTA Files

The genome FASTA file (.fa) is a text file with the reference genome sequences.The FASTA index file (.fai) contains meta-data about chromosomal orchestration within the FASTA file for a particular species. DRAGEN Array PGx calling supports human genome build 37 and 38. The genome FASTA file and FASTA index file are both provided by Illumina for human species and should be stored together in the same input folder.

IDAT Sample Sheet

For local analysis, the IDAT sample sheet can be a CSV or JSON formatted file with direct paths to sample IDAT files. It enables easy analysis of samples from different directories.

Example CSV format:

Green IDAT Path,Red IDAT Path

/path/to/sample1_Grn.idat,/path/to/sample1_Red.idat

/path/to/sample2_Grn.idat,/path/to/sample2_Red.idat

/path/to/sample3_Grn.idat,/path/to/sample3_Red.idat

Example JSON format:

[

{

"Green IDAT Path": "/path/to/sample1_Grn.idat",

"Red IDAT Path": "/path/to/sample1_Red.idat"

},

{

"Green IDAT Path: "/path/to/sample2_Grn.idat",

"Red IDAT Path": "/path/to/sample2_Red.idat"

},

{

"Green IDAT Path": "/path/to/sample3_Grn.idat",

"Red IDAT Path": "/path/to/sample3_Red.idat"

},

]

For cloud analysis, the IDAT sample sheet can be a CSV formatted file.

beadChipName,sampleSectionName

Beadchip 1 barcode (204753010023), sample section (R01C01)

Beadchip 1 barcode (204753010023), sample section (R02C01)

Beadchip 2 barcode (204753010024), sample section (R01C01)

Beadchip 2 barcode (204753010024), sample section (R02C01)

For DRAGEN Array Methylation QC on cloud, additional optional sample sheet fields are available.

Following Sample_Group, any number of additional columns can be added to include meta data fields such as sex, sample type, plate and well information, etc. Additional columns added after the Sample_Group column may have user-defined column header values. The Sample_ID field and any additional meta data added will be replicated in the Sample QC Summary output files.

The Sample_Group field will be used to populate the PCA Control Plot within the Sample QC Summary Plots file and the Principal Component Summary file. For the PCA Control Plot, each sample group will be assigned a unique color. Samples assigned to the same Sample_Group value will be the same color in the PCA Control Plot.

beadChipName,sampleSectionName,Sample_ID,Sample_Group,MetaData1

Beadchip 1 barcode (204753010023), sample section (R01C01),NA1231,Group1,F

Beadchip 1 barcode (204753010023), sample section (R02C01),NA1232,Group2,F

Beadchip 2 barcode (204753010024), sample section (R01C01),NA1233,Group2,M

Beadchip 2 barcode (204753010024), sample section (R02C01),NA1234,Group1,M

GTC Sample Sheet

The GTC sample sheet is a CSV or JSON formatted file with direct paths to sample GTC files. It enables easy analysis of samples from different directories.

Example CSV format:

GTC Path

/path/to/sample1.gtc

/path/to/sample2.gtc

/path/to/sample3.gtc

Example JSON format:

[

{

"GTC Path": "/path/to/sample1.gtc"

},

{

"GTC Path": "/path/to/sample2.gtc"

},

{

"GTC Path": "/path/to/sample3.gtc"

}

]

Input File Summary Table

In addition to the input files, there are set of intermediate files, including GTC, SNV VCF, CNV VCF and PGx CSV, which are outputs of some DRAGEN Array Local commands and inputs to other commands.

The table below summarizes the input files or intermediate file, their sources, and the associated DRAGEN Array Local commands and options.

Input File

Source

Command

Option

Reference

Support and Additional Resources

Technical Support

For support, questions, and feedback on DRAGEN Array, please contact Illumina Tech Support at [email protected].

Additional Resources

Resource

Description

Frequently Asked Questions

Is DRAGEN Array analysis a local (on-premises) or cloud solution? DRAGEN Array analysis is available locally (on-premises) and cloud.\
DRAGEN Array Local Analysis utilizes a command-line interface for power users to have granular control and flexibility to support large scale microarray genomic studies. Deployed on Windows or Linux operating systems, the local package is CPU-based and does not require a specialized server or hardware.\
DRAGEN Array Cloud Analysis utilizes the user-friendly, graphical interface of BaseSpace Sequence Hub to simplify analysis setup and kickoff.\

Release Notes

The following versions of DRAGEN Array have been released:

DRAGEN Array v1.0.0 Release Notes

RELEASE DATE

December 2023

RELEASE HIGHLIGHTS

DRAGEN Array Genotyping Cloud v1.0.0 Release Notes

RELEASE DATE

March 2024

RELEASE HIGHLIGHTS

Ability to genotype and produce related reports for human and non-human arrays in the cloud.
Configureable interfaces in Basespace that allows for flexibility and easy kick off.

NEW FEATURES IN DETAIL

KNOWN ISSUES

Some multi-nucleotide variant (MNV) designs reverse compliment the "Allele1/2 Top" fields in the Final Report

KNOWN LIMITATIONS

Genotyping only works on diploid organisms at this time. Polyploid genotyping is not currently supported.

DRAGEN Array Methylation QC Cloud v1.0.0 Release Notes

RELEASE DATE

May 2024

RELEASE HIGHLIGHTS

Adjustable thresholds to determine pass/fail status
Data summary plots for a quick visual check of each analysis batch
Determining detection p-value, beta-values, and m-values from each methylation sample
Deployment on BaseSpace™ Sequence Hub user interface for easy analysis kickoff

NEW FEATURES IN DETAIL

Adjustable thresholds for 21 built in controls, p-value detection, proportion probes passing, and offset correction within BaseSpace Sequence Hub to customize for user’s study needs
- Thresholds are used to assign pass (1) or fail (0) status to each sample
  - Failed metrics can be highlighted for easy viewing

KNOWN ISSUES

KNOWN LIMITATIONS

Standard thresholds may not be applicable for all discontinued, semi-custom or custom BeadChips and IDATs originating from NextSeq550
Built-in controls may not be available on all discontinued, semi-custom or custom BeadChips

PGx CNV Coverage

Copy number variation can be detected for genes and regions listed below. The chromosome locations are GRCh38 based.

Gene

Region Name

Chromosome

Start

End

GSTM1

109687842

109693526

PGx Allele Definitions and PGx Guidelines

DRAGEN Array star allele calling leverages the star allele definitions provided by PharmVar and PharmGKB. DRAGEN Array star allele phenotype annotation, using the “star-alle annotate” command, is achieved through direct lookup into public PGx guidelines CPIC or DPWG, which is selected by the user when running DRAGEN Array.

See table below for details of the data sources.

Data Source

Version

URL

PharmVar

6.0.5

https://www.pharmvar.org

PharmGKB

DRAGEN Array “star-alle annotate” command provides both metabolizer status and activity score annotations for genes covered by the CPIC and DPWG guidelines.

Specifically, CPIC metabolizer/phenotype annotations are supported for CACNA1S, CYP2B6, CYP2C19, CYP2C9, CYP2D6, CYP3A5, DPYD, G6PD, MT-RNR1, NUDT15, RYR1, SLCO1B1, TPMT, UGT1A1, CFTR, IFNL3/IFNL4 and VKORC1, among them activity scores are supported for CYP2C9, CYP2D6, and DPYD. DPWG metabolizer/phenotype annotations are supported for CYP1A2, CYP2B6, CYP2C19, CYP2C9, CYP2D6, CYP3A4, CYP3A5, DPYD, NUDT15, SLCO1B1, TPMT, UGT1A1, VKORC1 and F5, among them activity scores are supported for CYP2D6 and DPYD.

PGx Star Allele Coverage

The genes and star alleles listed below can be detected by DRAGEN Array v1.0 if available on the microarray. Known and novel star alleles not in the below list will not be reported. Star allele definitions are sourced from PharmVar and PharmGKB.

Gene

PGx Alleles

ADH1B

Reference;rs1229984.T>C;rs1229985.A>G;rs17033.T>C;rs1789891.C>A;rs2018417.C>A;rs2066702.G>A;rs75967634.C>T

ALDH2

Reference;rs671.G>A

ANK3

Reference;rs143414470.T>C

ANKK1

Document Revision History

The version history for DRAGEN Array product documentation:

Version

Date

Description of Change