1 of 27

DRAGEN Array v1.3

Overview

Welcome to DRAGEN Array

DRAGEN (Dynamic Read Analysis for GENomics) Array secondary analysis is a powerful bioinformatics software for Illumina Infinium array-based assays. DRAGEN Array uses cutting-edge data analysis tools to provide accurate, comprehensive, and highly efficient secondary analysis to maximize genomic insights and meet your research needs across multiple applications.

DRAGEN Array is offered as a local package with command-line interface (no specialized server or hardware required) and as a cloud-based package with an intuitive graphical user interface, as summerized in the table below.

Description

Key features

Local analysis

Cloud analysis

Product Guides

DRAGEN Array Methylation QC

Methylation QC Threshold Adjustment

When using DRAGEN Array – Methylation – QC cloud analysis type, additional customization options will appear after product files are selected within Configuration Settings. Adjustments to these thresholds will be saved as part of the Configuration Setting. Thresholds can be adjusted based on study objectives. Adjusting thresholds will impact the pass or fail status of samples in the output files.

Illumina recommends thresholds for MethylationEPIC v1 & v2 and Methylation Screening Array (MSA). Users may use these thresholds as a starting point when defining thresholds for their custom or semi-custom BeadChip or other Infinium Methylation arrays. Further tuning may be required based on BeadChip used, laboratory conditions, iScan settings, bisulfite conversion methods, FPPE sample type, etc. A dataset deemed acceptable to the user based on proportion probes passing can be used for these additional threshold adjustments.

Reference

Support and Additional Resources

Technical Support

For support, questions, and feedback on DRAGEN Array, please contact Illumina Tech Support at [email protected].

Additional Resources

Resource

Description

Frequently Asked Questions

Is DRAGEN Array analysis a local (on-premises) or cloud solution? DRAGEN Array analysis is available locally (on-premises) and cloud.
DRAGEN Array Local Analysis utilizes a command-line interface for power users to have granular control and flexibility to support large scale microarray genomic studies. Deployed on Windows or Linux operating systems, the local package is CPU-based and does not require a specialized server or hardware.
DRAGEN Array Cloud Analysis utilizes the user-friendly, graphical interface of BaseSpace Sequence Hub to simplify analysis setup and kickoff.

Release Notes

The following versions of DRAGEN Array have been released:

DRAGEN Array v1.3.0 Release Notes
- DRAGEN Array v1.3.0 + Emedgene V100.39.0 Release Notes

DRAGEN Array v1.3.0 + Emedgene V100.39.0 Release Notes

RELEASE DATE

October 2025

RELEASE HIGHLIGHTS

DRAGEN Array v1.2.0 Release Notes

RELEASE DATE

February 2025

RELEASE HIGHLIGHTS

DRAGEN Array v1.1.0 Release Notes

RELEASE DATE

September 2024

RELEASE HIGHLIGHTS

New EX PGx beadchips enabled for PGx analysis
Increased coverage of high priority PGx genes
Custom optimized .egt files accepted in PGx analysis
Up-to-date database reflecting latest versions of public PGx resources

NEW FEATURES IN DETAIL

DRAGEN Array supports multiple PGx products
- Two new EX PGx beadchips enabled through genotyping, PGx CNV calling, and star allele annotation
  - Infinium Global Screening Array with Enhanced PGx-48 v4.0 Kit

KNOWN ISSUES

Some simple variants have REF and ALT delimited by _ instead of > in the star_alleles.csv and metabolizer status JSON files (e.g., "ryr1.38577931a_c" instead of "ryr1.38577931a>c")
Some multi-nucleotide variant (MNV) designs reverse compliment the "Allele1/2 Top" fields in the Final Report
Occasional star-allele solution score discorcordance between Linux and Windows OS with concordant solution ranking.

KNOWN LIMITATIONS

Star allele calling does not support novel alleles but those defined in the PharmVar and PharmGKB databases.
CYP2D6 non-*36 star alleles with exon 9 conversion, such as *83, are reported as *36 with *83 as an underlying allele.
Genotyping only supports diploid organisms. Polyploid genotyping is currently not supported.

DRAGEN Array Methylation QC Cloud v1.0.1 Release Notes

RELEASE DATE

February 2026

RELEASE HIGHLIGHTS

Hot-fix release patching known issue from v1.0 where analysis could fail for the 80-100 sample size range when using large (>900K probes) arrays. Large arrays are now supported in 80-100 sample size range as well.

KNOWN LIMITATIONS

Standard thresholds may not be applicable for all discontinued, semi-custom or custom BeadChips and IDATs originating from NextSeq550
Built-in controls may not be available on all discontinued, semi-custom or custom BeadChips

DRAGEN Array v1.0.0 Release Notes

RELEASE DATE

December 2023

RELEASE HIGHLIGHTS

Improved star allele calling accuracy for Global Diversity Array with enhanced PGx (GDA-ePGx) BeadChips.
Reports star allele calls with quality scores for greater transparency and confidence.
Provides missing variant reporting to improve data quality.

NEW FEATURES IN DETAIL

Star Allele Calling
- Star allele calling for genes listed in
  - For in-silico datasets, call rate ≥99%, diplotyping accuracy ≥ 90%

KNOWN ISSUES

Corrupt or invalid GTC files will abort with an error instead of skipping. The corrupt or invalid GTC files will need to be removed before proceeding.
In the gtc-to-vcf subcommand a mismatch between BPM and CSV manifests will not cause the command to abort with an error. The mismatch will need to be addressed before proceeding.
For gtc-to-vcf, multi-allelic variants designed with multiple assays might not always collapse into one variant correctly and be reported as two separate variants instead. Some indel variants are missing from SNV VCF due to mapping issue between the designed indels and the reference genome.

There is a workaround to disable globalization and produce valid GTC files:

Locate the dragena.runtimeconfig.json file inside the installation directory of DRAGEN Array (i.e., where the .zip or .tar.gz file was downloaded and unzipped).
Add the key System.Globalization.Invariant to that file and set it's value to true. (i.e., step #2 here: https://github.com/dotnet/corefx/blob/master/Documentation/architecture/globalization-invariant-mode.md#enabling-the-invariant-mode)

KNOWN LIMITATIONS

PGx CNV calling and star allele calling and annotation were only validated and intended to be used with GDA_PGx_E2 product files.
Using subcommands “unsquash-duplicates” and “filter loci” during gtc-to-vcf conversion should not be used when star allele calling is desired.
Only CPIC guidelines are available for star allele annotation (metabolizer status calling) for the cloud offering. For local, CPIC and DPWG are available.

DRAGEN Array Genotyping Cloud v1.0.0 Release Notes

RELEASE DATE

March 2024

RELEASE HIGHLIGHTS

Ability to genotype and produce related reports for human and non-human arrays in the cloud.
Configureable interfaces in Basespace that allows for flexibility and easy kick off.

NEW FEATURES IN DETAIL

KNOWN ISSUES

Some multi-nucleotide variant (MNV) designs reverse compliment the "Allele1/2 Top" fields in the Final Report

KNOWN LIMITATIONS

Genotyping only works on diploid organisms at this time. Polyploid genotyping is not currently supported.

DRAGEN Array Methylation QC Cloud v1.0.0 Release Notes

RELEASE DATE

May 2024

RELEASE HIGHLIGHTS

Adjustable thresholds to determine pass/fail status
Data summary plots for a quick visual check of each analysis batch
Determining detection p-value, beta-values, and m-values from each methylation sample
Deployment on BaseSpace™ Sequence Hub user interface for easy analysis kickoff

NEW FEATURES IN DETAIL

Adjustable thresholds for 21 built in controls, p-value detection, proportion probes passing, and offset correction within BaseSpace Sequence Hub to customize for user’s study needs
- Thresholds are used to assign pass (1) or fail (0) status to each sample
  - Failed metrics can be highlighted for easy viewing

KNOWN ISSUES

Analysis may fail for the 80-100 sample size range when using large (>900K probes) arrays. If encountering this issue users are recommended to increase sample size as a workaround. The issue does not affect sample sizes strictly greater than 100.

KNOWN LIMITATIONS

Standard thresholds may not be applicable for all discontinued, semi-custom or custom BeadChips and IDATs originating from NextSeq550
Built-in controls may not be available on all discontinued, semi-custom or custom BeadChips

PGx CNV Coverage

Copy number variation can be detected for genes and regions listed below. The chromosome locations are GRCh38 based.

Gene

Region Name

Chromosome

Start

End

GSTM1

109687842

109693526

UGT2B17

Document Revision History

The version history for DRAGEN Array product documentation:

Version

Date

Description of Change

December 2023

Initial release

March 2024

Added details for DRAGEN Array v1.0.0 cloud genotype pipeline release.

Re-generate the GTC using the genotype call subcommand.

fail:  ArrayAnalysis.CLI.App[0] 
        [07:17:07 6620]: System.IO.EndOfStreamException: Unable to read beyond the end of the stream. 
             at System.IO.BinaryReader.ReadString() 
             at ArrayAnalysis.Core.GtcFileLoader..ctor(String filePath) in /src/ArrayAnalysis.Core/GtcFileLoader.cs:line 161 
             at ArrayAnalysis.Services.GtcFactory.CreateGtcFromSample(Sample sample, Boolean log) 
             at ArrayAnalysis.Services.GtcToVcfService.Run(GtcToVcfInput input) 
             at ArrayAnalysis.CLI.App.RunCliServiceAndReturnExitCode(BaseOptions opts) in /src/ArrayAnalysis.CLI/App.cs:line 110

If a sample's sex estimate is called as unknown in the genotyping module, the cytogenetic caller will assume the sample is male. Consequently, detection results on sex chromosomes could be inaccurate if the sample is actually female.
ISCN annotations in the cytogenetic annotation JSON output file are only provided for variants greater than 1 Kb in length. This is often cited as a minimum size limit used to define copy number variants.
ISCN annotations are not provided for LOH variants in the cytogenetic annotation JSON output file.
Centromere regions typically have low sequence complexity and are prone to artifacts. As a result, cytogenetic calling results in these regions are likely to be false positives.
The cyto annotate subcommand produces extraneous logs (e.g., No credential is provided) that can be safely ignored.
During cyto call, there is a log for the CytoPlatform currently hardcoded to LCG regardless of the product used. This has no bearing on the underlying algorithm and is just what is reported in the log. It can be safely ignored.
A non-default value of the --smoothing parameter for the command triggers a bug causing wrong values in the LogR Ratios (LRR) bedgraph. It is advised users use the default (0), which produces a valid LRR bedgraph with raw signal for visualization purposes. The --smoothing parameter will be disabled in next release of DRAGEN Array.
The cyto call command may throw an overflow error in very rare cases when no variants are detected in noisy or low-quality samples. Contact [email protected] if you encounter this issue.
The minimum deletion/LOH/duplication thresholds shown in the cyto annotation JSON may be shown in the wrong units when set higher than the cyto calling thresholds.
Cyto CNV/LOH variants with quality scores of 0 seen in the cyto call VCF files cannot be passed into the annotation output json files.
CYP2A6 *1 definition incorrectly includes NC_000019.10:g.40848264_40848265delinsT.
DRAGEN Array – Cytogenetics Calling and DRAGEN Array - Cytogenetics analysis + Emedgene interpretation cloud analyses may fail around 200 samples in one batch due to high memory usage. Recommended workaround is to run smaller batches.
Sample with a reference allele for ABCG2 genes will have missing phenotype annotations when running the command pgx star-allele annotate.

Rare intermittent memory issues during star allele calling. Example error message: The model has been changed since the solution was last computed.. To workaround the issue, user should restart star allele calling or run it on a machine with more memory.

Input Files

The following section describes the input files required by DRAGEN Array. Product files (anything other than the IDATs) can be found on the support site.

IDAT Files

For each sample a pair of raw intensity files (.idat) are generated from the iScan System or NextSeq550 (for select arrays). They provide intensities in the red and green channels for each probe on the Infinium array. More information on which arrays can be used with NextSeq550, can be found on the Illumina Knowledge page on NextSeq550.

An IDAT file is identified by the BeadChip Barcode (12-digit unique Sentrix ID, i.e. 123456789101), BeadChip Position (row and column of the sample, i.e. R01C01), and Grn (Green) or Red for the specific channel.

Manifest Files

The CSV and BPM manifest files can be found on the Illumina Support Site for all commercial Infinium BeadChips or on for custom and semi-custom designs. DRAGEN Array only supports manifest files from the Illumina Support site. For instructions on obtaining manifest files from MyIllumina, see Illumina Knowledge article, .

The CSV manifest file (.csv) provides complementary data to the BPM manifest file in a human readable format. It is a required input to the genotype gtc-to-vcf command to enable VCF generation for insertion/deletion variants. gtc-to-vcf depends on the presence of accurate mapping information within the manifest, and may produce inaccurate results if the mapping information is incorrect. Mapping information follows the implicit dbSNP standard, where

Positions are reported with 1-based indexing.
Positions in the PAR are reported with mapping position to the X chromosome.
For an insertion relative to the reference, the position of the base immediately 5' to the insertion (on the plus strand) is given.

Cluster File

The cluster file (.egt) is a standard product file provided by Illumina for commercial genotyping products and it is a required input for the genotype call command in DRAGEN Array. Custom cluster files may be required for optimal genotyping performance. See section for additional details.

PGx CN Model File

The PGx CN (Copy Number) model file (.dat) is a required input to the pgx copy-number call command to enable accurate copy number calling for pharmacogenomics. Illumina provides a standard CN model file for each PGx array product. See section for additional details.

Cytogenetics Model File

The cytogenetics CN (Copy Number) model file (.dat) is a required input to the cyto call command to enable accurate Cytogenetics analysis. Illumina provides a standard CN model file for each supported array product. For custom or other products, please contact Tech Support to request a CN model file and include the product BPM manifest.

Note: The CN model file needs to be updated upon manifest revisions since probes can be added or removed during manifest revisions. A mismatch between the CN model file and the manifest will cause an error during pgx copy-number call and cyto call.

Mask File

The mask file (.msk) is a required input to the pgx copy-number train command to enable accurate pgx copy number training for pharmacogenomics. It does not need to be provided as an explicit input to the command line interface but should reside in the same folder as the BPM manifest. It should have the same base name as the manifest for the product. Illumina provides a mask file for each PGx array product and these can be found on the

PGx Database File

The PGx database file (.zip) contains the variant mapping information from Infinium PGx arrays to PGx variants. Each line in this file represents a single probe ID mapping to a variant's HGVS (Human Genome Variation Society) tag. This creates a map of many probes to one variant. DRAGEN Array cross references this map with SNV VCF IDs during runtime to do star allele calling. It works across all supported PGx products, even though the probes and variant coverage differ across them.

Cytogenetics Database File

The cytogenetics database file (.zip) contains information from Ensembl and RefSeq data sources used in the generation of Cytogenetics Annotation JSON File. This file can be used across products (beadchip/manifest types and versions). It is only necessary for input to local analysis (i.e., cyto annotate) as it is already stored in the cloud for cloud analysis. It may be updated in the future to accomodate changes in the underlying Ensembl and RefSeq datasources.

Genome FASTA Files

The genome FASTA file (.fa) is a text file with the reference genome sequences.The FASTA index file (.fai) contains metadata about chromosomal orchestration within the FASTA file for a particular species. DRAGEN Array PGx calling supports human genome build 37 and 38. The genome FASTA file and FASTA index file are both provided by Illumina for human species and should be stored together in the same input folder. For custom reference genomes, the contig identifiers in the provided genome FASTA file must match exactly the chromosome identifiers specified in the provided manifest. For a standard human product manifest, this means that the contig headers should read ">1" rather than ">chr1". Note: The Genome FASTA file is only required for the dragen-array-local-analysis workflow. If you're using dragen-array-cloud-analysis, you do not need to provide this file.

Sample Sheet

The sample sheet is a CSV formatted input file that utilizes a couple required fields for sample lookup (SentrixBarcode_A, SentrixPosition_A for local, beadChipName, sampleSectionName for cloud) to enable adding optional metadata and analyzing a filtered list of samples within a folder. It is intended to be flexible and the local version should be backwards compatible with most GenomeStudio samplesheets.

The root folder which DRAGEN Array will search the files for can be set by either providing it via the --idat-folder or --gtc-folder options (where applicable). Or by setting the RootFolder field in the [Header] section. This RootFolder should be the full absolute path to the sample files. e.g.,

Note: In the case of conflict between RootFolder and the CLI options (--idat-folder or --gtc-folder), the CLI options take precedence.

The following are examples of all valid samplesheets:

Most basic (no sections, one sample)

Medium complexity (no sections, multiple samples, optional data)

High complexity (sections, multiple samples, optional data)

Notes:

The column names are case insensitive. For example, the columns Sample_Name and sample_name, would be considered the same and the software would produce an error like this: Duplicate column sample_name found. Column names are case-insensitive. Please remove or rename the column from the samplesheet and re-process.
Because user-provided fields get output in the , the column names cannot conflict with those fields. For example, if the user provides a column named Sex Estimate in their samplesheet. DRAGEN Array will produce the following error: Sex Estimate is a reserved keyword. Please remove or rename the column from the samplesheet and re-process.

For cloud analyses (i.e., for use in sample selection in ), the samplesheet does not currently support sections such as [Header] and [Data] and instead of using SentrixBarcode_A and SentrixPosition_A columns as the sample's keys, it uses beadChipName and sampleSectionName. i.e., a valid cloud samplesheet could look like this:

There is also a template available on the sample selection interface on Basespace.

Methylation QC sample sheet

For DRAGEN Array Methylation QC on cloud, the additional optional sample sheet fields are used in analysis.

Following Sample_Group, any number of additional columns can be added to include meta data fields such as sex, sample type, plate and well information, etc. Additional columns added after the Sample_Group column may have user-defined column header values. The Sample_ID field and any additional metadata added will be replicated in the Sample QC Summary output files.

The Sample_Group field will be used to populate the PCA Control Plot within the Sample QC Summary Plots file and the Principal Component Summary file. For the PCA Control Plot, each sample group will be assigned a unique color. Samples assigned to the same Sample_Group value will be the same color in the PCA Control Plot. e.g.,

Cytogenetics analysis + Emedgene interpretation sample sheet

For Cytogenetics analysis + Emedgene interpretation on cloud, an additional column: demographicSex will be used to compare against to the Sex Estimate output from DRAGEN Array genotyping module and be displayed in Emedgene. The allowed values for this field are M (Male), F (Female), or U (Unknown).

Example:

Input File Summary Table

In addition to the input files, there are set of intermediate files, including GTC, SNV VCF, CNV VCF and PGx CSV, which are outputs of some DRAGEN Array Local commands and inputs to other commands.

The table below summarizes the input files or intermediate file, their sources, and the associated DRAGEN Array Local commands and options.

Input File

Source

Command

Option

beadChipName,sampleSectionName,Sample_ID,Sample_Group,MetaData1
204753010023,R01C01,NA1231,Group1,F
204753010023,R02C01,NA1232,Group2,F
204753010024,R01C01,NA1233,Group2,M
204753010024,R02C01,NA1234,Group1,M

PGx Allele Definitions and PGx Guidelines

DRAGEN Array star allele calling leverages the star allele definitions provided by PharmVar and PharmGKB. DRAGEN Array star allele phenotype annotation, using the “star-allele annotate” command, is achieved through direct lookup into public PGx guidelines CPIC or DPWG, which is selected by the user when running DRAGEN Array.

See table below for details of the data sources.

Data Source

Version

URL

DRAGEN Array “star-allele annotate” command provides both metabolizer status and activity score annotations for genes covered by the CPIC and DPWG guidelines.

Specifically, CPIC metabolizer/phenotype annotations are supported for CACNA1S, CYP2B6, CYP2C19, CYP2C9, CYP2D6, CYP3A5, DPYD, G6PD, MT-RNR1, NUDT15, RYR1, SLCO1B1, TPMT, UGT1A1, CFTR, IFNL3/IFNL4 and VKORC1, among them activity scores are supported for CYP2C9, CYP2D6, and DPYD. DPWG metabolizer/phenotype annotations are supported for CYP1A2, CYP2B6, CYP2C19, CYP2C9, CYP2D6, CYP3A4, CYP3A5, DPYD, NUDT15, SLCO1B1, TPMT, UGT1A1, VKORC1 and F5, among them activity scores are supported for CYP2D6 and DPYD.

Extended Multi-allelic variants based on the designs in the supported PGx products

DRAGEN Array PGx extends any single allele variant definitions obtained from PharmVar or PharmGKB that have multiple alleles in Illumina's product files to include all alleles of the Multi Allelic Variant (MAV). The table below shows the MAVs that were extended in the DRAGEN Array Database to cover all alleles for that MAV that are in the product files. Allele Name describes the allele that was added to the database.

Gene Symbol

Allele Name

Hgvs

Exceptions to Star Allele Definitions

G6PD

With the changes of reference genomes, the definition for a star allele sometimes need to be updated accordingly.

Mediterranean Haplotype and Mediterranean, Dallas, Panama, Sassari, Cagliari, Birmingham are defined by two variants rs5030868 and rs2230037. In genome build GRCh37, Mediterranean Haplotype is defined by rs2230037 G>A and rs5030868 G>A, and Mediterranean, Dallas, Panama, Sassari, Cagliari, Birmingham is defined by rs5030868 G>A, with rs2230037 reference allele G.

In genome build GRCh38, Mediterranean Haplotype is defined by rs5030868 G>A, with rs2230037 reference allele A, and Mediterranean, Dallas, Panama, Sassari, Cagliari, Birmingham is defined by rs2230037 A>G and rs5030868 G>A.

Variant rs2230037 is ignored in all other G6PD alleles except in the two Mediterranean alleles.

*0 Star Allele Definition

A *0 allele refers to a full gene deletion of the analyzed gene, if there is no existing star allele name for the deletion allele from source databases, such as PharmVar and PharmGKB.

DRAGEN Array Cloud Analysis

DRAGEN Array Cloud Analysis Overview

DRAGEN Array Cloud utilizes the user-friendly graphical interface of BaseSpace Sequence Hub to simplify DRAGEN Array analysis setup and kickoff. Optional integration with the iScan System allows data to be streamed directly from the instrument to the cloud platform. Analysis data is stored on the Illumina Connected Platform providing secure storage for both microarray and sequencing data.

Getting Started

The following prerequisites are needed to get started with DRAGEN Array Cloud:

Illumina Connected Analytics subscription: An ICA Basic, Professional or Enterprise subscription can be used which include access to BaseSpace Sequence Hub. Follow the to register the software.
Workgroup setup: Administrator must create a workgroup before users can log in. Using a workgroup allows all members of the workgroup to share access to resources, analyses, and data. Learn more about .
- Designating a workgroup as ‘Collaborative’ allows projects to be shared with collaborators or Illumina Tech Support to assist with troubleshooting. To create a collaborative workgroup, select the Enable collaborators outside of this domain checkbox during workgroup creation.

Running Analysis

Before beginning analysis, ensure workgroup context is being used so analysis can be viewed by all members of your workgroup. The name of your workgroup should appear in the top right corner.

Use the following steps to run the Microarray Analysis Setup on BaseSpace Sequence Hub:

Select the Runs tab
Select New Run
Select Microarray Analysis Setup

Use the Select Project link to choose the project for your output files To select an existing project, click the radio button next to the desired project name. You can also create a project by clicking the New button in the project selection window.
Select the Type of Analysis Further detail of each Type of Analysis is available in section . Note: For PGx CNV calling, it is recommended that 96 or more samples passing LogRDev <= 0.2 are included in the analysis. For PGx star allele calling, it is recommended to QC the samples and review the samples that have Log R Dev > 0.2, call rate < 0.99, or TGA Control probe < 1.0 to assess the reliability of the analysis. These metrics are provided in the genotyping sample summary file (gt_sample_summary.csv). For more details, see the explanation for the .

DRAGEN Array – Genotyping provides flexibility for turning off/on specific output files and adjusting GenCall score cutoff. Its recommended to turn off VCF output for non-human species and Final Report output for large sample numbers.
DRAGEN Array – Cytogenetics analysis provides options to adjust thresholds as detailed on the page. The GTC and SNV VCF options listed on that page are configured in the "Add Custom Configuration" window at the bottom under "Additional Output Options".
DRAGEN Array - Cytogenetics analysis + Emedgene interpretation shares the same options detailed on the page.

Select your preferred option in the Configuration Settings drop-down menu Configuration setup will vary based on the Type of Analysis selected. More details are available in section .
Select Next
Select either Import Sample Sheet, Select BeadChips, or Import IDAT Files (Figure 3)

Import Sample Sheet presents a link to upload sample sheet. Users may download a template sample sheet by selecting the Download Template link.
Select BeadChips allows users to select BeadChips from the displayed list of available BeadChips. If selecting specific samples within the BeadChip is desired the Import Sample Sheet option should be used.
Import IDAT Files allows users to upload the IDAT files from a local folder to the cloud platform for use with the current and future analyses by users within the same workgroup.

Select Launch Analysis

View Outputs

On the Analyses tab, view the analysis status, e.g., initializing or complete.
After the analysis is complete, select the analysis and select the Files tab.
From the Files tab, select the Output folder.

Manage Data

The data management tab allows you to view and manage all your scanned IDAT files in the cloud. Before viewing, ensure workgroup context is being used so all data available your workgroup can be seen. The name of your workgroup should appear in the top right corner. For more information, see .

Troubleshooting and Additional Support

Troubleshooting iScan integration

The firewall protects the iScan control computer by filtering incoming traffic to remove potential threats. The firewall is enabled by default to block all inbound connections. Keep the firewall enabled and allow outbound connections.

For the instrument to connect to BaseSpace Sequence Hub, you will need to add regional platform endpoints and instrument specific endpoints to the allow list on your firewall. Regional endpoints and further detail can be found in .

The following table shows the applicable endpoints for the iScan.

Endpoint

Sharing a project

Project sharing allows a user to share files with users outside the workgroup for collaboration or with Illumina Tech Support for troubleshooting. To share a project on BaseSpace Sequence Hub, first set the Workgroup type as ‘Collaborative’ during , and then use the following steps to obtain a link to your project. The project can then be accessed by anyone with the link. All files in the project are shared.

Navigate to the Projects tab
Click the button next to the desired project
Select the Share button above to list (Figure 3)
Select the Get Link Option to Activate a link for the project

Additional Notes:

The project owner maintains ownership and write access. If project owner deletes the data, the collaborators lose access to it.
Either sending or receiving domain must be collaborative. See "Workgroup setup"
Must be in the same (i.e."Data cannot be transferred directly between instances, however you can download and share data separately." )

PGx Star Allele Coverage

Theoretical Coverage

The PGx genes and star/variant alleles listed below can be detected by DRAGEN Array v1.1 if available on the microarray. PGx coverage for specific PGx microarrays can be found here: PGx Star Allele Coverage for Specific PGx Products. Known and novel star alleles not in the below list will not be reported. Star allele definitions are sourced from PharmVar and PharmGKB.

Among the PGx genes, HLA-A, HLA-B, and IFNL3/IFNL4 alleles are covered through tagging variants, specifically HLA-A,*31:01 (rs1061235.A>T); HLA-B,*15:02 (rs144012689.T>A); HLA-B,*57:01 (rs2395029.T>G); HLA-B,*58:01 (rs9263726.G>A); IFNL3/4, rs12979860 variant (T). Reliability of the tagging SNPs varies depending on the population. Additional information on PGx gene types, variant type versus star allele type, can be found here: Introducing-dragen-array-1-0-for-infinium-array-based-pharmacogenomics-analysis

Gene

PGx Alleles

PGx Star Allele Coverage for Specific PGx Products

PGx star alleles can only be called when the related variants in the star allele definition are present in a PGx product. An auxiliary file ([Product]_GS_import.txt) is provided for each product with the PGx variants and associated star alleles. The product files pages that contain the auxiliary files are listed in the table below. The aux file covers SNPs and indels only, it does not contain SV defined star-alleles.

Instructions on how to use the auxiliary file can be found here: .

Product

GS Import File Name

Product Files Link

Known Limitations of GDA-ePGx, GSAv4-ePGx, and GCRA-ePGx.

APOE: GSAv4-ePGx and GCRA-ePGx do not support calling E2 and E4 due to the lack of functional probes for rs7412 and rs429358.
CYP2A6: GDA-ePGx does not support *5 due to lack of coverage for *5 core variants.
CYP4F2: for all three products

PGx Variants Masked in DRAGEN Array

During DRAGEN Array star allele calling, poorly performing PGx variants are masked and treated as "No Calls". Star alleles that are solely defined by the masked variants will NOT be called by DRAGEN Array. The tables below provide the variants that are masked per product with each row represents a single variant. The Variant_ID matches the ID field of the corresponding SNV VCF entry of the PGx product.

GDA-ePGx

Manifest

Gene_Symbol

Variant_ID

GSAv4-ePGx

Manifest

Gene_Symbol

Variant_ID

GCRA-ePGx

Manifest

Gene_Symbol

Variant_ID

DPYD

Reference;rs111858276.T>C;rs112766203.1.G>A;rs112766203.2.G>C;rs114096998.1.G>T;rs114096998.2.G>A;rs114096998.2.G>C;rs115232898.T>C;rs116364703.T>A;rs1180771326.T>C;rs137878450.C>A;rs137999090.C>T;rs138391898.C>T;rs138545885.C>A;rs138616379.C>T;rs139459586.A>C;rs139834141.C>T;rs140039091.C>G;rs140114515.C>T;rs140602333.G>A;rs140602333.G>T;rs140989814.C>G;rs141044036.T>C;rs141439344.C>T;rs141462178.T>C;rs141726921.C>T;rs142512579.C>T;rs142619737.C>G;rs142619737.C>T;rs143154602.G>A;rs143154602.G>T;rs143815742.1.C>A;rs143815742.2.C>T;rs143879757.1.G>T;rs143879757.2.G>A;rs143986398.G>C;rs144395748.1.G>C;rs144395748.2.G>T;rs144935781.T>C;rs145112791.G>A;rs145529148.T>C;rs145548112.C>A;rs145548112.C>T;rs145773863.C>T;rs146356975.T>C;rs146529561.G>A;rs147545709.G>A;rs147601618.A>G;rs148799944.C>G;rs148994843.C>T;rs150036960.G>C;rs150385342.1.C>T;rs150385342.2.C>A;rs150437414.A>G;rs151074666.C>T;rs17376848.A>G;rs1801158.C>T;rs1801159.T>C;rs1801160.C>T;rs1801265.A>G;rs1801266.G>A;rs1801267.C>T;rs1801268.C>A;rs183105782.A>G;rs183385770.C>T;rs186169810.A>C;rs187713395.A>G;rs188052243.T>C;rs190577302.G>C;rs190951787.G>C;rs190951787.G>T;rs199549923.G>T;rs199634007.G>T;rs199646142.C>T;rs199777072.C>T;rs200064537.A>T;rs200296941.T>C;rs200562975.T>C;rs200643089.A>C;rs200687447.1.C>T;rs200687447.2.C>A;rs200687447.2.C>G;rs200693895.A>G;rs200709381.T>G;rs201018345.C>T;rs201035051.T>G;rs201268750.G>T;rs201433243.C>T;rs201615754.1.C>A;rs201615754.2.C>T;rs201648613.C>G;rs201785202.G>A;rs202144771.G>A;rs202212118.C>A;rs2297595.T>C;rs267598785.G>A;rs267598786.C>T;rs267598789.G>A;rs367619008.T>C;rs368146607.T>G;rs368152149.T>C;rs368327291.C>G;rs368519011.T>C;rs368970772.G>T;rs369103276.A>G;rs369575517.G>A;rs370569731.1.C>G;rs370569731.2.C>T;rs370615432.C>A;rs370707404.A>G;rs371258350.C>T;rs371313778.C>T;rs371587702.1.G>A;rs371587702.2.G>C;rs371792178.1.G>A;rs371792178.2.G>C;rs372058915.T>C;rs372307932.A>T;rs372909322.T>C;rs374527058.A>G;rs374531732.C>T;rs374825099.1.G>T;rs374825099.2.G>C;rs374827081.G>C;rs375436137.C>T;rs375990187.A>G;rs376073289.1.C>T;rs376073289.2.C>A;rs376128878.G>T;rs376273539.G>C;rs377143350.C>T;rs377169736.C>G;rs3918289.G>A;rs3918289.G>C;rs3918290.C>G;rs3918290.C>T;rs45589337.T>C;rs527580106.T>C;rs528152707.C>A;rs528430685.G>A;rs528768620.C>T;rs529019871.T>C;rs532341730.A>T;rs536577604.T>C;rs538336580.T>A;rs538703919.G>A;rs547099198.G>A;rs548783838.C>T;rs55674432.C>A;rs556933127.A>C;rs557220418.G>A;rs558354142.G>A;rs55886062.1.A>C;rs55886062.2.A>T;rs559427764.C>A;rs55971861.T>G;rs56005131.G>T;rs56038477.C>T;rs568169006.T>C;rs568367673.C>A;rs569661196.A>G;rs570122671.G>A;rs571114616.A>G;rs573299212.C>T;rs575763449.G>A;rs575853463.C>T;rs576409484.T>A;rs57918000.G>A;rs59086055.G>A;rs60139309.T>C;rs60511679.A>C;rs61622928.C>T;rs61757362.G>A;rs6670886.C>A;rs6670886.C>T;rs672601273.1.C>A;rs672601273.2.C>T;rs672601275.T>G;rs672601276.C>A;rs672601282.G>A;rs672601284.C>T;rs672601285.T>C;rs672601287.T>G;rs672601288.C>A;rs67376798.T>A;rs72547601.T>C;rs72547602.T>A;rs72549303.del;rs72549304.G>A;rs72549304.G>C;rs72549304.G>T;rs72549305.T>C;rs72549306.1.C>A;rs72549306.2.C>T;rs72549307.T>C;rs72549308.T>G;rs72549309.ATGA[1];rs72549310.G>A;rs72975710.1.G>A;rs72975710.2.G>C;rs745512069.G>A;rs745704371.G>C;rs745833535.T>C;rs745911874.C>T;rs745982505.1.T>C;rs745982505.2.T>A;rs746115989.C>T;rs746329786.T>A;rs746777181.C>T;rs747132274.C>G;rs747161261.C>T;rs747627716.A>C;rs747633945.C>T;rs747858350.G>A;rs747872037.C>A;rs748214188.A>T;rs748235192.1.T>A;rs748235192.2.T>C;rs748266854.G>A;rs748320430.A>C;rs748620513.C>A;rs748620513.C>G;rs748639205.A>C;rs748639205.A>G;rs748853941.T>C;rs748958293.G>A;rs748974194.G>A;rs749157068.C>A;rs749269410.C>T;rs749354734.A>T;rs749586100.T>A;rs749699298.A>C;rs749982106.G>A;rs750147471.T>C;rs75017182.G>C;rs750224169.G>A;rs750423752.A>C;rs750687600.C>T;rs750721736.A>T;rs751049055.C>A;rs751104498.T>C;rs751113340.G>A;rs751190912.G>A;rs751340819.A>G;rs751374989.T>A;rs751399062.G>T;rs751841116.1.C>A;rs751841116.2.C>T;rs751848058.T>A;rs752020412.C>T;rs752228747.G>A;rs752388408.C>T;rs752518145.C>A;rs752985272.C>A;rs753166888.C>G;rs753217888.G>C;rs753296078.C>G;rs753419296.C>G;rs753527420.C>G;rs753707032.G>A;rs753710779.G>A;rs753820482.T>C;rs753950237.G>A;rs754028972.A>G;rs754125729.1.G>A;rs754125729.2.G>T;rs754467630.G>A;rs754786483.T>C;rs755155824.C>A;rs755407188.T>G;rs755416212.C>T;rs755428442.C>G;rs755645831.A>C;rs755692084.T>G;rs755729055.T>C;rs756020314.G>C;rs756372042.A>G;rs756613407.T>C;rs756684474.T>C;rs756890859.T>C;rs756992995.C>T;rs757155354.T>C;rs757227327.C>T;rs757342874.C>T;rs757376267.C>A;rs757695236.C>T;rs757954074.C>T;rs757958938.T>C;rs757994597.G>A;rs758154803.A>G;rs758489611.C>T;rs758514990.C>T;rs758649719.C>T;rs758699471.T>C;rs759082282.C>A;rs759249769.G>T;rs759424419.A>T;rs759479759.T>C;rs759562628.T>G;rs759766897.T>C;rs759967863.A>G;rs760038956.C>T;rs760222167.T>C;rs760235888.C>T;rs760485592.G>A;rs760553268.G>C;rs760570391.A>G;rs760663364.G>A;rs760663364.G>C;rs761302217.T>C;rs761351410.G>A;rs761479700.G>C;rs761555670.T>C;rs761609256.T>G;rs762083671.T>A;rs762102298.A>C;rs762198241.G>A;rs762430779.G>T;rs762446803.A>C;rs762468894.G>C;rs762523739.T>A;rs762533012.C>T;rs762598766.T>C;rs762779297.T>C;rs762858106.C>T;rs762911226.T>A;rs763008163.T>G;rs763061658.A>G;rs763449831.C>T;rs763506271.T>C;rs763557204.A>G;rs763572567.T>G;rs763623595.A>C;rs763784786.G>C;rs763862486.C>T;rs763893877.T>C;rs763984510.G>C;rs764111543.C>T;rs764270260.G>A;rs764555085.A>G;rs764635955.G>T;rs764666241.C>A;rs764679468.A>C;rs764945792.C>T;rs765001324.C>T;rs765034707.C>A;rs765075551.T>C;rs765131182.G>A;rs765247038.G>A;rs765309287.G>T;rs765465250.T>C;rs765640386.C>A;rs765990958.G>A;rs766411970.A>C;rs766438205.T>C;rs766635900.C>T;rs766700777.C>G;rs766761199.T>G;rs766833304.G>C;rs766885021.A>C;rs767200577.T>C;rs767376585.C>G;rs767437717.G>T;rs767464878.C>A;rs767468952.C>T;rs767482279.A>G;rs767547827.G>C;rs767818267.C>T;rs767836989.T>C;rs767986711.T>G;rs768117152.T>C;rs768157853.G>C;rs768200107.T>G;rs768288280.T>C;rs768501828.T>C;rs768507975.A>T;rs768680499.G>T;rs768915005.C>T;rs769190350.T>A;rs769306962.C>T;rs769466648.1.T>G;rs769466648.2.T>C;rs769514867.G>T;rs769696395.T>C;rs769709846.T>C;rs769820114.C>T;rs769847078.T>C;rs769932607.G>A;rs770229152.T>A;rs770566506.A>G;rs770958862.G>A;rs771194906.A>G;rs771534236.T>C;rs771536388.C>T;rs771573678.T>A;rs771646887.C>T;rs771648776.T>C;rs771885007.A>G;rs771930534.1.A>T;rs771930534.2.A>G;rs772097379.G>A;rs772264512.G>A;rs772320654.T>C;rs772358811.C>G;rs772544099.G>T;rs772826416.A>G;rs772906420.C>T;rs773159364.C>G;rs773407491.T>C;rs773584401.C>A;rs773652644.T>C;rs773815814.1.C>A;rs773815814.2.C>T;rs773868825.C>T;rs773983635.A>T;rs774134971.T>C;rs774500505.A>T;rs774579695.1.C>T;rs774799003.G>A;rs774883578.A>C;rs775494607.G>A;rs775526810.C>A;rs775570841.G>C;rs775601164.G>A;rs775926386.G>C;rs776082092.C>T;rs776236081.C>T;rs776289153.C>T;rs776321529.G>C;rs776662759.T>G;rs776973423.C>T;rs776984091.T>C;rs777220476.1.C>T;rs777220476.2.C>A;rs777238016.T>C;rs777347164.C>T;rs777368221.A>C;rs777425216.C>A;rs777425216.C>T;rs777560627.G>A;rs777673186.G>C;rs777902288.T>A;rs778022685.C>T;rs778054451.C>T;rs778141885.T>C;rs778298325.C>T;rs778601245.C>T;rs778754188.A>G;rs778760295.C>G;rs778776264.T>C;rs778867644.T>C;rs778911905.A>C;rs779465366.A>G;rs779557503.G>A;rs779573574.T>A;rs779728902.A>T;rs779925747.T>G;rs779967271.T>C;rs780025995.G>A;rs780047918.T>C;rs780120302.T>C;rs78060119.C>A;rs780813130.C>T;rs780873985.T>C;rs780885126.T>C;rs781184141.T>C;rs80081766.C>T;rs866110709.C>T;rs866869468.C>A;rs867143119.C>A;rs867226255.C>T;rs867232786.C>T;rs867600987.C>T;rs868047175.C>T;rs868235016.C>T

G6PD

202G>A_376A>G_1264C>G;A;A- 202A_376G;A- 680T_376G;A- 968C_376G;Aachen;Abeno;Acrokorinthos;Alhambra;Amazonia;Amiens;Amsterdam;Anadia;Ananindeua;Andalus;Arakawa;Asahi;Asahikawa;Aures;Aveiro;B (reference);Bajo Maumere;Bangkok;Bangkok Noi;Bao Loc;Bari;Belem;Beverly Hills, Genova, Iwate, Niigata, Yamaguchi;Brighton;Buenos Aires;Cairo;Calvo Mackenna;Campinas;Canton, Taiwan-Hakka, Gifu-like, Agrigento-like;Cassano;Chatham;Chikugo;Chinese-1;Chinese-5;Cincinnati;Cleveland Corum;Clinic;Coimbra Shunde;Cosenza;Costanzo;Covao do Lobo;Crispim;Dagua;Durham;Farroupilha;Figuera da Foz;Flores;Fukaya;Fushan;Gaohe;Georgia;Gidra;Gond;Guadalajara;Guangzhou;Haikou;Hammersmith;Harilaou;Harima;Hartford;Hechi;Hermoupolis;Honiara;Ierapetra;Ilesha;Insuli;Iowa, Walter Reed, Springfield;Iwatsuki;Japan, Shinagawa;Kaiping, Anant, Dhon, Sapporo-like, Wosera;Kalyan-Kerala, Jamnaga, Rohini;Kambos;Kamiube, Keelung;Kamogawa;Kawasaki;Kozukata;Krakow;La Jolla;Lages;Lagosanto;Laibin;Lille;Liuzhou;Loma Linda;Ludhiana;Lynwood;Madrid;Mahidol;Malaga;Manhattan;Mediterranean Haplotype;Mediterranean, Dallas, Panama, Sassari, Cagliari, Birmingham;Metaponto;Mexico City;Miaoli;Minnesota, Marion, Gastonia, LeJeune;Mira d'Aire;Mizushima;Montalbano;Montpellier;Mt Sinai;Munich;Murcia Oristano;Musashino;Namouru;Nankang;Nanning;Naone;Nara;Nashville, Anaheim, Portici;Neapolis;Nice;Nilgiri;No name;North Dallas;Olomouc;Omiya;Orissa;Osaka;Palestrina;Papua;Partenope;Pawnee;Pedoplis-Ckaro;Piotrkow;Plymouth;Praha;Puerto Limon;Quing Yan;Radlowo;Rehevot;Rignano;Riley;Riverside;Roubaix;S. Antioco;Salerno Pyrgos;Santa Maria;Santiago;Santiago de Cuba, Morioka;Sao Borja;Seattle, Lodi, Modena, Ferrara II, Athens-like;Seoul;Serres;Shenzen;Shinshu;Sibari;Sierra Leone;Sinnai;Songklanagarind;Split;Stonybrook;Sugao;Sumare;Sunderland;Surabaya;Suwalki;Swansea;Taipei, Chinese-3;Telti, Kobe;Tenri;Tokyo, Fukushima;Toledo;Tomah;Tondela;Torun;Tsukui;Ube Konan;Union,Maewo, Chinese-2, Kalo;Urayasu;Utrecht;Valladolid;Vancouver;Vanua Lava;Viangchan, Jammu;Villeurbanne;Volendam;Wayne;West Virginia;Wexham;Wisconsin;Yunan

OPRM1

Reference;rs10457090.A>G;rs10457090.A>T;rs10485057.A>G;rs10485058.A>G;rs10485060.C>A;rs1074287.A>G;rs11575856.G>A;rs12190259.A>C;rs12205732.G>A;rs12209447.C>T;rs12210856.T>G;rs1294092.A>G;rs1319339.T>A;rs1319339.T>C;rs13195018.A>C;rs13195018.A>T;rs13203628.A>G;rs1323040.A>G;rs1323042.G>C;rs1323042.G>T;rs1381376.C>A;rs1381376.C>G;rs1381376.C>T;rs1461773.G>A;rs17174629.A>G;rs17174794.C>G;rs17174794.C>T;rs17174801.A>G;rs17180982.dup;rs17181352.A>G;rs1799971.A>G;rs1799972.C>A;rs1799972.C>G;rs1799972.C>T;rs1852629.T>A;rs1852629.T>C;rs1852629.T>G;rs2010884.G>A;rs2075572.G>C;rs2236256.C>A;rs2236257.G>C;rs2236258.C>G;rs2236258.C>T;rs2236259.T>A;rs2236259.T>C;rs2236259.T>G;rs2281617.C>G;rs2281617.C>T;rs3778148.G>T;rs3778150.T>C;rs3778151.T>C;rs3778152.A>G;rs3778156.A>G;rs3798676.C>T;rs3798677.A>G;rs3798678.A>C;rs3798678.A>G;rs3798683.G>A;rs3798688.G>T;rs3823010.G>A;rs483481.G>A;rs483481.G>C;rs4870266.G>A;rs495491.A>G;rs497976.G>A;rs497976.G>T;rs499796.A>G;rs506247.A>C;rs510769.C>T;rs511435.C>G;rs511435.C>T;rs518596.G>A;rs524731.C>A;rs527434.T>A;rs527434.T>C;rs538174.T>C;rs540825.A>C;rs540825.A>G;rs540825.A>T;rs544093.G>A;rs544093.G>T;rs548646.T>A;rs548646.T>C;rs548646.T>G;rs553202.C>T;rs558025.A>G;rs558948.C>G;rs558948.C>T;rs562859.C>A;rs562859.C>G;rs562859.C>T;rs563649.C>T;rs569284.A>C;rs583664.T>C;rs589046.C>T;rs598160.G>A;rs598160.G>C;rs598682.A>C;rs598682.A>G;rs598682.A>T;rs599548.G>A;rs606545.G>A;rs606545.G>C;rs609148.G>A;rs609148.G>T;rs609623.T>A;rs609623.T>C;rs610231.G>A;rs610231.G>C;rs613355.C>A;rs613355.C>G;rs613355.C>T;rs618207.A>C;rs618207.A>G;rs618207.A>T;rs62436463.C>T;rs62638690.G>T;rs632499.A>C;rs632499.A>G;rs632499.A>T;rs639855.C>A;rs639855.C>G;rs642489.G>A;rs642489.G>T;rs644261.G>A;rs644261.G>C;rs644261.G>T;rs645027.A>G;rs647192.G>A;rs647192.G>C;rs648007.A>C;rs648007.A>G;rs648893.A>G;rs650825.G>A;rs6557337.C>A;rs6557337.C>T;rs658156.A>C;rs658156.A>G;rs658156.A>T;rs671531.A>G;rs671531.A>T;rs675026.A>C;rs675026.A>G;rs677830.C>A;rs677830.C>G;rs677830.C>T;rs681243.T>A;rs681243.T>C;rs6902403.T>C;rs6912029.G>T;rs73576470.A>G;rs7748401.T>G;rs7763748.C>A;rs7763748.C>T;rs7776341.A>C;rs79910351.C>T;rs9282815.C>A;rs9282815.C>T;rs9322446.G>A;rs9322447.A>C;rs9322447.A>G;rs9322447.A>T;rs9322453.G>C;rs9371773.G>A;rs9371776.G>A;rs9384174.C>G;rs9384174.C>T;rs9384179.G>A;rs9384179.G>T;rs9397685.A>G;rs9397685.A>T;rs9397687.C>T;rs9479757.G>A;rs9479779.A>G

RYR1

NC_000019.10:g.38440818G>C;NC_000019.10:g.38444179C>A;NC_000019.10:g.38444252G>T;NC_000019.10:g.38444257A>C;NC_000019.10:g.38444257A>G;NC_000019.10:g.38448680_38448681insGGA;NC_000019.10:g.38448715G>A;NC_000019.10:g.38451785C>A;NC_000019.10:g.38452985C>T;NC_000019.10:g.38455253C>G;NC_000019.10:g.38455254T>C;NC_000019.10:g.38455347T>C;NC_000019.10:g.38455504G>T;NC_000019.10:g.38466392G>A;NC_000019.10:g.38469404A>C;NC_000019.10:g.38485679T>C;NC_000019.10:g.38486095A>G;NC_000019.10:g.38490642A>C;NC_000019.10:g.38494454G>A;NC_000019.10:g.38496455G>A;NC_000019.10:g.38499234T>C;NC_000019.10:g.38499642C>A;NC_000019.10:g.38499667G>A;NC_000019.10:g.38499667G>T;NC_000019.10:g.38499680T>A;NC_000019.10:g.38499683G>A;NC_000019.10:g.38499696C>G;NC_000019.10:g.38499719A>G;NC_000019.10:g.38499730G>A;NC_000019.10:g.38499985A>T;NC_000019.10:g.38500000G>A;NC_000019.10:g.38502669C>G;NC_000019.10:g.38504298G>A;NC_000019.10:g.38506508C>G;NC_000019.10:g.38506865C>T;NC_000019.10:g.38507821C>T;NC_000019.10:g.38512279G>A;NC_000019.10:g.38515052C>T;NC_000019.10:g.38516181T>C;NC_000019.10:g.38516208G>C;NC_000019.10:g.38517470T>C;NC_000019.10:g.38517523T>A;NC_000019.10:g.38519424C>A;NC_000019.10:g.38519432A>T;NC_000019.10:g.38519447A>G;NC_000019.10:g.38525432C>T;NC_000019.10:g.38527710G>C;NC_000019.10:g.38528372G>T;NC_000019.10:g.38529002G>C;NC_000019.10:g.38529042C>T;NC_000019.10:g.38543380A>T;NC_000019.10:g.38543566G>A;NC_000019.10:g.38543810C>T;NC_000019.10:g.38548253A>T;NC_000019.10:g.38561140G>C;NC_000019.10:g.38561213C>T;NC_000019.10:g.38561362G>A;NC_000019.10:g.38561363G>T;NC_000019.10:g.38565023T>G;NC_000019.10:g.38570649C>G;NC_000019.10:g.38577931A>C;NC_000019.10:g.38578205G>T;NC_000019.10:g.38580039_38580040delinsAA;NC_000019.10:g.38580041C>A;NC_000019.10:g.38580126C>G;NC_000019.10:g.38580397G>C;NC_000019.10:g.38580416C>T;NC_000019.10:g.38585078A>G;NC_000019.10:g.38585099G>A;NC_000019.10:g.38586190A>G;NC_000019.10:g.38587362G>C;NC_000019.10:g.38587363G>C;Reference;rs111272095.C>T;rs111364296.G>A;rs111565359.G>A;rs111657878.T>C;rs111888148.G>A;rs112151058.G>A;rs112196644.A>G;rs112563513.G>A;rs112596687.T>A;rs112772310.G>A;rs113210953.A>G;rs113332073.G>A;rs113332073.G>T;rs117886618.C>G;rs118192113.C>A;rs118192116.C>G;rs118192116.C>T;rs118192121.A>C;rs118192122.G>A;rs118192123.T>C;rs118192124.C>T;rs118192126.A>G;rs118192130.G>A;rs118192135.G>A;rs118192140.C>T;rs118192151.G>A;rs118192151.G>C;rs118192158.G>A;rs118192159.C>G;rs118192160.G>A;rs118192160.G>T;rs118192161.C>T;rs118192162.A>C;rs118192162.A>G;rs118192163.G>A;rs118192163.G>C;rs118192163.G>T;rs118192167.A>G;rs118192168.G>A;rs118192170.T>C;rs118192172.C>T;rs118192175.C>T;rs118192176.G>A;rs118192177.C>G;rs118192177.C>T;rs118192178.C>G;rs118192178.C>T;rs118192181.C>T;rs118204421.C>T;rs118204422.T>C;rs118204423.G>A;rs118204423.G>C;rs121918592.G>A;rs121918592.G>C;rs121918593.G>A;rs121918594.G>A;rs121918594.G>T;rs121918595.C>T;rs121918596._38499648delGAG;rs137932199.G>A;rs137933390.A>G;rs138874610.G>A;rs139161723.G>A;rs139647387.A>G;rs140152019.G>A;rs140616359.G>A;rs141646642.C>G;rs141942845.G>A;rs142474192.G>A;rs142474192.G>T;rs143398211.G>A;rs143520367.C>T;rs143987857.G>A;rs143988412.A>G;rs143988412.A>T;rs144336148.G>A;rs144685735.C>T;rs145573319.A>G;rs145801146.C>T;rs146306934.G>A;rs146429605.A>G;rs146504767.G>A;rs146876145.C>T;rs147136339.A>G;rs147213895.A>G;rs147303895.G>A;rs147707463.C>T;rs147723844.A>G;rs148399313.G>A;rs148623597.G>A;rs150396398.G>C;rs151029675.C>T;rs151119428.G>A;rs1801086.G>A;rs1801086.G>C;rs1801086.G>T;rs180714609.G>A;rs186983396.C>G;rs186983396.C>T;rs192863857.C>T;rs193922744.T>G;rs193922745._38440752delTGA;rs193922746.A>G;rs193922747.T>C;rs193922748.C>T;rs193922749.C>A;rs193922750.C>A;rs193922751.G>A;rs193922752.A>G;rs193922753.G>A;rs193922753.G>T;rs193922754.G>A;rs193922755.G>A;rs193922756.A>G;rs193922757.C>T;rs193922759.G>A;rs193922760.A>T;rs193922761.G>T;rs193922762.C>A;rs193922762.C>T;rs193922764.C>A;rs193922764.C>G;rs193922764.C>T;rs193922766.G>A;rs193922766.G>T;rs193922767.G>A;rs193922767.G>T;rs193922768.C>A;rs193922768.C>T;rs193922769.T>C;rs193922769.T>G;rs193922770.C>T;rs193922772.G>A;rs193922772.G>T;rs193922775.C>T;rs193922776.C>T;rs193922777.C>T;rs193922781.C>T;rs193922782.T>G;rs193922783.T>A;rs193922788.G>C;rs193922789.G>A;rs193922790.A>T;rs193922791.C>T;rs193922792.G>T;rs193922793.T>A;rs193922795.G>A;rs193922797.G>A;rs193922798.G>C;rs193922799.G>A;rs193922801.A>G;rs193922802.G>A;rs193922803.C>T;rs193922804.A>G;rs193922805.T>G;rs193922806.C>G;rs193922807.G>C;rs193922809.G>A;rs193922810.G>A;rs193922810.G>T;rs193922812.C>T;rs193922813.G>C;rs193922815.G>A;rs193922815.G>C;rs193922816.C>T;rs193922817.C>T;rs193922818.G>A;rs193922819.T>C;rs193922822.C>G;rs193922822.C>T;rs193922824.C>T;rs193922826.C>G;rs193922826.C>T;rs193922827.G>C;rs193922828.G>A;rs193922829.G>A;rs193922830.C>T;rs193922831.T>A;rs193922832.G>A;rs193922833.G>A;rs193922834.G>A;rs193922838.G>A;rs193922838.G>T;rs193922839.G>A;rs193922840.T>G;rs193922842.C>G;rs193922842.C>T;rs193922843.G>T;rs193922844.C>A;rs193922848.A>T;rs193922849.C>A;rs193922850.T>C;rs193922852.G>C;rs193922852.G>T;rs193922853.A>T;rs193922855.C>T;rs193922860.G>A;rs193922862._38572267delinsCT;rs193922863.C>T;rs193922864.T>C;rs193922865.T>G;rs193922866.G>A;rs193922867.C>T;rs193922868.G>A;rs193922873.G>A;rs193922873.G>T;rs193922874.T>C;rs193922876.C>T;rs193922877.delA;rs193922878.C>G;rs193922879.G>A;rs193922880.C>G;rs193922883.T>C;rs193922888.G>A;rs193922895.C>A;rs193922896.G>T;rs193922898.T>A;rs199738299.A>G;rs199870223.C>T;rs200766617.G>A;rs201321695.A>G;rs2145447772.G>A;rs2145447772.G>C;rs28933396.G>A;rs28933396.G>T;rs28933397.C>T;rs34390345.A>G;rs34694816.A>G;rs34934920.C>T;rs35180584.C>G;rs35364374.G>T;rs370634440.G>A;rs370634440.G>T;rs372958050.T>C;rs373406011.C>T;rs375626634.T>C;rs375915752.C>T;rs376149732.C>T;rs4802584.C>G;rs537994744.G>A;rs549201486.C>T;rs551223467.C>T;rs553055844.G>A;rs55876273.G>C;rs587784372.C>T;rs63749869.G>A;rs727504129.C>T;rs746818096.T>A;rs747177274.G>C;rs748575133.T>A;rs749040743.G>A;rs751180702.G>A;rs752652072.C>T;rs754476250.C>T;rs754785770.A>G;rs755088027.G>A;rs756850145.A>G;rs757753317.G>A;rs759500310.T>C;rs761616815.G>A;rs762401851.G>A;rs763112609.C>T;rs763352221.C>T;rs767553612.A>G;rs768360593.G>A;rs768535909.T>C;rs769482889.C>T;rs770593660.G>C;rs771058055.G>A;rs771741606.C>T;rs773040531.A>G;rs778241277.G>A;rs781104539.A>G;rs781126470.C>T;rs901087791.G>A;rs914804033.G>A;rs914804033.G>C;rs917523269.C>T;rs936513262.G>A;rs959170123.G>A;rs976108591.A>G;rs995399438.T>C

Output Files

The following section describes the outputs produced by DRAGEN Array.

PGx CNV VCF File

DRAGEN Array produces one PGx CNV variant call file (VCF) (*.cnv.vcf) per sample to report the CN status on the gene and sub gene level, along with the CN events for PGx targets.

The PGx CNV VCF output file follows the standard VCF format. The QUAL field in the VCF file measures the CNV call quality. The CNV call quality is a Phred-scaled score capped at 60 and the minimal value is 0. Low quality calls (QUAL<7) are flagged by the Q7 filter. Low quality samples with LogRDev greater than a threshold 0.2 are flagged with the SampleQuality flag.

The PGx CNV VCF files are by default bgzipped (Block GZIP) and have the “.gz” extension. The compression saves storage space and facilitates efficient lookup when indexed with the TBI Index File. To view these files as plain text, they can be uncompressed with from Samtools or other third-party tools. The CNV VCF must be bgzipped and indexed to be used in downstream DRAGEN Array commands, such as star allele calling.

The PGx CNV VCF output file includes the following content.

##fileformat=VCFv4.1

##source=dragena 1.3.0

##genomeBuild=38

##reference=file:///hg38_with_alt/hg38_nochr_MT.fa

##FORMAT=<ID=CN,Number=1,Type=Integer,Description="Copy number genotype for imprecise events. CN=5 indicates 5 or 5+">

##FORMAT=<ID=NR,Number=1,Type=Float,Description="Aggregated normalized intensity">

##ALT=<ID=CNV,Description="Copy number variant region">

##FILTER=<ID=Q7,Description="Quality below 7">

##FILTER=<ID=SampleQuality,Description="Sample was flagged as potentially low-quality due to high noise levels.">

##INFO=<ID=CNVLEN,Number=1,Type=Integer,Description="Number of bases in CNV hotspot">

##INFO=<ID=PROBE,Number=1,Type=Integer,Description="Number of probes assayed for CNV hotspot">

##INFO=<ID=END,Number=1,Type=Integer,Description="End position of CNV hotspot">

##INFO=<ID=SVTYPE,Number=1,Type=String,Description="Structural Variant Type">

##OverallPloidy=1.8

##GCCorrect=True

##contig=<ID=1,length=248956422>

##contig=<ID=4,length=190214555>

##contig=<ID=10,length=133797422>

##contig=<ID=16,length=90338345>

##contig=<ID=19,length=58617616>

##contig=<ID=22,length=50818468>

##contig=<ID=22_KI270879v1_alt,length=304135>

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 204619760001_R01C01

1 109687842 CNV:GSTM1:chr1:109687842:109693526 N <CNV> 60 PASS CNVLEN=5685;PROBE=124;END=109693526;SVTYPE=CNV CN:NR 2:0.966631132771593

4 68537222 CNV:UGT2B17:chr4:68537222:68568499 N <CNV> 60 PASS CNVLEN=31278;PROBE=383;END=68568499;SVTYPE=CNV CN:NR 0:0.376696837881692

10 133527374 CNV:CYP2E1:chr10:133527374:133539096 N <CNV> 60 PASS CNVLEN=11723;PROBE=194;END=133539096;SVTYPE=CNV CN:NR 2:0.980059731860893

16 28615068 CNV:SULT1A1:chr16:28603587:28613544 N <CNV> 57 PASS CNVLEN=8315;PROBE=164;END=28623382;SVTYPE=CNV CN:NR 2:0.980552325552963

19 40844791 CNV:CYP2A6.intron.7:chr19:40844791:40845293 N <CNV> 60 PASS CNVLEN=503;PROBE=38;END=40845293;SVTYPE=CNV CN:NR 2:0.9663775484762

19 40850267 CNV:CYP2A6.exon.1:chr19:40850267:40850414 N <CNV> 60 PASS CNVLEN=148;PROBE=21;END=40850414;SVTYPE=CNV CN:NR 2:0.9663775484762

22 42126498 CNV:CYP2D6.exon.9:chr22:42126498:42126752 N <CNV> 48 PASS CNVLEN=255;PROBE=370;END=42126752;SVTYPE=CNV CN:NR 2:0.981703411438716

22 42129188 CNV:CYP2D6.intron.2:chr22:42129188:42129734 N <CNV> 10 PASS CNVLEN=547;PROBE=333;END=42129734;SVTYPE=CNV CN:NR 2:0.965498002434641

22 42130886 CNV:CYP2D6.p5:chr22:42130886:42131379 N <CNV> 60 PASS CNVLEN=494;PROBE=172;END=42131379;SVTYPE=CNV CN:NR 2:0.970341562236357

22_KI270879v1_alt 270316 CNV:GSTT1:chr22_KI270879v1_alt:270316:278477 N <CNV> 60 PASS CNVLEN=8162;PROBE=91;END=278477;SVTYPE=CNV CN:NR 2:1.01191145130511

Cytogenetics VCF File

DRAGEN Array produces one cytogenetics Variant Call File (VCF) (*.cnv.vcf) per sample to report the CN and LOH status of the detected variants.

The cytogenetics CNV VCF output file follows the standard VCF format. The QUAL field in the VCF file measures the CNV/LOH call quality. The CNV/LOH call quality is a Phred-scaled score capped at 60 and the minimal value is 0. Low quality calls (QUAL<10) are flagged by the Q10 filter. Low quality samples with LogRDev greater than a threshold 0.2 are flagged with the SampleQuality flag.

The cytogenetics CNV VCF files are by default bgzipped (Block GZIP) and have the “.gz” extension. The compression saves storage space and facilitates efficient lookup when indexed with the TBI Index File. To view these files as plain text, they can be uncompressed with from Samtools or other third-party tools. The CNV VCF must be bgzipped and indexed to be used in downstream DRAGEN Array commands, such as cyto annotate.

One example file can be found below:

##fileformat=VCFv4.1

##source=dragena 1.3.0 Cyto

##genomeBuild=37

##product=GDACyto-8v1-0_A

##reference=file://genome.fa

##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">

##FORMAT=<ID=CN,Number=1,Type=Integer,Description="Copy number genotype. CN=4 indicates 4 or 4+">

##FORMAT=<ID=NR,Number=1,Type=Float,Description="Aggregated normalized intensity">

##FORMAT=<ID=LRD,Number=1,Type=Float,Description="Standard deviation of logR ratios">

##platform=cytoplatform

##ALT=<ID=DEL,Description="Copy number loss region">

##ALT=<ID=DUP,Description="Copy number gain heterozygous region">

##ALT=<ID=LOH,Description="AOH/LOH/ROH, absence of heterozygosity region, or, loss of heterozygosity region">

##FILTER=<ID=Q10,Description="Quality below 10">

##FILTER=<ID=SampleQuality,Description="Sample was flagged as potentially low-quality due to high noise levels.">

##INFO=<ID=SVLEN,Number=1,Type=Integer,Description="Number of bases in CNV/LOH region">

##INFO=<ID=PROBE,Number=1,Type=Integer,Description="Number of probes assayed for CNV/LOH region">

##INFO=<ID=END,Number=1,Type=Integer,Description="End position of CNV/LOH region">

##INFO=<ID=LOHTYPE,Number=A,Type=String,Description="Type of LOH (Loss/absence of heterozygosity). Valid values are AOH (germline, copy number neutral or gain LOH), CNLOH (somatic, copy number neutral LOH), GAINLOH (somatic, copy number gain LOH)">

##OverallPloidy=1.9

##GCCorrect=True

##contig=<ID=1,length=249250621>

##contig=<ID=2,length=243199373>

##contig=<ID=3,length=198022430>

##contig=<ID=4,length=191154276>

##contig=<ID=5,length=180915260>

##contig=<ID=6,length=171115067>

##contig=<ID=7,length=159138663>

##contig=<ID=8,length=146364022>

##contig=<ID=9,length=141213431>

##contig=<ID=10,length=135534747>

##contig=<ID=11,length=135006516>

##contig=<ID=12,length=133851895>

##contig=<ID=13,length=115169878>

##contig=<ID=14,length=107349540>

##contig=<ID=15,length=102531392>

##contig=<ID=16,length=90354753>

##contig=<ID=17,length=81195210>

##contig=<ID=18,length=78077248>

##contig=<ID=19,length=59128983>

##contig=<ID=20,length=63025520>

##contig=<ID=21,length=48129895>

##contig=<ID=22,length=51304566>

##contig=<ID=X,length=155270560>

##contig=<ID=Y,length=59373566>

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 208588190001_R02C01 1 109687842 DEL:chr1:109687842:109693526 N <DEL> 60 PASS SVLEN=5685;PROBE=99;END=109693526 GT:CN:NR:LRD 1/1:1:0.8860:0.21 16 28603587 DUP:chr16:28603587:28613544 N <DUP> 60 PASS SVLEN=9958;PROBE=197;END=28613544 GT:CN:NR:LRD 1/1:3:1.1666:0.11 22 42129188 AOH:chr22:42129188:42129734 N <LOH> 37 PASS SVLEN=547;PROBE=198;END=42129734;LOHTYPE=AOH GT:CN:NR:LRD 1/1:2:1.0208:0.25

SNV VCF File

The software produces one genotyping variant call file (*.snv.vcf) file per sample, covering single nucleotide variants (SNV) and indels for the sample. It reports GenCall score (GS), B Allele Frequency (BAF), and Log R Ratio (LRR) per variant. The VCF file output follows .

Some additional details:

The FILTER column is hardcoded to PASS and is not dependent on the GT value. It does not reflect the underlying quality of the call. Refer to the GS value for quality information.
Genotypes are adjusted to reflect the sample ploidy. Calls are haploid for loci on Y, MT, and non-PAR chromosome X for males.

The SNV VCF output file includes the following content. The last row shows an example of variant call.

##fileformat=VCFv4.1

##source=dragena 1.3.0

##genomeBuild=38

##reference=file:///genomes/38/genome.fa

##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">

##FORMAT=<ID=GS,Number=1,Type=Float,Description="GenCall score. For merged multi-assay or multi-allelic records, min GenCall score is reported.">

##FORMAT=<ID=BAF,Number=1,Type=Float,Description="B Allele Frequency">

##FORMAT=<ID=LRR,Number=1,Type=Float,Description="LogR ratio">

##contig=<ID=1,length=248956422>

##contig=<ID=2,length=242193529>

##contig=<ID=3,length=198295559>

##contig=<ID=4,length=190214555>

##contig=<ID=5,length=181538259>

##contig=<ID=6,length=170805979>

##contig=<ID=7,length=159345973>

##contig=<ID=8,length=145138636>

##contig=<ID=9,length=138394717>

##contig=<ID=10,length=133797422>

##contig=<ID=11,length=135086622>

##contig=<ID=12,length=133275309>

##contig=<ID=13,length=114364328>

##contig=<ID=14,length=107043718>

##contig=<ID=15,length=101991189>

##contig=<ID=16,length=90338345>

##contig=<ID=17,length=83257441>

##contig=<ID=18,length=80373285>

##contig=<ID=19,length=58617616>

##contig=<ID=20,length=64444167>

##contig=<ID=21,length=46709983>

##contig=<ID=22,length=50818468>

##contig=<ID=MT,length=16569>

##contig=<ID=X,length=156040895>

##contig=<ID=Y,length=57227415>

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 202937470021_R06C01

1 2290399 rs878093 G A . PASS . GT:GS:BAF:LRR 0/1:0.7923:0.50724137:0.14730307

Note on Multi-Allelic Variants (MAV) calling limitations

DRAGEN Array can combine multiple assays with different target bases but the same genomic position to make MAV calls. However, Illumina Microarrays are inherently bi-allelic assays made up of which require special design considerations and have some inherent limitations.

The MAV calling algorithm currently filters across all overlapping assays, retaining only genotypes whose alleles are present in the intersection of all assays. If multiple genotypes remain after filtering, the result is considered ambiguous and reported as a NoCall to avoid false positives. This ambiguity often arises when one assay is a NoCall due to presumed probe failure rather than missing signal. In such cases, its potential genotypes are not excluded, contributing to ambiguity. When DRAGEN Array outputs a NoCall because of the described behaviors, they are logged as (e.g., Failed to combine genotypes due to ambiguity...).

Overall, the current algorithm errs on the side of caution to ensure quality calls, but produces some idiosyncratic behavior and potential false NoCalls when genotypes are biologically consistent but differ due to probe designs. We hope to improve this behavior in future versions of DRAGEN Array.

Some illustrative examples are below to help understand the current limitations:

Scenario

Expected MAV Call

Actual MAV Call

Explanation

Note on delimiters in the "ID" field

By default, when multiple probes are present for a given variant, all probe names are included in the "ID" field of the resulting VCF file.

For SNP entries, probe names are separated by commas (,).
For Indel entries, probe names are separated by semicolons (;).

Note on REF/ALT "flipping" for INDELs

Expected REF and ALTs for INDELS may not match the dbSNP annotations in rare cases. E.g., an expected "Deletion" with the REF = "ATCG" and the ALT = "A" may be "flipped" to an "Insertion" variant with REF="A" and ALT="ACTG". The corresponding genotype output will take this into account so the actual VCF is still correct. This is simply a notation issue in some of the manifest files.

Note on PLINK compatibility

It is possible to make DRAGEN Array genotype VCF files compatible for conversion to PED/MAP format with preprocessing using tabix (v1.19.1), BCFtools (v1.21) and PLINK (v1.9). The following three commands demonstrate the basic process.

bcftools merge -l vcf_list.txt -Oz -o merged.vcf.gz creates a single compressed VCFs from individual sample VCFs listed in the vcf_list.txt text file.
tabix -p vcf merged.vcf.gz creates a binary index for the merged file.

Some optional arguments may be provided to PLINK depending on the content of the VCFs to be converted and the downstream analysis.

For VCFs containing non-standard human chromosomes (e.g. haplotype chromosomes or unplaced contigs), the --allow-extra-chr flag can be used.
If using non-human data, refer to the for the --chr-set argument and supported options.
By default, PLINK will only consider the most common ALT allele for multi-allelic variants. The --biallelic-only

For more info on the options described and others, refer the .

Genotype Call (GTC) File

The genotype call algorithm produces one genotype call file (.gtc) per sample analyzed. The Genotype Call (GTC) file contains the small variant (SNV and indel) genotype for each marker specified by the product and sample quality metrics. The sample marker location is not included and must be extracted from the manifest file. Binary proprietary format can be parsed using the Illumina open-source tool .

Note on lack of i18n: GTCs are binary/fixed format files built designed before modern internationalization and localization tools. There is a related that makes the GTCs unable to be used in downstream analyses. Refer to the same issue to see a workaround.

Note on legacy GTCs: Other Illumina software (such as AutoConvert and Beeline) also product GTC files. These "legacy GTC" files will work in DRAGEN Array genotyping commands such as genotype gtc-to-vcf but they will not work with all other downstream analyses such as and . We recommend using DRAGEN Array end-to-end starting from IDATs for these analyses.

BedGraph Files

The BedGraph files contains the Log R Ratios (LRR.bedgraph) and B-Allele Frequencies (BAF.bedgraph) from the genotyping algorithm for use in visual tools.

Star Allele CSV File

The Star Allele CSV file is an intermediate file generated by the pgx star-allele call command and serves as the input to the pgx star-allele annotate command. It contains all the star allele calls for all samples in a run. Each row in the file provides either a star allele diplotype or simple variant call for a PGx-related gene. Star allele diplotype calls for a sample and a gene may span multiple lines where alternative solutions can be listed.

The Star Allele CSV file also contains meta information marked by # at the top of the file for the genome build and PGx database used for the star allele calling.

The star_allele.csv file contains the following details per sample:

Field

Description

Below is an example of the first 4 columns from a star allele CSV file:

Sample,Rank,Gene or Variant,Type,Solution

204650490282_R02C01,1,CYP2C9,Haplotype,*9/*11

204650490282_R02C01,1,CYP2C19,Haplotype,*2/*10

Genotype Summary Files

The software produces genotype summary files (gt_sample_summary.csv and gt_sample_summary.json) that contains the following details per sample:

Sample ID
Sample Name
Sample Folder
Autosomal Call Rate

The TGA_Ctrl_5716 Norm R field is specific to PGx products (e.g., Global Diversity Array with enhanced PGx). The field value is the Normalized R value of one probe and is meant as an assay control where < 1 indicates the sample failed in the TGA (Targeted Gene Amplification) process. If the product does not have this probe, it is not included in the gt_sample_summary.

The user defined fields from the samplesheet will appear as-is in the gt_sample_summary files. e.g. for the given samplesheet:

It would produce something like the following gt_sample_summary.csv:

And something like the following gt_sample_summary.json:

Note: As of v1.3, samples that fail during genotyping will still be present in this file. See the details in the .

Final Report

DRAGEN Array Cloud produces a Final Report (gtc_final_report.csv) per analysis batch similar to the one available in GenomeStudio. It contains the following details per locus per sample:

Field

Description

Note: Analyses on products with large numbers of loci (>1 Million) and large numbers of samples (>100) yield a large (50+ Gigabyte) Final Report that are difficult to download and review. It’s recommended to create analysis configurations that do not produce this report if large batches are desired.

For more information on interpreting DNA strand and allele information, see Illumina Knowledge article .

Locus Summary

DRAGEN Array Cloud produces a Locus Summary (locus_summary.csv) per analysis batch similar to the one available in GenomeStudio. It contains the following details per locus:

Field

Description

CN Summary File

The sample summary contains per sample key stats for each sample in a batch that contains the following details per sample:

Sample ID
Sample Name
Sample Folder

Copy Number Batch File

The copy number batch summary file (cn_batch_summary.csv) shows the total copy number gain, loss, and neutral (CN=2) values for each target region across all the samples in the analysis.

Example copy number batch summary file content:

Target Region,Total CN gain,Total CN loss,Total CN neutral

CYP2A6.exon.1,0,1,47

CYP2A6.intron.7,0,1,47

CYP2D6.exon.9,2,4,42

CYP2D6.intron.2,7,2,39

CYP2D6.p5,13,2,33

CYP2E1,2,0,46

GSTM1,0,42,6

GSTT1,0,33,15

SULT1A1,0,0,48

UGT2B17,0,34,14

All Target Regions,24,119,337

Warning/Error Messages and Logs

The following scenarios result in a warning or error message:

Manifest file used to generate GTC is not the same as the manifest file used to generate the CN model.
FASTA files and FASTA index files do not match.

For the following scenarios, the software reports messages to the terminal output (as either a warning or an error):

Indel processing for GTC to VCF conversion failed.
The input folder does not contain the required input files.
An input file is corrupt.

Examples of such notifications can include the following:

Error

Type

Cause

Star allele JSON File

The star allele JSON file is produced per sample. It contains the fields present in the as well as additional meta data and annotations.

Fields included in the star allele JSON header are described below.

Field

Description

Fields included in the star allele call (locusAnnotations) information are described below.

Field

Description

Fields included in the candidateSolution section, only available for star allele call type, are described below.

Field

Description

Example of JSON file content:

Guidance on alternative star-allele results

Typically, the star allele solution with highest quality score is accepted as the final genotype (i.e. star allele diplotype) for the PGx locus. In rare cases, there are lower ranked star allele solutions with quality scores no less than 50% of the highest quality score, these lower ranked solutions are considered feasible and they are all listed in the genotype field of the locus annotation of the PGx gene in the PGx JSON file. Alternative solutions should also be considered if there are supporting variants for those solutions with low (less than 0.15) GS scores. The clustering of low GS scoring supporting variants should also be evaluated for cluster quality and any potential cluster shift.

Cytogenetics Annotation JSON File

DRAGEN Array produces one cytogenetics annotation JSON (*.json) per sample to report more sample-level, chromosome-level, and event-level metrics and annotations.

Example of JSON file content:

The fields in the annotation JSON for each sample are described as follows.

Field

Description

The fields for each chromosome under the chromosomeAnnotations field of the Cyto annotation JSON are described below.

Field

Description

The fields within each variant (CNV/LOH event) under the locusAnnotations field of the Cyto annotation JSON are described below.

Field

Description

The traditionalNomenclature field is used to describe individual and cumulative copy number variants at a coarse resolution. They show gains (dup) and losses (del) according to their chromosome number, arm (p or q), and band (e.g., p36.13). The microarrayNomenclature field follows the Comparative Genomic Hybridization or SNP array conventions. Values are prefixed with arr[] to indicate array data as the source, along with the genome build e.g. GRCh38. These data can more precisely describe the location (start_end in bp) and copy number (x1 for loss, x3 for gain, etc).

The genes list field for each variant include those with transcript coordinates that intersect with the described variant. The values included are a combination of HGNC gene symbols taken from NCBI RefSeq database (e.g. ), as well as the subset of Ensembl gene accessions unmatched to a gene symbol ().

TBI Index File

The TBI (TABIX) index file is associated with the bgzipped VCF files. It allows for data line lookup in VCF files for quick data retrieval. The format is a tab-delimited genome index file developed by Samtools as part of the HTSlib utilities. For more information, visit the website.

Methylation Control Probe Output File

The software produces a control probe output file ({BeadChipBarcode}_{Position}_ctrl.tsv.gz) per sample that includes the raw methylated and unmethylated values for each control probe.

Each control probe has an address, type, color channel, name, and probe ID. It also provides the raw signal for methylated green (MG), methylated red (MR), unmethylated green (UG) and unmethylated red (UR).

The file can help identify which probes are available on a given BeadChip.

Methylation CG Output File

The software produces a CG output file ({BeadChipBarcode}_{Position}_cgs.tsv.gz) per sample that includes beta values, m-values and detection p-values for each CG site.

Beta values measure methylation levels in a linear fashion for easy interpretation. Unmethylated probes are close to zero and methylated probes are close to 1.

M-values are a log transformed beta value which provides a more representative measure of methylation.

Detection p-values measure the likelihood that the signal is background noise. It is recommended that p-value >0.05 are excluded from analysis as they are likely background noise.

see software tech note for further detail on calculation of these metrics.

Methylation Sample QC Summary Files

The software produces methylation sample QC summary in .xlsx and .tsv file formats (sample_qc_summary.xlsx and sample_qc_summary.tsv) per analysis batch, which provides per sample QC data for all samples in the batch.

The QC summary provides details on 21 controls metrics (see tables below), which are computed in same way as in the BeadArray Controls Reporter software from Illumina. In addition, it provides average red and green raw and normalized signals, time of scanning, proportion of probes passing, overall sample pass/fail status, and the failure codes for control metrics that did not pass. The sample pass status is defined as the passing of all 21 control metrics. The QC summary .xlsx file further highlights failing parameters for easy viewing.

The QC summary files contain the following fields:

Field

Description

The control metrics in the QC summary files are calculated as following. The default value for background correction offset (x) of 3,000 can be modified and applies to all background calculations indicated with (bkg + x). Note that the table uses default thresholds for EPIC arrays as example, the default thresholds changes with the methylation arrays. See section for additional details.

Methylation Sample QC Summary Plots

The software produces methylation sample QC summary plots (sample_qc_summary.pdf) per analysis batch which provides visual depictions of two QC summary plots for quick visual review.

The file contains the following control plots:

Control Plot

Description

Methylation Principal Component Summary

The software produces a methylation principal component summary file (pcs.tsv.gz) per analysis batch which provides principal component data for each sample within the batch. This can be used to identify the specific samples associated with points on the PCA control plot within the Methylation Sample QC Control Plots output file.

The files contain the following fields:

Field

Description

Methylation Manifest Files

The software produces two methylation manifest files

Manifest in Sesame format (probes.csv)
Additional information for control probes (controls.csv)

The probes.csv file has the following columns:

Field

Description

The controls.csv file has the following columns:

Field

Description

Methylation Warning/Error Messages and Logs

The following scenarios result in a warning or error message:

Missing IDATs or manifest
Incorrect sample sheet formatting
Duplicate BeadChip Barcode and Position within the sample sheet

Examples of such notifications can include the following:

plink --vcf merged.vcf.gz --recode --out merged

--out

bcftools norm -m - in.vcf.gz -Oz -o out.vcf.gz

Sample ID,Sample Name,Sample Folder,Autosomal Call Rate,Call Rate,Log R Ratio Std Dev,Sex Estimate,SentrixBarcode_A,SentrixPosition_A,Sample_Group,MetaData1
204753010023_R01C01,204753010023_R01C01,/sample/folder,0.99414575,0.98843694,0.14829777,F,204753010023,R01C01,Group1,F
204753010024_R01C01,204753010024_R01C01,/sample/folder,0.99415575,0.98943694,0.14929777,M,204753010024,R01C01,Group2,M

[
  {
    "Sample ID": "204753010023_R01C01",
    "Sample Name": "204753010023_R01C01",
    "Sample Folder": "/sample/folder",
    "Autosomal Call Rate": 0.99414575,
    "Call Rate": 0.98843694,
    "Log R Ratio Std Dev": 0.14829777,
    "Sex Estimate": "F",
    "SentrixBarcode_A": "204753010023",
    "SentrixPosition_A": "R01C01",
    "Sample_Group": "Group1",
    "MetaData1": "F"
  },
  {
    "Sample ID": "204753010024_R01C01",
    "Sample Name": "204753010024_R01C01",
    "Sample Folder": "/sample/folder",
    "Autosomal Call Rate": 0.99415575,
    "Call Rate": 0.98943694,
    "Log R Ratio Std Dev": 0.14929777,
    "Sex Estimate": "F",
    "SentrixBarcode_A": "2083757900024",
    "SentrixPosition_A": "R01C01",
    "Sample_Group": "Group2",
    "MetaData1": "M"
  }
]

{
  "softwareVersion": "dragena 1.3.0",
  "genomeBuild": "38",
  "starAlleleDatabaseSources": [
    "PharmVar Version: 6.1",
    "PharmGKB Database Version: Snapshot-2024.05.16",
    "UGT Alleles Nomenclature: 2010.12.21",
    "The Human Cytochrome P450 (CYP) Allele Nomenclature Database, July 2024"
  ],
  "phenotypeDatabaseSources": [
    "CPIC Database Version: 1.38.0",
    "DPWG Database Version: June 2023"
  ],
  "mappingFile": "DRAGENA-549-fix-annotate-sha.e56e884ed1f2d118e796cdab578ab895456bb94e.zip",
  "pgxGuideline": "CPIC",
  "sampleId": "207883050020_R08C03",
  "locusAnnotations": [
    {
      "gene": "CYP2C9",
      "callType": "Star Allele",
      "genotype": "*1/*1",
      "activityScore": "2",
      "phenotypeDatabaseAnnotation": "CYP2C9 Normal Metabolizer",
      "qualityScore": "0.9999",
      "rawScore": "0.9999",
      "supportingVariants": [],
      "candidateSolutions": [
        {
          "rank": 1,
          "genotype": "*1/*1",
          "activityScore": "2",
          "phenotypeDatabaseAnnotation": "CYP2C9 Normal Metabolizer",
          "qualityScore": 0.9999,
          "rawScore": 0.9999,
          "alleles": [
            {
              "solutionLong": "Complete: *1",
              "supportingVariants": [],
              "missingVariantSites": [],
              "collapsedAlleles": ""
            }
          ],
          "copyNumberRegions": "p5,exon.1,intron.1,exon.2,intron.2,exon.3,intron.3,exon.4,intron.4,exon.5,intron.5,exon.6,intron.6,exon.7,intron.7,exon.8,intron.8,exon.9,p3",
          "copyNumberSolution": "2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2"
        }
      ],
      "missingVariantSites": [
        {
          "id": "NC_000010.11:g.94938719T>G",
          "alleleIds": "*80"
        },
        {
          "id": "NC_000010.11:g.94938788C>T",
          "alleleIds": "*83"
        },
        {
          "id": "NC_000010.11:g.94938800G>A",
          "alleleIds": "*76"
        },
        {
          "id": "NC_000010.11:g.94941975G>A",
          "alleleIds": "*77"
        },
        {
          "id": "NC_000010.11:g.94942243T>G",
          "alleleIds": "*78"
        },
        {
          "id": "NC_000010.11:g.94942306C>T",
          "alleleIds": "*72"
        },
        {
          "id": "NC_000010.11:g.94942308C>T",
          "alleleIds": "*73"
        },
        {
          "id": "NC_000010.11:g.94942309G>T",
          "alleleIds": "*27"
        },
        {
          "id": "NC_000010.11:g.94947939G>T",
          "alleleIds": "*74"
        },
        {
          "id": "NC_000010.11:g.94949145C>T",
          "alleleIds": "*82"
        },
        {
          "id": "NC_000010.11:g.94949163del",
          "alleleIds": "*85"
        },
        {
          "id": "NC_000010.11:g.94972183A>T",
          "alleleIds": "*81"
        },
        {
          "id": "NC_000010.11:g.94981258C>T",
          "alleleIds": "*79"
        },
        {
          "id": "NC_000010.11:g.94986136A>C",
          "alleleIds": "*75"
        },
        {
          "id": "NC_000010.11:g.94986174G>C",
          "alleleIds": "*84"
        }
      ],
      "allelesTested": "*1,*2,*3,*4,*5,*6,*7,*8,*9,*10,*11,*12,*13,*14,*15,*16,*17,*18,*19,*20,*21,*22,*23,*24,*25,*26,*27,*28,*29,*30,*31,*32,*33,*34,*35,*36,*37,*38,*39,*40,*41,*42,*43,*44,*45,*46,*47,*48,*49,*50,*51,*52,*53,*54,*55,*56,*57,*58,*59,*60,*61,*62,*63,*64,*65,*66,*67,*68,*69,*70,*71,*72,*73,*74,*75,*76,*77,*78,*79,*80,*81,*82,*83,*84,*85"
    },
    {
      "gene": "CYP2C19",
      "callType": "Star Allele",
      "genotype": "*1/*2",
      "activityScore": "n/a",
      "phenotypeDatabaseAnnotation": "CYP2C19 Intermediate Metabolizer",
      "qualityScore": "0.9999",
      "rawScore": "0.9958",
      "supportingVariants": [
        {
          "chrom": "10",
          "pos": "94842866",
          "ref": "A",
          "alt": "G",
          "gt": "1/1",
          "gs": "0.2669",
          "baf": "1",
          "id": "NC_000010.11:g.94842866A>G",
          "alleleIds": "*1"
        },
        {
          "chrom": "10",
          "pos": "94775367",
          "ref": "A",
          "alt": "G",
          "gt": "0/1",
          "gs": "0.2191",
          "baf": "0.4690612",
          "id": "NC_000010.11:g.94775367A>G",
          "alleleIds": "*2"
        },
        {
          "chrom": "10",
          "pos": "94781859",
          "ref": "G",
          "alt": "A",
          "gt": "0/1",
          "gs": "0.3351",
          "baf": "0.66212183",
          "id": " NC_000010.11:g.94781859G>A",
          "alleleIds": "*2"
        },
        {
          "chrom": "10",
          "pos": "94842866",
          "ref": "A",
          "alt": "G",
          "gt": "1/1",
          "gs": "0.2669",
          "baf": "1",
          "id": " NC_000010.11:g.94842866A>G",
          "alleleIds": "*2"
        }
      ],
      "candidateSolutions": [
        {
          "rank": 1,
          "genotype": "*1/*2",
          "activityScore": "n/a",
          "phenotypeDatabaseAnnotation": "CYP2C19 Intermediate Metabolizer",
          "qualityScore": 0.9999,
          "rawScore": 0.9958,
          "alleles": [
            {
              "solutionLong": "Complete: *1",
              "supportingVariants": [
                {
                  "chrom": "10",
                  "pos": "94842866",
                  "ref": "A",
                  "alt": "G",
                  "gt": "1/1",
                  "gs": "0.2669",
                  "baf": "1",
                  "id": "NC_000010.11:g.94842866A>G"
                }
              ],
              "missingVariantSites": [],
              "collapsedAlleles": ""
            },
            {
              "solutionLong": "Complete: *2",
              "supportingVariants": [
                {
                  "chrom": "10",
                  "pos": "94775367",
                  "ref": "A",
                  "alt": "G",
                  "gt": "0/1",
                  "gs": "0.2191",
                  "baf": "0.4690612",
                  "id": "NC_000010.11:g.94775367A>G"
                },
                {
                  "chrom": "10",
                  "pos": "94781859",
                  "ref": "G",
                  "alt": "A",
                  "gt": "0/1",
                  "gs": "0.3351",
                  "baf": "0.66212183",
                  "id": " NC_000010.11:g.94781859G>A"
                },
                {
                  "chrom": "10",
                  "pos": "94842866",
                  "ref": "A",
                  "alt": "G",
                  "gt": "1/1",
                  "gs": "0.2669",
                  "baf": "1",
                  "id": " NC_000010.11:g.94842866A>G"
                }
              ],
              "missingVariantSites": [],
              "collapsedAlleles": "*2.001"
            }
          ],
          "copyNumberRegions": "p5,exon.1,intron.1,exon.2,intron.2,exon.3,intron.3,exon.4,intron.4,exon.5,intron.5,exon.6,intron.6,exon.7,intron.7,exon.8,intron.8,exon.9,p3",
          "copyNumberSolution": "2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2"
        }
      ],
      "missingVariantSites": [
        {
          "id": "NC_000010.11:g.94762715T>C",
          "alleleIds": "*34"
        }
      ],
      "allelesTested": "*1,*2,*3,*4,*5,*6,*7,*8,*9,*10,*11,*12,*13,*14,*15,*16,*17,*18,*19,*22,*23,*24,*25,*26,*28,*29,*30,*31,*32,*33,*34,*35,*38,*39"
    }

{
  "annotateDb": "CytoAnnotateData_DAv1.2.0.zip",
  "softwareVersion": "dragena 1.3.0 Cyto",
  "referenceGenome": "file://genome.fa",
  "annotationType": "Constitutional",
  "genomeBuild": "hg19",
  "databaseSources": "RefSeq (Version: GCF_000001405.40-RS_2023_10; Release Date: 2023-10-07),Ensembl (Version: 112; Release Date: 2024-05-14)",
  "iscnVersion": "ISCN 2020",
  "sampleId": "208662410005_R01C01",
  "gcCorrect": true,
  "minDelProbes": 10,
  "minDupProbes": 10,
  "minLOHProbes": 500,
  "minDelSize": "20kb",
  "minDupSize": "20kb",
  "minLOHSize": "3000kb",
  "minQual": 20,
  "overallPloidy": 2.012,
  "callRate": 0.9847303032875061,
  "logRDev": 0.19977793097496033,
  "medianLogRDev": 0.1329595363075463,
  "bafDev": {
    "AA": 0.015344353947021572,
    "AB": 0.05078388443393606,
    "BB": 0.026002263878862304
  },
  "numLOHOver1M": 4,
  "numLOHOver8M": 3,
  "totalSizeLOHOver1M": 49319297,
  "copyNumberMedian": 2.0,
  "percentLOH": "1.59%",
  "sexEstimate": "Female",
  "traditionalNomenclature": "dup(2)(q32.3q33.1),dup(2)(q33.1q37.1),dup(2)(q37.1q37.3),del(2)(q37.3q37.3),del(2)(q37.3q37.3),dup(3)(p24.3p24.3),del(13)(q34q34)",
  "microarrayNomenclature": "1p12q21.1(120311442_144549929)x2 hmz,2q32.3q33.1(197045077_201353083)x3,2q33.1q37.1(201356309_234652155)x3,2q37.1q37.3(234653107_238195820)x3,2q37.3(238204076_238283050)x1,2q37.3(238283403_243062047)x1,3p24.3(23235392_23403815)x3,5p12q11.1(44708357_49847659)x2 hmz,11p11.2q12.1(47912150_56507812)x2 hmz,13q34(111358236_111423865)x1,Xp11.22q12(53907828_65253670)x2 hmz",
  "chromosomeAnnotations": [
    {
      "id": "chr1",
      "size": 249250621,
      "percentHet": "10.9587%",
      "hasMosaicism": false,
      "lrrMedian": -0.009397780522704124,
      "lrrDev": 0.16622741086471732,
      "numLOHOver1M": 0,
      "numLOHOver8M": 0,
      "totalSizeLOHOver1M": 0,
      "percentLOH": "0%",
      "copyNumberMedian": 2.0,
      "copyNumberMean": 2.0,
      "minLogRRatio": -2.739042392000556,
      "maxLogRRatio": 1.631125334650278,
      "medianMosaicFraction": ".",
      "numberDel": 0,
      "numberDup": 0,
      "numberLOH": 0,
      "numberMosaic": 0
    },
    ...
  ],
    "locusAnnotations": [
    {
      "id": "AOH:1:120311442:144549929",
      "chrom": "chr1",
      "start": 120311441,
      "end": 144549929,
      "callType": "LOH",
      "mosaicState": false,
      "mosaicFraction": ".",
      "copyNumber": 2,
      "qualityScore": 35.0,
      "size": 24238488,
      "effectiveSize": 151008838,
      "probeCount": 701,
      "percentHet": "1.01%",
      "lrrMedian": 0.05862508801510572,
      "lrrDev": 0.08330211160585141,
      "bafDev": 0.47622189059186604,
      "startCytoBand": "1p12",
      "endCytoBand": "1q21.1",
      "traditionalNomenclature": "N/A",
      "microarrayNomenclature": "1p12q21.1(120311442_144549929)x2 hmz",
      "geneCount": 96,
      "genes": [
        "HMGCS2",
        "REG4",
        "NBPF7P",
        "PFN1P9",
        "NOTCH2P1",
        "ADAM30",
        "RP5-1042I8.7",
        "NOTCH2",
        "RP11-114O18.1",
        ...
      ]
    },
    ...
  ]
}

If the genotyping module reports an unknown sex and the cytogenetic caller cannot resolve it, the caller assumes the sample is male. As a result, sex chromosome detection may be inaccurate if the sample is actually female. This behavior is not currently output in the log.
ISCN annotations in the cytogenetic annotation JSON output file are only provided for variants greater than 1 Kb in length. This is often cited as a minimum size limit used to define copy number variants.
Centromere regions typically have low sequence complexity and are prone to artifacts. As a result, cytogenetic calling results in these regions are likely to be false positives.
ISCN annotations are not provided for LOH variants in the cytogenetic annotation JSON output file.
DRAGEN Array Cytogenetics analysis is intended for constitutional samples only, oncology samples not supported at this time.
DRAGEN Array Cytogenetics analysis is validated only for specific array platforms: Infinium Global Diversity Array with Cytogenetics-8, Infinium Global Screening Array with Cytogenetics-24, and Infinium CytoSNP-850K BeadChip (iScan System).
- Note: DRAGEN Array can process IDAT files from the NextSeq550 for cytogenetic analysis, but this setup hasn’t been formally validated. If you're interested in trying it, check out the demo data in the ‘Demo Data’ section on BaseSpace, which was generated using the iScan system.
DRAGEN Array Cytogenetics analysis may call large events that are broken into smaller pieces and require visual confirmation.
GT is hardcoded to homozygous alt (1/1) for cyto VCF entries.
Tabix indexing from DRAGEN Array is not exactly the same as . For instance, if you run bcftools index --stats in.vcf.gz or bcftools index --nrecords in.vcf.gz, with certain versions of bcftools, you may get the following error: index of in.snv.vcf.gz does not contain any count metadata. Please re-index with a newer version of bcftools or tabix.. If these tools are critical to user's bioinformatics pipelines a workaround would be to unzip and re-index DRAGEN Array VCFs using bcftool's tabix. But please note, these index files may not work in downstream VCF-based DRAGEN Array commands like pgx star-allele call. Please use DRAGEN Array end-to-end for analysis flows like the ones detailed in the guide.
There can be some minor differences when running pgx star-allele call on Windows vs. Linux. During verification testing, out of 1576 samples, we noticed the following discordance:

{
   "SentrixBarcode_A": "204753010023",
   "SentrixPosition_A": "R02C01",
   "Sample ID": "204753010023_R02C01",
   "Sample Name": "204753010023_R02C01",
   "Sample Folder": "/tmp",
   "Autosomal Call Rate": 0.99,
   "Call Rate": 0.99,
   "Log R Ratio Std Dev": 0.15,
   "Sex Estimate": "F",
   "": ""
}

DRAGEN Array Local Analysis

DRAGEN Array Local Overview

DRAGEN Array provides accurate, comprehensive, and efficient analysis of Infinium microarray data. The local command-line interface makes it easy for power users to have granular control and flexibility to support large scale microarray genomic studies.

Getting Started

DRAGEN Array Local utilizes a command-line interface which allows full user control of software functionality and easy automation of tasks. The software is designed to be used by power users and bioinformaticians. If new to using command-line interface, please review the .

Computing Requirements

Before downloading and installing the software, ensure the following specifications are met for best performance:

Quota Specifications

The star-allele call command in DRAGEN Array Local requires quota to run. The quota is charged per sample analyzed and can be purchased on the . Quota is used for all samples analyzed including re-analysis or low-quality samples. Quota is checked before and after analysis but not after updating the usage. Users will need to re-run the command to re-check the current usage after a run.

The credential provided in the activation email after purchasing should be used as an input to the star-allele call command through the "--license-server-url" option. During runtime, the will record the remaining quota at the beginning and the end of the analysis.

Internet is required to do a software license check and ensure paid quota is available for all samples in the analysis batch. For the software license check, the following endpoints are used:

In v1.0 and v1.1: license.edicogenome.com
In v1.2+: license.dragen.illumina.com

NOTES:

Do not use license.dragen.illumina.com license server urls when running DRAGEN Array v1.0 and v1.1 as that domain only works with v1.2+ versions. This is described in the and known issues.
In v1.1+, during analysis, precomputed quota is no longer checked. This can result in a scenario where an analysis run can be over-quota, but will not fail until the end of the run. An example: if there is only quota for 6 samples, but the analysis run contains 8 samples, the analysis will proceed as normal until the end when usage is updated the software will produce the following error: Error updating usage. HTTP error status code: 409 and will not write the results to disk.

Installation

Please follow the steps below to install the software on your compute infrastructure:

Click on the latest DRAGEN Array version installation package for the platform of your choice. Installers for Windows and Linux are available on the . Once download is completed, move the DRAGEN Array installation package to the desired folder. Administrative permissions may be required for system folders, for example /usr/local/bin for Linux, and C:\Program Files for Windows. Note: Throughout the remainder of the document, Linux will be assumed in the examples.
Unzip and extract the package. The executable can be found in the dragena subfolder of the software download after extraction.

The version of the software will be displayed in the terminal window when the installation was successful.

Run DRAGEN Array Local

For genotyping or cytogenetic analysis, there is no sample minimum required to run analysis.

For CNV PGx analysis, a minimum of 24 samples is required to run analysis. For a successful analysis, 22 samples must pass QC defined as having log R dev < 0.2. With a standard hardware specification in section , up to 500 GDA-ePGx samples can be processed per analysis batch.

To optimize performance of the targeted PGx CNV caller and minimize batch effect, it is recommended to:

Group samples in the same assay batch (e.g. whole genome amplication and targeted gene application assay batch) into the same analysis batch.
Avoid combining sample batches processed on different reagent lots.
Analyze batches of 96 samples or more.

Quick Start

Review section for information on input files to use, sample minimums per analysis type and other best practices.

Command examples show analysis for a Linux system using folders instead of sample sheets. For Windows users, make sure to substitute the file paths in the commands following windows conventions, e.g., using backslash (\) instead of forward-slash (/). A sample sheet can be used to select specific samples out of a folder.

Note: DRAGEN Array will overwrite older files if using the same --output-folder from a previous analysis. If this is not desired, use different --output-folder for re-analyses.

PGx

Use the following instructions to start the full PGx analysis, covering genotyping, PGx CNV and PGx star allele calling. Refer to for parameters for all commands.

Open a command prompt (Windows) or terminal window (Linux) and navigate to the directory where the software was installed. Or a different, desired directory if the executable was added to the PATH environmental variable.
Use the genotype call command to call genotypes and generate GTC files using IDAT files as input. dragena genotype call --bpm-manifest /user/productfiles/manifest.bpm --cluster-file /user/productfiles/clusterfile.egt --idat-folder /user/IDATs --output-folder /user/gtc
Use the genotype gtc-to-vcf command to create SNV VCF files from the GTC files generated by the genotype call command. dragena genotype gtc-to-vcf --bpm-manifest /user/productfiles/manifest.bpm --csv-manifest /user/productfiles/manifest.csv --genome-fasta-file /user/productfiles/genome.fa --gtc-folder /user/gtc --output-folder /user/vcf

Cytogenetics

Use the following instructions to start the full cytogenetics analysis, covering genotyping, CNV and LOH calling, and annotation. Refer to for parameters for all commands.

Open a command prompt (Windows) or terminal window (Linux) and navigate to the directory where the software was installed. Or a different, desired directory if the executable was added to the PATH environmental variable.
Use the genotype call command to call genotypes and generate GTC files using IDAT files as input. dragena genotype call --bpm-manifest /user/productfiles/manifest.bpm --cluster-file /user/productfiles/clusterfile.egt --idat-folder /user/IDATs --output-folder /user/gtc
Use the cyto call command to determine copy number variants and loss of heterozygosity given genotypes. dragena cyto call --cn-model /user/productfiles/cyto_model.dat --gtc-folder /user/gtc --output-folder /user/vcf

Command Index

Use the following syntax when using the command-line interface:

dragena [module] [sub-module (not needed for cyto)] [command] [required parameters] [optional parameters]

Module

Description

help

Displays the first-layer help information.

version

Displays current DRAGEN Array Local version.

genotype

The root command for genotype calling.

Command

Description

genotype call

Determines genotype calls (GTC) from IDAT files.

Option

Description

Note: Either --idat-folder, --sample-sheet, or both are required inputs.

genotype gtc-to-bedgraph

Converts GTC to BedGraph files, producing BedGraph formatted visualization files from the Log R Ratio and B-allele frequency data contained in the GTC intermediate files.

Option

Description

Note: Either --gtc-folder, --sample-sheet, or both are required inputs.

genotype gtc-to-vcf

Converts GTC (v5) to . The command is only applicable for produced by DRAGEN Array.

Option

Description

Squashing duplicates

In the manifest, there can be cases where the same variant is probed by multiple different assays. These assays may be the same design or alternate designs for the same locus. In the default mode of operation, these duplicates will be "squashed" into a single record in the VCF to reflect a true variant rather than probe genotype. The method used to incorporate information across multiple assays is defined further in the . When the --unsquash-duplicates option is provided, this "squashing" behavior is disabled, and each duplicate assay will be reported in a separate entry in the VCF file. This option is helpful when you are interested in investigating or validating the performance of individual assays, rather than trying to generate genotypes for specific variants. Note that if a locus has more than two alleles and is also queried with duplicated designs, the duplicates will not be unsquashed (i.e., in the case of multi-allelic variants). DO NOT use --unsquash-duplicates option if doing star allele calling downstream as that command expects squashed variants.

Genome cache

By default, the entire reference genome will be read into memory. Generally, this will be more efficient than reading data from the indexed reference on disk at the expense of greater memory utilization. For situations in which the genome caching is not desirable (low memory availability or a small input manifest), it is possible to disable this default behavior with the --disable-genome-cache option.

Auxiliary loci

Certain classes of variant types (such as multi-nucleotide variants) are not currently supported in the upstream analysis software that produces GTC files. However, it is possible to query this type of variant by creating a SNP design that differentiates the specific multi-nucleotide alleles of interest. For example, if the true source sequence is

ATGC[AT/CG]GTAA

This assay could be designed as a SNP assay with the following source sequence

ATGC[A/C]NNNN

gtc-to-vcf provides an option (--auxiliary-loci) to supply a list of auxiliary records (in VCF format) to restore the true alleles for these cases in the output VCF. There are several restrictions around this function

The auxiliary definition must NOT be a multi-allelic variant.
The auxiliary definition must be a multi-nucleotide variant.
There must NOT be multiple array assays (e.g., duplicates) for the locus.

Notes:

Either --gtc-folder, --sample-sheet, or both are required inputs.
The genome fasta files for human genomes are provided by Illumina on the .

genotype help

Displays the help information for a genotype command.

genotype version

Displays current DRAGEN Array Local version.

pgx

The root command for pgx module

Command

Description

pgx copy-number

The root command for actions that act on pgx copy number variants.

Command

Description

pgx copy-number call

The command used to call copy number variants. A batch of 24 samples or more are required for analysis. For a successful analysis, 22 samples must pass QC defined as having log R dev < 0.2.

Option

Description

pgx copy-number train

Trains pgx copy number (CN) model for a set of samples. Generate a new pgx CN model if using a customized cluster file (.egt) optimized for the specific data set.

Execute the train command using the data sets that were used to optimize the cluster file.
To use a pgx CN model generated by the train command, the mask file for the manifest must be saved in the same directory as the manifest.
A minimum of 96 samples is required to use the copy-number train command. For optimal performance, at least 150 is recommended.

See for further details.

Option

Description

pgx copy-number help

Displays help information for the copy-number command.

pgx copy-number version

Displays version information for pgx copy-number command.

pgx star-allele

The root command PGx star allele calling.

Command

Description

pgx star-allele call

Calls PGx star allele diplotypes. The SNV VCF files should be generated using the DRAGEN Array gtc-to-vcf command with unsquash-duplicates off (default) and without filter loci.

Option

Description

pgx star-allele annotate

Annotates and summarizes the star-alleles, specifically for metabolizer statuses and outputs in a consolidated JSON report. Metabolizer status is determined through direct lookup into public PGx guidelines CPIC or DPWG as specified by the user.

Option

Description

pgx star-allele help

Displays help information for a star-allele command.

pgx star-allele version

Displays version information for star-allele.

cyto

The root command for Cytogenetics analysis and annotation.

Command

Description

cyto call

Determines copy number variants (CNV) and loss/absence of heterozygosity (LOH/AOH) given genotypes.

Option

Description

Notes:

Greater than 10 events (DEL/DUP/AOH) per chromosome is an indication of need for visual inspection.
If mosaic fraction cannot be estimated due to insufficient informative probes, it will be set to NaN.
Mosaic events that surpass the --max-mosaic-fraction limit have the MOSAIC tag in the INFO field of the VCF replaced with an HIGHFRACTION tag.

cyto annotate

Annotates samples and generates cytogenetic json reports.

Option

Description

Notes:

The metadata "cyto.cnv.dat" file that is generated during cyto call in the vcf-folder needs to be kept in the vcf-folder for cyto annotate.
The vcfs files need to be zipped and indexed for cyto annotate, which means "--no-bgzip" flag cannot be turned on for the cyto vcf file generation if those vcf files are going to be used for cyto annotate command.
The "cyto annotate" step needs at least 5GB free space on the hard drive.

cyto help

Display more information on a specific command.

cyto version

Displays version information.

Troubleshooting and Additional Support

Tips for using the Command-line interface

When using command-line consider the following tips:

Spaces cannot be part of a file name in a command. If the file name has spaces, use quotes around the file name
To correct a typing error in a previously entered command, use the up arrow to repeat the previous command, then correct the error before re-entering it.
Double check the command. Misspelling, extra, or missing dashes, etc. will cause the command to be unrecognizable by the software.

Optimizing cluster files and copy number models

A (.egt) contains the cluster positions of every probe used for genotyping analysis. Illumina provides a standard cluster file for all commercial Infinium BeadChips. It may be desirable to create a custom cluster file if the one provided does not fit the data well or if a semi-custom or custom BeadChip, that do not come with a cluster file, are used. is the software used to create custom cluster files.

To facilitate the review and optimization of PGx variant GenTrain cluster positions, a GenomeStudio auxiliary file is provided for each PGx Array product through the and array product files page, e.g. . The auxiliary file is a tab-delimited text file that can be imported into GenomeStudio through Column Import. The file contains the Infinium Assay to PGx star allele mapping, covering the variants involved in DRAGEN Array PGx star allele calling.

When updating the cluster file for pharmacogenomic applications, understand the specifications for the copy number model file before beginning.

Before creating a custom cluster file, review the , the , and .

A (.dat) contains the data needed to make accurate copy number calls for pharmacogenomics. This file is used in the creation CNV VCFs which are inputs to the star allele calling command. Illumina provides a standard CN model file for all commercial PGx Infinium BeadChips. If it is determined the cluster file needs to be customized, the CN Model File should also be updated using the copy-number train command available with DRAGEN Array Local only. i.e.,

Use GenomeStudio 2.0 to generate a new cluster file.
Use the genotype call command to call genotypes and generate GTC files using IDAT files as input. dragena genotype call --bpm-manifest /user/productfiles/manifest.bpm --cluster-file /user/productfiles/new_clusterfile.egt --idat-folder /user/IDATs --output-folder /user/new_gtcs
Use the copy-number train command to retrain the copy number model. Note: The --platform option can be found in the

Note the difference in the cluster file requirement based upon the version of DRAGEN Array used:

Version 1.1+: If using a CN model with a different cluster file, the software will provide a warning but will proceed with copy number calling. As a result, a user can choose to keep using the commercial CN model from Illumina in combination with custom updated EGT file in the PGx analysis.
Version 1.0: The same cluster file used for copy number training must be used to generate GTC files for copy number calling. Otherwise, the software will produce an error and exit.

For reference, see the for details of copy-number train command.

To retrain the CN model file, 96 samples must be used at minimum with 90 of those samples passing QC defined as Log R Dev less than or equal to 0.2. It is recommended to train with at least 150 samples. A greater number of samples can be advantageous, but diminishing returns and longer computation times are seen after 3,000 samples.

It is recommended to manually QC the training samples and remove samples that have Log R Dev > 0.2, call rate < 0.99, or TGA Control probe < 1.0 so only the highest quality samples are used in the training. The same samples used to create the new cluster file should be used to retrain the CN Model. To minimize batch effect in the training sample set, the samples should be analyzed in as few batches as possible and come from the same reagent lots.

The copy-number train algorithm is designed with the assumption that the copy number distribution resembles the standard population distributions. This ensures the updated CN model file is representative of the normal populations in which it will be used to calculate copy number for key pharmacogenomic targets.

Pharmacogenomic analysis for semi-custom arrays

Semi-custom arrays add additional content or other pre-designed to enhance the commercial array content. This additional content can be analyzed for to obtain information on SNV and indel calls.

For , PGx CNV and star allele calls are limited to content included on the commercial Infinium PGx arrays. Additional semi-custom content will not be included in the pharmacogenomic results.

When designing a semi-custom array using a commercial Infinium PGx array backbone, such as the Global Diversity Array with enhanced PGx, it is important to retain all backbone content in the design as removing content could decrease the quality of result.

Pharmacogenomic analysis for semi-custom arrays should be run using . Because the PGx CNV calling and PGx star allele calling algorithms are only compatible with commercial product files (see ), to fully analyze semi-custom PGx beadchips some steps of the pipeline can be run twice; once with the semi-custom product files (to get complete semi-custom SNV VCF files), and once with the commercial product files (to get the PGx CNV VCF files, PGx Star Allele output, and metabolizer report).

The semi-custom product files can be used via the Command-line interface in genotype call, genotype gtc-to-vcf, and used in GenomeStudio, i.e.,

Use GenomeStudio 2.0 to prepare a custom cluster file for the semi-custom array, following guidance outlined in .
Open a command prompt (Windows) or terminal window (Linux) and navigate to the directory where the software was installed. Or a different, desired directory if the executable was added to the PATH environmental variable.
Use the genotype call command to call all semi-custom genotypes and generate custom content GTC files using IDAT files as input. dragena genotype call --bpm-manifest /user/productfiles/semi_custom_manifest.bpm --cluster-file /user/productfiles/semi_custom_clusterfile.egt --idat-folder /user/IDATs --output-folder /user/semi_custom_gtcs

Keep the GTC files and SNV VCF files generated using the semi-custom product files in clearly labelled folders to distinguish them from the GTC and SNV VCF files generated using the commercial product files. Note that the GTC and SNV VCFs generated using the commercial product files will not contain genotypes for the semi-custom/add-on content. The GTC and SNV VCFs generated using the semi-custom product files cannot be used for downstream PGx analysis commands.

dragena copy-number train --bpm-manifest /user/productfiles/manifest.bpm --genome-fasta-file /user/productfiles/genome.fa --gtc-folder /user/new_gtcs --platform LCG --output-folder /user/productfiles/new_cnmodel

DRAGEN Array v1.3

Overview

Welcome to DRAGEN Array

Product Guides

DRAGEN Array Methylation QC

hashtagMethylation QC Threshold Adjustment

Reference

Support and Additional Resources

hashtagTechnical Support

hashtagAdditional Resources

hashtag

Frequently Asked Questions

Release Notes

DRAGEN Array v1.3.0 + Emedgene V100.39.0 Release Notes

hashtagRELEASE DATE

hashtagRELEASE HIGHLIGHTS

DRAGEN Array v1.2.0 Release Notes

hashtagRELEASE DATE

hashtagRELEASE HIGHLIGHTS

DRAGEN Array v1.1.0 Release Notes

hashtagRELEASE DATE

hashtagRELEASE HIGHLIGHTS

hashtagNEW FEATURES IN DETAIL

hashtagKNOWN ISSUES

hashtagKNOWN LIMITATIONS

DRAGEN Array Methylation QC Cloud v1.0.1 Release Notes

hashtagRELEASE DATE

hashtagRELEASE HIGHLIGHTS

hashtagKNOWN LIMITATIONS

DRAGEN Array v1.0.0 Release Notes

hashtagRELEASE DATE

hashtagRELEASE HIGHLIGHTS

hashtagNEW FEATURES IN DETAIL

hashtagKNOWN ISSUES

hashtagKNOWN LIMITATIONS

DRAGEN Array Genotyping Cloud v1.0.0 Release Notes

hashtagRELEASE DATE

hashtagRELEASE HIGHLIGHTS

hashtagNEW FEATURES IN DETAIL

hashtagKNOWN ISSUES

hashtagKNOWN LIMITATIONS

DRAGEN Array Methylation QC Cloud v1.0.0 Release Notes

hashtagRELEASE DATE

hashtagRELEASE HIGHLIGHTS

hashtagNEW FEATURES IN DETAIL

hashtagKNOWN ISSUES

hashtagKNOWN LIMITATIONS

PGx CNV Coverage

Document Revision History

DRAGEN Array v1.0.0 Release Notes

hashtagRELEASE DATE

hashtagRELEASE HIGHLIGHTS

hashtagNEW FEATURES IN DETAIL

hashtagKNOWN ISSUES

hashtagKNOWN LIMITATIONS

DRAGEN Array Methylation QC Cloud v1.0.1 Release Notes

hashtagRELEASE DATE

hashtagRELEASE HIGHLIGHTS

hashtagKNOWN LIMITATIONS

Welcome to DRAGEN Array

DRAGEN Array v1.3.0 + Emedgene V100.39.0 Release Notes

hashtagRELEASE DATE

hashtagRELEASE HIGHLIGHTS

hashtagNEW FEATURES IN DETAIL

hashtagKNOWN ISSUES

hashtagKNOWN LIMITATIONS

PGx CNV Coverage

DRAGEN Array Genotyping Cloud v1.0.0 Release Notes

hashtagRELEASE DATE

hashtagRELEASE HIGHLIGHTS

hashtagNEW FEATURES IN DETAIL

hashtagKNOWN ISSUES

hashtagKNOWN LIMITATIONS

Release Notes

DRAGEN Array Methylation QC Cloud v1.0.0 Release Notes

hashtagRELEASE DATE

hashtagRELEASE HIGHLIGHTS

hashtagNEW FEATURES IN DETAIL

hashtagKNOWN ISSUES

hashtagKNOWN LIMITATIONS

Methylation QC Threshold Adjustment

Technical Support

Additional Resources

RELEASE DATE

RELEASE HIGHLIGHTS

RELEASE DATE

RELEASE HIGHLIGHTS

RELEASE DATE

RELEASE HIGHLIGHTS

NEW FEATURES IN DETAIL

KNOWN ISSUES

KNOWN LIMITATIONS

RELEASE DATE

RELEASE HIGHLIGHTS

KNOWN LIMITATIONS

RELEASE DATE

RELEASE HIGHLIGHTS

NEW FEATURES IN DETAIL

KNOWN ISSUES

KNOWN LIMITATIONS

RELEASE DATE

RELEASE HIGHLIGHTS

NEW FEATURES IN DETAIL

KNOWN ISSUES

KNOWN LIMITATIONS

RELEASE DATE

RELEASE HIGHLIGHTS

NEW FEATURES IN DETAIL

KNOWN ISSUES

KNOWN LIMITATIONS

RELEASE DATE

RELEASE HIGHLIGHTS

NEW FEATURES IN DETAIL

KNOWN ISSUES

KNOWN LIMITATIONS

RELEASE DATE

RELEASE HIGHLIGHTS

KNOWN LIMITATIONS

RELEASE DATE

RELEASE HIGHLIGHTS

NEW FEATURES IN DETAIL

KNOWN ISSUES

KNOWN LIMITATIONS

RELEASE DATE

RELEASE HIGHLIGHTS

NEW FEATURES IN DETAIL

KNOWN ISSUES

KNOWN LIMITATIONS

RELEASE DATE

RELEASE HIGHLIGHTS

NEW FEATURES IN DETAIL

KNOWN ISSUES

KNOWN LIMITATIONS

Technical Support

Additional Resources

RELEASE DATE

RELEASE HIGHLIGHTS

NEW FEATURES IN DETAIL

KNOWN ISSUES

KNOWN LIMITATIONS

Methylation QC Threshold Adjustment

DRAGEN Array Methylation QC and GenomeStudio Methylation Module Differences

RELEASE DATE

RELEASE HIGHLIGHTS

NEW FEATURES IN DETAIL

KNOWN ISSUES

KNOWN LIMITATIONS

IDAT Files

Manifest Files

Cluster File

PGx CN Model File

Cytogenetics Model File

Mask File

PGx Database File

Cytogenetics Database File

Genome FASTA Files

Sample Sheet

Methylation QC sample sheet

Cytogenetics analysis + Emedgene interpretation sample sheet

Input File Summary Table