arrow-left

Only this pageAll pages
gitbookPowered by GitBook
1 of 26

DRAGEN Array v1.3

Overview

Loading...

Loading...

Product Guides

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Reference

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Welcome to DRAGEN Array

DRAGEN (Dynamic Read Analysis for GENomics) Array secondary analysis is a powerful bioinformatics software for Illumina Infinium array-based assays. DRAGEN Array uses cutting-edge data analysis tools to provide accurate, comprehensive, and highly efficient secondary analysis to maximize genomic insights and meet your research needs across multiple applications.

DRAGEN Array is offered as a local package with command-line interface (no specialized server or hardware required) and as a cloud-based package with an intuitive graphical user interface, as summerized in the table below.

Description
Key features
Local analysis
Cloud analysis

Genotyping

This product documentation describes the installation and setup, analysis execution, and result outputs. For the latest updates and release details, see the . See for additional details on DRAGEN Array genotyping, PGx CNV calling and PGx star allele annotation.

Release Notes

The following versions of DRAGEN Array have been released:

DRAGEN Array v1.2.0 Release Notes
  • DRAGEN Array v1.2.0 EMGv38 Automatic Case Creation Release Notes

  • DRAGEN Array v1.1.0 Release Notes

  • DRAGEN Array v1.0.0 Release Notes

    • DRAGEN Array Genotyping Cloud v1.0.0 Release Notes

    • DRAGEN Array Methylation QC Cloud v1.0.0 Release Notes

  • DRAGEN Array v1.3.0 Release Notes
    DRAGEN Array v1.3.0 + Emedgene V100.39.0 Release Notes

    Provides genotyping results for any human Infinium genotyping array.

    • Greater than 99.5% genotyping accuracy

    • Genotyping VCF in as little as 35 seconds per sample

    PGx – CNV calling

    Provides CNV calling on 7 target PGx genes across 10 target regions, plus genotyping outputs for Infinium microarrays with enhanced PGx content.

    • Greater than 95% PGx CNV accuracy

    PGx – star allele annotation

    Provides PGx star allele and variant coverage across 2400+ targets for over 50 genes, plus PGx CNV and genotyping outputs for Infinium microarrays with enhanced PGx content.

    • Assess hard to discern PGx genes, including the elusive CYP2D6 with greater than 97% call rate

    • Obtain all PGx analysis results in ~1 minute per sample

    Methylation QC

    Provides high-throughput, quantitative methylation quality control for Infinium methylation arrays.

    • 21 algorithm-based quantitative control metrics with adjustable thresholds

    • Data summary plots

    • Proportion of CG probes passing with user defined p-value threshold

    Cytogenetics analysis

    Provides cytogenetic CNV calling and LOH (loss of heterozygosity) detection for human Infinium arrays.

    • Multiple output formats including CNV/LOH VCFs, annotated QC JSONs, and bedgraph files for Log R Ratio and B-Allele Frequency visualization

    • Adjustable algorithm thresholds such as minimum deletion, duplication, and LOH sizes and smoothing parameters

    Cytogenetics analysis + Emedgene interpretation

    Provides cytogenetic CNV calling and LOH (loss of heterozygosity) detection for human Infinium arrays with added visualization and case management in Emedgenearrow-up-right

    • Multiple output formats including CNV/LOH VCFs, annotated QC JSONs, and bedgraph files for Log R Ratio and B-Allele Frequency visualization

    • Adjustable algorithm thresholds such as minimum deletion, duplication, and LOH sizes and smoothing parameters

    DRAGEN Array Release Notes
    Introducing DRAGEN™ Array 1.0 for Infinium™ Array-Based Pharmacogenomics Analysisarrow-up-right

    DRAGEN Array Methylation QC Cloud v1.0.0 Release Notes

    hashtag
    RELEASE DATE

    May 2024

    hashtag
    RELEASE HIGHLIGHTS

    • Adjustable thresholds to determine pass/fail status

    • Data summary plots for a quick visual check of each analysis batch

    • Determining detection p-value, beta-values, and m-values from each methylation sample

    • Deployment on BaseSpace™ Sequence Hub user interface for easy analysis kickoff

    hashtag
    NEW FEATURES IN DETAIL

    • Adjustable thresholds for 21 built in controls, p-value detection, proportion probes passing, and offset correction within BaseSpace Sequence Hub to customize for user’s study needs

      • Thresholds are used to assign pass (1) or fail (0) status to each sample

        • Failed metrics can be highlighted for easy viewing

    hashtag
    KNOWN ISSUES

    • Analysis may fail for the 80-100 sample size range when using large (>900K probes) arrays. If encountering this issue users are recommended to increase sample size as a workaround. The issue does not affect sample sizes strictly greater than 100.

    hashtag
    KNOWN LIMITATIONS

    • Standard thresholds may not be applicable for all discontinued, semi-custom or custom BeadChips and IDATs originating from NextSeq550

    • Built-in controls may not be available on all discontinued, semi-custom or custom BeadChips

    Frequently Asked Questions

    1. Is DRAGEN Array analysis a local (on-premises) or cloud solution? DRAGEN Array analysis is available locally (on-premises) and cloud.

      DRAGEN Array Local Analysis utilizes a command-line interface for power users to have granular control and flexibility to support large scale microarray genomic studies. Deployed on Windows or Linux operating systems, the local package is CPU-based and does not require a specialized server or hardware.

      DRAGEN Array Cloud Analysis utilizes the user-friendly, graphical interface of BaseSpace Sequence Hub to simplify analysis setup and kickoff.

    2. Which Infinium arrays is DRAGEN Array compatible with? Refer to the Product and Analysis Compatibility table in the section.

    3. How many samples are needed per analysis? Cytogenetics: As few as one sample can be used for cytogenetics. Multiple analysis batches can be kicked off and run in parallel.

      Genotyping: As few as one sample can be used for genotyping. Multiple analysis batches can be kicked off and run in parallel.

      Pharmacogenomics: A minimum of 24 samples is required for PGx CNV calling with 22 passing QC. Passing QC is defined as Log R Dev < 0.2. 96 samples are recommended for the most accurate CNV results. Multiple analysis batches can be kicked off and run in parallel.

    4. Which PGx CNVs and star alleles are available? Please refer to the DRAGEN Array .

    5. Where can I find demo data? Demo data is available in BaseSpace under the “Demo Data” section. All array data starts with “iScan:” and includes the name of the type of analysis. Supported types of analysis can be found in the section.

    6. Where can I find assay QC metrics? Dragen Array currently provides functional QC metrics such as call rate and Log R Dev. See for instructions on how to evaluate assay QC using GenomeStudio.

    Document Revision History

    The version history for DRAGEN Array product documentation:

    Version
    Date
    Description of Change

    DRAGEN Array Genotyping Cloud v1.0.0 Release Notes

    hashtag
    RELEASE DATE

    March 2024

    hashtag
    RELEASE HIGHLIGHTS

    DRAGEN Array v1.3.0 + Emedgene V100.39.0 Release Notes

    hashtag
    RELEASE DATE

    October 2025

    hashtag
    RELEASE HIGHLIGHTS

    Pinpoint areas of failure including bisulfite conversion, staining, hybridization, etc. to identify assay steps in need of troubleshooting

  • Quantitative values for each control removing ambiguity with manual interpretation

  • Data summary plots with information on passing p-value detection and principal component analysis of beta values

  • Provides detection p-value, beta-values and m-values for each CG site per sample to use in downstream analysis

  • Ability to genotype and produce related reports for human and non-human arrays in the cloud.

  • Configureable interfaces in Basespace that allows for flexibility and easy kick off.

  • hashtag
    NEW FEATURES IN DETAIL

    • SNV VCF File

    • Final Report

    • Locus Summary

    hashtag
    KNOWN ISSUES

    • Some multi-nucleotide variant (MNV) designs reverse compliment the "Allele1/2 Top" fields in the Final Report

    hashtag
    KNOWN LIMITATIONS

    • Genotyping only works on diploid organisms at this time. Polyploid genotyping is not currently supported.

  • Release of DRAGEN Array – Cytogenetics analysis + Emedgene interpretation 1.3.0

  • VCFs generated on Windows now compatible with Emedgene

  • hashtag
    NEW FEATURES IN DETAIL

    • See new features for DRAGEN Array – Cytogenetics analysis in the 1.3.0 release notes.

    • See Emedgene V100.39.0 new features in the V100.39.0 release notesarrow-up-right.

    hashtag
    KNOWN ISSUES

    • See known issues for DRAGEN Array – Cytogenetics analysis in the 1.3.0 release notes.

    • See known issues for Emedgene in the V100.39.0 release notesarrow-up-right.

    hashtag
    KNOWN LIMITATIONS

    • See known limitations for DRAGEN Array – Cytogenetics analysis in the 1.3.0 release notes.

    • See Emedgene V100.39.0 limitations in the V100.39.0 release notesarrow-up-right.

    • See the Prerequisites section in the cloud setup guide for detailed setup instructions. The following limitation applies if these prerequisites are not met:

      • The "DRAGEN Array - Cytogenetics analysis + Emedgene interpretation" analysis type is available to all users regardless of Emedgene (EMG) subscription status or SNS notification settings. The software does not enforce an EMG subscription in the workgroup. Without an EMG subscription and SNS configuration, the analysis will start and run as the "DRAGEN Array – Cytogenetics analysis" type; however, "Automatic Case Creation on EMG" will not occur.

    Applications
    release notes
    Applications
    Support and Additional Resources

    Added details for DRAGEN Array methylation QC pipeline v1.0.0 release. Error correction in the CNV VCF example (CN=4 to CN=5).

    04

    September 2024

    DRAGEN Array v1.1.0 release

    05

    February 2025

    DRAGEN Array v1.2.0 release

    06

    February 2025

    Updated DRAGEN Array v1.2.0 release notes

    07

    June 2025

    • Updated DRAGEN Array v1.2.0 release notes: added gtc-to-bedgraph LRR smoothing bug. "Bedgraph Smoothing window size" disabled in .

    • Added details for pipeline 1.2.0 release and corresponding

    08

    August 2025

    DRAGEN Array v1.3.0 release. Rename of "CNV and LOH Calling" to "Cytogenetics analysis"

    09

    October 2025

    DRAGEN Array - Cytogenetics analysis + Emedgene interpretation cloud pipeline v1.3.0 release.

    01

    December 2023

    Initial release.

    02

    March 2024

    Added details for DRAGEN Array v1.0.0 cloud genotype pipeline release.

    03

    May 2024

    DRAGEN Array v1.2.0 Emedgene V38.0 Automatic Case Creation Release Notes

    hashtag
    RELEASE DATE

    June 2025

    hashtag
    RELEASE HIGHLIGHTS

    • Automatic case creation in Emedgene (EMG) following the successful completion of a DRAGEN Array - Cytogenetics analysis + Emedgene interpretation analysis from Basespace (powered by ICA).

    hashtag
    NEW FEATURES IN DETAIL

    • See existing features for DRAGEN Array Cytogenetics analysis in the

    • For more details on the EMGv38 features, see these .

    hashtag
    KNOWN ISSUES

    • See existing issues for DRAGEN Array Cytogenetics analysis in the

    • For more details on the EMGv38 known issues, see these .

    hashtag
    KNOWN LIMITATIONS

    • Please see the section in the cloud setup page for detailed guidance on how to setup this analysis. The following limitation applies if these prerequisites are not met:

      • The "DRAGEN Array - Cytogenetics analysis + Emedgene interpretation" analysis type is available to all users regardless of Emedgene (EMG) subscription status or SNS notification settings. The software does not enforce an EMG subscription in the workgroup. Without an EMG subscription and SNS configuration, the analysis will start and run as the "DRAGEN Array – Cytogenetics analysis" type; however, "Automatic Case Creation on EMG" will not occur.

    DRAGEN Array v1.2.0 Release Notes

    hashtag
    RELEASE DATE

    February 2025

    hashtag
    RELEASE HIGHLIGHTS

    DRAGEN Array v1.1.0 Release Notes

    hashtag
    RELEASE DATE

    September 2024

    hashtag
    RELEASE HIGHLIGHTS

    Support and Additional Resources

    hashtag
    Technical Support

    For support, questions, and feedback on DRAGEN Array, please contact Illumina Tech Support at .

    hashtag
    Additional Resources

    PGx CNV Coverage

    Copy number variation can be detected for genes and regions listed below. The chromosome locations are GRCh38 based.

    Gene
    Region Name
    Chromosome
    Start
    End
    1.2.0 release notes
    release notesarrow-up-right
    1.2.0 release notes
    release notesarrow-up-right
    Prerequisites
    cloud interface
    DRAGEN Array - Cytogenetics analysis + Emedgene interpretation
    release notes

    4

    68537222

    68568499

    CYP2E1

    CYP2E1

    10

    133527374

    133539096

    SULT1A1

    SULT1A1

    16

    28603587

    28613544

    CYP2A6

    CYP2A6.intron.7

    19

    40844791

    40845293

    CYP2A6

    CYP2A6.exon.1

    19

    40850267

    40850414

    CYP2D6

    CYP2D6.exon.9

    22

    42126498

    42126752

    CYP2D6

    CYP2D6.intron.2

    22

    42129188

    42129734

    CYP2D6

    CYP2D6.p5

    22

    42130886

    42131379

    GSTT1

    GSTT1

    22_KI270879v1_alt

    270316

    278477

    GSTM1

    GSTM1

    1

    109687842

    109693526

    UGT2B17

    UGT2B17

  • Whole-genome copy number and loss of heterozygosity (LOH) calling, with VCF output format, for any human genotyping array.

  • B-allele frequency bedgraph output file to power informative CNV visualizations.

  • Additional outputs including ISCN and cytoband nomenclature to support cytogenetics applications.

  • hashtag
    NEW FEATURES IN DETAIL

    • Cytogenetics Calling and VCF Output

      • Ability to obtain output files for any human genotyping array. Detection abilities vary by array probe density and spacing.

      • Detects copy number up to 4+.

      • Provides Phred scaled quality score to assess the event quality.

      • Addition of mosaic tagging to detect mosaic deletions and duplications.

      • Three arrays tested for performance including:

        • Infinium Global Diversity Array with Cytogenetics-8

        • Infinium Global Screening Array with Cytogenetics-24

      • Ability to adjust minimum size and probe number for copy number and LOH event calling

    • BAF and LRR Bedgraph files

      • Additional bedgraph file output for B-allele frequency (BAF) for use in visualization. Updated file extensions to differentiate BAF.bedgraph and LRR.bedgraph files.

      • Added a smoothing parameter to the genotype gtc-to-bedgraph command for LRR.bedgraph (log R ratio bedgraph file) generation for improved visualization.

    • Cytogenetic annotation and JSON Output

      • Provides summary statistics per sample and per CNV/LOH event. Includes gene count and gene names within each event based on the RefSeq database.

      • Annotates each event using International System for Human Cytogenomic Nomenclature (ISCN) 2020 and cytoband nomenclature based on Ensembl database.

    • Pharmacogenomics

      • Added root command pgx for grouping PGx copy number and star allele calling.

      • Fixed issue causing pgx star-allele annotate command to fail mid-analysis from version 1.1.

    hashtag
    KNOWN ISSUES

    • If a sample's sex estimate is called as unknown in the genotyping module, the cytogenetic caller will assume the sample is male. Consequently, detection results on sex chromosomes could be inaccurate if the sample is actually female.

    • ISCN annotations in the cytogenetic annotation JSON output file are only provided for variants greater than 1 Kb in length. This is often cited as a minimum size limit used to define copy number variants.

    • ISCN annotations are not provided for LOH variants in the cytogenetic annotation JSON output file.

    • Centromere regions typically have low sequence complexity and are prone to artifacts. As a result, cytogenetic calling results in these regions are likely to be false positives.

    • The cyto annotate subcommand produces extraneous logs (e.g., No credential is provided) that can be safely ignored.

    • During cyto call, there is a log for the CytoPlatform currently hardcoded to LCG regardless of the product used. This has no bearing on the underlying algorithm and is just what is reported in the log. It can be safely ignored.

    • A non-default value of the --smoothing parameter for the command triggers a bug causing wrong values in the LogR Ratios (LRR) bedgraph. It is advised users use the default (0), which produces a valid LRR bedgraph with raw signal for visualization purposes. The --smoothing parameter will be disabled in next release of DRAGEN Array.

    • The cyto call command may throw an overflow error in very rare cases when no variants are detected in noisy or low-quality samples. Contact [email protected] if you encounter this issue.

    • The minimum deletion/LOH/duplication thresholds shown in the cyto annotation JSON may be shown in the wrong units when set higher than the cyto calling thresholds.

    • Cyto CNV/LOH variants with quality scores of 0 seen in the cyto call VCF files cannot be passed into the annotation output json files.

    • CYP2A6 *1 definition incorrectly includes NC_000019.10:g.40848264_40848265delinsT.

    • DRAGEN Array – Cytogenetics Calling and DRAGEN Array - Cytogenetics analysis + Emedgene interpretation cloud analyses may fail around 200 samples in one batch due to high memory usage. Recommended workaround is to run smaller batches.

    • Sample with a reference allele for ABCG2 genes will have missing phenotype annotations when running the command pgx star-allele annotate.

    hashtag
    KNOWN LIMITATIONS

    • DRAGEN Array Cytogenetics analysis is intended for constitutional samples only, oncology samples not supported at this time.

    • DRAGEN Array Cytogenetics analysis was only validated for specific array platforms (Infinium Global Diversity Array with Cytogenetics-8, Infinium Global Screening Array with Cytogenetics-24, Infinium CytoSNP-850K BeadChip using the iScan System).

    • DRAGEN Array Cytogenetics analysis may call large events that are broken into smaller pieces and require visual confirmation.

    • DRAGEN Array Cytogenetics analysis does not produce mosaic fraction estimation or mosaic ISCN notation at this time.

    • When using CytoSNP-850Kv1-4_iScan_B, GSACyto-24v1_20044998_C, or GDACyto-8v1-0_20047166_E manifests, DRAGEN Array Cytogenetics analysis will be unable to call events or visualize probes in the PAR (pseudo-autosomal regions). Please reach out to [email protected] for additional details.

    • GT is hardcoded to homozygous alt (1/1) for cyto VCF entries.

    • IDATs originating from NextSeq550 not tested.

  • New EX PGx beadchips enabled for PGx analysis

  • Increased coverage of high priority PGx genes

  • Custom optimized .egt files accepted in PGx analysis

  • Up-to-date database reflecting latest versions of public PGx resources

  • DPWG guidelines now available for metabolizer status calling on cloud analysis

  • hashtag
    NEW FEATURES IN DETAIL

    • DRAGEN Array supports multiple PGx products

      • Two new EX PGx beadchips enabled through genotyping, PGx CNV calling, and star allele annotation

        • Infinium Global Screening Array with Enhanced PGx-48 v4.0 Kit

        • Infinium Global Clinical Research Array with Enhanced PGx-24 v1.0 Kit

      • In total 3 PGx products supported: GDA-ePGx, GSAv4-ePGx, GCRA-ePGx. See the for more details.

      • Increased coverage of high priority PGx genes

      • Star allele annotation now covers CYP2E1, CYP1A2, ABCG2, CYP2C8, HMGCR, UGT1A4, UGT2B15, F13A1, and HLA-B*15:02

      • CNV calling now covers SULT1A1

      • Extended bi-allelic PGx variants from source databases to multi-allelic variants based on the designs in the supported PGx products.

      • See and for the full coverage lists.

    • Allows flexibility for GTCs generated with a custom cluster file (.egt) to be used with the commercial CN model file (.dat). This alleviates the burden to retrain the CN model file.

      • The cluster file is a required input for the genotype call command in DRAGEN Array. The CN (Copy Number) model file is a required input to the copy-number call command to enable accurate copy number calling for pharmacogenomics. Custom cluster files and CN model files may be required for optimal genotyping and PGx performance. See section Optimizing cluster files and copy number models for additional details.

    • Database revision reflecting updates.

    • Standardization of star allele JSON output file

      • Renamed databaseSources to phenotypeDatabaseSources and starAlleleDatabaseSources

      • Renamed Phenotype to PhenotypeDatabaseAnnotation

    • Updated VCF tabix indexing, improving performance and disk usage for SNV VCF.

    hashtag
    KNOWN ISSUES

    • Some simple variants have REF and ALT delimited by _ instead of > in the star_alleles.csv and metabolizer status JSON files (e.g., "ryr1.38577931a_c" instead of "ryr1.38577931a>c")

    • Some multi-nucleotide variant (MNV) designs reverse compliment the "Allele1/2 Top" fields in the Final Report

    • Occasional star-allele solution score discorcordance between Linux and Windows OS with concordant solution ranking.

    • Rare intermittent memory issues during star allele calling. Example error message: The model has been changed since the solution was last computed.. To workaround the issue, user should restart star allele calling or run it on a machine with more memory.

    • The new license server (license.dragen.illumina.com) will not work (i.e., returns "No valid licenses found.") for local star allele calling. Users should continue to point to license.edicogenome.com.

    • Star allele annotation can fail mid-analysis in rare circumstances when a particular allele is unknown (e.g. for CYP2E1). The observed cases have all been mis-calls for CYP2E1 due to cluster drift. See for more details.

    hashtag
    KNOWN LIMITATIONS

    • Star allele calling does not support novel alleles but those defined in the PharmVar and PharmGKB databases.

    • CYP2D6 non-*36 star alleles with exon 9 conversion, such as *83, are reported as *36 with *83 as an underlying allele.

    • Genotyping only supports diploid organisms. Polyploid genotyping is currently not supported.

    • DRAGEN Array were only validated and intended to be used for commercial PGx beadchips with specified manifests (see table above). PGx star allele annotation is not backwards compatable with v1.0 manifest version, e.g., GDA_PGx-8v1-0_20042614_E2 is supported in DRAGEN Array v1.0, GDA_PGx-8v1-0_20042614_G2 is supported in DRAGEN Array v1.1.

    • Command line options unsquash-duplicates and filter-loci for gtc-to-vcf conversion should not be used when star allele calling is desired. In addition, VCFs must be gzipped and tabix indexed (the default for gtc-to-vcf) to be used in star allele calling.

    Resource
    Description

    Product features and benefits and allows product ordering.

    Support site for DRAGEN Array which includes installers and product documentation.

    Illumina Software Resources article with technical details on DRAGEN Array v1.0 Methylation QC.

    Illumina Software Resources article with technical details on DRAGEN Array v1.0 PGx analysis.

    Lab setup and maintenance information for Infinium assays.

    Instructions for evaluating assay controls using GenomeStudio

    hashtag

    [email protected]envelope
    Infinium CytoSNP-850K BeadChip using the iScan System
    Bedgraph files are compatible with IGV (Integrative Genomics Viewer) for visualization purposes.
    genotype gtc-to-bedgraph
    Combined missingVariants and allMissingVariants to missingVariantSites
  • JSONized supportingVariants and missingVariants at the gene and candidate solution allele levels

  • Removed redundant info in the Alleles fields

  • Product & Analysis Compatibility table
    PGx Star Allele Coverage
    PGx CNV Coverage
    PGx Allele Definitions and PGx Guidelines
    Optimizing cluster files

    GenomeStudio Genotyping: Evaluating Infinium Assay Controlsarrow-up-right

    Video instructions for evaluating assay controls using GenomeStudio

    Infinium Assay Consumables & Equipment Listarrow-up-right

    List of consumables and equipment used in Infinium assays.

    iScan System Product Documentationarrow-up-right

    Instructions for operating and maintaining the iScan System.

    Polygenic Risk Score – Predictarrow-up-right

    Instructions for using the Polygenic Risk Score – Predict Module.

    Illumina Connected Analyticsarrow-up-right

    Instructions for using the hosted environment Illumina Connected Analytics.

    BaseSpace Sequence Hubarrow-up-right

    Instructions for using the hosted environment BaseSpace Sequence Hub.

    Emedgenearrow-up-right

    Instructions for using Emedgene software

    DRAGEN Array Webpagearrow-up-right
    DRAGEN Array Support Sitearrow-up-right
    DRAGEN Array Methylation QC analysisarrow-up-right
    DRAGEN Array PGx Analysisarrow-up-right
    Infinium Lab Setup and Best Practicesarrow-up-right
    Evaluation of Infinium Genotyping Assay Controls using GenomeStudioarrow-up-right

    DRAGEN Array v1.3.0 Release Notes

    hashtag
    RELEASE DATE

    August 2025

    hashtag
    RELEASE HIGHLIGHTS

    • Provides mosaic fraction estimation for mosaic events.

    • Improved accuracy of sex chromosome calling, including pseudo-autosomal regions (PAR).

    • New QC metrics available in cytogenetics JSON output.

    hashtag
    NEW FEATURES IN DETAIL

    • Genotyping & Core

      • GenomeStudio backwards compatible samplesheet support and related deprecation of separate IDAT and GTC samplesheets.

      • User-defined data from the samplesheet will get passed to gt_sample_summary files during genotyping.

    hashtag
    KNOWN ISSUES

    • The does not handle empty columns. For example this samplesheet:

    Will throw the following error: System.ArgumentException : Duplicate column found. Column names are case-insensitive. Please remove or rename the column from the samplesheet and re-process. And this example:

    Will produce an empty column/field in the , e.g.,

    hashtag
    KNOWN LIMITATIONS

    • If the genotyping module reports an unknown sex and the cytogenetic caller cannot resolve it, the caller assumes the sample is male. As a result, sex chromosome detection may be inaccurate if the sample is actually female. This behavior is not currently output in the log.

    • ISCN annotations in the cytogenetic annotation JSON output file are only provided for variants greater than 1 Kb in length. This is often cited as a minimum size limit used to define copy number variants.

    • Centromere regions typically have low sequence complexity and are prone to artifacts. As a result, cytogenetic calling results in these regions are likely to be false positives.

    Field name
    Number of differences
    • Note: All overall solutions tested for comparison were found to be concordant.

    • DRAGEN Array v1.3 is not compatible with Emedgene (EMG) v38. I.e., it does not support automatic case creation and you can't manually upload from v1.3 into EMG. Users should continue to use DRAGEN Array - Cytogenetics analysis + Emedgene interpretation 1.2.0 for DRAGEN Array + EMG cyto analyses.

    DRAGEN Array v1.0.0 Release Notes

    hashtag
    RELEASE DATE

    December 2023

    hashtag
    RELEASE HIGHLIGHTS

    • Improved star allele calling accuracy for Global Diversity Array with enhanced PGx (GDA-ePGx) BeadChips.

    • Reports star allele calls with quality scores for greater transparency and confidence.

    • Provides missing variant reporting to improve data quality.

    hashtag
    NEW FEATURES IN DETAIL

    • Star Allele Calling

      • Star allele calling for genes listed in

        • For in-silico datasets, call rate ≥99%, diplotyping accuracy ≥ 90%

    hashtag
    KNOWN ISSUES

    • Corrupt or invalid GTC files will abort with an error instead of skipping. The corrupt or invalid GTC files will need to be removed before proceeding.

    • In the gtc-to-vcf subcommand a mismatch between BPM and CSV manifests will not cause the command to abort with an error. The mismatch will need to be addressed before proceeding.

    • For gtc-to-vcf, multi-allelic variants designed with multiple assays might not always collapse into one variant correctly and be reported as two separate variants instead. Some indel variants are missing from SNV VCF due to mapping issue between the designed indels and the reference genome.

    There is a workaround to disable globalization and produce valid GTC files:

    1. Locate the dragena.runtimeconfig.json file inside the installation directory of DRAGEN Array (i.e., where the .zip or .tar.gz file was downloaded and unzipped).

    2. Add the key System.Globalization.Invariant to that file and set it's value to true. (i.e., step #2 here: https://github.com/dotnet/corefx/blob/master/Documentation/architecture/globalization-invariant-mode.md#enabling-the-invariant-mode)

    hashtag
    KNOWN LIMITATIONS

    • PGx CNV calling and star allele calling and annotation were only validated and intended to be used with GDA_PGx_E2 product files.

    • Using subcommands “unsquash-duplicates” and “filter loci” during gtc-to-vcf conversion should not be used when star allele calling is desired.

    • Only CPIC guidelines are available for star allele annotation (metabolizer status calling) for the cloud offering. For local, CPIC and DPWG are available.

    DRAGEN Array Methylation QC

    hashtag
    Methylation QC Threshold Adjustment

    When using DRAGEN Array – Methylation – QC cloud analysis type, additional customization options will appear after product files are selected within Configuration Settings. Adjustments to these thresholds will be saved as part of the Configuration Setting. Thresholds can be adjusted based on study objectives. Adjusting thresholds will impact the pass or fail status of samples in the output files.

    Illumina recommends thresholds for MethylationEPIC v1 & v2 and Methylation Screening Array (MSA). Users may use these thresholds as a starting point when defining thresholds for their custom or semi-custom BeadChip or other Infinium Methylation arrays. Further tuning may be required based on BeadChip used, laboratory conditions, iScan settings, bisulfite conversion methods, FPPE sample type, etc. A dataset deemed acceptable to the user based on proportion probes passing can be used for these additional threshold adjustments.

    DRAGEN Array Cytogenetics Analysis

    hashtag
    DRAGEN Array - Cytogenetics Analysis

    hashtag
    Cytogenetics Threshold Adjustment

    When using

    Samples that fail IDAT->GTC conversion during genotype call will be added to the gt_sample_summary instead of skipped. For these samples, the Autosomal Call Rate and Call Rate will be set to 0 while the Log R Ratio Std Dev and TGA_Ctrl_5716 Norm R (when applicable for PGx products) are set to NaN.
  • Cytogenetics

    • Fixed an issue causing cyto calling to crash due to overflow errors for noisy samples in v1.2.

    • Fixed a memory issue in v1.2 that limited the number of samples able to run to about 200.

    • Improved accuracy of length normalized median copy number calculation by removing lower limit on included variant size (1 Kbp).

    • Added reporting of mosaic fraction for mosaic events.

    • Added a method for promoting mosaic events above user-defined mosaic fraction.

    • Reduced verbosity in STDOUT messages produced by annotate command.

    • Added sample-level Median Log R Dev statistic to annotate JSON output.

    • Added chromosome-level QC metrics to the annotation JSON output.

    • Added an event-level QC metric effective size to the JSON output.

    • Variants are filtered by their effective size in the cyto call command. In the cyto annotate command, they are filtered by the raw size.

    • Fixed a bug whereby the minimum deletion/LOH/duplication thresholds were shown in the wrong units in the annotation JSON, when set higher than the calling thresholds.

    • Fixed a bug that prevented cyto CNV variants with quality scores of 0 from appearing in the output json files.

    • The cytogenetic caller now attempts to resolve sample sex if previously classified as unknown by the upstream genotyping module, enabling more accurate results. A log message is generated when sex is resolved, e.g., "Sample XXX sex updated from Unknown to Male."

  • Pharmacogenomics

    • Fixed a bug in the pgx star-allele annotate command, sample with a reference allele for ABCG2 genes will now be annotated properly using default annotation "Normal" for reference alleles.

    • Corrected the CYP2A6 *1 definition. Removed NC_000019.10:g.40848264_40848265delinsT variant that was incorrectly added to the CYP2A6 *1 definition

  • ISCN annotations are not provided for LOH variants in the cytogenetic annotation JSON output file.

  • DRAGEN Array Cytogenetics analysis is intended for constitutional samples only, oncology samples not supported at this time.

  • DRAGEN Array Cytogenetics analysis is validated only for specific array platforms: Infinium Global Diversity Array with Cytogenetics-8, Infinium Global Screening Array with Cytogenetics-24, and Infinium CytoSNP-850K BeadChip (iScan System).

    • Note: DRAGEN Array can process IDAT files from the NextSeq550 for cytogenetic analysis, but this setup hasn’t been formally validated. If you're interested in trying it, check out the demo data in the ‘Demo Data’ section on BaseSpace, which was generated using the iScan system.

  • DRAGEN Array Cytogenetics analysis may call large events that are broken into smaller pieces and require visual confirmation.

  • GT is hardcoded to homozygous alt (1/1) for cyto VCF entries.

  • Tabix indexing from DRAGEN Array is not exactly the same as bcftools index --tbiarrow-up-right. For instance, if you run bcftools index --stats in.vcf.gz or bcftools index --nrecords in.vcf.gz, with certain versions of bcftools, you may get the following error: index of in.snv.vcf.gz does not contain any count metadata. Please re-index with a newer version of bcftools or tabix.. If these tools are critical to user's bioinformatics pipelines a workaround would be to unzip and re-index DRAGEN Array VCFs using bcftool's tabix. But please note, these index files may not work in downstream VCF-based DRAGEN Array commands like pgx star-allele call. Please use DRAGEN Array end-to-end for analysis flows like the ones detailed in the Quick Start guide.

  • There can be some minor differences when running pgx star-allele call on Windows vs. Linux. During verification testing, out of 1576 samples, we noticed the following discordance:

  • Collapsed Star-Alleles

    2

    Missing/Masked Core Variants

    1

    Solution Long

    1

    Supporting Variants

    2

    samplesheet
    Genotype Sample Summary files
    Cytogenetics VCF Files

    Includes reporting of the hybrid star alleles and allelic specific copy number

  • Provides quality score that estimates confidence in the star allele call as an additional quality metric

  • Star allele call rate increased through more robust error tolerance and missing data tolerance

    • Supporting variants and missing variants are listed and can be further reviewed

    • Quality score indicates confidence in result considering the missing data

  • Reports alternative ranked PGx star allele solutions

    • Allows an alternative to be investigated which may be desirable for samples with low confidence calls

    • Provides quality score (negative log likelihood) for alternative solutions

  • Function annotations for PGx genes listed in section PGx Allele Definitions and PGx Guidelines

    • Metabolizer and function annotations are supported for two sets of guidelines from CPIC and DPWG respectively

    • Activity scores are provided for CYP2C9, CYP2D6, and DPYD

  • CNV VCF

    • CNV coverage for genes listed in PGx CNVs Coverage

    • Compressed and indexed files for size reduction and faster reading

    • Updated VCF header description to indicate copy number of 5 may be reported by the software

    • Revised filter field delimiter to comply with VCF 4.3 specification which allows VCF parsing software to parse the file successfully

  • Genotyping VCF

    • Compressed and indexed files for size reduction and faster reading

  • Manifest names greater than 80 characters will cause failure when converting IDATs to GTCs.

  • Symbolic links for VCFs are not supported as the inputs to the “star-allele call” subcommand.

  • The local Linux CLI and Cloud offering do not sort the star_alleles.csv and various fields in the metabolizer_status.json. The local Windows CLI does.

  • The new license server (license.dragen.illumina.com) will not work (i.e., returns "No valid licenses found.") for local star allele calling. Users should continue to point to license.edicogenome.com.

  • GTC files do not support non-ASCII characters. This is especially problematic when running DRAGEN Array local if operating system locale settings are not English-based (e.g., en-US) as internal datetime fields could write non-ASCII characters. This will result in the following error:

  • Re-generate the GTC using the genotype call subcommand.
    PGx Star Allele Coverage
    SentrixBarcode_A,SentrixPosition_A,,
    204753010023,R02C01,,
    SentrixBarcode_A,SentrixPosition_A,
    204753010023,R02C01,
    {
       "SentrixBarcode_A": "204753010023",
       "SentrixPosition_A": "R02C01",
       "Sample ID": "204753010023_R02C01",
       "Sample Name": "204753010023_R02C01",
       "Sample Folder": "/tmp",
       "Autosomal Call Rate": 0.99,
       "Call Rate": 0.99,
       "Log R Ratio Std Dev": 0.15,
       "Sex Estimate": "F",
       "": ""
    }
    fail:  ArrayAnalysis.CLI.App[0] 
            [07:17:07 6620]: System.IO.EndOfStreamException: Unable to read beyond the end of the stream. 
                 at System.IO.BinaryReader.ReadString() 
                 at ArrayAnalysis.Core.GtcFileLoader..ctor(String filePath) in /src/ArrayAnalysis.Core/GtcFileLoader.cs:line 161 
                 at ArrayAnalysis.Services.GtcFactory.CreateGtcFromSample(Sample sample, Boolean log) 
                 at ArrayAnalysis.Services.GtcToVcfService.Run(GtcToVcfInput input) 
                 at ArrayAnalysis.CLI.App.RunCliServiceAndReturnExitCode(BaseOptions opts) in /src/ArrayAnalysis.CLI/App.cs:line 110

    To customize thresholds, use the toggle to allow additional thresholds to be displayed and adjust as desired by typing in a numeric value or using the arrows to adjust up or down. Further detail of these thresholds including calculation method can be found in the Methylation Sample QC Summary Files section.

    The recommended thresholds are pre-set within the software for MethylationEPIC and Methylation Screening Array with the following values:

    Threshold
    Methylation Screening Array
    MethylationEPIC

    Restoration

    0

    0

    StainingGreen

    5

    5

    StainingRed

    5

    5

    ExtensionGreen

    5

    The first 21 rows in the tables correspond to the 21 control metrics used in the methylation sample QC. See section Methylation Sample QC Summary Files for details.

    hashtag
    DRAGEN Array Methylation QC and GenomeStudio Methylation Module Differences

    DRAGEN Array Methylation QC software provides automated methylation sample QC using assay control probes on the Infinium Methylation Arrays. Unlike the manual visual QC in GenomeStudio, DRAGEN Array ultilizes 21 numerical metrics defined based on the control probes and uses standard thresholds to determine pass/fail status of a sample. Unlike GenomeStuio, probe detection rate (proportion of probes passing at a given p-value threshold) is not utilized to determine sample pass/fail status in DRAGEN Array. For more information, see High-throughput Infinium methylation array QC using DRAGEN Array Methylation QCarrow-up-right software tech note.

    DRAGEN Array Methylation QC performs background normalization, dye bias correction, and detection p-value calculation differently in comparison to the GenomeStudio Methylation module, leading to differences in probe detection p-values and detection rates. For the GenomeStudio Methylation Module, non-cancer samples at standard DNA input typically have detection rate > 96%. The detection rates from DRAGEN Array Methylation QC are typically lower compared to GenomeStudio, because the detection p-value from DRAGEN Array is more stringent than that from the GenomeStudio Methylation Module. The table below shows example detection rates from the DRAGEN Array Methylation QC software from MSA (Methylation Screening Array) datasets.

    Dataset
    Min detection rate
    Mean detection rate
    Sample Count

    A

    86%

    93%

    220

    B

    61%

    83%

    951

    C

    63%

    85%

    Note that only samples passing QC are included and all samples are at or above 50ng DNA input. Detection p-value threshold 0.05.

    DRAGEN Array – Cytogenetics analysis
    or
    DRAGEN Array - Cytogenetics analysis + Emedgene interpretation
    cloud analysis types, additional customization options will appear after product files are selected within Configuration Settings. Adjustments to these thresholds will be saved as part of the Configuration Setting. Thresholds can be adjusted based on results objectives. Adjusting thresholds will impact the number of events called and thus, the output in the VCF and JSON files.

    The recommended thresholds/settings are pre-set within the software for any new configurations:

    Threshold
    New Config
    Min Value
    Max Value

    GTC Output

    False

    N/A

    N/A

    SNV VCF Output

    False

    N/A

    N/A

    CNV minimum size (kb)

    0

    0

    hashtag
    DRAGEN Array - Cytogenetics analysis + Emedgene interpretation

    This analysis type integrates with Emedgenearrow-up-right to display results in a user-friendly interface.

    Note: Only specific versions of Emedgene and DRAGEN Array are compatible with each other. For more details see the compatibility tablearrow-up-right on the Emedgene help site.

    hashtag
    Prerequisites

    • You'll need an additional Emedgene subscription to be either "Array", "Professional", or "Enterprise" tier. You can also follow the Illumina Software Registration Guidearrow-up-right to obtain that subscription.

    • To ensure proper integration with Emedgene (EMG), ICA notificationsarrow-up-right must be enabled for the specific ICA BSSH-managed project. EMG relies on these notifications to detect when an analysis has successfully completed. To configure SNS (Amazon Web Services Simple Notification Servicearrow-up-right) events in your managed ICA BSSH-managed project, follow these steps:

      • In the ICA portalarrow-up-right, in the ICA BSSH-managed project (e.g. "BSSH Your Workgroup Name") navigate to the Notifications section via the left-hand menu.

      • Click + Create, then select ICA Event.

      • Fill in the required fields as follows:

        • Event: Analysis Success

        • Type: SNS

    Create Subscription

    For more details on the prerequisites for this analysis, see the Automatic Case Creation from ICAarrow-up-right section in the Emedgene User Guide. For more details on limitations for this analysis see the release notes

    5

    ExtensionRed

    5

    5

    HybridizationHighMedium

    1

    1

    HybridizationMediumLow

    1

    1

    TargetRemoval1

    1

    1

    TargetRemoval2

    1

    1

    BisulfiteConversion1Green

    1

    1

    BisulfiteConversion1BackgroundGreen

    0.5

    1

    BisulfiteConversion1Red

    1

    1

    BisulfiteConversion1BackgroundRed

    0.5

    1

    BisulfiteConversion2

    0.5

    1

    BisulfiteConversion2Background

    0.5

    1

    Specificity1Green

    1

    1

    Specificity1Red

    1

    1

    Specificity2

    1

    1

    Specificity2Background

    1

    1

    NonpolymorphicGreen

    2.5

    5

    NonpolymorphicRed

    3

    5

    BgCorrectionOffset

    3000

    3000

    PvalThreshold

    0.05

    0.05

    34

    D

    77%

    85%

    22

    Address: Provide the correct address based on your region (contact [email protected] if unsure).
  • Payload Version: v4

  • AWS Region: This will be auto-populated based on the provided address.

  • (Recommended) Click Send Test Message to verify the configuration.

  • Click Save to complete the setup.

  • 250000

    CNV minimum probes

    10

    0

    250000

    LOH minimum size (kb)

    3000

    0

    250000

    LOH minimum probes

    500

    0

    250000

    CNV Smoothing window size

    5

    0

    1000

    ICA Notifications Menu
    ICA Event

    PGx Allele Definitions and PGx Guidelines

    hashtag
    PGx Allele Definitions and PGx Guidelines

    DRAGEN Array star allele calling leverages the star allele definitions provided by PharmVar and PharmGKB. DRAGEN Array star allele phenotype annotation, using the “star-allele annotate” command, is achieved through direct lookup into public PGx guidelines CPIC or DPWG, which is selected by the user when running DRAGEN Array.

    See table below for details of the data sources.

    Data Source
    Version
    URL

    DRAGEN Array “star-allele annotate” command provides both metabolizer status and activity score annotations for genes covered by the CPIC and DPWG guidelines.

    Specifically, CPIC metabolizer/phenotype annotations are supported for CACNA1S, CYP2B6, CYP2C19, CYP2C9, CYP2D6, CYP3A5, DPYD, G6PD, MT-RNR1, NUDT15, RYR1, SLCO1B1, TPMT, UGT1A1, CFTR, IFNL3/IFNL4 and VKORC1, among them activity scores are supported for CYP2C9, CYP2D6, and DPYD. DPWG metabolizer/phenotype annotations are supported for CYP1A2, CYP2B6, CYP2C19, CYP2C9, CYP2D6, CYP3A4, CYP3A5, DPYD, NUDT15, SLCO1B1, TPMT, UGT1A1, VKORC1 and F5, among them activity scores are supported for CYP2D6 and DPYD.

    hashtag
    Extended Multi-allelic variants based on the designs in the supported PGx products

    • DRAGEN Array PGx extends any single allele variant definitions obtained from PharmVar or PharmGKB that have multiple alleles in Illumina's product files to include all alleles of the Multi Allelic Variant (MAV). The table below shows the MAVs that were extended in the DRAGEN Array Database to cover all alleles for that MAV that are in the product files. Allele Name describes the allele that was added to the database.

    Gene Symbol
    Allele Name
    Hgvs

    hashtag
    Exceptions to Star Allele Definitions

    hashtag
    G6PD

    With the changes of reference genomes, the definition for a star allele sometimes need to be updated accordingly.

    Mediterranean Haplotype and Mediterranean, Dallas, Panama, Sassari, Cagliari, Birmingham are defined by two variants rs5030868 and rs2230037. In genome build GRCh37, Mediterranean Haplotype is defined by rs2230037 G>A and rs5030868 G>A, and Mediterranean, Dallas, Panama, Sassari, Cagliari, Birmingham is defined by rs5030868 G>A, with rs2230037 reference allele G.

    In genome build GRCh38, Mediterranean Haplotype is defined by rs5030868 G>A, with rs2230037 reference allele A, and Mediterranean, Dallas, Panama, Sassari, Cagliari, Birmingham is defined by rs2230037 A>G and rs5030868 G>A.

    Variant rs2230037 is ignored in all other G6PD alleles except in the two Mediterranean alleles.

    hashtag
    *0 Star Allele Definition

    A *0 allele refers to a full gene deletion of the analyzed gene, if there is no existing star allele name for the deletion allele from source databases, such as PharmVar and PharmGKB.

    Input Files

    The following section describes the input files required by DRAGEN Array. Product files (anything other than the IDATs) can be found on the .

    hashtag
    IDAT Files

    For each sample a pair of raw intensity files (.idat) are generated from the iScan System or NextSeq550 (for select arrays). They provide intensities in the red and green channels for each probe on the Infinium array. More information on which arrays can be used with NextSeq550, can be found on the .

    An IDAT file is identified by the BeadChip Barcode (12-digit unique Sentrix ID, i.e. 123456789101), BeadChip Position (row and column of the sample, i.e. R01C01), and Grn (Green) or Red for the specific channel.

    https://www.pharmgkb.org/page/dpwgMapping

    CFTR.rs121908755

    rs121908755.G>T

    NC_000007.14:g.117587800G>T

    CFTR.rs121909005

    rs121909005.T>C

    NC_000007.14:g.117587801T>C

    CFTR.rs121909020

    rs121909020.G>C

    NC_000007.14:g.117611640G>C

    CFTR.rs150212784

    rs150212784.T>C

    NC_000007.14:g.117611595T>C

    CFTR.rs193922525

    rs193922525.G>C

    NC_000007.14:g.117664770G>C

    CFTR.rs267606723

    rs267606723.G>T

    NC_000007.14:g.117642451G>T

    CFTR.rs397508288

    rs397508288.A>C

    NC_000007.14:g.117590409A>C

    CFTR.rs397508759

    rs397508759.G>T

    NC_000007.14:g.117534363G>T

    CFTR.rs74551128

    rs74551128.C>T

    NC_000007.14:g.117548795C>T

    CFTR.rs75039782

    rs75039782.C>G

    NC_000007.14:g.117639961C>G

    CFTR.rs77834169

    rs77834169.C>A

    NC_000007.14:g.117530974C>A

    CFTR.rs77834169

    rs77834169.C>G

    NC_000007.14:g.117530974C>G

    CFTR.rs77932196

    rs77932196.G>C

    NC_000007.14:g.117540270G>C

    CFTR.rs77932196

    rs77932196.G>T

    NC_000007.14:g.117540270G>T

    CFTR.rs78655421

    rs78655421.G>C

    NC_000007.14:g.117530975G>C

    CFTR.rs78655421

    rs78655421.G>T

    NC_000007.14:g.117530975G>T

    COMT.rs13306278

    rs13306278.C>G

    NC_000022.11:g.19941504C>G

    DPYD.rs114096998

    rs114096998.2.G>C

    NC_000001.11:g.97078987G>C

    DPYD.rs140602333

    rs140602333.G>T

    NC_000001.11:g.97573919G>T

    DPYD.rs142619737

    rs142619737.C>G

    NC_000001.11:g.97515851C>G

    DPYD.rs143154602

    rs143154602.G>T

    NC_000001.11:g.97593289G>T

    DPYD.rs145548112

    rs145548112.C>A

    NC_000001.11:g.97306195C>A

    DPYD.rs190951787

    rs190951787.G>T

    NC_000001.11:g.97515889G>T

    DPYD.rs200687447

    rs200687447.2.C>A

    NC_000001.11:g.97193209C>A

    DPYD.rs3918289

    rs3918289.G>A

    NC_000001.11:g.97450059G>A

    DPYD.rs3918290

    rs3918290.C>G

    NC_000001.11:g.97450058C>G

    DPYD.rs6670886

    rs6670886.C>A

    NC_000001.11:g.97699506C>A

    DPYD.rs72549304

    rs72549304.G>C

    NC_000001.11:g.97549609G>C

    DPYD.rs72549304

    rs72549304.G>T

    NC_000001.11:g.97549609G>T

    DPYD.rs748620513

    rs748620513.C>A

    NC_000001.11:g.97573799C>A

    DPYD.rs748639205

    rs748639205.A>G

    NC_000001.11:g.97082415A>G

    DPYD.rs760663364

    rs760663364.G>C

    NC_000001.11:g.97515928G>C

    DPYD.rs777425216

    rs777425216.C>A

    NC_000001.11:g.97515815C>A

    RYR1.38499667G>A

    NC_000019.10:g.38499667G>T

    NC_000019.10:g.38499667G>T

    RYR1.rs118192116

    rs118192116.C>T

    NC_000019.10:g.38451850C>T

    RYR1.rs118192151

    rs118192151.G>C

    NC_000019.10:g.38584974G>C

    RYR1.rs118204423

    rs118204423.G>A

    NC_000019.10:g.38457539G>A

    RYR1.rs142474192

    rs142474192.G>T

    NC_000019.10:g.38443790G>T

    RYR1.rs143988412

    rs143988412.A>G

    NC_000019.10:g.38580066A>G

    RYR1.rs1801086

    rs1801086.G>T

    NC_000019.10:g.38446710G>T

    RYR1.rs186983396

    rs186983396.C>G

    NC_000019.10:g.38442434C>G

    RYR1.rs193922762

    rs193922762.C>A

    NC_000019.10:g.38448673C>A

    RYR1.rs193922767

    rs193922767.G>A

    NC_000019.10:g.38452996G>A

    RYR1.rs193922772

    rs193922772.G>A

    NC_000019.10:g.38457546G>A

    RYR1.rs193922826

    rs193922826.C>G

    NC_000019.10:g.38504319C>G

    RYR1.rs193922838

    rs193922838.G>A

    NC_000019.10:g.38529036G>A

    RYR1.rs193922842

    rs193922842.C>T

    NC_000019.10:g.38543821C>T

    RYR1.rs370634440

    rs370634440.G>T

    NC_000019.10:g.38463499G>T

    PharmVar

    6.1

    https://www.pharmvar.org

    PharmGKB

    Snapshot-2024.05.16

    https://www.pharmgkb.org/

    UGT Alleles Nomenclature

    2010.12.21

    https://www.pharmacogenomics.pha.ulaval.ca/ugt-alleles-nomenclature/

    Human Cytochrome P450 (CYP) Allele Nomenclature Database Legacy Content

    July 2024

    https://www.pharmvar.org/htdocs/archive/index_original.htm

    CPIC guidelines

    1.38.0

    https://cpicpgx.org/guidelines/

    https://github.com/cpicpgx/cpic-data/

    DPWG guidelines

    CACNA1S.rs1800559

    rs1800559.C>A

    NC_000001.11:g.201060815C>A

    CFTR.rs113993958

    rs113993958.G>A

    NC_000007.14:g.117530953G>A

    CFTR.rs113993958

    rs113993958.G>T

    NC_000007.14:g.117530953G>T

    CFTR.rs11971167

    rs11971167.G>T

    NC_000007.14:g.117642528G>T

    June 2023

    hashtag
    Manifest Files

    The CSV and BPM manifest files can be found on the Illumina Support Site for all commercial Infinium BeadChips or on MyIlluminaarrow-up-right for custom and semi-custom designs. DRAGEN Array only supports manifest files from the Illumina Support site. For instructions on obtaining manifest files from MyIllumina, see Illumina Knowledge article, How to access custom array product files (manifest and product definition files) in MyIlluminaarrow-up-right.

    The CSV manifest file (.csv) provides complementary data to the BPM manifest file in a human readable format. It is a required input to the genotype gtc-to-vcf command to enable VCF generation for insertion/deletion variants. gtc-to-vcf depends on the presence of accurate mapping information within the manifest, and may produce inaccurate results if the mapping information is incorrect. Mapping information follows the implicit dbSNP standard, where

    • Positions are reported with 1-based indexing.

    • Positions in the PAR are reported with mapping position to the X chromosome.

    • For an insertion relative to the reference, the position of the base immediately 5' to the insertion (on the plus strand) is given.

    • For a deletion relative to the reference, the position of the most 5' deleted based (on the plus strand) is given.

    hashtag
    Cluster File

    The cluster file (.egt) is a standard product file provided by Illumina for commercial genotyping products and it is a required input for the genotype call command in DRAGEN Array. Custom cluster files may be required for optimal genotyping performance. See section Optimizing cluster files and copy number models for additional details.

    hashtag
    PGx CN Model File

    The PGx CN (Copy Number) model file (.dat) is a required input to the pgx copy-number call command to enable accurate copy number calling for pharmacogenomics. Illumina provides a standard CN model file for each PGx array product. See section Optimizing cluster files and copy number models for additional details.

    hashtag
    Cytogenetics Model File

    The cytogenetics CN (Copy Number) model file (.dat) is a required input to the cyto call command to enable accurate Cytogenetics analysis. Illumina provides a standard CN model file for each supported array product. For custom or other products, please contact Tech Support to request a CN model file and include the product BPM manifest.

    Note: The CN model file needs to be updated upon manifest revisions since probes can be added or removed during manifest revisions. A mismatch between the CN model file and the manifest will cause an error during pgx copy-number call and cyto call.

    hashtag
    Mask File

    The mask file (.msk) is a required input to the pgx copy-number train command to enable accurate pgx copy number training for pharmacogenomics. It does not need to be provided as an explicit input to the command line interface but should reside in the same folder as the BPM manifest. It should have the same base name as the manifest for the product. Illumina provides a mask file for each PGx array product and these can be found on the product files support page.arrow-up-right

    hashtag
    PGx Database File

    The PGx database file (.zip) contains the variant mapping information from Infinium PGx arrays to PGx variants. Each line in this file represents a single probe ID mapping to a variant's HGVS (Human Genome Variation Society) tag. This creates a map of many probes to one variant. DRAGEN Array cross references this map with SNV VCF IDs during runtime to do star allele calling. It works across all supported PGx products, even though the probes and variant coverage differ across them.

    hashtag
    Cytogenetics Database File

    The cytogenetics database file (.zip) contains information from Ensembl and RefSeq data sources used in the generation of Cytogenetics Annotation JSON File. This file can be used across products (beadchip/manifest types and versions). It is only necessary for input to local analysis (i.e., cyto annotate) as it is already stored in the cloud for cloud analysis. It may be updated in the future to accomodate changes in the underlying Ensembl and RefSeq datasources.

    hashtag
    Genome FASTA Files

    The genome FASTA file (.fa) is a text file with the reference genome sequences.The FASTA index file (.fai) contains metadata about chromosomal orchestration within the FASTA file for a particular species. DRAGEN Array PGx calling supports human genome build 37 and 38. The genome FASTA file and FASTA index file are both provided by Illumina for human species and should be stored together in the same input folder. For custom reference genomes, the contig identifiers in the provided genome FASTA file must match exactly the chromosome identifiers specified in the provided manifest. For a standard human product manifest, this means that the contig headers should read ">1" rather than ">chr1". Note: The Genome FASTA file is only required for the dragen-array-local-analysis workflow. If you're using dragen-array-cloud-analysis, you do not need to provide this file.

    hashtag
    Sample Sheet

    The sample sheet is a CSV formatted input file that utilizes a couple required fields for sample lookup (SentrixBarcode_A, SentrixPosition_A for local, beadChipName, sampleSectionName for cloud) to enable adding optional metadata and analyzing a filtered list of samples within a folder. It is intended to be flexible and the local version should be backwards compatible with most GenomeStudio samplesheets.

    The root folder which DRAGEN Array will search the files for can be set by either providing it via the --idat-folder or --gtc-folder options (where applicable). Or by setting the RootFolder field in the [Header] section. This RootFolder should be the full absolute path to the sample files. e.g.,

    Note: In the case of conflict between RootFolder and the CLI options (--idat-folder or --gtc-folder), the CLI options take precedence.

    The following are examples of all valid samplesheets:

    • Most basic (no sections, one sample)

    • Medium complexity (no sections, multiple samples, optional data)

    • High complexity (sections, multiple samples, optional data)

    Notes:

    • The column names are case insensitive. For example, the columns Sample_Name and sample_name, would be considered the same and the software would produce an error like this: Duplicate column sample_name found. Column names are case-insensitive. Please remove or rename the column from the samplesheet and re-process.

    • Because user-provided fields get output in the Genotype Summary File, the column names cannot conflict with those fields. For example, if the user provides a column named Sex Estimate in their samplesheet. DRAGEN Array will produce the following error: Sex Estimate is a reserved keyword. Please remove or rename the column from the samplesheet and re-process.

    • The optional fields (i.e. not SentrixBarcode_A and SentrixBarcode_B) will be output as-is in the for the genotype call command.

    • The [Manifests] section (used by GenomeStudio to delineate manifests in multi-manifest analyses) is currently ignored in DRAGEN Array.

    • There is a known issue regarding empty columns in the .

    For cloud analyses (i.e., for use in sample selection in running cloud analyses), the samplesheet does not currently support sections such as [Header] and [Data] and instead of using SentrixBarcode_A and SentrixPosition_A columns as the sample's keys, it uses beadChipName and sampleSectionName. i.e., a valid cloud samplesheet could look like this:

    There is also a template available on the sample selection interface on Basespace.

    hashtag
    Methylation QC sample sheet

    For DRAGEN Array Methylation QC on cloud, the additional optional sample sheet fields are used in analysis.

    Following Sample_Group, any number of additional columns can be added to include meta data fields such as sex, sample type, plate and well information, etc. Additional columns added after the Sample_Group column may have user-defined column header values. The Sample_ID field and any additional metadata added will be replicated in the Sample QC Summary output files.

    The Sample_Group field will be used to populate the PCA Control Plot within the Sample QC Summary Plots file and the Principal Component Summary file. For the PCA Control Plot, each sample group will be assigned a unique color. Samples assigned to the same Sample_Group value will be the same color in the PCA Control Plot. e.g.,

    hashtag
    Cytogenetics analysis + Emedgene interpretation sample sheet

    For Cytogenetics analysis + Emedgene interpretation on cloud, an additional column: demographicSex will be used to compare against to the Sex Estimate output from DRAGEN Array genotyping module and be displayed in Emedgene. The allowed values for this field are M (Male), F (Female), or U (Unknown).

    Example:

    hashtag
    Input File Summary Table

    In addition to the input files, there are set of intermediate files, including GTC, SNV VCF, CNV VCF and PGx CSV, which are outputs of some DRAGEN Array Local commands and inputs to other commands.

    The table below summarizes the input files or intermediate file, their sources, and the associated DRAGEN Array Local commands and options.

    Input File
    Source
    Command
    Option

    IDAT

    User provided from scanning instrument

    genotype call

    --idat-folder

    CSV Manifest

    Product file from Illumina

    genotype gtc-to-vcf

    --csv-manifest

    BPM Manifest

    Product file from Illumina

    pgx copy-number train

    genotype call

    genotype gtc-to-bedgraph

    genotype gtc-to-vcf

    support sitearrow-up-right
    Illumina Knowledge page on NextSeq550arrow-up-right

    DRAGEN Array Cloud Analysis

    hashtag
    DRAGEN Array Cloud Analysis Overview

    DRAGEN Array Cloud utilizes the user-friendly graphical interface of BaseSpace Sequence Hub to simplify DRAGEN Array analysis setup and kickoff. Optional integration with the iScan System allows data to be streamed directly from the instrument to the cloud platform. Analysis data is stored on the Illumina Connected Platform providing secure storage for both microarray and sequencing data.

    hashtag
    Getting Started

    The following prerequisites are needed to get started with DRAGEN Array Cloud:

    • Illumina Connected Analytics subscription: An ICA Basic, Professional or Enterprise subscription can be used which include access to BaseSpace Sequence Hub. Follow the to register the software.

    • Workgroup setup: Administrator must create a workgroup before users can log in. Using a workgroup allows all members of the workgroup to share access to resources, analyses, and data. Learn more about .

      • Designating a workgroup as ‘Collaborative’ allows projects to be shared with collaborators or Illumina Tech Support to assist with troubleshooting. To create a collaborative workgroup, select the Enable collaborators outside of this domain checkbox during workgroup creation.

    hashtag
    Running Analysis

    Before beginning analysis, ensure workgroup context is being used so analysis can be viewed by all members of your workgroup. The name of your workgroup should appear in the top right corner.

    Use the following steps to run the Microarray Analysis Setup on BaseSpace Sequence Hub:

    1. Select the Runs tab

    2. Select New Run

    3. Select Microarray Analysis Setup

    1. Use the Select Project link to choose the project for your output files To select an existing project, click the radio button next to the desired project name. You can also create a project by clicking the New button in the project selection window.

    2. Select the Type of Analysis Further detail of each Type of Analysis is available in section . Note: For PGx CNV calling, it is recommended that 96 or more samples passing LogRDev <= 0.2 are included in the analysis. For PGx star allele calling, it is recommended to QC the samples and review the samples that have Log R Dev > 0.2, call rate < 0.99, or TGA Control probe < 1.0 to assess the reliability of the analysis. These metrics are provided in the genotyping sample summary file (gt_sample_summary.csv). For more details, see the explanation for the .

    • DRAGEN Array – Genotyping provides flexibility for turning off/on specific output files and adjusting GenCall score cutoff. Its recommended to turn off VCF output for non-human species and Final Report output for large sample numbers.

    • DRAGEN Array – Cytogenetics analysis provides options to adjust thresholds as detailed on the page. The GTC and SNV VCF options listed on that page are configured in the "Add Custom Configuration" window at the bottom under "Additional Output Options".

    • DRAGEN Array - Cytogenetics analysis + Emedgene interpretation shares the same options detailed on the page.

    1. Select your preferred option in the Configuration Settings drop-down menu Configuration setup will vary based on the Type of Analysis selected. More details are available in section .

    2. Select Next

    3. Select either Import Sample Sheet, Select BeadChips, or Import IDAT Files (Figure 3)

    • Import Sample Sheet presents a link to upload sample sheet. Users may download a template sample sheet by selecting the Download Template link.

    • Select BeadChips allows users to select BeadChips from the displayed list of available BeadChips. If selecting specific samples within the BeadChip is desired the Import Sample Sheet option should be used.

    • Import IDAT Files allows users to upload the IDAT files from a local folder to the cloud platform for use with the current and future analyses by users within the same workgroup.

    1. Select Launch Analysis

    hashtag
    View Outputs

    1. On the Analyses tab, view the analysis status, e.g., initializing or complete.

    2. After the analysis is complete, select the analysis and select the Files tab.

    3. From the Files tab, select the Output folder.

    hashtag
    Manage Data

    The data management tab allows you to view and manage all your scanned IDAT files in the cloud. Before viewing, ensure workgroup context is being used so all data available your workgroup can be seen. The name of your workgroup should appear in the top right corner. For more information, see .

    hashtag
    Troubleshooting and Additional Support

    hashtag
    Troubleshooting iScan integration

    The firewall protects the iScan control computer by filtering incoming traffic to remove potential threats. The firewall is enabled by default to block all inbound connections. Keep the firewall enabled and allow outbound connections.

    For the instrument to connect to BaseSpace Sequence Hub, you will need to add regional platform endpoints and instrument specific endpoints to the allow list on your firewall. Regional endpoints and further detail can be found in .

    The following table shows the applicable endpoints for the iScan.

    Endpoint
    Category
    Purpose

    Some notes on IDAT fail status: iScan will mark certain samples with a FAIL status if the registration quality is too poor for that particular section. Selected samples that are marked with FAIL status will be excluded from analysis and there would be no results for that sample, even though IDATs are generated. The registration quality can be found in the metrics.txt file. More information on that file can be found in the .

    Some notes on Illumina LIMS: If using Illumina LIMS integration with the iScan, it's possible to set sample names that will be encoded in the IDATs and downstream will show up in analysis output files like VCFs instead of the sample ID (Sentrix Barcode + Position). This can cause issues with integration for cytogenetic analysis as that sample name is used as a unique identifier. So it is highly recommended to not use that feature in Illumina LIMS to ensure the sample names remain unique throughout the various analyses.

    Some notes on running a semi-custom PGx product: Detailed notes on running this analysis for local can be found . But for cloud, a workaround is necessary because the semi-custom product samples are filtered from the beadchip table. Here are the steps for that workaround:

    1. Select an existing commercial product configuration (e.g., GDA_PGx-8v1-0_G4 - GRCh38).

    2. Kick off an analysis using the Import Sample Sheet option for the semi-custom product samples.

    hashtag
    Sharing a project

    Project sharing allows a user to share files with users outside the workgroup for collaboration or with Illumina Tech Support for troubleshooting. To share a project on BaseSpace Sequence Hub, first set the Workgroup type as ‘Collaborative’ during , and then use the following steps to obtain a link to your project. The project can then be accessed by anyone with the link. All files in the project are shared.

    1. Navigate to the Projects tab

    2. Click the button next to the desired project

    3. Select the Share button above to list (Figure 3)

    4. Select the Get Link Option to Activate a link for the project

    Additional Notes:

    • The project owner maintains ownership and write access. If project owner deletes the data, the collaborators lose access to it.

    • Either sending or receiving domain must be collaborative. See "Workgroup setup"

    • Must be in the same (i.e."Data cannot be transferred directly between instances, however you can download and share data separately." )

    [Header]
    RootFolder,/test/samples
    [Data]
    ....
    SentrixBarcode_A,SentrixPosition_A
    204753010023,R02C01
    SentrixBarcode_A,SentrixPosition_A,Sample_ID,Sample_Group,MetaData1
    204753010023,R01C01,NA1231,Group1,F
    204753010024,R01C01,NA1233,Group2,M
    [Header]
    RootFolder,/tests/samples
    Date,1/1/2025
    [Data]
    SentrixBarcode_A,SentrixPosition_A,Sample_ID,Sample_Group,MetaData1
    204753010023,R01C01,NA1231,Group1,F
    204753010024,R01C01,NA1233,Group2,M
    beadChipName,sampleSectionName
    204753010023,R01C01
    204753010023,R02C01
    204753010024,R01C01
    204753010024,R02C01
    beadChipName,sampleSectionName,Sample_ID,Sample_Group,MetaData1
    204753010023,R01C01,NA1231,Group1,F
    204753010023,R02C01,NA1232,Group2,F
    204753010024,R01C01,NA1233,Group2,M
    204753010024,R02C01,NA1234,Group1,M
    beadChipName,sampleSectionName,demographicSex
    204753010023,R01C01,F
    204753010023,R02C01,F
    204753010024,R01C01,M
    204753010024,R02C01,M

    --bpm-manifest

    Cluster File

    Product file from Illumina or user created using GenomeStudio

    genotype call

    --cluster-file

    PGx CN Model

    Product file from Illumina or user created using DRAGEN Array Local

    pgx copy-number call

    --cn-model

    Cytogenetics CN Model

    Product file from Illumina

    cyto call

    --cn-model

    PGx Database

    Product file from Illumina

    pgx star-allele call

    --database

    Cytogenetics Database

    Product file from Illumina

    cyto annotate

    --database

    Genome FASTA

    Product file from Illumina

    genotype gtc-to-vcf

    pgx copy-number train

    --genome-fasta-file

    Sample Sheet

    User provided

    genotype call

    genotype gtc-to-bedgraph

    genotype gtc-to-vcf

    pgx copy-number call

    pgx copy-number train

    --sample-sheet

    GTC

    DRAGEN Array output from genotype call

    genotype gtc-to-bedgraph

    genotype gtc-to-vcf

    pgx copy-number call

    pgx copy-number train

    --gtc-folder

    SNV and PGx CNV VCF

    DRAGEN Array output from genotype gtc-to-vcf and pgx copy-number call

    pgx star-allele call

    --vcf-folder

    PGx CSV

    DRAGEN Array output from pgx star-allele call

    pgx star-allele annotate

    --star-alleles

    Cytogenetics CNV VCF

    DRAGEN Array output from cyto call

    cyto annotate

    --vcf-folder

    genotype summary files
    v1.3 Release Notes
  • Software consumables: iCredits can be purchased for storage on the cloud platform and analysis pipelines with a compute charge. Per sample analysis can be purchased for relevant pipelines as listed in section Applications. Follow the Example 3: Configuring Software Consumables (iCredits or Sample Analyses)arrow-up-right in the Illumina Software Registration Guide to register the software consumables.

  • [Optional] iScan integration: The iScan System is integrated with Illumina Connected Platform and can send IDATs for further analysis. The iScan System must be running iScan Control Software version 4.2.1 or later.

    • Instructions to Use Illumina Connect Analytics (ICA) with the iScan Systemarrow-up-right

    • Troubleshooting iScan integration

  • EULA acceptance: Accept all necessary End User License Agreements in BaseSpace Sequence Hub before scanning begins.

  • Internet connection: For uploading product files or IDATs, a network connection 1 GbE or faster is recommended.

  • Enter the Analysis Name (Figure 1)
    (Optional) Create a custom configuration via the "Add Custom Configuration" option in Configuration Settings. Custom configurations must be assigned a name and product files can be uploaded or selected (Figure 2). Details on file name constraints can be found in the ICA documentationarrow-up-right. Custom configuration options vary by Type of Analysis including:
  • DRAGEN Array – Methylation – QC provides options to adjust thresholds as detailed on the DRAGEN Array Methylation QC page.

  • DRAGEN Array – PGx – Star allele annotation provides an option to change the default metabolizer status database used from CPICarrow-up-right to DPWGarrow-up-right.

  • ocsp.rootca1.amazontrust.com

    Required

    Certificate authorization

    ocsp.rootg2.amazontrust.com

    Required

    Certificate authorization

    ocsp.sca1b.amazontrust.com

    Required

    Certificate authorization

    fonts.gstatic.com

    Required

    Display fonts

    fonts.googleapis.com

    Recommended

    Display fonts

    cdn.walkme.com

    Recommended

    Telemetry

    cdn3.userzoom.com

    Recommended

    Telemetry

    dpm.demdex.net

    Recommended

    Telemetry

    illuminainc.demdex.net

    Recommended

    Telemetry

    illuminainc.tt.omtrdc.net

    Recommended

    Telemetry

    smetrics.illumina.com

    Recommended

    Telemetry

    google.com

    Recommended

    Telemetry

    google-analytics.com

    Recommended

    Telemetry

    stats.g.doubleclick.net

    Recommended

    Telemetry

    illumina.com

    Optional

    Access Illumina support material

  • Copy the link and send it to the desired recipient(s)

  • For Enterprise domains, use this same methodarrow-up-right (share-by-link, not share-by-transfer)

    ica.illumina.com

    Required

    Send IDAT files to ICA

    o.ss2.us

    Required

    Certificate authorization

    ocsp.digicert.com

    Required

    Certificate authorization

    ocsp.pki.goog/gsr2

    Required

    Certificate authorization

    Illumina Software Registration Guidearrow-up-right
    managing a Workgrouparrow-up-right
    Applications
    local analysis
    DRAGEN Array Cytogenetics Analysis
    DRAGEN Array Cytogenetics Analysis
    Applications
    BaseSpace Data Managementarrow-up-right
    Security and Networking for Illumina instrument control computersarrow-up-right
    iScan documentationarrow-up-right
    Emedgenearrow-up-right
    here
    Workgroup setup
    here.arrow-up-right
    AWS regional instancearrow-up-right
    Figure 1. Configuration step of Microarray Analysis Setup
    Figure 2. Optional Custom Configuration step of Microarray Analysis Setup
    Figure 3. Sample Selection step of Microarray Analysis Setup
    Figure 3. Share data on BaseSpace Sequence Hub

    PGx Star Allele Coverage

    hashtag
    Theoretical Coverage

    The PGx genes and star/variant alleles listed below can be detected by DRAGEN Array v1.1 if available on the microarray. PGx coverage for specific PGx microarrays can be found here: PGx Star Allele Coverage for Specific PGx Productsarrow-up-right. Known and novel star alleles not in the below list will not be reported. Star allele definitions are sourced from PharmVar and PharmGKB.

    Among the PGx genes, HLA-A, HLA-B, and IFNL3/IFNL4 alleles are covered through tagging variants, specifically HLA-A,*31:01 (rs1061235.A>T); HLA-B,*15:02 (rs144012689.T>A); HLA-B,*57:01 (rs2395029.T>G); HLA-B,*58:01 (rs9263726.G>A); IFNL3/4, rs12979860 variant (T). Reliability of the tagging SNPs varies depending on the population. Additional information on PGx gene types, variant type versus star allele type, can be found here: Introducing-dragen-array-1-0-for-infinium-array-based-pharmacogenomics-analysisarrow-up-right

    Gene
    PGx Alleles

    hashtag
    PGx Star Allele Coverage for Specific PGx Products

    PGx star alleles can only be called when the related variants in the star allele definition are present in a PGx product. An auxiliary file ([Product]_GS_import.txt) is provided for each product with the PGx variants and associated star alleles. The product files pages that contain the auxiliary files are listed in the table below. The aux file covers SNPs and indels only, it does not contain SV defined star-alleles.

    Instructions on how to use the auxiliary file can be found here: .

    Product
    GS Import File Name
    Product Files Link

    hashtag
    Known Limitations of GDA-ePGx, GSAv4-ePGx, and GCRA-ePGx.

    • APOE: GSAv4-ePGx and GCRA-ePGx do not support calling E2 and E4 due to the lack of functional probes for rs7412 and rs429358.

    • CYP2A6: GDA-ePGx does not support *5 due to lack of coverage for *5 core variants.

    • CYP4F2: for all three products

    hashtag
    PGx Variants Masked in DRAGEN Array

    During DRAGEN Array star allele calling, poorly performing PGx variants are masked and treated as "No Calls". Star alleles that are solely defined by the masked variants will NOT be called by DRAGEN Array. The tables below provide the variants that are masked per product with each row represents a single variant. The Variant_ID matches the ID field of the corresponding SNV VCF entry of the PGx product.

    hashtag
    GDA-ePGx

    Manifest
    Gene_Symbol
    Variant_ID

    hashtag
    GSAv4-ePGx

    Manifest
    Gene_Symbol
    Variant_ID

    hashtag
    GCRA-ePGx

    Manifest
    Gene_Symbol
    Variant_ID

    BDNF

    Reference;rs10835210.C>A;rs10835210.C>G;rs11030101.A>G;rs11030101.A>T;rs11030104.A>G;rs11030118.G>A;rs11030119.G>A;rs11030119.G>T;rs1491850.T>C;rs16917234.T>A;rs16917234.T>C;rs1967554.A>C;rs2030324.A>G;rs61888800.G>T;rs6265.C>T;rs7103411.C>T;rs7124442.C>G;rs7124442.C>T;rs7127507.T>C;rs7934165.G>A;rs962369.T>C;rs988748.C>G

    CACNA1C

    Reference;rs1006737.G>A;rs1034936.C>A;rs1034936.C>G;rs1034936.C>T;rs1051375.G>A;rs1051375.G>C;rs10774053.A>C;rs10774053.A>G;rs10848635.T>A;rs10848635.T>C;rs11062040.C>T;rs12813888.A>C;rs12813888.A>T;rs2041135.T>C;rs215976.C>G;rs215976.C>T;rs215994.T>C;rs216008.C>T;rs216013.A>G;rs2238032.T>C;rs2238032.T>G;rs2238087.C>G;rs2238087.C>T;rs2239050.G>A;rs2239050.G>C;rs2239128.T>A;rs2239128.T>C;rs2283271.T>A;rs723672.C>A;rs723672.C>G;rs723672.C>T;rs7295250.T>C;rs7316246.G>A;rs7316246.G>C;rs758723.T>A;rs758723.T>C

    CACNA1S

    Reference;rs1800559.C>A;rs1800559.C>T;rs772226819.G>A

    CFTR

    Reference;rs113993958.G>A;rs113993958.G>C;rs113993958.G>T;rs115545701.C>T;rs11971167.G>A;rs11971167.G>T;rs121908752.T>G;rs121908753.G>A;rs121908755.G>A;rs121908755.G>T;rs121908757.A>C;rs121909005.T>C;rs121909005.T>G;rs121909013.G>A;rs121909020.G>A;rs121909020.G>C;rs121909041.T>C;rs141033578.C>T;rs150212784.T>C;rs150212784.T>G;rs186045772.T>A;rs193922525.G>A;rs193922525.G>C;rs200321110.G>A;rs202179988.C>T;rs267606723.G>A;rs267606723.G>T;rs368505753.C>T;rs397508256.G>A;rs397508288.A>C;rs397508288.A>G;rs397508387.G>T;rs397508442.C>T;rs397508513.A>C;rs397508537.C>A;rs397508759.G>A;rs397508759.G>T;rs397508761.A>G;rs74503330.G>A;rs74551128.C>A;rs74551128.C>T;rs75039782.C>G;rs75039782.C>T;rs75527207.G>A;rs75541969.G>C;rs76151804.A>G;rs77834169.C>A;rs77834169.C>G;rs77834169.C>T;rs77932196.G>A;rs77932196.G>C;rs77932196.G>T;rs78655421.G>A;rs78655421.G>C;rs78655421.G>T;rs78769542.G>A;rs80224560.G>A;rs80282562.G>A

    COMT

    Reference;rs13306278.C>T;rs165599.G>A;rs165599.G>C;rs165722.C>T;rs165728.C>A;rs165728.C>G;rs165728.C>T;rs165774.G>A;rs174675.T>C;rs174696.C>A;rs174696.C>T;rs174699.C>T;rs2020917.C>T;rs2075507.G>A;rs2075507.G>C;rs2075507.G>T;rs2239393.A>G;rs4633.C>T;rs4646312.T>C;rs4646316.C>G;rs4646316.C>T;rs4680.G>A;rs4818.C>G;rs4818.C>T;rs5746849.A>G;rs5993882.T>C;rs5993882.T>G;rs5993883.T>G;rs6267.G>A;rs6267.G>T;rs6269.A>G;rs6269.A>T;rs7287550.T>C;rs7287550.T>G;rs737865.A>G;rs737866.T>A;rs737866.T>C;rs740603.A>G;rs9332377.C>A;rs9332377.C>T;rs933271.T>A;rs933271.T>C;rs9606186.C>A;rs9606186.C>G;rs9606186.C>T

    CYP1A2

    *10;*11;*12;*13;*14;*15;*16;*17;*18;*19;*1A;*1B;*1C;*1D;*1E;*1F;*1G;*1J;*1K;*1L;*1M;*1N;*1P;*1Q;*1R;*1S;*1T;*1U;*1V;*2;*20;*21;*3;*4;*5;*6;*7;*8;*9

    CYP2A6

    *1;*10;*11;*12;*13;*14;*15;*16;*17;*18;*19;*1x2;*2;*20;*21;*22;*23;*24;*25;*26;*27;*28;*31;*34;*35;*36;*37;*38;*39;*4;*40;*41;*42;*43;*44;*45;*46;*48;*49;*5;*50;*51;*52;*53;*54;*55;*56;*6;*7;*8;*9

    CYP2B6

    *1;*10;*11;*12;*13;*14;*15;*17;*18;*19;*2;*20;*21;*22;*23;*24;*25;*26;*27;*28;*3;*31;*32;*33;*34;*35;*36;*37;*38;*39;*4;*40;*41;*42;*43;*44;*45;*46;*47;*48;*49;*5;*6;*7;*8;*9

    CYP2C19

    *1;*10;*11;*12;*13;*14;*15;*16;*17;*18;*19;*2;*22;*23;*24;*25;*26;*28;*29;*3;*30;*31;*32;*33;*34;*35;*38;*39;*4;*5;*6;*7;*8;*9

    CYP2C8

    *1;*10;*11;*12;*13;*14;*15;*16;*17;*18;*2;*3;*4;*5;*6;*7;*8;*9

    CYP2C9

    *1;*10;*11;*12;*13;*14;*15;*16;*17;*18;*19;*2;*20;*21;*22;*23;*24;*25;*26;*27;*28;*29;*3;*30;*31;*32;*33;*34;*35;*36;*37;*38;*39;*4;*40;*41;*42;*43;*44;*45;*46;*47;*48;*49;*5;*50;*51;*52;*53;*54;*55;*56;*57;*58;*59;*6;*60;*61;*62;*63;*64;*65;*66;*67;*68;*69;*7;*70;*71;*72;*73;*74;*75;*76;*77;*78;*79;*8;*80;*81;*82;*83;*84;*85;*9

    CYP2D6

    *1;*1-*90;*10;*100;*101;*102;*103;*104;*105;*106;*107;*108;*109;*10x2;*11;*110;*111;*112;*113;*114;*115;*116;*117;*118;*119;*12;*120;*121;*122;*123;*124;*125;*126;*127;*128;*129;*13;*13-*1;*13-*2;*13-*4-*68;*130;*131;*132;*133;*134;*135;*136;*137;*138;*139;*13x2-*1;*13x2-*2;*14;*140;*141;*142;*143;*144;*145;*146;*147;*148;*149;*15;*150;*151;*152;*153;*154;*155;*156;*157;*158;*159;*160;*161;*162;*163;*164;*165;*166;*167;*168;*169;*17;*170;*171;*172;*17x2;*18;*19;*1x2;*2;*20;*21;*22;*23;*24;*25;*26;*27;*28;*29;*29x2;*2x2;*3;*30;*31;*32;*33;*34;*35;*35x2;*36;*36;*36-*10;*36-*10x2;*36x2-*10;*36x3-*10;*37;*38;*39;*4;*40;*41;*42;*43;*43x2;*44;*45;*46;*47;*48;*49;*4M;*4N-*4;*4x2;*5;*50;*51;*52;*53;*54;*55;*56;*58;*59;*6;*60;*62;*64;*65;*68;*68-*4;*69;*7;*70;*71;*72;*73;*74;*75;*8;*81;*82;*83;*84;*85;*86;*87;*88;*89;*9;*90;*91;*92;*93;*94;*95;*96;*97;*98;*99;*9x2

    CYP2E1

    *1A;*1B;*2;*3;*4;*5A;*5B;*6;*7A;*7B;*7C

    CYP3A4

    *1;*10;*11;*12;*13;*14;*15;*16;*17;*18;*19;*2;*20;*21;*22;*23;*24;*26;*28;*29;*3;*30;*31;*32;*33;*34;*35;*37;*38;*39;*4;*40;*41;*42;*43;*44;*45;*46;*47;*48;*5;*6;*7;*8;*9

    CYP3A5

    *1;*3;*6;*7;*8;*9

    CYP4F2

    *1;*10;*11;*12;*13;*14;*15;*17;*2;*3;*4;*5;*6;*7;*8;*9

    DPYD

    Reference;rs111858276.T>C;rs112766203.1.G>A;rs112766203.2.G>C;rs114096998.1.G>T;rs114096998.2.G>A;rs114096998.2.G>C;rs115232898.T>C;rs116364703.T>A;rs1180771326.T>C;rs137878450.C>A;rs137999090.C>T;rs138391898.C>T;rs138545885.C>A;rs138616379.C>T;rs139459586.A>C;rs139834141.C>T;rs140039091.C>G;rs140114515.C>T;rs140602333.G>A;rs140602333.G>T;rs140989814.C>G;rs141044036.T>C;rs141439344.C>T;rs141462178.T>C;rs141726921.C>T;rs142512579.C>T;rs142619737.C>G;rs142619737.C>T;rs143154602.G>A;rs143154602.G>T;rs143815742.1.C>A;rs143815742.2.C>T;rs143879757.1.G>T;rs143879757.2.G>A;rs143986398.G>C;rs144395748.1.G>C;rs144395748.2.G>T;rs144935781.T>C;rs145112791.G>A;rs145529148.T>C;rs145548112.C>A;rs145548112.C>T;rs145773863.C>T;rs146356975.T>C;rs146529561.G>A;rs147545709.G>A;rs147601618.A>G;rs148799944.C>G;rs148994843.C>T;rs150036960.G>C;rs150385342.1.C>T;rs150385342.2.C>A;rs150437414.A>G;rs151074666.C>T;rs17376848.A>G;rs1801158.C>T;rs1801159.T>C;rs1801160.C>T;rs1801265.A>G;rs1801266.G>A;rs1801267.C>T;rs1801268.C>A;rs183105782.A>G;rs183385770.C>T;rs186169810.A>C;rs187713395.A>G;rs188052243.T>C;rs190577302.G>C;rs190951787.G>C;rs190951787.G>T;rs199549923.G>T;rs199634007.G>T;rs199646142.C>T;rs199777072.C>T;rs200064537.A>T;rs200296941.T>C;rs200562975.T>C;rs200643089.A>C;rs200687447.1.C>T;rs200687447.2.C>A;rs200687447.2.C>G;rs200693895.A>G;rs200709381.T>G;rs201018345.C>T;rs201035051.T>G;rs201268750.G>T;rs201433243.C>T;rs201615754.1.C>A;rs201615754.2.C>T;rs201648613.C>G;rs201785202.G>A;rs202144771.G>A;rs202212118.C>A;rs2297595.T>C;rs267598785.G>A;rs267598786.C>T;rs267598789.G>A;rs367619008.T>C;rs368146607.T>G;rs368152149.T>C;rs368327291.C>G;rs368519011.T>C;rs368970772.G>T;rs369103276.A>G;rs369575517.G>A;rs370569731.1.C>G;rs370569731.2.C>T;rs370615432.C>A;rs370707404.A>G;rs371258350.C>T;rs371313778.C>T;rs371587702.1.G>A;rs371587702.2.G>C;rs371792178.1.G>A;rs371792178.2.G>C;rs372058915.T>C;rs372307932.A>T;rs372909322.T>C;rs374527058.A>G;rs374531732.C>T;rs374825099.1.G>T;rs374825099.2.G>C;rs374827081.G>C;rs375436137.C>T;rs375990187.A>G;rs376073289.1.C>T;rs376073289.2.C>A;rs376128878.G>T;rs376273539.G>C;rs377143350.C>T;rs377169736.C>G;rs3918289.G>A;rs3918289.G>C;rs3918290.C>G;rs3918290.C>T;rs45589337.T>C;rs527580106.T>C;rs528152707.C>A;rs528430685.G>A;rs528768620.C>T;rs529019871.T>C;rs532341730.A>T;rs536577604.T>C;rs538336580.T>A;rs538703919.G>A;rs547099198.G>A;rs548783838.C>T;rs55674432.C>A;rs556933127.A>C;rs557220418.G>A;rs558354142.G>A;rs55886062.1.A>C;rs55886062.2.A>T;rs559427764.C>A;rs55971861.T>G;rs56005131.G>T;rs56038477.C>T;rs568169006.T>C;rs568367673.C>A;rs569661196.A>G;rs570122671.G>A;rs571114616.A>G;rs573299212.C>T;rs575763449.G>A;rs575853463.C>T;rs576409484.T>A;rs57918000.G>A;rs59086055.G>A;rs60139309.T>C;rs60511679.A>C;rs61622928.C>T;rs61757362.G>A;rs6670886.C>A;rs6670886.C>T;rs672601273.1.C>A;rs672601273.2.C>T;rs672601275.T>G;rs672601276.C>A;rs672601282.G>A;rs672601284.C>T;rs672601285.T>C;rs672601287.T>G;rs672601288.C>A;rs67376798.T>A;rs72547601.T>C;rs72547602.T>A;rs72549303.del;rs72549304.G>A;rs72549304.G>C;rs72549304.G>T;rs72549305.T>C;rs72549306.1.C>A;rs72549306.2.C>T;rs72549307.T>C;rs72549308.T>G;rs72549309.ATGA[1];rs72549310.G>A;rs72975710.1.G>A;rs72975710.2.G>C;rs745512069.G>A;rs745704371.G>C;rs745833535.T>C;rs745911874.C>T;rs745982505.1.T>C;rs745982505.2.T>A;rs746115989.C>T;rs746329786.T>A;rs746777181.C>T;rs747132274.C>G;rs747161261.C>T;rs747627716.A>C;rs747633945.C>T;rs747858350.G>A;rs747872037.C>A;rs748214188.A>T;rs748235192.1.T>A;rs748235192.2.T>C;rs748266854.G>A;rs748320430.A>C;rs748620513.C>A;rs748620513.C>G;rs748639205.A>C;rs748639205.A>G;rs748853941.T>C;rs748958293.G>A;rs748974194.G>A;rs749157068.C>A;rs749269410.C>T;rs749354734.A>T;rs749586100.T>A;rs749699298.A>C;rs749982106.G>A;rs750147471.T>C;rs75017182.G>C;rs750224169.G>A;rs750423752.A>C;rs750687600.C>T;rs750721736.A>T;rs751049055.C>A;rs751104498.T>C;rs751113340.G>A;rs751190912.G>A;rs751340819.A>G;rs751374989.T>A;rs751399062.G>T;rs751841116.1.C>A;rs751841116.2.C>T;rs751848058.T>A;rs752020412.C>T;rs752228747.G>A;rs752388408.C>T;rs752518145.C>A;rs752985272.C>A;rs753166888.C>G;rs753217888.G>C;rs753296078.C>G;rs753419296.C>G;rs753527420.C>G;rs753707032.G>A;rs753710779.G>A;rs753820482.T>C;rs753950237.G>A;rs754028972.A>G;rs754125729.1.G>A;rs754125729.2.G>T;rs754467630.G>A;rs754786483.T>C;rs755155824.C>A;rs755407188.T>G;rs755416212.C>T;rs755428442.C>G;rs755645831.A>C;rs755692084.T>G;rs755729055.T>C;rs756020314.G>C;rs756372042.A>G;rs756613407.T>C;rs756684474.T>C;rs756890859.T>C;rs756992995.C>T;rs757155354.T>C;rs757227327.C>T;rs757342874.C>T;rs757376267.C>A;rs757695236.C>T;rs757954074.C>T;rs757958938.T>C;rs757994597.G>A;rs758154803.A>G;rs758489611.C>T;rs758514990.C>T;rs758649719.C>T;rs758699471.T>C;rs759082282.C>A;rs759249769.G>T;rs759424419.A>T;rs759479759.T>C;rs759562628.T>G;rs759766897.T>C;rs759967863.A>G;rs760038956.C>T;rs760222167.T>C;rs760235888.C>T;rs760485592.G>A;rs760553268.G>C;rs760570391.A>G;rs760663364.G>A;rs760663364.G>C;rs761302217.T>C;rs761351410.G>A;rs761479700.G>C;rs761555670.T>C;rs761609256.T>G;rs762083671.T>A;rs762102298.A>C;rs762198241.G>A;rs762430779.G>T;rs762446803.A>C;rs762468894.G>C;rs762523739.T>A;rs762533012.C>T;rs762598766.T>C;rs762779297.T>C;rs762858106.C>T;rs762911226.T>A;rs763008163.T>G;rs763061658.A>G;rs763449831.C>T;rs763506271.T>C;rs763557204.A>G;rs763572567.T>G;rs763623595.A>C;rs763784786.G>C;rs763862486.C>T;rs763893877.T>C;rs763984510.G>C;rs764111543.C>T;rs764270260.G>A;rs764555085.A>G;rs764635955.G>T;rs764666241.C>A;rs764679468.A>C;rs764945792.C>T;rs765001324.C>T;rs765034707.C>A;rs765075551.T>C;rs765131182.G>A;rs765247038.G>A;rs765309287.G>T;rs765465250.T>C;rs765640386.C>A;rs765990958.G>A;rs766411970.A>C;rs766438205.T>C;rs766635900.C>T;rs766700777.C>G;rs766761199.T>G;rs766833304.G>C;rs766885021.A>C;rs767200577.T>C;rs767376585.C>G;rs767437717.G>T;rs767464878.C>A;rs767468952.C>T;rs767482279.A>G;rs767547827.G>C;rs767818267.C>T;rs767836989.T>C;rs767986711.T>G;rs768117152.T>C;rs768157853.G>C;rs768200107.T>G;rs768288280.T>C;rs768501828.T>C;rs768507975.A>T;rs768680499.G>T;rs768915005.C>T;rs769190350.T>A;rs769306962.C>T;rs769466648.1.T>G;rs769466648.2.T>C;rs769514867.G>T;rs769696395.T>C;rs769709846.T>C;rs769820114.C>T;rs769847078.T>C;rs769932607.G>A;rs770229152.T>A;rs770566506.A>G;rs770958862.G>A;rs771194906.A>G;rs771534236.T>C;rs771536388.C>T;rs771573678.T>A;rs771646887.C>T;rs771648776.T>C;rs771885007.A>G;rs771930534.1.A>T;rs771930534.2.A>G;rs772097379.G>A;rs772264512.G>A;rs772320654.T>C;rs772358811.C>G;rs772544099.G>T;rs772826416.A>G;rs772906420.C>T;rs773159364.C>G;rs773407491.T>C;rs773584401.C>A;rs773652644.T>C;rs773815814.1.C>A;rs773815814.2.C>T;rs773868825.C>T;rs773983635.A>T;rs774134971.T>C;rs774500505.A>T;rs774579695.1.C>T;rs774799003.G>A;rs774883578.A>C;rs775494607.G>A;rs775526810.C>A;rs775570841.G>C;rs775601164.G>A;rs775926386.G>C;rs776082092.C>T;rs776236081.C>T;rs776289153.C>T;rs776321529.G>C;rs776662759.T>G;rs776973423.C>T;rs776984091.T>C;rs777220476.1.C>T;rs777220476.2.C>A;rs777238016.T>C;rs777347164.C>T;rs777368221.A>C;rs777425216.C>A;rs777425216.C>T;rs777560627.G>A;rs777673186.G>C;rs777902288.T>A;rs778022685.C>T;rs778054451.C>T;rs778141885.T>C;rs778298325.C>T;rs778601245.C>T;rs778754188.A>G;rs778760295.C>G;rs778776264.T>C;rs778867644.T>C;rs778911905.A>C;rs779465366.A>G;rs779557503.G>A;rs779573574.T>A;rs779728902.A>T;rs779925747.T>G;rs779967271.T>C;rs780025995.G>A;rs780047918.T>C;rs780120302.T>C;rs78060119.C>A;rs780813130.C>T;rs780873985.T>C;rs780885126.T>C;rs781184141.T>C;rs80081766.C>T;rs866110709.C>T;rs866869468.C>A;rs867143119.C>A;rs867226255.C>T;rs867232786.C>T;rs867600987.C>T;rs868047175.C>T;rs868235016.C>T

    DRD2

    Reference;rs1076560.C>A;rs1076560.C>G;rs1076563.A>C;rs1079596.C>A;rs1079596.C>T;rs1079597.C>T;rs1079598.A>G;rs1079598.A>T;rs1110976.T>G;rs11214607.T>G;rs1124491.G>A;rs1124491.G>C;rs1124493.T>G;rs1125394.T>C;rs12364283.A>G;rs12574471.C>G;rs12574471.C>T;rs17601612.G>C;rs1799732._113475530insG;rs1799732.dup;rs1799978.T>C;rs1800497.G>A;rs1800498.G>A;rs1801028.G>C;rs2075652.G>A;rs2234689.G>C;rs2283265.C>A;rs2440390.T>C;rs2514218.C>T;rs2587548.G>A;rs2587548.G>C;rs2734833.G>A;rs2734841.A>C;rs2734841.A>G;rs2734841.A>T;rs2734842.G>C;rs4274224.G>A;rs4274224.G>C;rs4436578.C>G;rs4436578.C>T;rs4460839.C>G;rs4460839.C>T;rs4648317.G>A;rs4648318.T>A;rs4648318.T>C;rs4648318.T>G;rs4936274.A>G;rs4936274.A>T;rs6275.A>G;rs6277.G>A;rs6279.G>C;rs7122246.G>A;rs7131056.A>C;rs7131056.A>G;rs7131440.C>T

    F13A1

    Reference;rs5985.C>A;rs5985.C>T

    F2

    Reference;rs1799963.G>A;rs3136516.G>A;rs5896.C>G;rs5896.C>T

    F5

    Reference;rs6025.C>T

    FKBP5

    Reference;rs1360780.T>A;rs1360780.T>C;rs17614642.T>C;rs3800373.C>A;rs3800373.C>G;rs4713916.A>C;rs4713916.A>G;rs4713916.A>T;rs73748206.C>T;rs9380524.C>A;rs9380524.C>T

    G6PD

    202G>A_376A>G_1264C>G;A;A- 202A_376G;A- 680T_376G;A- 968C_376G;Aachen;Abeno;Acrokorinthos;Alhambra;Amazonia;Amiens;Amsterdam;Anadia;Ananindeua;Andalus;Arakawa;Asahi;Asahikawa;Aures;Aveiro;B (reference);Bajo Maumere;Bangkok;Bangkok Noi;Bao Loc;Bari;Belem;Beverly Hills, Genova, Iwate, Niigata, Yamaguchi;Brighton;Buenos Aires;Cairo;Calvo Mackenna;Campinas;Canton, Taiwan-Hakka, Gifu-like, Agrigento-like;Cassano;Chatham;Chikugo;Chinese-1;Chinese-5;Cincinnati;Cleveland Corum;Clinic;Coimbra Shunde;Cosenza;Costanzo;Covao do Lobo;Crispim;Dagua;Durham;Farroupilha;Figuera da Foz;Flores;Fukaya;Fushan;Gaohe;Georgia;Gidra;Gond;Guadalajara;Guangzhou;Haikou;Hammersmith;Harilaou;Harima;Hartford;Hechi;Hermoupolis;Honiara;Ierapetra;Ilesha;Insuli;Iowa, Walter Reed, Springfield;Iwatsuki;Japan, Shinagawa;Kaiping, Anant, Dhon, Sapporo-like, Wosera;Kalyan-Kerala, Jamnaga, Rohini;Kambos;Kamiube, Keelung;Kamogawa;Kawasaki;Kozukata;Krakow;La Jolla;Lages;Lagosanto;Laibin;Lille;Liuzhou;Loma Linda;Ludhiana;Lynwood;Madrid;Mahidol;Malaga;Manhattan;Mediterranean Haplotype;Mediterranean, Dallas, Panama, Sassari, Cagliari, Birmingham;Metaponto;Mexico City;Miaoli;Minnesota, Marion, Gastonia, LeJeune;Mira d'Aire;Mizushima;Montalbano;Montpellier;Mt Sinai;Munich;Murcia Oristano;Musashino;Namouru;Nankang;Nanning;Naone;Nara;Nashville, Anaheim, Portici;Neapolis;Nice;Nilgiri;No name;North Dallas;Olomouc;Omiya;Orissa;Osaka;Palestrina;Papua;Partenope;Pawnee;Pedoplis-Ckaro;Piotrkow;Plymouth;Praha;Puerto Limon;Quing Yan;Radlowo;Rehevot;Rignano;Riley;Riverside;Roubaix;S. Antioco;Salerno Pyrgos;Santa Maria;Santiago;Santiago de Cuba, Morioka;Sao Borja;Seattle, Lodi, Modena, Ferrara II, Athens-like;Seoul;Serres;Shenzen;Shinshu;Sibari;Sierra Leone;Sinnai;Songklanagarind;Split;Stonybrook;Sugao;Sumare;Sunderland;Surabaya;Suwalki;Swansea;Taipei, Chinese-3;Telti, Kobe;Tenri;Tokyo, Fukushima;Toledo;Tomah;Tondela;Torun;Tsukui;Ube Konan;Union,Maewo, Chinese-2, Kalo;Urayasu;Utrecht;Valladolid;Vancouver;Vanua Lava;Viangchan, Jammu;Villeurbanne;Volendam;Wayne;West Virginia;Wexham;Wisconsin;Yunan

    GRIK1

    Reference;rs2832407.C>A;rs2832407.C>T

    GRIK4

    Reference;rs12800734.G>A;rs1954787.T>C

    GRIN2B

    Reference;rs1019385.C>A;rs1072388.G>A;rs1072388.G>C;rs1806191.G>A;rs1806191.G>T;rs1806201.G>A;rs2058878.T>A;rs2058878.T>C;rs2160733.A>C;rs2160734.C>G;rs2160734.C>T;rs2284411.C>T;rs890.A>C;rs890.A>G

    HLA-A

    *31:01;Reference

    HLA-B

    *15:02;*57:01;*58:01;Reference

    HMGCR

    Reference;rs10474433.T>C;rs10474433.T>G;rs12654264.A>T;rs17238540.T>G;rs17244841.A>T;rs17671591.C>T;rs3846662.A>G;rs3846662.A>T

    HTR2A

    Reference;rs17288723.T>C;rs17289304.T>C;rs17289304.T>G;rs1928040.G>A;rs1928040.G>C;rs2274639.C>G;rs2274639.C>T;rs2770296.C>G;rs2770296.C>T;rs3742278.A>G;rs3803189.T>G;rs6305.G>A;rs6311.C>A;rs6311.C>T;rs6312.C>A;rs6312.C>G;rs6312.C>T;rs6313.G>A;rs6313.G>C;rs6314.G>A;rs659734.G>A;rs659734.G>C;rs659734.G>T;rs7997012.A>C;rs7997012.A>G;rs7997012.A>T;rs9316233.C>A;rs9316233.C>G;rs9316233.C>T;rs9567746.A>C;rs9567746.A>G

    HTR2C

    Reference;rs1023574.C>G;rs1023574.C>T;rs12836771.A>G;rs1414334.C>G;rs2497538.A>C;rs3813928.G>A;rs3813929.C>G;rs3813929.C>T;rs498207.G>A;rs518147.C>A;rs518147.C>G;rs539748.C>T;rs6318.C>G;rs6318.C>T;rs9698290.T>A;rs9698290.T>C

    IFNL3/4

    Reference;rs12979860 variant (T)

    IL6

    Reference;rs10242595.G>A;rs10242595.G>C;rs10242595.G>T;rs10499563.T>C;rs1524107.C>G;rs1524107.C>T;rs1800795.C>G;rs1800795.C>T;rs1800796.G>A;rs1800796.G>C;rs1800797.A>C;rs1800797.A>G;rs1800797.A>T;rs2066992.G>A;rs2066992.G>C;rs2066992.G>T;rs2069835.T>C;rs2069837.A>C;rs2069837.A>G;rs2069840.C>G

    ITGB3

    Reference;rs11871251.G>A;rs11871251.G>C;rs2317676.A>G;rs3785873.G>A;rs3785873.G>T;rs58847127.G>A;rs58847127.G>C;rs58847127.G>T;rs5918.T>C;rs8069732.C>A;rs8069732.C>T

    KIF6

    Reference;rs20455.A>G;rs9462535.C>A;rs9462535.C>G;rs9462535.C>T;rs9471077.A>G

    LPA

    Reference;rs10455872.A>G;rs3798220.T>C

    MT-RNR1

    NC_012920.1:m.1520T>C;NC_012920.1:m.1537C>T;NC_012920.1:m.1556C>T;NC_012920.1:m.669T>C;NC_012920.1:m.747A>G;NC_012920.1:m.786G>A;NC_012920.1:m.807A>C;NC_012920.1:m.807A>G;NC_012920.1:m.839A>G;NC_012920.1:m.896A>G;NC_012920.1:m.930A>G;NC_012920.1:m.960delC;NC_012920.1:m.988G>A;Reference;rs1556422499.delT;rs200887992.G>A;rs267606617.A>G;rs267606618.T>C;rs267606619.C>T;rs28358569.A>G;rs28358571.T>C;rs28358572.T>C;rs3888511.T>G;rs56489998.A>G

    MTHFR

    Reference;rs1476413.C>G;rs1476413.C>T;rs17367504.A>G;rs17421511.G>A;rs1801131.T>G;rs1801133.G>A;rs1801133.G>C;rs2274976.C>T;rs3737967.G>A;rs4846051.G>A;rs4846051.G>C;rs4846051.G>T

    NUDT15

    *1;*10;*11;*12;*13;*14;*15;*16;*17;*18;*19;*2;*20;*3;*4;*5;*6;*7;*8;*9

    OPRD1

    Reference;rs1042114.G>C;rs1042114.G>T;rs10753331.G>A;rs10753331.G>T;rs12749204.A>G;rs204047.G>C;rs204047.G>T;rs204055.T>A;rs204055.T>C;rs204069.A>G;rs204076.T>A;rs204076.T>C;rs204076.T>G;rs2234918.C>G;rs2234918.C>T;rs2236855.C>A;rs2236855.C>G;rs2236857.T>C;rs2236861.G>A;rs2298895.A>T;rs2298896.T>G;rs2298897.C>G;rs3766951.T>C;rs419335.A>G;rs421300.A>C;rs421300.A>G;rs4654327.G>A;rs4654327.G>T;rs482387.G>A;rs482387.G>C;rs508448.A>G;rs529520.A>C;rs529520.A>G;rs533123.G>A;rs533123.G>C;rs569356.A>G;rs581111.A>C;rs581111.A>G;rs581111.A>T;rs6669447.T>C;rs678849.C>G;rs678849.C>T;rs680090.G>A;rs760589.G>A;rs797397.G>A

    OPRK1

    Reference;rs10111937.C>T;rs1051660.C>A;rs1051660.C>G;rs1051660.C>T;rs16918842.C>A;rs16918842.C>T;rs16918875.G>A;rs16918909.A>G;rs16918941.A>G;rs3802279.C>T;rs3802281.T>C;rs3808627.C>G;rs3808627.C>T;rs6473797.T>C;rs6473799.A>G;rs6985606.T>A;rs6985606.T>C;rs7016778.A>T;rs702764.T>C;rs702764.T>G;rs7813478.T>C;rs963549.C>T;rs997917.T>C

    OPRM1

    Reference;rs10457090.A>G;rs10457090.A>T;rs10485057.A>G;rs10485058.A>G;rs10485060.C>A;rs1074287.A>G;rs11575856.G>A;rs12190259.A>C;rs12205732.G>A;rs12209447.C>T;rs12210856.T>G;rs1294092.A>G;rs1319339.T>A;rs1319339.T>C;rs13195018.A>C;rs13195018.A>T;rs13203628.A>G;rs1323040.A>G;rs1323042.G>C;rs1323042.G>T;rs1381376.C>A;rs1381376.C>G;rs1381376.C>T;rs1461773.G>A;rs17174629.A>G;rs17174794.C>G;rs17174794.C>T;rs17174801.A>G;rs17180982.dup;rs17181352.A>G;rs1799971.A>G;rs1799972.C>A;rs1799972.C>G;rs1799972.C>T;rs1852629.T>A;rs1852629.T>C;rs1852629.T>G;rs2010884.G>A;rs2075572.G>C;rs2236256.C>A;rs2236257.G>C;rs2236258.C>G;rs2236258.C>T;rs2236259.T>A;rs2236259.T>C;rs2236259.T>G;rs2281617.C>G;rs2281617.C>T;rs3778148.G>T;rs3778150.T>C;rs3778151.T>C;rs3778152.A>G;rs3778156.A>G;rs3798676.C>T;rs3798677.A>G;rs3798678.A>C;rs3798678.A>G;rs3798683.G>A;rs3798688.G>T;rs3823010.G>A;rs483481.G>A;rs483481.G>C;rs4870266.G>A;rs495491.A>G;rs497976.G>A;rs497976.G>T;rs499796.A>G;rs506247.A>C;rs510769.C>T;rs511435.C>G;rs511435.C>T;rs518596.G>A;rs524731.C>A;rs527434.T>A;rs527434.T>C;rs538174.T>C;rs540825.A>C;rs540825.A>G;rs540825.A>T;rs544093.G>A;rs544093.G>T;rs548646.T>A;rs548646.T>C;rs548646.T>G;rs553202.C>T;rs558025.A>G;rs558948.C>G;rs558948.C>T;rs562859.C>A;rs562859.C>G;rs562859.C>T;rs563649.C>T;rs569284.A>C;rs583664.T>C;rs589046.C>T;rs598160.G>A;rs598160.G>C;rs598682.A>C;rs598682.A>G;rs598682.A>T;rs599548.G>A;rs606545.G>A;rs606545.G>C;rs609148.G>A;rs609148.G>T;rs609623.T>A;rs609623.T>C;rs610231.G>A;rs610231.G>C;rs613355.C>A;rs613355.C>G;rs613355.C>T;rs618207.A>C;rs618207.A>G;rs618207.A>T;rs62436463.C>T;rs62638690.G>T;rs632499.A>C;rs632499.A>G;rs632499.A>T;rs639855.C>A;rs639855.C>G;rs642489.G>A;rs642489.G>T;rs644261.G>A;rs644261.G>C;rs644261.G>T;rs645027.A>G;rs647192.G>A;rs647192.G>C;rs648007.A>C;rs648007.A>G;rs648893.A>G;rs650825.G>A;rs6557337.C>A;rs6557337.C>T;rs658156.A>C;rs658156.A>G;rs658156.A>T;rs671531.A>G;rs671531.A>T;rs675026.A>C;rs675026.A>G;rs677830.C>A;rs677830.C>G;rs677830.C>T;rs681243.T>A;rs681243.T>C;rs6902403.T>C;rs6912029.G>T;rs73576470.A>G;rs7748401.T>G;rs7763748.C>A;rs7763748.C>T;rs7776341.A>C;rs79910351.C>T;rs9282815.C>A;rs9282815.C>T;rs9322446.G>A;rs9322447.A>C;rs9322447.A>G;rs9322447.A>T;rs9322453.G>C;rs9371773.G>A;rs9371776.G>A;rs9384174.C>G;rs9384174.C>T;rs9384179.G>A;rs9384179.G>T;rs9397685.A>G;rs9397685.A>T;rs9397687.C>T;rs9479757.G>A;rs9479779.A>G

    RYR1

    NC_000019.10:g.38440818G>C;NC_000019.10:g.38444179C>A;NC_000019.10:g.38444252G>T;NC_000019.10:g.38444257A>C;NC_000019.10:g.38444257A>G;NC_000019.10:g.38448680_38448681insGGA;NC_000019.10:g.38448715G>A;NC_000019.10:g.38451785C>A;NC_000019.10:g.38452985C>T;NC_000019.10:g.38455253C>G;NC_000019.10:g.38455254T>C;NC_000019.10:g.38455347T>C;NC_000019.10:g.38455504G>T;NC_000019.10:g.38466392G>A;NC_000019.10:g.38469404A>C;NC_000019.10:g.38485679T>C;NC_000019.10:g.38486095A>G;NC_000019.10:g.38490642A>C;NC_000019.10:g.38494454G>A;NC_000019.10:g.38496455G>A;NC_000019.10:g.38499234T>C;NC_000019.10:g.38499642C>A;NC_000019.10:g.38499667G>A;NC_000019.10:g.38499667G>T;NC_000019.10:g.38499680T>A;NC_000019.10:g.38499683G>A;NC_000019.10:g.38499696C>G;NC_000019.10:g.38499719A>G;NC_000019.10:g.38499730G>A;NC_000019.10:g.38499985A>T;NC_000019.10:g.38500000G>A;NC_000019.10:g.38502669C>G;NC_000019.10:g.38504298G>A;NC_000019.10:g.38506508C>G;NC_000019.10:g.38506865C>T;NC_000019.10:g.38507821C>T;NC_000019.10:g.38512279G>A;NC_000019.10:g.38515052C>T;NC_000019.10:g.38516181T>C;NC_000019.10:g.38516208G>C;NC_000019.10:g.38517470T>C;NC_000019.10:g.38517523T>A;NC_000019.10:g.38519424C>A;NC_000019.10:g.38519432A>T;NC_000019.10:g.38519447A>G;NC_000019.10:g.38525432C>T;NC_000019.10:g.38527710G>C;NC_000019.10:g.38528372G>T;NC_000019.10:g.38529002G>C;NC_000019.10:g.38529042C>T;NC_000019.10:g.38543380A>T;NC_000019.10:g.38543566G>A;NC_000019.10:g.38543810C>T;NC_000019.10:g.38548253A>T;NC_000019.10:g.38561140G>C;NC_000019.10:g.38561213C>T;NC_000019.10:g.38561362G>A;NC_000019.10:g.38561363G>T;NC_000019.10:g.38565023T>G;NC_000019.10:g.38570649C>G;NC_000019.10:g.38577931A>C;NC_000019.10:g.38578205G>T;NC_000019.10:g.38580039_38580040delinsAA;NC_000019.10:g.38580041C>A;NC_000019.10:g.38580126C>G;NC_000019.10:g.38580397G>C;NC_000019.10:g.38580416C>T;NC_000019.10:g.38585078A>G;NC_000019.10:g.38585099G>A;NC_000019.10:g.38586190A>G;NC_000019.10:g.38587362G>C;NC_000019.10:g.38587363G>C;Reference;rs111272095.C>T;rs111364296.G>A;rs111565359.G>A;rs111657878.T>C;rs111888148.G>A;rs112151058.G>A;rs112196644.A>G;rs112563513.G>A;rs112596687.T>A;rs112772310.G>A;rs113210953.A>G;rs113332073.G>A;rs113332073.G>T;rs117886618.C>G;rs118192113.C>A;rs118192116.C>G;rs118192116.C>T;rs118192121.A>C;rs118192122.G>A;rs118192123.T>C;rs118192124.C>T;rs118192126.A>G;rs118192130.G>A;rs118192135.G>A;rs118192140.C>T;rs118192151.G>A;rs118192151.G>C;rs118192158.G>A;rs118192159.C>G;rs118192160.G>A;rs118192160.G>T;rs118192161.C>T;rs118192162.A>C;rs118192162.A>G;rs118192163.G>A;rs118192163.G>C;rs118192163.G>T;rs118192167.A>G;rs118192168.G>A;rs118192170.T>C;rs118192172.C>T;rs118192175.C>T;rs118192176.G>A;rs118192177.C>G;rs118192177.C>T;rs118192178.C>G;rs118192178.C>T;rs118192181.C>T;rs118204421.C>T;rs118204422.T>C;rs118204423.G>A;rs118204423.G>C;rs121918592.G>A;rs121918592.G>C;rs121918593.G>A;rs121918594.G>A;rs121918594.G>T;rs121918595.C>T;rs121918596._38499648delGAG;rs137932199.G>A;rs137933390.A>G;rs138874610.G>A;rs139161723.G>A;rs139647387.A>G;rs140152019.G>A;rs140616359.G>A;rs141646642.C>G;rs141942845.G>A;rs142474192.G>A;rs142474192.G>T;rs143398211.G>A;rs143520367.C>T;rs143987857.G>A;rs143988412.A>G;rs143988412.A>T;rs144336148.G>A;rs144685735.C>T;rs145573319.A>G;rs145801146.C>T;rs146306934.G>A;rs146429605.A>G;rs146504767.G>A;rs146876145.C>T;rs147136339.A>G;rs147213895.A>G;rs147303895.G>A;rs147707463.C>T;rs147723844.A>G;rs148399313.G>A;rs148623597.G>A;rs150396398.G>C;rs151029675.C>T;rs151119428.G>A;rs1801086.G>A;rs1801086.G>C;rs1801086.G>T;rs180714609.G>A;rs186983396.C>G;rs186983396.C>T;rs192863857.C>T;rs193922744.T>G;rs193922745._38440752delTGA;rs193922746.A>G;rs193922747.T>C;rs193922748.C>T;rs193922749.C>A;rs193922750.C>A;rs193922751.G>A;rs193922752.A>G;rs193922753.G>A;rs193922753.G>T;rs193922754.G>A;rs193922755.G>A;rs193922756.A>G;rs193922757.C>T;rs193922759.G>A;rs193922760.A>T;rs193922761.G>T;rs193922762.C>A;rs193922762.C>T;rs193922764.C>A;rs193922764.C>G;rs193922764.C>T;rs193922766.G>A;rs193922766.G>T;rs193922767.G>A;rs193922767.G>T;rs193922768.C>A;rs193922768.C>T;rs193922769.T>C;rs193922769.T>G;rs193922770.C>T;rs193922772.G>A;rs193922772.G>T;rs193922775.C>T;rs193922776.C>T;rs193922777.C>T;rs193922781.C>T;rs193922782.T>G;rs193922783.T>A;rs193922788.G>C;rs193922789.G>A;rs193922790.A>T;rs193922791.C>T;rs193922792.G>T;rs193922793.T>A;rs193922795.G>A;rs193922797.G>A;rs193922798.G>C;rs193922799.G>A;rs193922801.A>G;rs193922802.G>A;rs193922803.C>T;rs193922804.A>G;rs193922805.T>G;rs193922806.C>G;rs193922807.G>C;rs193922809.G>A;rs193922810.G>A;rs193922810.G>T;rs193922812.C>T;rs193922813.G>C;rs193922815.G>A;rs193922815.G>C;rs193922816.C>T;rs193922817.C>T;rs193922818.G>A;rs193922819.T>C;rs193922822.C>G;rs193922822.C>T;rs193922824.C>T;rs193922826.C>G;rs193922826.C>T;rs193922827.G>C;rs193922828.G>A;rs193922829.G>A;rs193922830.C>T;rs193922831.T>A;rs193922832.G>A;rs193922833.G>A;rs193922834.G>A;rs193922838.G>A;rs193922838.G>T;rs193922839.G>A;rs193922840.T>G;rs193922842.C>G;rs193922842.C>T;rs193922843.G>T;rs193922844.C>A;rs193922848.A>T;rs193922849.C>A;rs193922850.T>C;rs193922852.G>C;rs193922852.G>T;rs193922853.A>T;rs193922855.C>T;rs193922860.G>A;rs193922862._38572267delinsCT;rs193922863.C>T;rs193922864.T>C;rs193922865.T>G;rs193922866.G>A;rs193922867.C>T;rs193922868.G>A;rs193922873.G>A;rs193922873.G>T;rs193922874.T>C;rs193922876.C>T;rs193922877.delA;rs193922878.C>G;rs193922879.G>A;rs193922880.C>G;rs193922883.T>C;rs193922888.G>A;rs193922895.C>A;rs193922896.G>T;rs193922898.T>A;rs199738299.A>G;rs199870223.C>T;rs200766617.G>A;rs201321695.A>G;rs2145447772.G>A;rs2145447772.G>C;rs28933396.G>A;rs28933396.G>T;rs28933397.C>T;rs34390345.A>G;rs34694816.A>G;rs34934920.C>T;rs35180584.C>G;rs35364374.G>T;rs370634440.G>A;rs370634440.G>T;rs372958050.T>C;rs373406011.C>T;rs375626634.T>C;rs375915752.C>T;rs376149732.C>T;rs4802584.C>G;rs537994744.G>A;rs549201486.C>T;rs551223467.C>T;rs553055844.G>A;rs55876273.G>C;rs587784372.C>T;rs63749869.G>A;rs727504129.C>T;rs746818096.T>A;rs747177274.G>C;rs748575133.T>A;rs749040743.G>A;rs751180702.G>A;rs752652072.C>T;rs754476250.C>T;rs754785770.A>G;rs755088027.G>A;rs756850145.A>G;rs757753317.G>A;rs759500310.T>C;rs761616815.G>A;rs762401851.G>A;rs763112609.C>T;rs763352221.C>T;rs767553612.A>G;rs768360593.G>A;rs768535909.T>C;rs769482889.C>T;rs770593660.G>C;rs771058055.G>A;rs771741606.C>T;rs773040531.A>G;rs778241277.G>A;rs781104539.A>G;rs781126470.C>T;rs901087791.G>A;rs914804033.G>A;rs914804033.G>C;rs917523269.C>T;rs936513262.G>A;rs959170123.G>A;rs976108591.A>G;rs995399438.T>C

    SLCO1B1

    *1;*10;*11;*12;*13;*14;*15;*16;*19;*2;*20;*23;*24;*25;*26;*27;*28;*29;*3;*30;*31;*32;*33;*34;*36;*37;*38;*39;*4;*40;*41;*42;*43;*44;*45;*46;*47;*5;*6;*7;*8;*9

    TNF

    Reference;rs1799724.C>T;rs1799964.T>C;rs1800610.G>A;rs1800629.G>A;rs1800630.C>A;rs1800750.G>A;rs2736195.A>G;rs3093548.C>T;rs3093662.A>G;rs3093726.T>C;rs361525.G>A;rs4248158.C>T;rs4248159.C>A;rs4248160.G>A;rs4248163.C>A;rs4248163.C>G;rs4248163.C>T;rs4647198.C>T;rs4987086.G>A;rs55634887.G>A;rs55994001.C>A;rs55994001.C>T

    TPMT

    *1;*10;*11;*12;*13;*14;*15;*16;*17;*18;*19;*2;*20;*21;*22;*23;*24;*25;*26;*27;*28;*29;*30;*31;*32;*33;*34;*35;*36;*37;*38;*39;*3A;*3B;*3C;*4;*40;*41;*42;*43;*44;*5;*6;*7;*8;*9

    UGT1A1

    *1;*27;*28;*36;*37;*6;*80;*80+*28;*80+*37

    UGT1A4

    *1a;*1b;*1c;*2;*3a;*3b;*4;*7

    UGT2B15

    *1;*2;*3;*4;*5;*6;*7

    VKORC1

    Reference;rs9923231 variant (T)

    YEATS4

    Reference;rs7297610.C>T

    *1 and *2 are not distinguishable due to the lack of probes for rs30193105. Samples with *2 will be called as *1.
  • *3 and *4 are not distinguishable due to the lack of probes for rs30193105, while *3 core variant rs2108622 is covered by all three products. Samples with *4 will be called as *3.

  • UGT1A1: *28 (rs8175347 [TA]8) and *37 (rs8175347 [TA]9) are not covered in all three PGx products due to the lack of functional probes.

  • UGT2B15: GSAv4-ePGx and GCRA-ePGx do not support *4 or *5 due to the lack of probes for rs4148269 and rs1902023.

  • CYP2D6: Due to the design of the probes, *40 (rs72549356[AAAGGGGCG]3) and *58 (rs72549356[AAAGGGGCG]2) cannot be distinguished. As a result, both alleles are reported as *40.

  • NUDT15: Due to the design of the probes, *6 (rs746071566dupGAGTCG) and *9 ((rs746071566delGAGTCG)) cannot be distingushed. As a result, both alleles are reported as *6.

  • GSA-PGx-48v4-0_20079540_E

    CYP2C9

    10:96701973,ilmnseq_rs774607211_ilmnfwd_ilmndup1,ilmnseq_rs774607211_ilmnfwd_ilmndup2

    GSA-PGx-48v4-0_20079540_E

    CYP2D6

    ilmnseq_rs1135836_ilmnrev_deg3a3b0

    GSA-PGx-48v4-0_20079540_E

    CYP2D6

    PGX_IlmnSeq_rs769157652_BEST,ilmnseq_rs769157652_ilmnrev_F2BT,ilmnseq_rs769157652_ilmnrev_deg3a1b0

    GSA-PGx-48v4-0_20079540_E

    CYP4F2

    ilmnseq_rs4020346_ilmnfwd

    GSA-PGx-48v4-0_20079540_E

    OPRM1

    ilmnseq_rs9384179.1_F2BT

    GSA-PGx-48v4-0_20079540_E

    UGT1A1

    ilmnseq_rs8175347.2_ilmnrev_F2BTindel_cei_ilmndup31

    GCRA-PGx-24v1-0_20084467_C

    CYP2C19

    IlmnSeq_rs113934938,ilmnseq_rs113934938_ilmnfwd,ilmnseq_rs113934938_ilmnfwd_ilmndup1,ilmnseq_rs113934938_ilmnfwd_ilmndup2,rs113934938

    GCRA-PGx-24v1-0_20084467_C

    CYP2C9

    ilmnseq_rs774607211_ilmnrev,ilmnseq_rs774607211_ilmnrev_ilmndup2

    GCRA-PGx-24v1-0_20084467_C

    CYP2D6

    ilmnseq_rs2004511_dup1

    GCRA-PGx-24v1-0_20084467_C

    CYP2D6

    PGX_IlmnSeq_rs769157652_BEST,ilmnseq_rs769157652_ilmnrev,ilmnseq_rs769157652_ilmnrev_deg3a1b0,ilmnseq_rs769157652_ilmnrev_deg3a1b0_ilmndup1,ilmnseq_rs769157652_ilmnrev_ilmndup1,seq-rs61737947

    GCRA-PGx-24v1-0_20084467_C

    CYP4F2

    ilmnseq_rs4020346_ilmnfwd

    GCRA-PGx-24v1-0_20084467_C

    OPRM1

    ilmnseq_rs9384179.1_F2BT

    ABCG2

    Reference;rs2231142.G>T

    ADH1B

    Reference;rs1229984.T>C;rs1229984.T>G;rs1229985.A>G;rs17033.T>C;rs1789891.C>A;rs2018417.C>A;rs2018417.C>T;rs2066702.G>A;rs75967634.C>T

    ALDH2

    Reference;rs671.G>A

    ANK3

    Reference;rs143414470.T>C

    ANKK1

    Reference;rs1800497.G>A;rs2587550.G>A;rs2734849.A>C;rs2734849.A>G;rs4938013.A>C;rs4938013.A>G;rs4938013.A>T;rs7118900.G>A;rs7118900.G>C

    APOE

    E2;E3;E4

    ATM

    GDA-ePGx

    GDAePGx_G2_GS_import.txt

    GDA-ePGx G2 product filesarrow-up-right

    GSAv4-ePGx

    GSAePGx_E2_GS_import.txt

    GSAv4-ePGx product filesarrow-up-right

    GCRA-ePGx

    GCRAePGx_E2_GS_import.txt

    GCRA-ePGx product filesarrow-up-right

    GDA_PGx-8v1-0_20042614_G

    CYP1A2

    ilmnseq_rs35694136_ilmnfwd;ilmnseq_rs35694136_ilmnfwd_ilmndup1;ilmnseq_rs35694136_ilmnfwd_ilmndup2;ilmnseq_rs35694136_ilmnfwd_ilmndup3;ilmnseq_rs35694136_ilmnfwd_ilmndup4;ilmnseq_rs35694136_ilmnfwd_ilmndup5;ilmnseq_rs35694136_ilmnfwd_ilmndup6;ilmnseq_rs35694136_ilmnfwd_ilmndup7

    GDA_PGx-8v1-0_20042614_G

    CYP2D6

    ilmnseq_rs72549352_ilmnrev_F2BTindel_deg3a3b3_IlmnRep;ilmnseq_rs72549352_ilmnrev_F2BTindel_deg3a3b3_ilmndup1;ilmnseq_rs72549352_ilmnrev_F2BTindel_deg3a3b3_ilmndup3;ilmnseq_rs72549352_ilmnrev_F2BTindel_ilmndup1;ilmnseq_rs72549352_ilmnrev_F2BTindel_ilmndup3

    GDA_PGx-8v1-0_20042614_G

    CYP4F2

    ilmnseq_rs4020346_ilmnfwd

    GDA_PGx-8v1-0_20042614_G

    UGT1A1

    GSA-PGx-48v4-0_20079540_E

    CYP1A2

    IlmnSeq_rs35694136_IlmnFWD;ilmnseq_rs35694136_ilmnfwd_ilmndup2;ilmnseq_rs35694136_ilmnfwd_ilmndup3;ilmnseq_rs35694136_ilmnfwd_ilmndup4;ilmnseq_rs35694136_ilmnfwd_ilmndup5;ilmnseq_rs35694136_ilmnfwd_ilmndup6;ilmnseq_rs35694136_ilmnfwd_ilmndup7

    GSA-PGx-48v4-0_20079540_E

    CYP2C19

    IlmnSeq_rs367543002,ilmnseq_rs367543002_ilmnfwd,ilmnseq_rs367543002_ilmnfwd_ilmndup1,ilmnseq_rs367543002_ilmnrev_deg3a1b0_ilmndup1,rs367543002

    GSA-PGx-48v4-0_20079540_E

    CYP2C19

    ilmnseq_rs17882687_ilmnfwd_ilmndup2,ilmnseq_rs17882687_ilmnrev,ilmnseq_rs17882687_ilmnrev_ilmndup1,ilmnseq_rs17882687_ilmnrev_ilmndup2

    GSA-PGx-48v4-0_20079540_E

    CYP2C19

    GCRA-PGx-24v1-0_20084467_C

    COMT

    ilmnseq_rs7287550_ilmnfwd_F2BT

    GCRA-PGx-24v1-0_20084467_C

    CYP1A2

    IlmnSeq_rs35694136;IlmnSeq_rs35694136_IlmnFWD;ilmnseq_rs35694136_ilmnfwd_ilmndup2;ilmnseq_rs35694136_ilmnfwd_ilmndup5;ilmnseq_rs35694136_ilmnfwd_ilmndup6;rs35694136

    GCRA-PGx-24v1-0_20084467_C

    CYP2C19

    ilmnseq_rs367543002_ilmnfwd

    GCRA-PGx-24v1-0_20084467_C

    CYP2C19

    How to use the auxiliary filearrow-up-right

    Reference;rs11212570.G>A;rs11212570.G>T;rs11212617.C>A;rs1801516.G>A;rs620815.T>A;rs620815.T>C

    ilmnseq_rs8175347_ilmnfwd_F2BTindel;ilmnseq_rs8175347_ilmnfwd_F2BTindel_ilmndup1;ilmnseq_rs8175347_ilmnrev;ilmnseq_rs8175347_ilmnrev_ilmndup1;ilmnseq_rs8175347_ilmnrev_ilmndup2;ilmnseq_rs8175347_ilmnrev_ilmndup3

    IlmnSeq_rs113934938,ilmnseq_rs113934938_ilmnfwd,ilmnseq_rs113934938_ilmnfwd_ilmndup1,ilmnseq_rs113934938_ilmnfwd_ilmndup2,rs113934938

    ilmnseq_rs17882687_ilmnfwd_ilmndup2,ilmnseq_rs17882687_ilmnrev_ilmndup1

    DRAGEN Array Applications

    The following Types of Analysis are currently supported by DRAGEN Array:

    • DRAGEN Array – Genotyping

    • DRAGEN Array – PGx – CNV calling

    • DRAGEN Array – PGx – Star allele annotation

    • DRAGEN Array – Methylation QC

    • DRAGEN Array – Cytogenetics analysis

    • DRAGEN Array - Cytogenetics analysis + Emedgene interpretation

    hashtag
    Product & Analysis Compatibility

    These products/beadchips have been verified to be compatible with the following analyses and versions of DRAGEN Array:

    Manifest Name
    DRAGEN Array Cloud Version(s)
    DRAGEN Array Local Version(s)
    Analysis
    Genome(s)

    hashtag
    DRAGEN Array – Genotyping

    Item
    Description

    hashtag
    DRAGEN Array – PGx – CNV calling

    Item
    Description

    hashtag
    DRAGEN Array – PGx – Star Allele Annotation

    Item
    Description

    hashtag
    DRAGEN Array – Methylation QC

    Item
    Description

    hashtag
    DRAGEN Array – Cytogenetics analysis

    Item
    Description

    hashtag
    DRAGEN Array - Cytogenetics analysis + Emedgene interpretation

    Item
    Description

    v1.0, v1.1

    v1.0+

    DRAGEN Array – Genotyping

    GRCh37, GRCh38

    v1.0, v1.1

    v1.0+

    DRAGEN Array – PGx – CNV calling

    GRCh37, GRCh38

    v1.0

    v1.0

    DRAGEN Array – PGx – Star allele annotate

    GRCh38

    v1.0, v1.1

    v1.0+

    DRAGEN Array – Genotyping

    GRCh38

    v1.0, v1.1

    v1.0+

    DRAGEN Array – PGx – CNV Calling

    GRCh38

    v1.1+

    v1.1+

    DRAGEN Array – PGx – Star allele annotate

    GRCh38

    v1.0, v1.1

    v1.0+

    DRAGEN Array – Genotyping

    GRCh37, GRCh38

    v1.0, v1.1

    v1.0+

    DRAGEN Array – Genotyping

    GRCh38

    v1.0, v1.1

    v1.0+

    DRAGEN Array – PGx – CNV Calling

    GRCh38

    v1.1+

    v1.1+

    DRAGEN Array – PGx – Star allele annotate

    GRCh38

    v1.0, v1.1

    v1.0+

    DRAGEN Array – Genotyping

    GRCh38

    v1.0, v1.1

    v1.0+

    DRAGEN Array – PGx – CNV Calling

    GRCh38

    v1.1+

    v1.1+

    DRAGEN Array – PGx – Star allele annotate

    GRCh38

    v1.1+

    v1.0+

    DRAGEN Array – Genotyping

    GRCh37

    v1.0

    N/A

    DRAGEN Array – Methylation – QC

    GRCh38

    v1.0

    N/A

    DRAGEN Array – Methylation – QC

    GRCh38

    v1.0

    N/A

    DRAGEN Array – Methylation – QC

    GRCh38

    v1.2+

    v1.2+

    DRAGEN Array – Cytogenetics analysis + Emedgene interpretation

    GRCh37, GRCh38

    v1.3+

    v1.3+

    DRAGEN Array – Cytogenetics analysis + Emedgene interpretation

    GRCh37, GRCh38

    v1.2+

    v1.2+

    DRAGEN Array – Cytogenetics analysis + Emedgene interpretation

    GRCh37, GRCh38

    v1.2+

    v1.2+

    DRAGEN Array – Cytogenetics analysis + Emedgene interpretation

    GRCh37, GRCh38

    v1.2+

    v1.2+

    DRAGEN Array – Cytogenetics analysis

    GRCh37, GRCh38

    v1.3+

    v1.3+

    DRAGEN Array – Cytogenetics analysis

    GRCh37, GRCh38

    v1.2+

    v1.2+

    DRAGEN Array – Cytogenetics analysis

    GRCh37, GRCh38

    v1.2+

    v1.2+

    DRAGEN Array – Cytogenetics analysis

    GRCh37, GRCh38

    Inputs

    •

    • [may be pre-setup on cloud]

    • [may be pre-setup on cloud]

    • [pre-setup on cloud]

    • [optional on cloud and local]

    Outputs

    Per sample:

    •

    • [optional on cloud and local]

    • [optional on cloud and local]

    Per analysis batch:

    •

    • [cloud only]

    • [cloud only]

    Cost

    Local: No cost download from .

    Cloud: to analyze and store data as needed.

    Inputs

    •

    • [may be pre-setup on cloud]

    • [may be pre-setup on cloud]

    • [pre-setup on cloud]

    • [pre-setup on cloud]

    • [optional on cloud and local]

    Outputs

    Per sample:

    •

    • [optional on local]

    • [optional on local]

    •

    • [optional on local]

    Per analysis batch:

    •

    Cost

    Local: No cost download from .

    Cloud: to analyze and store data as needed.

    Inputs

    •

    • [may be pre-setup on cloud]

    • [may be pre-setup on cloud]

    • [pre-setup on cloud]

    • [pre-setup on cloud]

    • [pre-setup on cloud]

    • [optional on cloud and local]

    Outputs

    Per sample:

    •

    • [optional on local]

    • [optional on local]

    •

    • [optional on local]

    •

    Per analysis batch:

    Cost

    Local: Per sample analysis.

    Cloud: Per sample analysis. to store data as needed.

    Visit the to learn more.

    Inputs

    • [from iScan instrument] • [may be pre-setup on cloud] • [optional on cloud]

    Outputs

    Per sample: • • Per analysis batch: • • • • •

    Cost

    Cloud: to analyze and store data as needed.

    Inputs

    •

    • [may be pre-setup on cloud]

    • [may be pre-setup on cloud]

    • [pre-setup on cloud]

    • [only necessary for local]

    • [optional]

    Outputs

    Per sample:

    • [optional on cloud]

    • [optional on local and cloud]

    • [optional on local and cloud for snv vcf]

    •

    •

    • [optional on local]

    Per analysis batch:

    Cost

    Local: No cost download from .

    Cloud: to analyze and store data as needed.

    Inputs

    •

    • [may be pre-setup]

    • [may be pre-setup]

    • [may be pre-setup]

    • [optional]

    Outputs

    Per sample:

    • [optional]

    • [optional]

    • [optional for snv vcf]

    •

    •

    •

    Per analysis batch:

    Cost

    Cloud: to analyze and store data as needed. As well as additional sample-based costs if uploaded into the interface.

    BovineSNP50_v3_Aarrow-up-right

    v1.0, v1.1

    v1.0+

    DRAGEN Array – Genotyping

    UMD3

    GDA-8v1-0_Darrow-up-right

    v1.0, v1.1

    v1.0+

    DRAGEN Array – Genotyping

    GRCh37, GRCh38

    Summary

    Provides genotyping results for any human Infinium genotyping array.

    Variant types detected

    SNV

    Indel

    Sample minimum

    1 sample

    Arrays supported

    Any human Infinium genotyping array including custom and semi-custom to create a SNV VCF output. Illumina provides Genome FASTA Files required to map to the reference genome for human, genome build 37 and 38. DRAGEN Array Cloud offers additional output formats including Locus Summary and Final Report which are applicable for Infinium arrays for human and non-human species.

    Related Local Commands

    genotype call

    genotype gtc-to-vcf

    Related Cloud Specifics

    Select Type of Analysis DRAGEN Array – Genotyping from the dropdown. Max 1152 samples are supported.

    Summary

    Provides CNV calling on 7 target PGx genes across 10 target regions, plus genotyping outputs.

    Variant types detected

    SNV

    Indel

    CNV

    Sample minimum

    Minimum of 24 samples with 22 passing QC defined as Log R Dev < 0.2. 96 samples are recommended for best results.

    Arrays supported

    Check Product & Analysis Compatibility here Product & Analysis Compatibility

    See Pharmacogenomic Analysis for semi-custom arrays for further detail.

    Related Local Commands

    genotype call

    genotype gtc-to-vcf [optional]

    pgx copy-number call

    Related Cloud Specifics

    Select Type of Analysis DRAGEN Array – PGx – CNV calling from the dropdown. Max 384 samples are supported.

    Summary

    Provides PGx annotation on over 50 genes, plus PGx CNV and genotyping outputs.

    Variant types detected

    SNV

    Indel

    CNV

    Star allele diplotype

    Sample minimum

    Minimum of 24 samples with 22 passing QC defined as Log R Dev < 0.2. 96 samples are recommended for best results.

    Arrays supported

    Check Product & Analysis Compatibility here Product & Analysis Compatibility

    See Pharmacogenomic Analysis for semi-custom arrays for further detail.

    Related Local Commands

    genotype call

    genotype gtc-to-vcf

    pgx copy-number call

    pgx star-allele call

    pgx star-allele annotate

    Related Cloud Specifics

    Select Type of Analysis DRAGEN Array – PGx – Star Allele Annotation from the dropdown. Max 384 samples are supported.

    Summary

    Provides methylation QC for Infinium methylation arrays.

    Variant types detected

    N/A

    Sample minimum

    1 sample

    Arrays supported

    Recommended thresholds and all built-in control probes are available for Methylation Screening Array (MSA) and MethylationEPIC (v1 & v2) originating from iScan. In non-human and custom arrays, availability of built-in QC probes may vary, and failure thresholds must be defined by the user.

    Related Local Commands

    Not available on DRAGEN Array Local.

    Related Cloud Specifics

    Select Type of Analysis DRAGEN Array – Methylation – QC from the dropdown. Adjust customizable thresholds as desired. Further detail can be found in Additional information for DRAGEN Array Methylation QC. A maximum of 1152 samples are supported.

    Summary

    Provides cytogenetic genome-wide copy number and loss of heterozygosity calling

    Variant types detected

    CNV

    LOH

    Sample minimum

    Minimum of 1 sample.

    Arrays supported

    Check Product & Analysis Compatibility here Product & Analysis Compatibility

    Related Local Commands

    genotype call

    genotype gtc-to-vcf [optional]

    genotype gtc-to-bedgraph

    cyto call

    cyto annotate

    Related Cloud Specifics

    Select Type of Analysis DRAGEN Array – Cytogenetics analysis from the dropdown. Max 1152 samples are supported.

    Summary

    Provides cytogenetic genome-wide copy number and loss of heterozygosity calling. This analysis type integrates with Emedgene via Automatic Case Creation from ICAarrow-up-right on cloud only.

    Variant types detected

    CNV

    LOH

    Sample minimum

    Minimum of 1 sample.

    Arrays supported

    Check Product & Analysis Compatibility here Product & Analysis Compatibility

    Related Local Commands

    Not available on DRAGEN Array Local.

    Related Cloud Specifics

    Select Type of Analysis DRAGEN Array - Cytogenetics analysis + Emedgene interpretation from the dropdown. Max 1152 samples are supported.

    •
    •

    •

    •

    •

    •

    •

    •

    •

    •

    •

    •

    •

    GDA_PGx-8v1-0_20042614_Earrow-up-right
    GDA_PGx-8v1-0_20042614_Earrow-up-right
    GDA_PGx-8v1-0_20042614_Earrow-up-right
    GDA_PGx-8v1-0_20042614_Garrow-up-right
    GDA_PGx-8v1-0_20042614_Garrow-up-right
    GDA_PGx-8v1-0_20042614_Garrow-up-right
    GSA-24v3-0_Aarrow-up-right
    GSA-PGx-48v4-0_20079540_Earrow-up-right
    GSA-PGx-48v4-0_20079540_Earrow-up-right
    GSA-PGx-48v4-0_20079540_Earrow-up-right
    GCRA-PGx-24v1-0_20084467_Carrow-up-right
    GCRA-PGx-24v1-0_20084467_Carrow-up-right
    GCRA-PGx-24v1-0_20084467_Carrow-up-right
    PRSbooster_20083382_Aarrow-up-right
    EPIC-8v1-0_B5arrow-up-right
    EPIC-8v2-0_A2arrow-up-right
    MSA-48v1-0_20102838_A1arrow-up-right
    CytoSNP-850Kv1-4_iScan_Barrow-up-right
    CytoSNP-850Kv1-4_NS550_Barrow-up-right
    GSACyto-24v1_20044998_Carrow-up-right
    GDACyto-8v1-0_20047166_Earrow-up-right
    CytoSNP-850Kv1-4_iScan_Barrow-up-right
    CytoSNP-850Kv1-4_NS550_Barrow-up-right
    GSACyto-24v1_20044998_Carrow-up-right
    GDACyto-8v1-0_20047166_Earrow-up-right
    IDAT(s)
    Manifest Files
    Cluster File
    Genome FASTA Files
    Sample Sheet
    Genotype Call (GTC) File
    SNV VCF File
    TBI Index File
    Genotype Summary Files
    Final Report
    Locus Summary
    Illumina Support Sitearrow-up-right
    iCreditsarrow-up-right
    IDAT(s)
    Manifest Files
    Cluster File
    Genome FASTA Files
    PGx CN Model File
    Sample Sheet
    Genotype Call (GTC) File
    SNV VCF File
    TBI Index File
    PGx CNV VCF File
    BedGraph Files
    Genotype Summary Files
    Illumina Support Sitearrow-up-right
    iCreditsarrow-up-right
    IDAT(s)
    Manifest Files
    Cluster File
    Genome FASTA Files
    PGx CN Model File
    PGx Database File
    Sample Sheet
    Genotype Call (GTC) File
    SNV VCF File
    TBI Index File
    PGx CNV VCF File
    BedGraph Files
    Star Allele JSON File
    iCreditsarrow-up-right
    Illumina Product Pagearrow-up-right
    IDAT(s)
    Manifest Files
    Sample Sheet
    Methylation Control Probe Output File
    Methylation CG Output File
    Methylation Sample QC Summary Files
    Methylation Sample QC Summary Plots
    Methylation Principal Component Summary
    Methylation Manifest Files
    Methylation Logs and Error Files
    iCreditsarrow-up-right
    IDAT(s)
    Manifest Files
    Cluster File
    Cytogenetics Model File
    Cytogenetics Database File
    Sample Sheet
    Genotype Call (GTC) File
    SNV VCF File
    TBI Index File
    Cytogenetics CNV VCF File
    Cytogenetics Annotation JSON File
    BedGraph Files
    Illumina Support Sitearrow-up-right
    iCreditsarrow-up-right
    IDAT(s)
    Manifest Files
    Cluster File
    Cytogenetics Model File
    Sample Sheet
    Genotype Call (GTC) File
    SNV VCF File
    TBI Index File
    Cytogenetics CNV VCF File
    Cytogenetics Annotation JSON File
    BedGraph Files
    iCreditsarrow-up-right
    Emedgenearrow-up-right
    Warning/Error Messages
    CN Summary File
    Copy Number Batch File
    Warning/Error Messages
    Star Allele CSV File
    Genotype Summary Files
    CN Summary File
    Copy Number Batch File
    Warning/Error Messages
    Genotype Summary Files
    Warning/Error Messages
    Genotype Summary Files
    Warning/Error Messages

    DRAGEN Array Local Analysis

    hashtag
    DRAGEN Array Local Overview

    DRAGEN Array provides accurate, comprehensive, and efficient analysis of Infinium microarray data. The local command-line interface makes it easy for power users to have granular control and flexibility to support large scale microarray genomic studies.

    hashtag
    Getting Started

    DRAGEN Array Local utilizes a command-line interface which allows full user control of software functionality and easy automation of tasks. The software is designed to be used by power users and bioinformaticians. If new to using command-line interface, please review the .

    hashtag
    Computing Requirements

    Before downloading and installing the software, ensure the following specifications are met for best performance:

    Category
    Recommendation

    Note on Cybersecurity: DRAGEN Array is not required to run as adminstrative user. We recommend you do not run with elevated permissions.

    hashtag
    Quota Specifications

    The star-allele call command in DRAGEN Array Local requires quota to run. The quota is charged per sample analyzed and can be purchased on the . Quota is used for all samples analyzed including re-analysis or low-quality samples. Quota is checked before and after analysis but not after updating the usage. Users will need to re-run the command to re-check the current usage after a run.

    The credential provided in the activation email after purchasing should be used as an input to the star-allele call command through the "--license-server-url" option. During runtime, the will record the remaining quota at the beginning and the end of the analysis.

    Internet is required to do a software license check and ensure paid quota is available for all samples in the analysis batch. For the software license check, the following endpoints are used:

    • In v1.0 and v1.1: license.edicogenome.com

    • In v1.2+: license.dragen.illumina.com

    NOTES:

    • Do not use license.dragen.illumina.com license server urls when running DRAGEN Array v1.0 and v1.1 as that domain only works with v1.2+ versions. This is described in the and known issues.

    • In v1.1+, during analysis, precomputed quota is no longer checked. This can result in a scenario where an analysis run can be over-quota, but will not fail until the end of the run. An example: if there is only quota for 6 samples, but the analysis run contains 8 samples, the analysis will proceed as normal until the end when usage is updated the software will produce the following error: Error updating usage. HTTP error status code: 409 and will not write the results to disk.

    hashtag
    Installation

    Please follow the steps below to install the software on your compute infrastructure:

    1. Click on the latest DRAGEN Array version installation package for the platform of your choice. Installers for Windows and Linux are available on the . Once download is completed, move the DRAGEN Array installation package to the desired folder. Administrative permissions may be required for system folders, for example /usr/local/bin for Linux, and C:\Program Files for Windows. Note: Throughout the remainder of the document, Linux will be assumed in the examples.

    2. Unzip and extract the package. The executable can be found in the dragena subfolder of the software download after extraction.

    The version of the software will be displayed in the terminal window when the installation was successful.

    hashtag
    Run DRAGEN Array Local

    For genotyping or cytogenetic analysis, there is no sample minimum required to run analysis.

    For CNV PGx analysis, a minimum of 24 samples is required to run analysis. For a successful analysis, 22 samples must pass QC defined as having log R dev < 0.2. With a standard hardware specification in section , up to 500 GDA-ePGx samples can be processed per analysis batch.

    To optimize performance of the targeted PGx CNV caller and minimize batch effect, it is recommended to:

    • Group samples in the same assay batch (e.g. whole genome amplication and targeted gene application assay batch) into the same analysis batch.

    • Avoid combining sample batches processed on different reagent lots.

    • Analyze batches of 96 samples or more.

    hashtag
    Quick Start

    Review section for information on input files to use, sample minimums per analysis type and other best practices.

    Command examples show analysis for a Linux system using folders instead of sample sheets. For Windows users, make sure to substitute the file paths in the commands following windows conventions, e.g., using backslash (\) instead of forward-slash (/). A sample sheet can be used to select specific samples out of a folder.

    Note: DRAGEN Array will overwrite older files if using the same --output-folder from a previous analysis. If this is not desired, use different --output-folder for re-analyses.

    hashtag
    PGx

    Use the following instructions to start the full PGx analysis, covering genotyping, PGx CNV and PGx star allele calling. Refer to for parameters for all commands.

    1. Open a command prompt (Windows) or terminal window (Linux) and navigate to the directory where the software was installed. Or a different, desired directory if the executable was added to the PATH environmental variable.

    2. Use the genotype call command to call genotypes and generate GTC files using IDAT files as input. dragena genotype call --bpm-manifest /user/productfiles/manifest.bpm --cluster-file /user/productfiles/clusterfile.egt --idat-folder /user/IDATs --output-folder /user/gtc

    3. Use the genotype gtc-to-vcf command to create SNV VCF files from the GTC files generated by the genotype call command. dragena genotype gtc-to-vcf --bpm-manifest /user/productfiles/manifest.bpm --csv-manifest /user/productfiles/manifest.csv --genome-fasta-file /user/productfiles/genome.fa --gtc-folder /user/gtc --output-folder /user/vcf

    hashtag
    Cytogenetics

    Use the following instructions to start the full cytogenetics analysis, covering genotyping, CNV and LOH calling, and annotation. Refer to for parameters for all commands.

    1. Open a command prompt (Windows) or terminal window (Linux) and navigate to the directory where the software was installed. Or a different, desired directory if the executable was added to the PATH environmental variable.

    2. Use the genotype call command to call genotypes and generate GTC files using IDAT files as input. dragena genotype call --bpm-manifest /user/productfiles/manifest.bpm --cluster-file /user/productfiles/clusterfile.egt --idat-folder /user/IDATs --output-folder /user/gtc

    3. Use the cyto call command to determine copy number variants and loss of heterozygosity given genotypes. dragena cyto call --cn-model /user/productfiles/cyto_model.dat --gtc-folder /user/gtc --output-folder /user/vcf

    hashtag
    Command Index

    Use the following syntax when using the command-line interface:

    dragena [module] [sub-module (not needed for cyto)] [command] [required parameters] [optional parameters]

    Module
    Description

    hashtag
    help

    Displays the first-layer help information.

    hashtag
    version

    Displays current DRAGEN Array Local version.

    hashtag
    genotype

    The root command for genotype calling.

    Command
    Description

    hashtag
    genotype call

    Determines genotype calls (GTC) from IDAT files.

    Option
    Description

    Note: Either --idat-folder, --sample-sheet, or both are required inputs.

    hashtag
    genotype gtc-to-bedgraph

    Converts GTC to BedGraph files, producing BedGraph formatted visualization files from the Log R Ratio and B-allele frequency data contained in the GTC intermediate files.

    Option
    Description

    Note: Either --gtc-folder, --sample-sheet, or both are required inputs.

    hashtag
    genotype gtc-to-vcf

    Converts GTC (v5) to . The command is only applicable for produced by DRAGEN Array.

    Option
    Description

    hashtag
    Squashing duplicates

    In the manifest, there can be cases where the same variant is probed by multiple different assays. These assays may be the same design or alternate designs for the same locus. In the default mode of operation, these duplicates will be "squashed" into a single record in the VCF to reflect a true variant rather than probe genotype. The method used to incorporate information across multiple assays is defined further in the . When the --unsquash-duplicates option is provided, this "squashing" behavior is disabled, and each duplicate assay will be reported in a separate entry in the VCF file. This option is helpful when you are interested in investigating or validating the performance of individual assays, rather than trying to generate genotypes for specific variants. Note that if a locus has more than two alleles and is also queried with duplicated designs, the duplicates will not be unsquashed (i.e., in the case of multi-allelic variants). DO NOT use --unsquash-duplicates option if doing star allele calling downstream as that command expects squashed variants.

    hashtag
    Genome cache

    By default, the entire reference genome will be read into memory. Generally, this will be more efficient than reading data from the indexed reference on disk at the expense of greater memory utilization. For situations in which the genome caching is not desirable (low memory availability or a small input manifest), it is possible to disable this default behavior with the --disable-genome-cache option.

    hashtag
    Auxiliary loci

    Certain classes of variant types (such as multi-nucleotide variants) are not currently supported in the upstream analysis software that produces GTC files. However, it is possible to query this type of variant by creating a SNP design that differentiates the specific multi-nucleotide alleles of interest. For example, if the true source sequence is

    ATGC[AT/CG]GTAA

    This assay could be designed as a SNP assay with the following source sequence

    ATGC[A/C]NNNN

    gtc-to-vcf provides an option (--auxiliary-loci) to supply a list of auxiliary records (in VCF format) to restore the true alleles for these cases in the output VCF. There are several restrictions around this function

    • The auxiliary definition must NOT be a multi-allelic variant.

    • The auxiliary definition must be a multi-nucleotide variant.

    • There must NOT be multiple array assays (e.g., duplicates) for the locus.

    Notes:

    • Either --gtc-folder, --sample-sheet, or both are required inputs.

    • The genome fasta files for human genomes are provided by Illumina on the .

    hashtag
    genotype help

    Displays the help information for a genotype command.

    hashtag
    genotype version

    Displays current DRAGEN Array Local version.

    hashtag
    pgx

    The root command for pgx module

    Command
    Description

    hashtag
    pgx copy-number

    The root command for actions that act on pgx copy number variants.

    Command
    Description

    hashtag
    pgx copy-number call

    The command used to call copy number variants. A batch of 24 samples or more are required for analysis. For a successful analysis, 22 samples must pass QC defined as having log R dev < 0.2.

    Option
    Description

    hashtag
    pgx copy-number train

    Trains pgx copy number (CN) model for a set of samples. Generate a new pgx CN model if using a customized cluster file (.egt) optimized for the specific data set.

    • Execute the train command using the data sets that were used to optimize the cluster file.

    • To use a pgx CN model generated by the train command, the mask file for the manifest must be saved in the same directory as the manifest.

    • A minimum of 96 samples is required to use the copy-number train command. For optimal performance, at least 150 is recommended.

    See for further details.

    Option
    Description

    hashtag
    pgx copy-number help

    Displays help information for the copy-number command.

    hashtag
    pgx copy-number version

    Displays version information for pgx copy-number command.

    hashtag
    pgx star-allele

    The root command PGx star allele calling.

    Command
    Description

    hashtag
    pgx star-allele call

    Calls PGx star allele diplotypes. The SNV VCF files should be generated using the DRAGEN Array gtc-to-vcf command with unsquash-duplicates off (default) and without filter loci.

    Option
    Description

    hashtag
    pgx star-allele annotate

    Annotates and summarizes the star-alleles, specifically for metabolizer statuses and outputs in a consolidated JSON report. Metabolizer status is determined through direct lookup into public PGx guidelines CPIC or DPWG as specified by the user.

    Option
    Description

    hashtag
    pgx star-allele help

    Displays help information for a star-allele command.

    hashtag
    pgx star-allele version

    Displays version information for star-allele.

    hashtag
    cyto

    The root command for Cytogenetics analysis and annotation.

    Command
    Description

    hashtag
    cyto call

    Determines copy number variants (CNV) and loss/absence of heterozygosity (LOH/AOH) given genotypes.

    Option
    Description

    Notes:

    • Greater than 10 events (DEL/DUP/AOH) per chromosome is an indication of need for visual inspection.

    • If mosaic fraction cannot be estimated due to insufficient informative probes, it will be set to NaN.

    • Mosaic events that surpass the --max-mosaic-fraction limit have the MOSAIC tag in the INFO field of the VCF replaced with an HIGHFRACTION tag.

    hashtag
    cyto annotate

    Annotates samples and generates cytogenetic json reports.

    Option
    Description

    Notes:

    • The metadata "cyto.cnv.dat" file that is generated during cyto call in the vcf-folder needs to be kept in the vcf-folder for cyto annotate.

    • The vcfs files need to be zipped and indexed for cyto annotate, which means "--no-bgzip" flag cannot be turned on for the cyto vcf file generation if those vcf files are going to be used for cyto annotate command.

    • The "cyto annotate" step needs at least 5GB free space on the hard drive.

    hashtag
    cyto help

    Display more information on a specific command.

    hashtag
    cyto version

    Displays version information.

    hashtag
    Troubleshooting and Additional Support

    hashtag
    Tips for using the Command-line interface

    DRAGEN Array Local utilizes a command-line interface which allows full user control of software functionality and easy automation of tasks. The software is designed to be used by power users and bioinformaticians.

    When using command-line consider the following tips:

    • Spaces cannot be part of a file name in a command. If the file name has spaces, use quotes around the file name

    • To correct a typing error in a previously entered command, use the up arrow to repeat the previous command, then correct the error before re-entering it.

    • Double check the command. Misspelling, extra, or missing dashes, etc. will cause the command to be unrecognizable by the software.

    hashtag
    Optimizing cluster files and copy number models

    A (.egt) contains the cluster positions of every probe used for genotyping analysis. Illumina provides a standard cluster file for all commercial Infinium BeadChips. It may be desirable to create a custom cluster file if the one provided does not fit the data well or if a semi-custom or custom BeadChip, that do not come with a cluster file, are used. is the software used to create custom cluster files.

    To facilitate the review and optimization of PGx variant GenTrain cluster positions, a GenomeStudio auxiliary file is provided for each PGx Array product through the and array product files page, e.g. . The auxiliary file is a tab-delimited text file that can be imported into GenomeStudio through Column Import. The file contains the Infinium Assay to PGx star allele mapping, covering the variants involved in DRAGEN Array PGx star allele calling.

    When updating the cluster file for pharmacogenomic applications, understand the specifications for the copy number model file before beginning.

    Before creating a custom cluster file, review the , the , and .

    A (.dat) contains the data needed to make accurate copy number calls for pharmacogenomics. This file is used in the creation CNV VCFs which are inputs to the star allele calling command. Illumina provides a standard CN model file for all commercial PGx Infinium BeadChips. If it is determined the cluster file needs to be customized, the CN Model File should also be updated using the copy-number train command available with DRAGEN Array Local only. i.e.,

    1. Use GenomeStudio 2.0 to generate a new cluster file.

    2. Use the genotype call command to call genotypes and generate GTC files using IDAT files as input. dragena genotype call --bpm-manifest /user/productfiles/manifest.bpm --cluster-file /user/productfiles/new_clusterfile.egt --idat-folder /user/IDATs --output-folder /user/new_gtcs

    3. Use the copy-number train command to retrain the copy number model. Note: The --platform option can be found in the

    Note the difference in the cluster file requirement based upon the version of DRAGEN Array used:

    • Version 1.1+: If using a CN model with a different cluster file, the software will provide a warning but will proceed with copy number calling. As a result, a user can choose to keep using the commercial CN model from Illumina in combination with custom updated EGT file in the PGx analysis.

    • Version 1.0: The same cluster file used for copy number training must be used to generate GTC files for copy number calling. Otherwise, the software will produce an error and exit.

    For reference, see the for details of copy-number train command.

    To retrain the CN model file, 96 samples must be used at minimum with 90 of those samples passing QC defined as Log R Dev less than or equal to 0.2. It is recommended to train with at least 150 samples. A greater number of samples can be advantageous, but diminishing returns and longer computation times are seen after 3,000 samples.

    It is recommended to manually QC the training samples and remove samples that have Log R Dev > 0.2, call rate < 0.99, or TGA Control probe < 1.0 so only the highest quality samples are used in the training. The same samples used to create the new cluster file should be used to retrain the CN Model. To minimize batch effect in the training sample set, the samples should be analyzed in as few batches as possible and come from the same reagent lots.

    The copy-number train algorithm is designed with the assumption that the copy number distribution resembles the standard population distributions. This ensures the updated CN model file is representative of the normal populations in which it will be used to calculate copy number for key pharmacogenomic targets.

    hashtag
    Pharmacogenomic analysis for semi-custom arrays

    Semi-custom arrays add additional content or other pre-designed to enhance the commercial array content. This additional content can be analyzed for to obtain information on SNV and indel calls.

    For , PGx CNV and star allele calls are limited to content included on the commercial Infinium PGx arrays. Additional semi-custom content will not be included in the pharmacogenomic results.

    When designing a semi-custom array using a commercial Infinium PGx array backbone, such as the Global Diversity Array with enhanced PGx, it is important to retain all backbone content in the design as removing content could decrease the quality of result.

    Pharmacogenomic analysis for semi-custom arrays should be run using . Because the PGx CNV calling and PGx star allele calling algorithms are only compatible with commercial product files (see ), to fully analyze semi-custom PGx beadchips some steps of the pipeline can be run twice; once with the semi-custom product files (to get complete semi-custom SNV VCF files), and once with the commercial product files (to get the PGx CNV VCF files, PGx Star Allele output, and metabolizer report).

    The semi-custom product files can be used via the Command-line interface in genotype call, genotype gtc-to-vcf, and used in GenomeStudio, i.e.,

    1. Use GenomeStudio 2.0 to prepare a custom cluster file for the semi-custom array, following guidance outlined in .

    2. Open a command prompt (Windows) or terminal window (Linux) and navigate to the directory where the software was installed. Or a different, desired directory if the executable was added to the PATH environmental variable.

    3. Use the genotype call command to call all semi-custom genotypes and generate custom content GTC files using IDAT files as input. dragena genotype call --bpm-manifest /user/productfiles/semi_custom_manifest.bpm --cluster-file /user/productfiles/semi_custom_clusterfile.egt --idat-folder /user/IDATs --output-folder /user/semi_custom_gtcs

    Keep the GTC files and SNV VCF files generated using the semi-custom product files in clearly labelled folders to distinguish them from the GTC and SNV VCF files generated using the commercial product files. Note that the GTC and SNV VCFs generated using the commercial product files will not contain genotypes for the semi-custom/add-on content. The GTC and SNV VCFs generated using the semi-custom product files cannot be used for downstream PGx analysis commands.

    To check that the DRAGEN Array installation was successful, follow these steps:
    • Open a command prompt (Windows) or terminal (Linux).

    • [Optional] Add /path/to/dragena/, e.g. /usr/local/bin/dragena-linux-x64-DAv1.1.0/dragena/, to your PATH – to access the executable anywhere in the folder structure

    • Execute the following command: /path/to/dragena/dragena version, or if the environmental variable PATH is set: dragena version

    Samples processed in a two-week period from multiple library preparation batches can be grouped together to meet size requirement of an analysis batch. In such cases, it is recommended to use the same lot of reagents and instruments used in the workflow.
  • Use the CN Model and PGx Database File provided as part of the standard product files

  • Use the pgx copy-number call command to call PGx CNVs from the GTC files and produce CNV VCF files. It is recommended to use the same output folder used for SNV VCF since the star-allele call command accepts one VCF folder with SNV and PGx CNV VCFs. dragena pgx copy-number call --cn-model /user/productfiles/cnv_model.dat --gtc-folder /user/gtc --output-folder /user/vcf Note: For PGx CNV calling, it is recommended that 96 or more samples passing LogRDev <= 0.2 are included in the analysis.

  • Use the pgx star-allele call command to generate star allele calls using the CNV and SNV VCF files generated by the gtc-to-vcf and copy-number call commands. dragena pgx star-allele call --vcf-folder /user/vcf --database /user/productfiles/DAv1.3.0-rc3.zip --output-folder /user/star-alleles --license-server-url https://username:[email protected] Note: For PGx star allele calling, it is recommended to QC the samples and review the samples that have Log R Dev > 0.2, call rate < 0.99, or TGA Control probe < 1.0 to assess the reliability of the analysis. These metrics are provided in the genotyping sample summary file (gt_sample_summary.csv).

  • Use the pgx star-allele annotate command to summarize the star alleles and add metabolizer statuses to the star alleles generated by the star-allele call command. Guidelines (CPIC or DPWG) can be specified. dragena pgx star-allele annotate --star-alleles star_alleles.csv --guidelines CPIC --output-folder /user/metabolizer-statuses

  • [Optional] Use the pgx copy-number train command to retrain the copy number model. dragena pgx copy-number train --bpm-manifest /user/productfiles/manifest.bpm --genome-fasta-file /user/productfiles/genome.fa --gtc-folder /user/gtc --platform LCG --output-folder /user/productfiles/cnmodelnew

  • Use the cyto annotate command to generate JSON annotation files with gene annotations, cytogenetic bands, various QC fields, and the variant information from the VCFs. dragena cyto annotate --annotation-db /user/productfiles/CytoAnnotateData_DAv1.2.0.zip --vcf-folder user/vcf --output-folder /user/cyto-annotations

  • --help

    Displays help information for the genotype call command.

    --json-log

    Outputs logs in JSON format. Default is false.

    --num-threads

    Number of parallel threads to run.

    --output-folder

    Specifies the path to the folder where the output files are saved.

    --normalize-sample-name

    Sample Name from IDATs will be ignored and instead Sample Name will be normalized to the SentrixBarcode_Position in downstream outputs. Default is false

    --version

    Displays version information.

    --output-folder

    Specifies the path to the folder where the output files are saved.

    --version

    Displays version information.

    --debug

    Include stack traces in logs. Default is false.

    --disable-genome-cache

    Disables the reference genome cache.

    --filter-loci

    Generates a text file containing a list of probe names to be filtered.

    --unsquash-duplicates

    Generates unique VCF records for duplicate assays. Default is false.

    --help

    Displays help information for the genotype gtc-to-vcf command.

    --json-log

    Outputs logs in JSON format. Default is false.

    --no-bgzip

    VCFs are not bgzip compressed (.gz) and no tabix index files (.tbi) are output. Default is false.

    --output-folder

    Specifies the path to the folder where the output files are saved.

    --version

    Displays version information.

    --no-bgzip

    VCFs are not bgzip compressed (.gz) and no tabix index files (.tbi) are output. Default is false.

    --output-folder

    [Optional] Specifies the path to the folder where the output files are saved.

    --version

    Displays version information.

    For best performance, validate the pgx CN model using truth data before using in pgx CN calling.

    --disable-genome-cache

    Disables the reference genome cache.

    --help

    Displays help information for the copy-number train command.

    --json-log

    Outputs logs in JSON format. Default is false.

    --version

    Displays version information.

    --output-folder

    The location to output the CN model. By default, the output folder is the current working directory.

    --json-log

    Outputs logs in JSON format. Default is false.

    --output-folder

    Directory path to output files. Default is the current working directory.

    --version

    Displays version information.

    --version

    Displays version

    --no-bgzip

    VCFs are not bgzip compressed (.gz) and no tabix index files (.tbi) are output. Default is false.

    --output-folder

    [Optional] Directory path to output files. Default is the current working directory.

    --version

    Displays version information.

    --min-cnv-probes

    CNV size limit (probes). Only CNV events with equal or more probes than the min-cnv-probes will be reported in the VCF file. Default is 10.

    --min-cnv-size

    CNV size limit (kb). Only CNV events with effective size equal or larger than the min-cnv-size will be reported in the VCF file. Default is 20.

    --min-loh-probes

    LOH size limit (probes). Only LOH events with equal or more probes than the min-loh-probes will be reported in the VCF file. Default is 500.

    --min-loh-size

    LOH size limit (kb). Only LOH events with effective size equal or larger than the min-loh-size will be reported in the VCF file. Default is 3000.

    --max-mosaic-fraction

    The maximum allowable mosaic fraction. Mosaic variants at or above this are promoted. Promoted variants are marked with a HIGHFRACTION INFO tag and retain their mosaic fraction value. Default is 1.0

    --smoothing

    Smoothing window size, specifying the number of probes on each side of the center probe used for smoothing LRR values. Default is 5.

    Mosaic events with a fraction below 20% may be missed, and events under 5% mosaicism will not be called.

  • LogRDev > 0.2 is indicative of a low-quality sample.

  • For samples with LogRDev > 0.3, profiles are typically very noisy. In such cases, only whole-chromosome-level events are detected and reported to prevent excessive false positives.

  • --version

    Displays version information.

    --min-del-probes

    Deletion CNV size limit (probes). Only deletions with equal or more probes than the min-del-probes will be reported in the json file. Default is 10.

    --min-del-size

    Deletion CNV size limit (kb). Only deletions with size equal or larger than the min-del-size will be reported in the json file. Default is 0.

    --min-dup-probes

    Duplication CNV size limit (probes). Only duplications with equal or more probes than the min-dup-probes will be reported in the json file. Default is 10.

    --min-dup-size

    Duplication CNV size limit (kb). Only duplications with size equal or larger than the min-dup-size will be reported in the json file. Default is 0.

    --min-loh-probes

    LOH size limit (probes). Only LOH events with equal or more probes than the min-loh-probes will be reported in the json file. Default is 500.

    --min-loh-size

    LOH size limit (kb). Only LOH events with size equal or larger than the min-loh-size will be reported in the json file. Default is 3000.

    --min-qual

    Min CNV qual and LOH qual scores. Default is 20.

    When entering paths or long names, copy and paste the values to help avoid errors.

  • If using Windows, use a File Explorer window to navigate to the product file or folder that is needed by the DRAGEN Array Local command. While holding down the shift button on the keyboard, right click the file and select the 'Copy as Path' option. Then paste the copied path into the command prompt to use the file or folder.

  • To cancel a command while it is running, press Control + C on the keyboard.

  • Assay Format
    heading value from the CSV manifest.
    dragena copy-number train --bpm-manifest /user/productfiles/manifest.bpm --genome-fasta-file /user/productfiles/genome.fa --gtc-folder /user/new_gtcs --platform LCG --output-folder /user/productfiles/new_cnmodel
  • Use the new_cnmodel for subsequent copy-number call commands.

  • Use the genotype gtc-to-vcf command to create custom content SNV VCF files from the custom content GTC files generated by the genotype call command. dragena genotype gtc-to-vcf --bpm-manifest /user/productfiles/semi_custom_manifest.bpm --csv-manifest /user/productfiles/semi_custom_manifest.csv --genome-fasta-file /user/productfiles/genome.fa --gtc-folder /user/semi_custom_gtcs --output-folder /user/semi_custom_vcfs

  • Perform Quick Start steps 1-6 using the commercial Infinium PGx array product files to obtain PGx CNV VCFs, star allele calls, and metabolizer status annotations.

  • CPU

    8 cores

    Memory

    16 GB available or more

    Hard Drive

    30 GB or more of free disk space

    Operating System

    One of the following:

    • Windows 10 or later – win10-x64

    • CentOS 7 or later, Ubuntu 20.04 or later – linux-x64

    genotype

    Call genotypes, single nucleotide variants, and various related file conversions.

    pgx

    Pharmacogenomics CNV, star allele calling and metabolizer status annotation.

    cyto

    Cytogenetics CNV/LOH/mosaic calling and annotation.

    help

    Display more information on a specific command.

    version

    Display version information.

    genotype call

    Determines genotype calls (GTC) from IDAT files.

    genotype gtc-to-bedgraph

    Converts GTC to BedGraphs, producing BedGraph formatted visualization files from the log R ratio data contained in the GTC intermediate files.

    genotype gtc-to-vcf

    Converts GTC to VCF.

    genotype help

    Displays the help information for the genotype command.

    genotype version

    Displays version information for the genotype command.

    --bpm-manifest

    [Required] Specifies the path to the bead pool manifest in BPM format.

    --cluster-file

    [Required] Specifies the path to the EGT cluster file to use.

    --idat-folder

    Specifies the path to the directory where all intensity data IDATs (for the samples to be processed) are located. If using the --sample-sheet option in conjunction, this value will be used to override the RootFolder in the samplesheet.

    This path also includes the contents of all subdirectories.

    --sample-sheet

    Sample sheet that allows for filtering and providing sample metadata.

    --debug

    Includes stack traces in logs. Default is false.

    --gencall-cutoff

    GenCall score cutoff to label a NoCall. Default is 0.15.

    --bpm-manifest

    [Required] Specifies the path to the bead pool manifest in BPM format.

    --gtc-folder

    Folder containing genotype files (.gtc). If using the --sample-sheet option in conjunction, this value will be used to override the RootFolder in the samplesheet.

    --sample-sheet

    Sample sheet that allows for filtering and providing sample metadata.

    --debug

    Include stack traces in logs. Default is false.

    --help

    Displays help information for the genotype gtc-to-bedgraph command.

    --json-log

    Outputs logs in JSON format. Default is false.

    --bpm-manifest

    [Required] Specifies the path to the bead pool manifest in BPM format.

    --csv-manifest

    [Required] Specifies the path to the CSV manifest with SourceSeq column.

    --genome-fasta-file

    [Required] Specifies the path to the genome FASTA file (.fa). Assumes FASTA index file (.fai) is in the same directory.

    --gtc-folder

    Folder containing genotype files (.gtc). If using the --sample-sheet option in conjunction, this value will be used to override the RootFolder in the samplesheet.

    --sample-sheet

    Sample sheet that allows for filtering and providing sample metadata.

    --auxiliary-loci

    Specifies the path to the VCF file with auxiliary definitions of loci, such as for multi-nucleotide variants.

    copy-number

    Call and train copy number variants.

    star-allele

    Star Allele Caller for Illumina Microarrays

    help

    Display more information on a specific command

    version

    Display version information.

    pgx copy-number call

    Determines copy number variants given genotypes (GTC to CNV VCF).

    pgx copy-number help

    Displays help information for a copy-number command.

    pgx copy-number train

    Trains copy number model for a set of samples (GTC to CN Model File).

    pgx copy-number version

    Displays version information for copy-number.

    --cn-model

    [Required] Specifies the path to the copy number model parameters file (.dat).

    --gtc-folder

    Folder containing genotype files (.gtc). If using the --sample-sheet option in conjunction, this value will be used to override the RootFolder in the samplesheet.

    --sample-sheet

    Sample sheet that allows for filtering and providing sample metadata.

    --debug

    Includes stack traces in logs. Default is false.

    --help

    Displays help information for the copy-number call command.

    --json-log

    Outputs logs in JSON format. Default is false.

    --bpm-manifest

    [Required] Specifies the path to the bead pool manifest in BPM format. Assumes mask file (.msk) is in the same directory.

    --genome-fasta-file

    [Required] Specifies the path to the genome FASTA file (.fa). Assumes FASTA index file (.fai) is in the same directory.

    --platform

    [Required] Specifies which microarray platform generated the data. Set this to 'LCG' for GDA-ePGx, 'EX' for GSAv4-ePGx or GCRA-ePGx

    --gtc-folder

    Folder containing genotype files (.gtc). If using the --sample-sheet option in conjunction, this value will be used to override the RootFolder in the samplesheet.

    --sample-sheet

    Sample sheet that allows for filtering and providing sample metadata.

    --debug

    Includes stack traces in logs. Default is false.

    pgx star-allele call

    Determines PGx star allele and variant genotypes.

    pgx star-allele annotate

    Annotate PGx gene functions and product JSON report.

    pgx star-allele help

    Displays help information for a star allele command.

    pgx star-allele version

    Displays version information for star allele.

    --database

    [Required] The PGx database file (.zip).

    --license-server-url

    [Required] The license server url with credentials.

    --vcf-folder

    [Required] The directory containing *.snv.vcf.gz and *.cnv.vcf.gz files.

    --query-license-quota

    During beginning and end of analysis, the license server will be queried for the quotas on the valid license(s) and display the result.

    --debug

    Includes stack traces in logs. Default is false.

    --help

    Displays help information for the star-allele call command.

    --star-alleles

    [Required] Path to star alleles file (.csv) generated by the call subcommand.

    --guidelines

    PGx guidelines to use for annotation. Valid values are ‘CPIC’ and ‘DPWG’. Default is ‘CPIC’.

    --debug

    Includes stack traces in logs. Default is false.

    --help

    Displays help information for the star-allele annotate command.

    --json-log

    Outputs logs in JSON format. Default is false.

    --output-folder

    Directory path to output files. Default is the current working directory.

    cyto call

    Determines copy number variants and loss of heterozygosity given genotypes.

    cyto annotate

    Annotates samples and generates cytogenetics json reports.

    cyto help

    Display more information on a specific command.

    cyto version

    Displays version information.

    --cn-model

    [Required] Path to cyto model parameters file (.dat).

    --gtc-folder

    Folder containing genotype files (.gtc). If using the --sample-sheet option in conjunction, this value will be used to override the RootFolder in the samplesheet.

    --sample-sheet

    Sample sheet that allows for filtering and providing sample metadata.

    --debug

    Logs will include stack traces. Default is false.

    --help

    Display this help screen.

    --json-log

    Logs will be output in JSON format. Default is false.

    --debug

    Logs will include stack traces. Default is false.

    --help

    Display this help screen.

    --json-log

    Logs will be output in JSON format. Default is false.

    --annotation-db

    [Required] Database for variant annotations.

    --vcf-folder

    [Required] The directory containing the *.cnv.vcf.gz files.

    --output-folder

    [Optional] Directory path to output files. Default is the current working directory.

    Command-line interface Basics
    Illumina Product Pagearrow-up-right
    logs
    1.0.0arrow-up-right
    1.1.0arrow-up-right
    Illumina Support Sitearrow-up-right
    Computing Requirements
    DRAGEN Array Applications
    Command Index
    Command Index
    SNV VCF Files
    Genotype Call Files
    VCF description
    support sitearrow-up-right
    Optimizing cluster files and copy number models
    Cluster File
    GenomeStudio 2.0arrow-up-right
    DRAGEN Array Support Sitearrow-up-right
    Infinium Global Diversity Array with Enhanced PGx Product Filesarrow-up-right
    Infinium Genotyping Data Analysis Technical Notearrow-up-right
    Infinium Arrays Support Webinar Videoarrow-up-right
    Custom cluster file creation for improved copy number analysisarrow-up-right
    PGx Copy Number (CN) Model File
    Command Index
    Infinium booster contentarrow-up-right
    genotyping applications
    pharmacogenomic applications
    DRAGEN Array Local
    Applications
    Custom cluster file creation for improved copy number analysisarrow-up-right

    Output Files

    The following section describes the outputs produced by DRAGEN Array.

    hashtag
    PGx CNV VCF File

    DRAGEN Array produces one PGx CNV variant call file (VCF) (*.cnv.vcf) per sample to report the CN status on the gene and sub gene level, along with the CN events for PGx targets.

    The PGx CNV VCF output file follows the standard VCF format. The QUAL field in the VCF file measures the CNV call quality. The CNV call quality is a Phred-scaled score capped at 60 and the minimal value is 0. Low quality calls (QUAL<7) are flagged by the Q7 filter. Low quality samples with LogRDev greater than a threshold 0.2 are flagged with the SampleQuality flag.

    The PGx CNV VCF files are by default bgzipped (Block GZIP) and have the “.gz” extension. The compression saves storage space and facilitates efficient lookup when indexed with the TBI Index File. To view these files as plain text, they can be uncompressed with from Samtools or other third-party tools. The CNV VCF must be bgzipped and indexed to be used in downstream DRAGEN Array commands, such as star allele calling.

    The PGx CNV VCF output file includes the following content.

    ##fileformat=VCFv4.1

    ##source=dragena 1.3.0

    ##genomeBuild=38

    ##reference=file:///hg38_with_alt/hg38_nochr_MT.fa

    ##FORMAT=<ID=CN,Number=1,Type=Integer,Description="Copy number genotype for imprecise events. CN=5 indicates 5 or 5+">

    ##FORMAT=<ID=NR,Number=1,Type=Float,Description="Aggregated normalized intensity">

    ##ALT=<ID=CNV,Description="Copy number variant region">

    ##FILTER=<ID=Q7,Description="Quality below 7">

    ##FILTER=<ID=SampleQuality,Description="Sample was flagged as potentially low-quality due to high noise levels.">

    ##INFO=<ID=CNVLEN,Number=1,Type=Integer,Description="Number of bases in CNV hotspot">

    ##INFO=<ID=PROBE,Number=1,Type=Integer,Description="Number of probes assayed for CNV hotspot">

    ##INFO=<ID=END,Number=1,Type=Integer,Description="End position of CNV hotspot">

    ##INFO=<ID=SVTYPE,Number=1,Type=String,Description="Structural Variant Type">

    ##OverallPloidy=1.8

    ##GCCorrect=True

    ##contig=<ID=1,length=248956422>

    ##contig=<ID=4,length=190214555>

    ##contig=<ID=10,length=133797422>

    ##contig=<ID=16,length=90338345>

    ##contig=<ID=19,length=58617616>

    ##contig=<ID=22,length=50818468>

    ##contig=<ID=22_KI270879v1_alt,length=304135>

    #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 204619760001_R01C01

    1 109687842 CNV:GSTM1:chr1:109687842:109693526 N <CNV> 60 PASS CNVLEN=5685;PROBE=124;END=109693526;SVTYPE=CNV CN:NR 2:0.966631132771593

    4 68537222 CNV:UGT2B17:chr4:68537222:68568499 N <CNV> 60 PASS CNVLEN=31278;PROBE=383;END=68568499;SVTYPE=CNV CN:NR 0:0.376696837881692

    10 133527374 CNV:CYP2E1:chr10:133527374:133539096 N <CNV> 60 PASS CNVLEN=11723;PROBE=194;END=133539096;SVTYPE=CNV CN:NR 2:0.980059731860893

    16 28615068 CNV:SULT1A1:chr16:28603587:28613544 N <CNV> 57 PASS CNVLEN=8315;PROBE=164;END=28623382;SVTYPE=CNV CN:NR 2:0.980552325552963

    19 40844791 CNV:CYP2A6.intron.7:chr19:40844791:40845293 N <CNV> 60 PASS CNVLEN=503;PROBE=38;END=40845293;SVTYPE=CNV CN:NR 2:0.9663775484762

    19 40850267 CNV:CYP2A6.exon.1:chr19:40850267:40850414 N <CNV> 60 PASS CNVLEN=148;PROBE=21;END=40850414;SVTYPE=CNV CN:NR 2:0.9663775484762

    22 42126498 CNV:CYP2D6.exon.9:chr22:42126498:42126752 N <CNV> 48 PASS CNVLEN=255;PROBE=370;END=42126752;SVTYPE=CNV CN:NR 2:0.981703411438716

    22 42129188 CNV:CYP2D6.intron.2:chr22:42129188:42129734 N <CNV> 10 PASS CNVLEN=547;PROBE=333;END=42129734;SVTYPE=CNV CN:NR 2:0.965498002434641

    22 42130886 CNV:CYP2D6.p5:chr22:42130886:42131379 N <CNV> 60 PASS CNVLEN=494;PROBE=172;END=42131379;SVTYPE=CNV CN:NR 2:0.970341562236357

    22_KI270879v1_alt 270316 CNV:GSTT1:chr22_KI270879v1_alt:270316:278477 N <CNV> 60 PASS CNVLEN=8162;PROBE=91;END=278477;SVTYPE=CNV CN:NR 2:1.01191145130511

    hashtag
    Cytogenetics VCF File

    DRAGEN Array produces one cytogenetics Variant Call File (VCF) (*.cnv.vcf) per sample to report the CN and LOH status of the detected variants.

    The cytogenetics CNV VCF output file follows the standard VCF format. The QUAL field in the VCF file measures the CNV/LOH call quality. The CNV/LOH call quality is a Phred-scaled score capped at 60 and the minimal value is 0. Low quality calls (QUAL<10) are flagged by the Q10 filter. Low quality samples with LogRDev greater than a threshold 0.2 are flagged with the SampleQuality flag.

    The cytogenetics CNV VCF files are by default bgzipped (Block GZIP) and have the “.gz” extension. The compression saves storage space and facilitates efficient lookup when indexed with the TBI Index File. To view these files as plain text, they can be uncompressed with from Samtools or other third-party tools. The CNV VCF must be bgzipped and indexed to be used in downstream DRAGEN Array commands, such as cyto annotate.

    One example file can be found below:

    ##fileformat=VCFv4.1

    ##source=dragena 1.3.0 Cyto

    ##genomeBuild=37

    ##product=GDACyto-8v1-0_A

    ##reference=file://genome.fa

    ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">

    ##FORMAT=<ID=CN,Number=1,Type=Integer,Description="Copy number genotype. CN=4 indicates 4 or 4+">

    ##FORMAT=<ID=NR,Number=1,Type=Float,Description="Aggregated normalized intensity">

    ##FORMAT=<ID=LRD,Number=1,Type=Float,Description="Standard deviation of logR ratios">

    ##platform=cytoplatform

    ##ALT=<ID=DEL,Description="Copy number loss region">

    ##ALT=<ID=DUP,Description="Copy number gain heterozygous region">

    ##ALT=<ID=LOH,Description="AOH/LOH/ROH, absence of heterozygosity region, or, loss of heterozygosity region">

    ##FILTER=<ID=Q10,Description="Quality below 10">

    ##FILTER=<ID=SampleQuality,Description="Sample was flagged as potentially low-quality due to high noise levels.">

    ##INFO=<ID=SVLEN,Number=1,Type=Integer,Description="Number of bases in CNV/LOH region">

    ##INFO=<ID=PROBE,Number=1,Type=Integer,Description="Number of probes assayed for CNV/LOH region">

    ##INFO=<ID=END,Number=1,Type=Integer,Description="End position of CNV/LOH region">

    ##INFO=<ID=LOHTYPE,Number=A,Type=String,Description="Type of LOH (Loss/absence of heterozygosity). Valid values are AOH (germline, copy number neutral or gain LOH), CNLOH (somatic, copy number neutral LOH), GAINLOH (somatic, copy number gain LOH)">

    ##OverallPloidy=1.9

    ##GCCorrect=True

    ##contig=<ID=1,length=249250621>

    ##contig=<ID=2,length=243199373>

    ##contig=<ID=3,length=198022430>

    ##contig=<ID=4,length=191154276>

    ##contig=<ID=5,length=180915260>

    ##contig=<ID=6,length=171115067>

    ##contig=<ID=7,length=159138663>

    ##contig=<ID=8,length=146364022>

    ##contig=<ID=9,length=141213431>

    ##contig=<ID=10,length=135534747>

    ##contig=<ID=11,length=135006516>

    ##contig=<ID=12,length=133851895>

    ##contig=<ID=13,length=115169878>

    ##contig=<ID=14,length=107349540>

    ##contig=<ID=15,length=102531392>

    ##contig=<ID=16,length=90354753>

    ##contig=<ID=17,length=81195210>

    ##contig=<ID=18,length=78077248>

    ##contig=<ID=19,length=59128983>

    ##contig=<ID=20,length=63025520>

    ##contig=<ID=21,length=48129895>

    ##contig=<ID=22,length=51304566>

    ##contig=<ID=X,length=155270560>

    ##contig=<ID=Y,length=59373566>

    #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 208588190001_R02C01 1 109687842 DEL:chr1:109687842:109693526 N <DEL> 60 PASS SVLEN=5685;PROBE=99;END=109693526 GT:CN:NR:LRD 1/1:1:0.8860:0.21 16 28603587 DUP:chr16:28603587:28613544 N <DUP> 60 PASS SVLEN=9958;PROBE=197;END=28613544 GT:CN:NR:LRD 1/1:3:1.1666:0.11 22 42129188 AOH:chr22:42129188:42129734 N <LOH> 37 PASS SVLEN=547;PROBE=198;END=42129734;LOHTYPE=AOH GT:CN:NR:LRD 1/1:2:1.0208:0.25

    hashtag
    SNV VCF File

    The software produces one genotyping variant call file (*.snv.vcf) file per sample, covering single nucleotide variants (SNV) and indels for the sample. It reports GenCall score (GS), B Allele Frequency (BAF), and Log R Ratio (LRR) per variant. The VCF file output follows .

    Some additional details:

    • The FILTER column is hardcoded to PASS and is not dependent on the GT value. It does not reflect the underlying quality of the call. Refer to the GS value for quality information.

    • Genotypes are adjusted to reflect the sample ploidy. Calls are haploid for loci on Y, MT, and non-PAR chromosome X for males.

    The SNV VCF output file includes the following content. The last row shows an example of variant call.

    ##fileformat=VCFv4.1

    ##source=dragena 1.3.0

    ##genomeBuild=38

    ##reference=file:///genomes/38/genome.fa

    ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">

    ##FORMAT=<ID=GS,Number=1,Type=Float,Description="GenCall score. For merged multi-assay or multi-allelic records, min GenCall score is reported.">

    ##FORMAT=<ID=BAF,Number=1,Type=Float,Description="B Allele Frequency">

    ##FORMAT=<ID=LRR,Number=1,Type=Float,Description="LogR ratio">

    ##contig=<ID=1,length=248956422>

    ##contig=<ID=2,length=242193529>

    ##contig=<ID=3,length=198295559>

    ##contig=<ID=4,length=190214555>

    ##contig=<ID=5,length=181538259>

    ##contig=<ID=6,length=170805979>

    ##contig=<ID=7,length=159345973>

    ##contig=<ID=8,length=145138636>

    ##contig=<ID=9,length=138394717>

    ##contig=<ID=10,length=133797422>

    ##contig=<ID=11,length=135086622>

    ##contig=<ID=12,length=133275309>

    ##contig=<ID=13,length=114364328>

    ##contig=<ID=14,length=107043718>

    ##contig=<ID=15,length=101991189>

    ##contig=<ID=16,length=90338345>

    ##contig=<ID=17,length=83257441>

    ##contig=<ID=18,length=80373285>

    ##contig=<ID=19,length=58617616>

    ##contig=<ID=20,length=64444167>

    ##contig=<ID=21,length=46709983>

    ##contig=<ID=22,length=50818468>

    ##contig=<ID=MT,length=16569>

    ##contig=<ID=X,length=156040895>

    ##contig=<ID=Y,length=57227415>

    #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 202937470021_R06C01

    1 2290399 rs878093 G A . PASS . GT:GS:BAF:LRR 0/1:0.7923:0.50724137:0.14730307

    hashtag
    Note on Multi-Allelic Variants (MAV) calling limitations

    DRAGEN Array can combine multiple assays with different target bases but the same genomic position to make MAV calls. However, Illumina Microarrays are inherently bi-allelic assays made up of which require special design considerations and have some inherent limitations.

    The MAV calling algorithm currently filters across all overlapping assays, retaining only genotypes whose alleles are present in the intersection of all assays. If multiple genotypes remain after filtering, the result is considered ambiguous and reported as a NoCall to avoid false positives. This ambiguity often arises when one assay is a NoCall due to presumed probe failure rather than missing signal. In such cases, its potential genotypes are not excluded, contributing to ambiguity. When DRAGEN Array outputs a NoCall because of the described behaviors, they are logged as (e.g., Failed to combine genotypes due to ambiguity...).

    Overall, the current algorithm errs on the side of caution to ensure quality calls, but produces some idiosyncratic behavior and potential false NoCalls when genotypes are biologically consistent but differ due to probe designs. We hope to improve this behavior in future versions of DRAGEN Array.

    Some illustrative examples are below to help understand the current limitations:

    Scenario
    Expected MAV Call
    Actual MAV Call
    Explanation

    hashtag
    Note on delimiters in the "ID" field

    By default, when multiple probes are present for a given variant, all probe names are included in the "ID" field of the resulting VCF file.

    • For SNP entries, probe names are separated by commas (,).

    • For Indel entries, probe names are separated by semicolons (;).

    hashtag
    Note on REF/ALT "flipping" for INDELs

    Expected REF and ALTs for INDELS may not match the dbSNP annotations in rare cases. E.g., an expected "Deletion" with the REF = "ATCG" and the ALT = "A" may be "flipped" to an "Insertion" variant with REF="A" and ALT="ACTG". The corresponding genotype output will take this into account so the actual VCF is still correct. This is simply a notation issue in some of the manifest files.

    hashtag
    Note on PLINK compatibility

    It is possible to make DRAGEN Array genotype VCF files compatible for conversion to PED/MAP format with preprocessing using tabix (v1.19.1), BCFtools (v1.21) and PLINK (v1.9). The following three commands demonstrate the basic process.

    • bcftools merge -l vcf_list.txt -Oz -o merged.vcf.gz creates a single compressed VCFs from individual sample VCFs listed in the vcf_list.txt text file.

    • tabix -p vcf merged.vcf.gz creates a binary index for the merged file.

    • plink --vcf merged.vcf.gz --recode --out merged

    Some optional arguments may be provided to PLINK depending on the content of the VCFs to be converted and the downstream analysis.

    • For VCFs containing non-standard human chromosomes (e.g. haplotype chromosomes or unplaced contigs), the --allow-extra-chr flag can be used.

    • If using non-human data, refer to the for the --chr-set argument and supported options.

    • By default, PLINK will only consider the most common ALT allele for multi-allelic variants. The --biallelic-only

    For more info on the options described and others, refer the .

    hashtag
    Genotype Call (GTC) File

    The genotype call algorithm produces one genotype call file (.gtc) per sample analyzed. The Genotype Call (GTC) file contains the small variant (SNV and indel) genotype for each marker specified by the product and sample quality metrics. The sample marker location is not included and must be extracted from the manifest file. Binary proprietary format can be parsed using the Illumina open-source tool .

    Note on lack of i18n: GTCs are binary/fixed format files built designed before modern internationalization and localization tools. There is a related that makes the GTCs unable to be used in downstream analyses. Refer to the same issue to see a workaround.

    Note on legacy GTCs: Other Illumina software (such as AutoConvert and Beeline) also product GTC files. These "legacy GTC" files will work in DRAGEN Array genotyping commands such as genotype gtc-to-vcf but they will not work with all other downstream analyses such as and . We recommend using DRAGEN Array end-to-end starting from IDATs for these analyses.

    hashtag
    BedGraph Files

    The BedGraph files contains the Log R Ratios (LRR.bedgraph) and B-Allele Frequencies (BAF.bedgraph) from the genotyping algorithm for use in visual tools.

    hashtag
    Star Allele CSV File

    The Star Allele CSV file is an intermediate file generated by the pgx star-allele call command and serves as the input to the pgx star-allele annotate command. It contains all the star allele calls for all samples in a run. Each row in the file provides either a star allele diplotype or simple variant call for a PGx-related gene. Star allele diplotype calls for a sample and a gene may span multiple lines where alternative solutions can be listed.

    The Star Allele CSV file also contains meta information marked by # at the top of the file for the genome build and PGx database used for the star allele calling.

    The star_allele.csv file contains the following details per sample:

    Field
    Description

    Below is an example of the first 4 columns from a star allele CSV file:

    Sample,Rank,Gene or Variant,Type,Solution

    204650490282_R02C01,1,CYP2C9,Haplotype,*9/*11

    204650490282_R02C01,1,CYP2C19,Haplotype,*2/*10

    hashtag
    Genotype Summary Files

    The software produces genotype summary files (gt_sample_summary.csv and gt_sample_summary.json) that contains the following details per sample:

    • Sample ID

    • Sample Name

    • Sample Folder

    • Autosomal Call Rate

    The TGA_Ctrl_5716 Norm R field is specific to PGx products (e.g., Global Diversity Array with enhanced PGx). The field value is the Normalized R value of one probe and is meant as an assay control where < 1 indicates the sample failed in the TGA (Targeted Gene Amplification) process. If the product does not have this probe, it is not included in the gt_sample_summary.

    The user defined fields from the samplesheet will appear as-is in the gt_sample_summary files. e.g. for the given samplesheet:

    It would produce something like the following gt_sample_summary.csv:

    And something like the following gt_sample_summary.json:

    Note: As of v1.3, samples that fail during genotyping will still be present in this file. See the details in the .

    hashtag
    Final Report

    DRAGEN Array Cloud produces a Final Report (gtc_final_report.csv) per analysis batch similar to the one available in GenomeStudio. It contains the following details per locus per sample:

    Field
    Description

    Note: Analyses on products with large numbers of loci (>1 Million) and large numbers of samples (>100) yield a large (50+ Gigabyte) Final Report that are difficult to download and review. It’s recommended to create analysis configurations that do not produce this report if large batches are desired.

    For more information on interpreting DNA strand and allele information, see Illumina Knowledge article .

    hashtag
    Locus Summary

    DRAGEN Array Cloud produces a Locus Summary (locus_summary.csv) per analysis batch similar to the one available in GenomeStudio. It contains the following details per locus:

    Field
    Description

    hashtag
    CN Summary File

    The sample summary contains per sample key stats for each sample in a batch that contains the following details per sample:

    • Sample ID

    • Sample Name

    • Sample Folder

    hashtag
    Copy Number Batch File

    The copy number batch summary file (cn_batch_summary.csv) shows the total copy number gain, loss, and neutral (CN=2) values for each target region across all the samples in the analysis.

    Example copy number batch summary file content:

    Target Region,Total CN gain,Total CN loss,Total CN neutral

    CYP2A6.exon.1,0,1,47

    CYP2A6.intron.7,0,1,47

    CYP2D6.exon.9,2,4,42

    CYP2D6.intron.2,7,2,39

    CYP2D6.p5,13,2,33

    CYP2E1,2,0,46

    GSTM1,0,42,6

    GSTT1,0,33,15

    SULT1A1,0,0,48

    UGT2B17,0,34,14

    All Target Regions,24,119,337

    hashtag
    Warning/Error Messages and Logs

    The following scenarios result in a warning or error message:

    • Manifest file used to generate GTC is not the same as the manifest file used to generate the CN model.

    • FASTA files and FASTA index files do not match.

    For the following scenarios, the software reports messages to the terminal output (as either a warning or an error):

    • Indel processing for GTC to VCF conversion failed.

    • The input folder does not contain the required input files.

    • An input file is corrupt.

    Examples of such notifications can include the following:

    Error
    Type
    Cause

    hashtag
    Star allele JSON File

    The star allele JSON file is produced per sample. It contains the fields present in the as well as additional meta data and annotations.

    Fields included in the star allele JSON header are described below.

    Field
    Description

    Fields included in the star allele call (locusAnnotations) information are described below.

    Field
    Description

    Fields included in the candidateSolution section, only available for star allele call type, are described below.

    Field
    Description

    Example of JSON file content:

    hashtag
    Guidance on alternative star-allele results

    Typically, the star allele solution with highest quality score is accepted as the final genotype (i.e. star allele diplotype) for the PGx locus. In rare cases, there are lower ranked star allele solutions with quality scores no less than 50% of the highest quality score, these lower ranked solutions are considered feasible and they are all listed in the genotype field of the locus annotation of the PGx gene in the PGx JSON file. Alternative solutions should also be considered if there are supporting variants for those solutions with low (less than 0.15) GS scores. The clustering of low GS scoring supporting variants should also be evaluated for cluster quality and any potential cluster shift.

    hashtag
    Cytogenetics Annotation JSON File

    DRAGEN Array produces one cytogenetics annotation JSON (*.json) per sample to report more sample-level, chromosome-level, and event-level metrics and annotations.

    Example of JSON file content:

    The fields in the annotation JSON for each sample are described as follows.

    Field
    Description

    The fields for each chromosome under the chromosomeAnnotations field of the Cyto annotation JSON are described below.

    Field
    Description

    The fields within each variant (CNV/LOH event) under the locusAnnotations field of the Cyto annotation JSON are described below.

    Field
    Description

    The traditionalNomenclature field is used to describe individual and cumulative copy number variants at a coarse resolution. They show gains (dup) and losses (del) according to their chromosome number, arm (p or q), and band (e.g., p36.13). The microarrayNomenclature field follows the Comparative Genomic Hybridization or SNP array conventions. Values are prefixed with arr[] to indicate array data as the source, along with the genome build e.g. GRCh38. These data can more precisely describe the location (start_end in bp) and copy number (x1 for loss, x3 for gain, etc).

    The genes list field for each variant include those with transcript coordinates that intersect with the described variant. The values included are a combination of HGNC gene symbols taken from NCBI RefSeq database (e.g. ), as well as the subset of Ensembl gene accessions unmatched to a gene symbol ().

    hashtag
    TBI Index File

    The TBI (TABIX) index file is associated with the bgzipped VCF files. It allows for data line lookup in VCF files for quick data retrieval. The format is a tab-delimited genome index file developed by Samtools as part of the HTSlib utilities. For more information, visit the website.

    hashtag
    Methylation Control Probe Output File

    The software produces a control probe output file ({BeadChipBarcode}_{Position}_ctrl.tsv.gz) per sample that includes the raw methylated and unmethylated values for each control probe.

    Each control probe has an address, type, color channel, name, and probe ID. It also provides the raw signal for methylated green (MG), methylated red (MR), unmethylated green (UG) and unmethylated red (UR).

    The file can help identify which probes are available on a given BeadChip.

    hashtag
    Methylation CG Output File

    The software produces a CG output file ({BeadChipBarcode}_{Position}_cgs.tsv.gz) per sample that includes beta values, m-values and detection p-values for each CG site.

    Beta values measure methylation levels in a linear fashion for easy interpretation. Unmethylated probes are close to zero and methylated probes are close to 1.

    M-values are a log transformed beta value which provides a more representative measure of methylation.

    Detection p-values measure the likelihood that the signal is background noise. It is recommended that p-value >0.05 are excluded from analysis as they are likely background noise.

    see software tech note for further detail on calculation of these metrics.

    hashtag
    Methylation Sample QC Summary Files

    The software produces methylation sample QC summary in .xlsx and .tsv file formats (sample_qc_summary.xlsx and sample_qc_summary.tsv) per analysis batch, which provides per sample QC data for all samples in the batch.

    The QC summary provides details on 21 controls metrics (see tables below), which are computed in same way as in the BeadArray Controls Reporter software from Illumina. In addition, it provides average red and green raw and normalized signals, time of scanning, proportion of probes passing, overall sample pass/fail status, and the failure codes for control metrics that did not pass. The sample pass status is defined as the passing of all 21 control metrics. The QC summary .xlsx file further highlights failing parameters for easy viewing.

    The QC summary files contain the following fields:

    Field
    Description

    The control metrics in the QC summary files are calculated as following. The default value for background correction offset (x) of 3,000 can be modified and applies to all background calculations indicated with (bkg + x). Note that the table uses default thresholds for EPIC arrays as example, the default thresholds changes with the methylation arrays. See section for additional details.

    hashtag
    Methylation Sample QC Summary Plots

    The software produces methylation sample QC summary plots (sample_qc_summary.pdf) per analysis batch which provides visual depictions of two QC summary plots for quick visual review.

    The file contains the following control plots:

    Control Plot
    Description

    hashtag
    Methylation Principal Component Summary

    The software produces a methylation principal component summary file (pcs.tsv.gz) per analysis batch which provides principal component data for each sample within the batch. This can be used to identify the specific samples associated with points on the PCA control plot within the Methylation Sample QC Control Plots output file.

    The files contain the following fields:

    Field
    Description

    hashtag
    Methylation Manifest Files

    The software produces two methylation manifest files

    1. Manifest in Sesame format (probes.csv)

    2. Additional information for control probes (controls.csv)

    The probes.csv file has the following columns:

    Field
    Description

    The controls.csv file has the following columns:

    Field
    Description

    hashtag
    Methylation Warning/Error Messages and Logs

    The following scenarios result in a warning or error message:

    • Missing IDATs or manifest

    • Incorrect sample sheet formatting

    • Duplicate BeadChip Barcode and Position within the sample sheet

    Examples of such notifications can include the following:

    Multiple SNPs in the input manifest which are mapped to the same chromosomal coordinate (e.g. tri-allelic loci or duplicated sites) are collapsed into one VCF entry and a combined genotype generated. To produce the combined genotype, the set of all possible genotypes is enumerated based on the queried alleles. Genotypes which are not possible based on called alleles and assay design limitations (e.g. Infinium II designs cannot distinguish between A/T and C/G calls) are filtered. If only one consistent genotype remains after the filtering process, then the site is assigned this genotype. Otherwise, the genotype is ambiguous (more than 1) or inconsistent (less than 1) and a no-call is returned.
  • Certain SNV and indel calls will be skipped when reported in the VCF. Skipped data can include unmapped loci (i.e., Chr is 0 in the manifest), intensity-only probes used for CNV identification, and indels that do not map back to the genome. See Warning/Error Messages and Logs for messages that may be seen with DRAGEN Array Local related to the skipped data.

  • The BAF and LRR are oriented with Ref as A and Alt as B relative to the reference genome, while GS is agnostic to the reference genome. Users familiar with GenomeStudio may observe BAF and LRR reported in the VCF as 1 minus the value reported in GenomeStudio depending on the Ref Alt allele orientation with the reference genome. GenomeStudio reports these values based on the information in the manifest without knowledge of the reference genome.

  • The SNV VCF files are by default bgzipped (Block GZIP) and have the “.gz” extension. The compression saves storage space and facilitates efficient lookup when indexed with the TBI Index File. To view these files as plain text, they can be uncompressed with bgziparrow-up-right from Samtools or other third-party tools. The SNV VCF must be bgzipped and indexed to be used in downstream DRAGEN Array commands, such as star allele calling.

  • creates .ped and .map files with the prefix provided to
    --out
    .
    argument can be provided to exclude multi-allelic variants altogether. As an alternative, using
    bcftools norm -m - in.vcf.gz -Oz -o out.vcf.gz
    , can be used upstream of PLINK to split multi-allelic variants into bi-allelic records to retain them for downstream processing.

    Supporting Variants

    All variants present in the array that support the star allele solution. The field has the following format: Long Solution Star Allele: (Supporting Variants).

    Each supporting variant is listed with essential information extracted from the SNV VCF to assist with troubleshooting, including Chromosome, Location, Reference allele, Alternative allele, Genotype, GenCall score (GS), and B-allele frequency (BAF).

    Missing/Masked Core Variants

    All variants not present in the array or not called in the SNV VCF file for the star allele. The field has the following format: Long Solution Star-Allele: (Missing Variants).

    All Missing Variants in Array

    All core definition variants that are not on the array or are not called in the SNV VCF along with the associated star alleles that are impacted. The field has the following format: Missing Variant: (List of impacted star alleles).

    Collapsed Star-Alleles

    Star alleles that cannot be distinguished from the solution star allele given the input array’s content. The field has the following format: Long Solution Star-Allele: (List of collapsed star alleles).

    The most frequent star allele based on the population frequency of PGx alleles will be the star allele in the solution.

    Score

    Quality score of the solution including the population frequency of PGx alleles. The score ranges from 0 to 1.

    Raw Score:

    Raw quality score of the solution without including the population frequency of PGx alleles. The score ranges from 0 to 1.

    Copy Number Solution

    Estimated copy number for each gene region. The field has the following format: Gene Region: Copy Number.

    Call Rate

  • Log R Ratio Std Dev

  • Sex Estimate

  • TGA_Ctrl_5716 Norm R

  • (Optional) User defined fields from the samplesheet

  • Allele 2 – Forward

    Allele 2 corresponds to Allele B and are reported on the Forward strand.

    Allele 1 – Plus

    Allele 1 corresponds to Allele A and are reported on the Plus strand.

    Allele 2 – Plus

    Allele 2 corresponds to Allele B and are reported on the Plus strand.

    GC Score

    Quality metric calculated for each genotype (data point), and ranges from 0 to 1.

    GT Score

    The SNP cluster quality. Score for a SNP from the GenTrain clustering algorithm.

    Log R Ratio

    Base-2 log of the normalized R value over the expected R value for the theta value (interpolated from the R-values of the clusters). For loci categorized as intensity only; the value is adjusted so that the expected R value is the mean of the cluster.

    B Allele Freq

    B allele frequency for this sample as interpolated from known B allele frequencies of 3 canonical clusters: 0, 0.5 and 1 if it is equal to or greater than the theta mean of the BB cluster. B Allele Freq is between 0 and 1, or set to NaN for loci categorized as intensity only.

    Chr

    Chromosome containing the SNP.

    Position

    SNP chromosomal position.

    A/B_Freq

    Frequency of heterozygote calls.

    B/B_Freq

    Frequency of homozygote allele B calls.

    Minor_Freq

    Frequency of the minor allele.

    Gentrain_Score

    Quality score for samples clustered for this locus.

    50%_GC_Score

    50th percentile GenCall score for all samples.

    10%_GC_Score

    10th percentile GenCall score for all samples.

    Het_Excess_Freq

    Heterozygote excess frequency, calculated as (Observed -Expected)/Expected for the heterozygote class. If $f_{ab}$ is the heterozygote frequency observed at a locus, and p and q are the major and minor allele frequencies, then het excess calculation is the following: $(f_{ab} - 2pq)/(2pq + \varepsilon)$

    ChiTest_P100

    Hardy-Weinberg p-value estimate calculated using genotype frequency. The value is calculated with 1 degree of freedom and is normalized to 100 individuals.

    Cluster_Sep

    Cluster separation score.

    AA_T_Mean

    Normalized theta angles mean for the AA genotype.

    AA_T_Std

    Normalized theta angles standard deviation for the AA genotype.

    AB_T_Mean

    Normalized theta angles mean for the AB genotype.

    AB_T_Std

    Standard deviation of the normalized theta angles for the AB genotype.

    BB_T_Mean

    Normalized theta angles mean for the BB genotypes.

    BB_T_Std

    Standard deviation of the normalized theta angles for the BB genotypes.

    AA_R_Mean

    Normalized R value mean for the AA genotypes.

    AA_R_Std

    Standard deviation of the normalized R value for the AA genotypes.

    AB_R_Mean

    Normalized R value mean for the AB genotypes.

    AB_R_Std

    Standard deviation of the normalized R value for the AB genotypes.

    BB_R_Mean

    Normalized R value mean for the BB genotypes.

    BB_R_Std

    Standard deviation of the normalized R value for the BB genotypes.

    Plus/Minus Strand

    Designated "+" or "-" with respect to the reference genome strand. "U" designates unknown.

    Skipping indel: {identifier}

    Warning

    Indel context (deletion/insertion) could not be determined.

    Failed to process entry for record: {identifier}

    Warning

    Unable to determine reference allele for indel.

    Incomplete match of source sequence to genome for indel: {identifier}

    Warning

    Indel not properly mapped to the reference genome.

    Failed to combine genotypes due to ambiguity - exm1068284 (InfiniumII): TT, ilmnseq_rs1131690890_mnv (InfiniumII): AA, rs1131690890_mnv (InfiniumII): AA

    Warning

    Detailed information about a NoCall ("./.”) in the VCF as a result of combining multiple probes that assay the same variant with conflicting results. The example here is two probes with homozygous REF genotypes (AA) and one probe with homozygous ALT probe (TT)

    Cluster file ({GTC.egt}) is not the same as CN Model Cluster file ({CN_Model.egt}).

    Warning

    Cluster file used to generated GTCs used for copy number calling is not the same as was used for the GTCs used during copy number training that created the input CN model. Though CNV model is robust to minor cluster file updates, CNV training should be considered when there are significant updates in the cluster file. To remove the warning, copy number training needs to be re-run with the new GTCs generated via the new cluster file during genotyping, a different CN model with the expected cluster file needs to be used, or different GTCs should be used for copy number calling that were generated using the same cluster file as was used during the generation of the input CN model.

    {numPassingSamples} sample(s) passed QC.

    Requires at least {minPassingSamples} samples to proceed.

    Error

    CNV calling is batch dependent and requires a certain number of samples with high-quality to make accurate calls. More high-quality samples need to be added to analysis batch to resolve error.

    Invalid manifest file path {manifestPath}

    Error

    Application could not find manifest file provided or user error.

    Failed to load cluster file: {e.Message}

    Error

    Corrupted file or unsupported version.

    System.IO.EndOfStreamException: Unable to read beyond the end of the stream.

    Error

    Likely failure to read a GTC file, see this for more details on root cause and a workaround

    sampleId

    Sentrix barcode and position of the sample.

    locusAnnotations

    The star allele call information.

    rawScore

    Raw quality score of the solution without including the population frequency of PGx alleles. The score ranges from 0 to 1.

    supportingVariants

    All variants present in the array that support the star allele solution. The field provides an array (list) of supporting Variants.

    Each supporting variant is listed with essential information extracted from the SNV VCF to assist with troubleshooting, including Chromosome (chrom), Location (pos), Reference allele (ref), Alternative allele (alt), Genotype (gt), GenCall score (gs), B-allele frequency (baf), the variant ID (id), and the associated star allele IDs (alleleIds).

    candidateSolutions

    The set of alternative star allele calling solutions, this is only relevant for genes of the ‘Star Allele’ call type.

    missingVariantSites

    All core variants that are not available (e.g. not on the array, or no calls in the SNV VCF) for star allele calling for this gene. For star alleles, the field provides an array (list) of variant "id" and impacted "alleleIds" pairs

    allelesTested

    Alleles that are covered by the star allele caller. The capability to call star alleles is also dependent on array content coverage and data quality. This field is defined by the array's content and will be the same across all samples.

    alleles

    The composite alleles of the candidate genotype solution.

    solutionLong

    Long format solution for star alleles. The field has the following format: Structural Variant Type: Underlying Star allele.

    An example of a long solution is: Complete: CYP2D64, Complete: CYP2D610, CYP2D668: CYP2D64 where there are two complete alleles that have CYP2D64 and CYP2D610 haplotypes and one CYP2D668 structural variant that has a CYP2D64 haplotype configuration.

    supportingVariants

    All variants present in the array that support the star allele solution. The field provides an array (list) of supporting Variants.

    Each supporting variant is listed with essential information extracted from the SNV VCF to assist with troubleshooting, including Chromosome (chrom), Location (pos), Reference allele (ref), Alternative allele (alt), Genotype (gt), GenCall score (gs), B-allele frequency (baf), and the variant ID (id).

    missingVariantSites

    All variants not present in the array or not called in the SNV VCF file for the star allele solution. The field provides an array (list) of missing variants.

    collapsedAlleles

    Star alleles that cannot be distinguished from the solution star allele given the input array’s content. The field has the following format: Long Solution Star-Allele: (List of collapsed star alleles).

    The most frequent star allele based on the population frequency of PGx alleles will be the star allele in the solution.

    copyNumberRegions

    Gene regions for the copy numbers listed in CopyNumberSolution.

    copyNumberSolution

    Estimated copy number for each gene region listed in CopyNumberRegions

    iscnVersion

    Release date of ISCN formatting used.

    sampleId

    ID string assigned to the sample.

    gcCorrect

    Boolean indicating whether GC correction was enabled.

    minDelProbes

    Deletions must contain this many probes to be reported.

    minDupProbes

    Duplications must contain this many probes to be reported.

    minLOHProbes

    LOH variants must contain this many probes to be reported.

    minDelSize

    Minimum length filter for reporting deletions in kilobases (kb).

    minDupSize

    Minimum length filter for reporting a duplication in kb.

    minLOHSize

    Minimum length filter for reporting a loss-of-heterzygozity (LOH) variant in kb.

    minQual

    Minimum quality score filter for reporting a variant.

    overallPloidy

    Arithmetic mean of the ploidy across the genome. This value accounts for the length of all variant calls. The baseline ploidy value without any variants will differ by sex.

    callRate

    Frequency of expected calls i.e. #Calls/(#No_Calls + #Calls).

    logRDev

    Standard deviation of the Log R ratio values for all probes.

    bafDev

    Standard deviation of the B allele frequency values for each assigned genotype (AA/AB/BB).

    numLOHOver1M

    Count of LOH variants detected > 1 Mbp in length.

    numLOHOver8M

    Count of LOH variants detected > 8 Mbp in length.

    totalSizeLOHOver1M

    Cumulative length of all detected LOH variants > 1 Mbp in length.

    copyNumberMedian

    Length-normalized genome-wide median copy number value. Copy number values are assigned to contiguous segements of variable size in the genome by the algorithm. The length-weighted copy numbers of each variant are aggregated to calculate the overall median.

    percentLOH

    Percent of the genome comprised of LOH variants.

    sexEstimate

    Detected sex of the sample.

    traditionalNomenclature

    Simplified ISCN format designation for all detected variants in the sample.

    microarrayNomenclature

    ISCN format designation for all detected variants in the sample.

    chromosomeAnnotations

    Counts of each type of variant detected per chromosome, including mosaic calls.

    locusAnnotations

    Locus level statistics (see additional table for locus-level statistics).

    numLOHOver1M

    Count of LOH variants detected > 1 Mbp in length.

    numLOHOver8M

    Count of LOH variants detected > 8 Mbp in length.

    totalSizeLOHOver1M

    Cumulative length of all detected LOH variants > 1 Mbp in length.

    percentLOH

    Percent of the genome comprised of LOH variants.

    copyNumberMedian

    Copy number median of the chromosome.

    copyNumberMean

    Copy number mean of the chromosome.

    minLogRRatio

    Minimum Log R ratio of the chromosome.

    maxLogRRatio

    Maximum Log R ratio of the chromosome.

    medianMosaicFraction

    Median mosaic fraction of mosaic events on the chromosome.

    numberDel

    Number of deletion events on the chromosome.

    numberDup

    Number of duplication events on the chromosome.

    numberLOH

    Number of LOH events on the chromosome.

    numberMosaic

    Number of mosaic events on the chromosome.

    copyNumber

    Copy number of the locus.

    qualityScore

    Phred-scaled score of the variant call quality.

    size

    Length of the variant.

    effectiveSize

    Gap-excluded length of the variant. A gap is defined when probe spacing is more than 150 times the median probe spacing. In that case, the gap is replaced with the median probe spacing.

    probeCount

    Number of probes contained in the called variant region.

    percentHet

    Percent of probes in the region call as heterzygous i.e. AB.

    lrrMedian

    Median log R ratio value of the probes within the variant.

    lrrDev

    Standard deviation of the log R ratio values of the probes within the variant.

    bafDev

    Standard deviation of the B allele frequency values of the probes within the variant.

    startCytoBand

    Cytoband in which the variant starts.

    endCytoBand

    Cytoband in which the variant ends.

    traditionalNomenclature

    Simplified ISCN format designation for the detected variant.

    microarrayNomenclature

    ISCN format designation for the detected variant.

    geneCount

    Count of annotated genes within the variant region.

    genes

    List of names of all annotated genes within the variant region.

    extension_green

    extension_red

    • Extension controls test the extension efficiency of A, T, C, and G nucleotides from a hairpin probe, and are therefore sample independent.

    • In the green channel, the lowest intensity for C or G is always greater than the highest intensity for A or T.

    • The metric provided is the (lowest of the C or G intensity)/ (highest of A or T extension) for a single sample.

    hybridization_high_medium

    hybridization_medium_low

    • Hybridization controls test the overall performance of the Infinium Assay using synthetic targets instead of amplified DNA. These synthetic targets complement the sequence on the array, allowing the probe to extend on the synthetic target as a template. Synthetic targets are present in the Hybridization Buffer at 3 levels, monitoring the response from high-concentration (5 pM), medium concentration (1 pM), and low concentration (0.2 pM) targets. All bead type IDs result in signals with various intensities, corresponding to the concentrations of the initial synthetic targets.

    • The value for high concentration is always higher than medium and the value for medium concentration is always higher than low.

    • The metric provided is the value of high/medium and the value of medium/low.

    target_removal1

    target_removal2

    • Target removal controls test the efficiency of the stripping step after the extension reaction. In contrast to allele-specific extension, the control oligos are extended using the probe sequence as a template. This process generates labeled targets. The probe sequences are designed such that extension from the probe does not occur. All target removal controls result in low signal compared to the hybridization controls, indicating that the targets were removed efficiently after extension. Target removal controls are present in the Hybridization Buffer.

    • The Background for the same sample is close to or larger than either control.

    • The metric provided is

    bisulfite_conversion1_green

    bisulfite_conversion1_background_green

    bisulfite_conversion1_red

    bisulfite_conversion1_background_red

    • These controls assess the efficiency of bisulfite conversion of the genomic DNA. The Infinium Methylation probes query a [C/T] polymorphism created by bisulfite conversion of non-CpG cytosines in the genome.

    • These controls use Infinium I probe design and allele-specific single base extension to monitor efficiency of bisulfite conversion. If the bisulfite conversion reaction was successful, the "C" (Converted) probes matches the converted sequence and get extended. If the sample has unconverted DNA, the "U" (Unconverted) probes get extended. There are no underlying C bases in the primer landing sites, except for the query site itself.

    • The calculation is done in both the green and red channels separately to provide 2 unique sets of values:

    bisulfite_conversion2

    bisulfite_conversion2_background

    • These controls assess the efficiency of bisulfite conversion of the genomic DNA. The Infinium Methylation probes query a [C/T] polymorphism created by bisulfite conversion of non-CpG cytosines in the genome.

    • These controls use Infinium II probe design and single base extension to monitor efficiency of bisulfite conversion. If the bisulfite conversion reaction was successful, the "A" base gets incorporated and the probe has intensity in the red channel. If the sample has unconverted DNA, the "G" base gets incorporated across the unconverted cytosine, and the probe has elevated signal in the green channel.

    • The calculation is done using both channels for 1 set of numbers returned.

    specificity1_green

    specificity1_red

    • Specificity controls are designed to monitor potential nonspecific primer extension for Infinium I and Infinium II assay probes. Specificity controls are designed against nonpolymorphic T sites.

    • These controls are designed to monitor allele-specific extension for Infinium I probes. The methylation status of a particular cytosine is carried out following bisulfite treatment of DNA by using query probes for unmethylated and methylated state of each CpG locus. In assay oligo design, the A/T match corresponds to the unmethylated status of the interrogated C, and G/C match corresponds to the methylated status of C. G/T mismatch controls check for nonspecific detection of methylation signal over unmethylated background. PM controls correspond to A/T perfect match and give high signal. MM controls correspond to G/T mismatch and give low signal.

    specificity2

    specificity2_background

    • Specificity controls are designed to monitor potential nonspecific primer extension for Infinium I and Infinium II assay probes. Specificity controls are designed against nonpolymorphic T sites.

    • These controls are designed to monitor extension specificity for Infinium II probes and check for potential nonspecific detection of methylation signal over unmethylated background. Specificity II probes incorporate the "A" base across the nonpolymorphic T and have intensity in the Red channel. If there was nonspecific incorporation of the "G" base, the probe has elevated signal in the Green channel.

    • The following metrics are provided:

    nonpolymorphic_green

    nonpolymorphic_red

    • Nonpolymorphic controls test the overall performance of the assay, from amplification to detection, by querying a particular base in a nonpolymorphic region of the genome. They let you compare assay performance across different samples. One nonpolymorphic control has been designed for each of the 4 nucleotides (A, T, C, and G).

    • In the green channel, the lowest intensity of C or G is always greater than the highest intensity of A or T.

    • The metric provided is the (lowest intensity for C or G) /(highest intensity for A or T) for a single sample.

    avg_green_raw

    avg_red_raw

    • Average green and red raw signal for the given sample.

    avg_green_norm

    avg_red_norm

    • Average green and red signal after dye bias correction and noob normalization for the given sample.

    ScanTime

    • The date (MM/DD/YY) and time (HH:MM) that the sample was scanned by the iScan system.

    NProbes

    • Number of probes on the BeadChip, including SNP and CG probes

    NPassDetection

    • Number of probes on the BeadChip that passed detection p-value at the threshold defined.

    prop_probes_passing

    • The proportion of probes passing defined as the number of probes passing detection p-value divided by the total number of probes on the BeadChip.

    passQC

    • 1 = sample passed all QC metrics for the thresholds defined

    • 0 = sample did not pass all QC metrics for the thresholds defined

    failCodes

    • The list of parameters that failed QC for the thresholds defined.

    (A or T/C or G) > 5

    Red channel—Lowest A or T intensity is used; highest C or G intensity is used.

    Hybridization Green High > Medium > Low

    (High/Med) > 1 (Med/Low) > 1

    Target Removal Green ctrl 1 ≤ bkg

    ((bkg + x)/ctrl) > 1

    bkg = Extension Green highest A or T intensity

    Target Removal Green ctrl 2 ≤ bkg

    ((bkg + x)/ctrl) > 1

    bkg = Extension Green highest A or T intensity

    Bisulfite Conversion I Green

    C1, 2 > U1, 2

    (C/U) > 1

    • Lowest C intensity is used. Highest U intensity is used.

    Bisulfite Conversion I Green

    U ≤ bkg

    ((bkg + x)/U) > 1

    • For MSA arrays, the default is 0.5

    • Highest U intensity is used.

    • Green channel—bkg = Extension Green highest AT

    Bisulfite Conversion I Red C3, 4, 5 > U3, 4, 5

    (C/U) >1

    • Lowest C intensity is used. Highest U intensity is used.

    Bisulfite Conversion I Red U ≤ bkg

    ((bkg + x)/U) > 1

    • For MSA arrays, the default is 0.5

    • Highest U intensity is used.

    • Red Channel—bkg = Extension Red highest CG

    Bisulfite Conversion II C Red > C Green

    (C Red/ C Green) > 1

    • For MSA arrays, the default is 0.5

    • Lowest C Red intensity is used. Highest C Green intensity is used.

    Bisulfite Conversion II C green ≤ bkg

    ((bkg + x)/C Green) > 1

    • For MSA arrays, the default is 0.5

    • Highest C Green intensity is used.

    • Green channel—bkg = Extension Green highest AT

    Specificity I Green PM > MM

    (PM/MM) > 1

    • Lowest PM intensity is used. Highest MM intensity is used

    Specificity I Red PM > MM

    (PM/MM) > 1

    • Lowest PM intensity is used. Highest MM intensity is used

    Specificity II

    S Red > S Green

    (S Red/ S Green) > 1

    • Lowest S Red intensity is used. Highest S Green intensity is used.

    Specificity II

    S Green ≤ bkg

    ((bkg + x)/ S green) > 1

    • bkg = Extension Green highest A or T intensity

    • Highest S Green intensity is used.

    Nonpolymorphic Green Lowest CG/ Highest AT

    (C or G/ A or T) > 5

    • Lowest C or G intensity is used; highest A or T intensity is used

    • For MSA arrays, the default threshold is 2.5

    Nonpolymorphic Red Lowest AT/ Highest CG

    (A or T/ C or G) >5

    • Lowest A or T intensity is used; highest C or G intensity is used

    • For MSA arrays, the default threshold is 3

    Missing control or assay probes
  • Missing required columns in the manifest

  • Unable to compute certain metrics

  • format_samplesheet.log

    beadChipName and sampleSectionName columns are required for the sample sheet.

    Error

    Sample sheet does not contain required columns: beadChipName and sampleSectionName.

    format_samplesheet.log

    Warning: <Number> samples have duplicate Sample_ID

    Warning

    X lines in the sample sheet have duplicate <beadChipName>_<sampleSectionName>. Duplicates are dropped from analysis.

    convert_manifest_ilmn_sesame.log

    Missing control probes in manifest

    Error

    Missing “[Controls]” line in CSV manifest

    convert_manifest_ilmn_sesame.log

    Probe section not found

    Error

    Missing “[Assay]” line in CSV manifest

    convert_manifest_ilmn_sesame.log

    Missing required columns: IlmnID, AddressA_ID, AddressB_ID, Color_Channel

    Error

    Missing one of required columns in Assay section of manifest

    convert_manifest_ilmn_sesame.log

    Controls not formatted correctly. Must have 4 columns (Address,Type,Color_Channel,Name)

    Error

    Missing one of required columns in Control section of manifest

    run_sesame_gs.log

    Missing sample: <Sample_ID>

    Error

    Missing idats for a particular sample

    run_sesame_gs.log

    No scan time available

    Warning

    No scan time in idat

    run_sesame_gs.log

    Prep failed

    Error

    Dye bias correction or noob failure for sample

    run_sesame_gs.log

    Warning: missing control probe types <Missing probes>

    Warning

    Missing control probe types to compute a BACR metric. Metric will be set to NA.

    run_sesame_gs.log

    Warning: missing control probe names <Missing probe types>

    Warning

    Missing control probes to compute a BACR metric. Metric will be set to NA.

    qc.log

    No features, skipping PCA plot

    Warning

    No common betas found in all samples. This may occur if a sample has no signal intensity in the IDAT files.

    Inf II [A/G] -> AA + Inf I [T/G] -> TT

    AT

    NoCall

    The Inf II assay cannot differentiate A versus T alleles, hence AA call for the Inf II assay is consistent with AA, AT, or TT genotypes. The Inf I assay TT call, however, is consistent with AT or TT genotypes. Combining the two assays results in a NoCall due to the AT or TT ambiguity.

    Inf I [T/A] -> AA + Inf II [A/G] -> NC

    AA

    NoCall

    NoCall for Inf II probe leaves possibility of AG. Ambiguity between AG or AA leads to NoCall.

    Sample

    Sentrix barcode and position of the sample.

    Rank

    Rank of a single star allele solution for a gene. The top solution based on quality score is ranked as 1 with the alternative solutions ranked lower.

    Gene or Variant

    The gene symbol, or gene symbol plus rsID for variants.

    Type

    ‘Haplotype’ (star allele) or ‘Variant’ PGx calling type.

    Solution

    Star allele or variant solution. If diploid, variant solutions have the format of Allele1/Allele2.

    Solution Long

    Long format solution for star alleles. The field has the following format: Structural Variant Type: Underlying Star allele.

    An example of a long solution is: Complete: CYP2D64, Complete: CYP2D610, CYP2D668: CYP2D64 where there are two complete alleles that have CYP2D64 and CYP2D610 haplotypes and one CYP2D668 structural variant that has a CYP2D64 haplotype configuration.

    SNP Name

    SNP identifier.

    SNP

    SNP alleles as reported by assay probes. Alleles on the Design strand (the ILMN strand) are listed in order of Allele A/B.

    Sample ID

    Sample identifier.

    Allele 1 – Top

    Allele 1 corresponds to Allele A and are reported on the Top strand.

    Allele 2 – Top

    Allele 2 corresponds to Allele B and are reported on the Top strand.

    Allele 1 – Forward

    Allele 1 corresponds to Allele A and are reported on the Forward strand.

    Locus_Name

    Locus name from the manifest file.

    Illumicode_Name

    Locus ID from the manifest file.

    #No_Calls

    Number of loci with GenCall scores below the call region threshold.

    #Calls

    Number of loci with GenCall scores above the call region threshold.

    Call_Freq

    Call frequency or call rate calculated as follows: #Calls/(#No_Calls + #Calls)

    A/A_Freq

    Frequency of homozygote allele A calls.

    Failed to normalize and gencall sample: {sample_id}, it will be skipped. Error: The given key '{loci_id}' was not present in the dictionary.

    Warning

    This generally occurs because of a mismatch between the manifest (bpm) and cluster file (egt) (i.e., the cluster file was generated via a different manifest). To remedy the issue, use the manifest and cluster files intended for use together.

    Reference allele is not queried for locus: {identifier}

    Warning

    True reference allele does not match any alleles in the manifest. The error is common for MNVs and will be addressed in future versions of the software.

    Skipping non-mapped locus: {identifier}

    Warning

    Locus has no chromosome position (usually 0) These loci may be used for quality purposes or CNV calling only.

    Skipping intensity only locus: {identifier}

    Warning

    Similar to non-mapped loci, intensity only probes have applications outside creating variants for SNV VCFs such as CNV calling.

    softwareVersion

    DRAGEN Array software version, e.g. dragena 1.0.0.

    genomeBuild

    Genome build, e.g hg38.

    starAlleleDatabaseSources

    Public databases with versions used as the sources of the star allele definitions and population frequencies.

    phenotypeDatabaseSources

    Public databases with versions used as the sources of the star allele phenotypes.

    mappingFile

    The PGx database file used for the star allele calling.

    pgxGuideline

    The PGx guidelines used for metabolizer status/phenotype annotations, e.g. CPIC or DPWG

    gene

    The gene symbol.

    callType

    ‘Star Allele’ or ‘Variant’ PGx calling type.

    genotype

    Most likely star allele or variant solution. If diploid, variant solutions have the format of Allele1/Allele2. More than one solution meeting threshold requirements can be reported. Multiple top solutions are separated by a semi-colon.

    activityScore

    Activity score annotation of the determined genotype of the gene determined based on public PGx guidelines CPIC or DPWG.

    phenotypeDatabaseAnnotation

    Metabolizer status and function annotations of the determined genotype of the gene based on lookup into public PGx guidelines CPIC or DPWG per user choice.

    qualityScore

    Quality score of the solution including the population frequency of PGx alleles. The score ranges from 0 to 1.

    rank

    Rank of a single star allele solution for a gene. The top solution based on quality score is ranked as 1 with the alternative solutions ranked lower.

    genotype

    Star allele or variant solution. If diploid, variant solutions have the format of Allele1/Allele2.

    activityScore

    Activity score annotation of the determined genotype of the gene determined based on public PGx guidelines CPIC or DPWG.

    phenotype

    Metabolizer status and function annotations of the determined genotype of the gene based on lookup into public PGx guidelines CPIC or DPWG per user choice.

    qualityScore

    Quality score of the solution including the population frequency of PGx alleles. The score ranges from 0 to 1.

    rawScore

    Raw quality score of the solution without including the population frequency of PGx alleles. The score ranges from 0 to 1.

    annotateDb

    File name of the database annotation file.

    softwareVersion

    Version of DRAGEN Array used for the analysis.

    referenceGenome

    File name of the reference genome.

    annotationType

    Integer representing annotation methodology where 0=Constitutional and 1=Oncology.

    genomeBuild

    Genome build e.g. hg19, hg38.

    databaseSources

    Release versions of annotation data.

    id

    Chromosome name.

    size

    Chromosome size.

    percentHet

    Percent of probes in the chromosome called as heterzygous i.e. AB.

    hasMosaicism

    Boolean indicating presence of any mosaic variant on the chromosome.

    lrrMedian

    Median log R ratio value of the probes within the chromosome.

    lrrDev

    Standard deviation of the log R ratio values of the probes within the chromosome.

    id

    Unique variant ID containing variant type, chromosome and the start and end positions.

    chrom

    Chromosome of the variant.

    start

    Variant start position.

    end

    Variant end position.

    callType

    Variant class (DEL, DUP, LOH).

    mosaicState

    Boolean indicating whether the locus is a mosaic variant.

    Sentrix_ID

    12-digit BeadChip Barcode associated with the sample.

    Sentrix_Position

    Row and column on the BeadChip ie R01C01

    Sample_ID

    Optional field that can be indicated using IDAT Sample Sheet

    User Defined Meta Data

    Optional field(s) that can be indicated using IDAT Sample Sheet. Any number of fields indicated will appear in this output file.

    restoration

    • The default threshold is 0.

    • If using the FFPE DNA Restore Kit, the restoration control identifies success of the FFPE restoration chemistry. Change the threshold from 0 to 1 if the FFPE DNA Restore Kit was used.

    • The green channel intensity is higher than Background. Therefore, the metric provided is the Green Channel Intensity/Background.

    staining_green

    staining_red

    • Staining controls are used to examine the efficiency of the staining step in both the red and green channels. These controls are independent of the hybridization and extension step.

    • The green channel shows a higher signal for biotin staining when compared to biotin background, whereas the red channel shows higher signal for DNP staining when compared to DNP background.

    • The metric provided for green is the (Biotin High value)/ (Biotin Bkg) and the metric provided for red is (DNP High value)/(DNP Bkg value)

    • The default threshold is 5. This threshold can be increased on some scanners.

    Control

    Calculation

    Additional Information

    Restoration Green > bkg

    (Green/(bkg+x))> 0

    • If using the FFPE Restore kit, change the default threshold from 0 to 1.

    • bkg = Extension Green highest A or T intensity

    Staining Green

    Biotin High > Biotin Bkg

    (High/Biotin Bkg) > 5

    Staining Red

    DNP High > DNP Bkg

    (High/DNP Bkg) > 5

    Extension Green Lowest CG/Highest AT

    (C or G/A or T) > 5

    Green channel—Lowest C or G intensity is used; highest A or T intensity is used.

    Proportion of Probes Passing Threshold

    Histogram of the proportion of probes passing the p-value detection threshold. Samples passing QC are shown in one color, and samples failing QC are shown in another color.

    Principal Component Analysis (PCA)

    Uses beta values for all analytical probes to compare samples. Principal component analysis (PCA) is applied to the beta values to reduce the dimensionality of the data to two “principal components” that reflect the most variation across samples. If more than 100 samples are used in the analysis, a random subset of 10,000 probes are used for the PCA analysis to reduce computational burden. PCA control plot assigns unique colors to each sample group defined by the IDAT Sample Sheet. If no groups were assigned, all samples will appear the same color. Sample groups may cluster together and can be used to explain some of the variation. Coordinates used to plot each sample in the PCA control plot are provided in the pcs.tsv.gz output file (see below).

    blank

    BeadChip Barcode and Position ie 123456789101_R01C01

    principal component 1

    The variable of the first axis for the Principal Component Analysis

    principal component 2

    The variable of the second axis for the Principal Component Analysis

    Sample_Group

    Sample group defined by the user in the IDAT Sample Sheet. If no sample group was defined, all samples will show NA.

    Probe_ID

    This is a unique identifier for each probe. It corresponds to the IlmnID column in the standard Illumina manifest format or ctl_[AddressA_ID] for control probes.

    U

    This is corresponds to the AddressA_ID column in the standard Illumina manifest format.

    M

    This corresponds to the AddressB_ID column in the standard Illumina manifest format.

    col

    This is the color channel for Infinium I probes (R/G). For Infinium I probes, this column will be NA.

    Address

    The address of the probe

    Type

    The control probe type

    Color_Channel

    A color used to denote certain control probes in legacy software

    Name

    A human readable identifier for certain control probes

    Probe_ID

    This is a unique identifier for each probe. It corresponds to the IlmnID column in the standard Illumina manifest format or ctl_[AddressA_ID] for control probes.

    Log

    Error

    Type

    Cause

    write_samplesheet.log

    No IDATs found

    Error

    No IDATs provided for analysis

    format_samplesheet.log

    No samples in sample sheet

    Error

    No samples in user’s sample sheet input

    format_samplesheet.log

    Sample sheet not correctly formatted

    Error

    Sample sheet is not in CSV format or header lines do not start with “<”

    bgziparrow-up-right
    bgziparrow-up-right
    VCF4.1 formatarrow-up-right
    Infinium I or Infinium II probe designsarrow-up-right
    warnings
    PLINK documentationarrow-up-right
    PLINK VCF conversion documentationarrow-up-right
    BeadArray Library File Parserarrow-up-right
    known issue
    Cytogenetics analysis
    PGx
    release notesarrow-up-right
    How to interpret DNA strand and allele information for Infinium genotyping array dataarrow-up-right
    star allele CSV file
    TBC1D3Iarrow-up-right
    ENSG00000278395arrow-up-right
    Samtoolsarrow-up-right
    High-throughput Infinium methylation array QC using DRAGEN Array Methylation QCarrow-up-right
    Threshold Adjustment

    Extension Red

    Lowest AT/Highest CG

    SentrixBarcode_A,SentrixPosition_A,Sample_ID,Sample_Group,MetaData1
    204753010023,R01C01,NA1231,Group1,F
    204753010024,R01C01,NA1233,Group2,M
    Sample ID,Sample Name,Sample Folder,Autosomal Call Rate,Call Rate,Log R Ratio Std Dev,Sex Estimate,SentrixBarcode_A,SentrixPosition_A,Sample_Group,MetaData1
    204753010023_R01C01,204753010023_R01C01,/sample/folder,0.99414575,0.98843694,0.14829777,F,204753010023,R01C01,Group1,F
    204753010024_R01C01,204753010024_R01C01,/sample/folder,0.99415575,0.98943694,0.14929777,M,204753010024,R01C01,Group2,M
    [
      {
        "Sample ID": "204753010023_R01C01",
        "Sample Name": "204753010023_R01C01",
        "Sample Folder": "/sample/folder",
        "Autosomal Call Rate": 0.99414575,
        "Call Rate": 0.98843694,
        "Log R Ratio Std Dev": 0.14829777,
        "Sex Estimate": "F",
        "SentrixBarcode_A": "204753010023",
        "SentrixPosition_A": "R01C01",
        "Sample_Group": "Group1",
        "MetaData1": "F"
      },
      {
        "Sample ID": "204753010024_R01C01",
        "Sample Name": "204753010024_R01C01",
        "Sample Folder": "/sample/folder",
        "Autosomal Call Rate": 0.99415575,
        "Call Rate": 0.98943694,
        "Log R Ratio Std Dev": 0.14929777,
        "Sex Estimate": "F",
        "SentrixBarcode_A": "2083757900024",
        "SentrixPosition_A": "R01C01",
        "Sample_Group": "Group2",
        "MetaData1": "M"
      }
    ]
    {
      "softwareVersion": "dragena 1.3.0",
      "genomeBuild": "38",
      "starAlleleDatabaseSources": [
        "PharmVar Version: 6.1",
        "PharmGKB Database Version: Snapshot-2024.05.16",
        "UGT Alleles Nomenclature: 2010.12.21",
        "The Human Cytochrome P450 (CYP) Allele Nomenclature Database, July 2024"
      ],
      "phenotypeDatabaseSources": [
        "CPIC Database Version: 1.38.0",
        "DPWG Database Version: June 2023"
      ],
      "mappingFile": "DRAGENA-549-fix-annotate-sha.e56e884ed1f2d118e796cdab578ab895456bb94e.zip",
      "pgxGuideline": "CPIC",
      "sampleId": "207883050020_R08C03",
      "locusAnnotations": [
        {
          "gene": "CYP2C9",
          "callType": "Star Allele",
          "genotype": "*1/*1",
          "activityScore": "2",
          "phenotypeDatabaseAnnotation": "CYP2C9 Normal Metabolizer",
          "qualityScore": "0.9999",
          "rawScore": "0.9999",
          "supportingVariants": [],
          "candidateSolutions": [
            {
              "rank": 1,
              "genotype": "*1/*1",
              "activityScore": "2",
              "phenotypeDatabaseAnnotation": "CYP2C9 Normal Metabolizer",
              "qualityScore": 0.9999,
              "rawScore": 0.9999,
              "alleles": [
                {
                  "solutionLong": "Complete: *1",
                  "supportingVariants": [],
                  "missingVariantSites": [],
                  "collapsedAlleles": ""
                }
              ],
              "copyNumberRegions": "p5,exon.1,intron.1,exon.2,intron.2,exon.3,intron.3,exon.4,intron.4,exon.5,intron.5,exon.6,intron.6,exon.7,intron.7,exon.8,intron.8,exon.9,p3",
              "copyNumberSolution": "2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2"
            }
          ],
          "missingVariantSites": [
            {
              "id": "NC_000010.11:g.94938719T>G",
              "alleleIds": "*80"
            },
            {
              "id": "NC_000010.11:g.94938788C>T",
              "alleleIds": "*83"
            },
            {
              "id": "NC_000010.11:g.94938800G>A",
              "alleleIds": "*76"
            },
            {
              "id": "NC_000010.11:g.94941975G>A",
              "alleleIds": "*77"
            },
            {
              "id": "NC_000010.11:g.94942243T>G",
              "alleleIds": "*78"
            },
            {
              "id": "NC_000010.11:g.94942306C>T",
              "alleleIds": "*72"
            },
            {
              "id": "NC_000010.11:g.94942308C>T",
              "alleleIds": "*73"
            },
            {
              "id": "NC_000010.11:g.94942309G>T",
              "alleleIds": "*27"
            },
            {
              "id": "NC_000010.11:g.94947939G>T",
              "alleleIds": "*74"
            },
            {
              "id": "NC_000010.11:g.94949145C>T",
              "alleleIds": "*82"
            },
            {
              "id": "NC_000010.11:g.94949163del",
              "alleleIds": "*85"
            },
            {
              "id": "NC_000010.11:g.94972183A>T",
              "alleleIds": "*81"
            },
            {
              "id": "NC_000010.11:g.94981258C>T",
              "alleleIds": "*79"
            },
            {
              "id": "NC_000010.11:g.94986136A>C",
              "alleleIds": "*75"
            },
            {
              "id": "NC_000010.11:g.94986174G>C",
              "alleleIds": "*84"
            }
          ],
          "allelesTested": "*1,*2,*3,*4,*5,*6,*7,*8,*9,*10,*11,*12,*13,*14,*15,*16,*17,*18,*19,*20,*21,*22,*23,*24,*25,*26,*27,*28,*29,*30,*31,*32,*33,*34,*35,*36,*37,*38,*39,*40,*41,*42,*43,*44,*45,*46,*47,*48,*49,*50,*51,*52,*53,*54,*55,*56,*57,*58,*59,*60,*61,*62,*63,*64,*65,*66,*67,*68,*69,*70,*71,*72,*73,*74,*75,*76,*77,*78,*79,*80,*81,*82,*83,*84,*85"
        },
        {
          "gene": "CYP2C19",
          "callType": "Star Allele",
          "genotype": "*1/*2",
          "activityScore": "n/a",
          "phenotypeDatabaseAnnotation": "CYP2C19 Intermediate Metabolizer",
          "qualityScore": "0.9999",
          "rawScore": "0.9958",
          "supportingVariants": [
            {
              "chrom": "10",
              "pos": "94842866",
              "ref": "A",
              "alt": "G",
              "gt": "1/1",
              "gs": "0.2669",
              "baf": "1",
              "id": "NC_000010.11:g.94842866A>G",
              "alleleIds": "*1"
            },
            {
              "chrom": "10",
              "pos": "94775367",
              "ref": "A",
              "alt": "G",
              "gt": "0/1",
              "gs": "0.2191",
              "baf": "0.4690612",
              "id": "NC_000010.11:g.94775367A>G",
              "alleleIds": "*2"
            },
            {
              "chrom": "10",
              "pos": "94781859",
              "ref": "G",
              "alt": "A",
              "gt": "0/1",
              "gs": "0.3351",
              "baf": "0.66212183",
              "id": " NC_000010.11:g.94781859G>A",
              "alleleIds": "*2"
            },
            {
              "chrom": "10",
              "pos": "94842866",
              "ref": "A",
              "alt": "G",
              "gt": "1/1",
              "gs": "0.2669",
              "baf": "1",
              "id": " NC_000010.11:g.94842866A>G",
              "alleleIds": "*2"
            }
          ],
          "candidateSolutions": [
            {
              "rank": 1,
              "genotype": "*1/*2",
              "activityScore": "n/a",
              "phenotypeDatabaseAnnotation": "CYP2C19 Intermediate Metabolizer",
              "qualityScore": 0.9999,
              "rawScore": 0.9958,
              "alleles": [
                {
                  "solutionLong": "Complete: *1",
                  "supportingVariants": [
                    {
                      "chrom": "10",
                      "pos": "94842866",
                      "ref": "A",
                      "alt": "G",
                      "gt": "1/1",
                      "gs": "0.2669",
                      "baf": "1",
                      "id": "NC_000010.11:g.94842866A>G"
                    }
                  ],
                  "missingVariantSites": [],
                  "collapsedAlleles": ""
                },
                {
                  "solutionLong": "Complete: *2",
                  "supportingVariants": [
                    {
                      "chrom": "10",
                      "pos": "94775367",
                      "ref": "A",
                      "alt": "G",
                      "gt": "0/1",
                      "gs": "0.2191",
                      "baf": "0.4690612",
                      "id": "NC_000010.11:g.94775367A>G"
                    },
                    {
                      "chrom": "10",
                      "pos": "94781859",
                      "ref": "G",
                      "alt": "A",
                      "gt": "0/1",
                      "gs": "0.3351",
                      "baf": "0.66212183",
                      "id": " NC_000010.11:g.94781859G>A"
                    },
                    {
                      "chrom": "10",
                      "pos": "94842866",
                      "ref": "A",
                      "alt": "G",
                      "gt": "1/1",
                      "gs": "0.2669",
                      "baf": "1",
                      "id": " NC_000010.11:g.94842866A>G"
                    }
                  ],
                  "missingVariantSites": [],
                  "collapsedAlleles": "*2.001"
                }
              ],
              "copyNumberRegions": "p5,exon.1,intron.1,exon.2,intron.2,exon.3,intron.3,exon.4,intron.4,exon.5,intron.5,exon.6,intron.6,exon.7,intron.7,exon.8,intron.8,exon.9,p3",
              "copyNumberSolution": "2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2"
            }
          ],
          "missingVariantSites": [
            {
              "id": "NC_000010.11:g.94762715T>C",
              "alleleIds": "*34"
            }
          ],
          "allelesTested": "*1,*2,*3,*4,*5,*6,*7,*8,*9,*10,*11,*12,*13,*14,*15,*16,*17,*18,*19,*22,*23,*24,*25,*26,*28,*29,*30,*31,*32,*33,*34,*35,*38,*39"
        }
    {
      "annotateDb": "CytoAnnotateData_DAv1.2.0.zip",
      "softwareVersion": "dragena 1.3.0 Cyto",
      "referenceGenome": "file://genome.fa",
      "annotationType": "Constitutional",
      "genomeBuild": "hg19",
      "databaseSources": "RefSeq (Version: GCF_000001405.40-RS_2023_10; Release Date: 2023-10-07),Ensembl (Version: 112; Release Date: 2024-05-14)",
      "iscnVersion": "ISCN 2020",
      "sampleId": "208662410005_R01C01",
      "gcCorrect": true,
      "minDelProbes": 10,
      "minDupProbes": 10,
      "minLOHProbes": 500,
      "minDelSize": "20kb",
      "minDupSize": "20kb",
      "minLOHSize": "3000kb",
      "minQual": 20,
      "overallPloidy": 2.012,
      "callRate": 0.9847303032875061,
      "logRDev": 0.19977793097496033,
      "medianLogRDev": 0.1329595363075463,
      "bafDev": {
        "AA": 0.015344353947021572,
        "AB": 0.05078388443393606,
        "BB": 0.026002263878862304
      },
      "numLOHOver1M": 4,
      "numLOHOver8M": 3,
      "totalSizeLOHOver1M": 49319297,
      "copyNumberMedian": 2.0,
      "percentLOH": "1.59%",
      "sexEstimate": "Female",
      "traditionalNomenclature": "dup(2)(q32.3q33.1),dup(2)(q33.1q37.1),dup(2)(q37.1q37.3),del(2)(q37.3q37.3),del(2)(q37.3q37.3),dup(3)(p24.3p24.3),del(13)(q34q34)",
      "microarrayNomenclature": "1p12q21.1(120311442_144549929)x2 hmz,2q32.3q33.1(197045077_201353083)x3,2q33.1q37.1(201356309_234652155)x3,2q37.1q37.3(234653107_238195820)x3,2q37.3(238204076_238283050)x1,2q37.3(238283403_243062047)x1,3p24.3(23235392_23403815)x3,5p12q11.1(44708357_49847659)x2 hmz,11p11.2q12.1(47912150_56507812)x2 hmz,13q34(111358236_111423865)x1,Xp11.22q12(53907828_65253670)x2 hmz",
      "chromosomeAnnotations": [
        {
          "id": "chr1",
          "size": 249250621,
          "percentHet": "10.9587%",
          "hasMosaicism": false,
          "lrrMedian": -0.009397780522704124,
          "lrrDev": 0.16622741086471732,
          "numLOHOver1M": 0,
          "numLOHOver8M": 0,
          "totalSizeLOHOver1M": 0,
          "percentLOH": "0%",
          "copyNumberMedian": 2.0,
          "copyNumberMean": 2.0,
          "minLogRRatio": -2.739042392000556,
          "maxLogRRatio": 1.631125334650278,
          "medianMosaicFraction": ".",
          "numberDel": 0,
          "numberDup": 0,
          "numberLOH": 0,
          "numberMosaic": 0
        },
        ...
      ],
        "locusAnnotations": [
        {
          "id": "AOH:1:120311442:144549929",
          "chrom": "chr1",
          "start": 120311441,
          "end": 144549929,
          "callType": "LOH",
          "mosaicState": false,
          "mosaicFraction": ".",
          "copyNumber": 2,
          "qualityScore": 35.0,
          "size": 24238488,
          "effectiveSize": 151008838,
          "probeCount": 701,
          "percentHet": "1.01%",
          "lrrMedian": 0.05862508801510572,
          "lrrDev": 0.08330211160585141,
          "bafDev": 0.47622189059186604,
          "startCytoBand": "1p12",
          "endCytoBand": "1q21.1",
          "traditionalNomenclature": "N/A",
          "microarrayNomenclature": "1p12q21.1(120311442_144549929)x2 hmz",
          "geneCount": 96,
          "genes": [
            "HMGCS2",
            "REG4",
            "NBPF7P",
            "PFN1P9",
            "NOTCH2P1",
            "ADAM30",
            "RP5-1042I8.7",
            "NOTCH2",
            "RP11-114O18.1",
            ...
          ]
        },
        ...
      ]
    }

    The default threshold is 5. This threshold can be increased on some scanners.

  • The default thresholds are 1. Do not change the default threshold.

  • Background/Control Intensity
    .
  • The default threshold is 1. Do not change the default threshold; however, the offset correction can be changed.

    • Green Channel

      • Lowest value of C1 or C2 / Highest value of U1 or U2. The default threshold is 1. This value can be increased for some scanners.

      • Background/(U1, or U2). The default threshold is 1. Do not change the default threshold; however, the offset correction can be changed.

    • Red Channel

      • Lowest value of C3, 4, or 5 / Highest value of U3, 4, or 5. The default threshold is 1. This value can be increased for some scanners.

      • Background /(Highest value of U4, U5, or U6). The default threshold is 1. Do not change the default threshold; however, the offset correction can be changed.

    The following metrics are provided:

    • (Lowest of red C 1, 2, 3, or 4) / (Highest of green C 1, 2, 3, or 4). The default threshold is 1. This value can be increased for some scanners.

    • Background/(Highest C1, C2, C3, or C4 green). The default threshold is 1. Do not change the default threshold; however, the offset correction can be changed.

    The metrics provided are the ratio of the lowest PM/highest MM in each channel.
  • The default threshold is 1. Do not change the default threshold.

  • (Lowest intensity of S1, S2, or S3 red) / (Highest intensity of S1, S2, or S3 green). The default threshold is 1. Do not change the default threshold.

  • Background/(Highest intensity S1, S2, S3, or S4 green). The default threshold is 1. Do not change the default threshold; however, the offset correction can be changed.

  • The default threshold is 5. This value can be increased for some scanners.

    known issue