# DRAGEN Array Local Analysis ## DRAGEN Array Local Overview DRAGEN Array provides accurate, comprehensive, and efficient analysis of Infinium microarray data. The local command-line interface makes it easy for power users to have granular control and flexibility to support large scale microarray genomic studies. ## Getting Started DRAGEN Array Local utilizes a command-line interface which allows full user control of software functionality and easy automation of tasks. The software is designed to be used by power users and bioinformaticians. If new to using command-line interface, please review the [Command-line interface Basics](#toc150786129). ### Computing Requirements Before downloading and installing the software, ensure the following specifications are met for best performance: | Category | Recommendation | | ---------------- | ------------------------------------------------------------------------------------------------------------------------------------------ | | CPU | 8 cores | | Memory | 16 GB available or more | | Hard Drive | 30 GB or more of free disk space | | Operating System |

One of the following:

Windows 10 or later – win10-x64
CentOS 7 or later, Ubuntu 20.04 or later – linux-x64

| ### Quota Specifications The star-allele call command in DRAGEN Array Local requires quota to run. The quota is charged per sample analyzed and can be purchased on the [Illumina Product Page](https://www.illumina.com/products/by-type/informatics-products/dragen-array-secondary-analysis.html). Quota is used for all samples analyzed including re-analysis or low-quality samples. The credential provided in the activation email after purchasing should be used as an input to the star-allele call command through the "--license-server-url" option. During runtime, the [logs](/dragen-array/dragen-array-v1.2/product-guides/output-files.md#toc150786153) will record the remaining quota at the beginning and the end of the analysis. Internet is required to do a software license check and ensure paid quota is available for all samples in the analysis batch. For the software license check, the following endpoints are used: * In v1.0 and v1.1: `license.edicogenome.com` * In v1.2+: `license.dragen.illumina.com` **NOTE:** Do not use `license.dragen.illumina.com` license server urls when running DRAGEN Array v1.0 and v1.1 as that domain only works with v1.2+ versions.\ This is described in the [1.0.0](/dragen-array/dragen-array-v1.2/reference/release-notes/dragen-array-v1.1.0-release-notes.md#known-issues) and [1.1.0](/dragen-array/dragen-array-v1.2/reference/release-notes/dragen-array-v1.1.0-release-notes.md#known-issues) known issues. ## Installation Please follow the steps below to install the software on your compute infrastructure: 1. Click on the latest DRAGEN Array version installation package for the platform of your choice. Installers for Windows and Linux are available on the [Illumina Support Site](https://support.illumina.com/array/array_software/dragen-array-secondary-analysis/downloads.html).\ \ Once download is completed, move the DRAGEN Array installation package to the desired folder. Administrative permissions may be required for system folders, for example `/usr/local/bin for Linux`, and `C:\Program Files` for Windows.\ \ **Note**: Throughout the remainder of the document, Linux will be assumed in the examples. 2. Unzip and extract the package. The executable can be found in the dragena subfolder of the software download after extraction. 3. To check that the DRAGEN Array installation was successful, follow these steps: * Open a command prompt (Windows) or terminal (Linux). * \[Optional] Add `/path/to/dragena/`, e.g. `/usr/local/bin/dragena-linux-x64-DAv1.1.0/dragena/`, to your PATH – to access the executable anywhere in the folder structure * Execute the following command: `/path/to/dragena/dragena version`, or if the environmental variable PATH is set: dragena version The version of the software will be displayed in the terminal window when the installation was successful. ## Run DRAGEN Array Local For genotyping or cytogenetic analysis, there is no sample minimum required to run analysis. For CNV PGx analysis, a minimum of 24 samples is required to run analysis. For a successful analysis, 22 samples must pass QC defined as having log R dev < 0.2. With a standard hardware specification in section [Computing Requirements](#computing_requirements), up to 500 GDA-ePGx samples can be processed per analysis batch. To optimize performance of the targeted PGx CNV caller and minimize batch effect, it is recommended to: * Group samples in the same assay batch (e.g. whole genome amplication and targeted gene application assay batch) into the same analysis batch. * Avoid combining sample batches processed on different reagent lots. * Analyze batches of 96 samples or more. * Samples processed in a two-week period from multiple library preparation batches can be grouped together to meet size requirement of an analysis batch. In such cases, it is recommended to use the same lot of reagents and instruments used in the workflow. * Use the CN Model and PGx Database File provided as part of the standard product files ## Quick Start Review section [DRAGEN Array Applications](/dragen-array/dragen-array-v1.2/overview/our-features.md) for information on input files to use, sample minimums per analysis type and other best practices. Command examples show analysis for a Linux system using folders instead of sample sheets. For Windows users, make sure to substitute the file paths in the commands following windows conventions, e.g., using backslash (\\) instead of forward-slash (/). A sample sheet can be used to select specific samples out of a folder. **Note**: DRAGEN Array will overwrite older files if using the same `--output-folder` from a previous analysis. If this is not desired, use different `--output-folder` for re-analyses. ### PGx Use the following instructions to start the full PGx analysis, covering genotyping, PGx CNV and PGx star allele calling. Refer to [Command Index](#command_index_1) for parameters for all commands. 1. Open a command prompt (Windows) or terminal window (Linux) and navigate to the directory where the software was installed. Or a different, desired directory if the executable was added to the PATH environmental variable. 2. Use the genotype call command to call genotypes and generate GTC files using IDAT files as input.\ `dragena genotype call --bpm-manifest /user/productfiles/manifest.bpm --cluster-file /user/productfiles/clusterfile.egt --idat-folder /user/IDATs --output-folder /user/gtc` 3. Use the genotype gtc-to-vcf command to create SNV VCF files from the GTC files generated by the genotype call command.\ `dragena genotype gtc-to-vcf --bpm-manifest /user/productfiles/manifest.bpm --csv-manifest /user/productfiles/manifest.csv --genome-fasta-file /user/productfiles/genome.fa --gtc-folder /user/gtc --output-folder /user/vcf` 4. Use the pgx copy-number call command to call PGx CNVs from the GTC files and produce CNV VCF files. It is recommended to use the same output folder used for SNV VCF since the star-allele call command accepts one VCF folder with SNV and PGx CNV VCFs.\ `dragena pgx copy-number call --cn-model /user/productfiles/cnv_model.dat --gtc-folder /user/gtc --output-folder /user/vcf`**Note**: For PGx CNV calling, it is recommended that 96 or more samples passing LogRDev <= 0.2 are included in the analysis. 5. Use the pgx star-allele call command to generate star allele calls using the CNV and SNV VCF files generated by the gtc-to-vcf and copy-number call commands.\ `dragena pgx star-allele call --vcf-folder /user/vcf --database /user/productfiles/GDA_ePGx_E2_DAv1.0.0.zip --output-folder /user/star-alleles --license-server-url https://username:password@license.dragen.illumina.com`**Note**: For PGx star allele calling, it is recommended to QC the samples and review the samples that have Log R Dev > 0.2, call rate < 0.99, or TGA Control probe < 1.0 to assess the reliability of the analysis. These metrics are provided in the genotyping sample summary file (gt\_sample\_summary.csv). 6. Use the pgx star-allele annotate command to summarize the star alleles and add metabolizer statuses to the star alleles generated by the star-allele call command. Guidelines (CPIC or DPWG) can be specified.\ `dragena pgx star-allele annotate --star-alleles star_alleles.csv --guidelines CPIC --output-folder /user/metabolizer-statuses` 7. \[Optional] Use the pgx copy-number train command to retrain the copy number model.\ `dragena pgx copy-number train --bpm-manifest /user/productfiles/manifest.bpm --genome-fasta-file /user/productfiles/genome.fa --gtc-folder /user/gtc --platform LCG --output-folder /user/productfiles/cnmodelnew` ### Cytogenetics Use the following instructions to start the full cytogenetics analysis, covering genotyping, CNV and LOH calling, and annotation. Refer to [Command Index](#command_index_1) for parameters for all commands. 1. Open a command prompt (Windows) or terminal window (Linux) and navigate to the directory where the software was installed. Or a different, desired directory if the executable was added to the PATH environmental variable. 2. Use the genotype call command to call genotypes and generate GTC files using IDAT files as input.\ `dragena genotype call --bpm-manifest /user/productfiles/manifest.bpm --cluster-file /user/productfiles/clusterfile.egt --idat-folder /user/IDATs --output-folder /user/gtc` 3. Use the cyto call command to determine copy number variants and loss of heterozygosity given genotypes.\ `dragena cyto call --cn-model /user/productfiles/cyto_model.dat --gtc-folder /user/gtc --output-folder /user/vcf` 4. Use the cyto annotate command to generate JSON annotation files with gene annotations, cytogenetic bands, various QC fields, and the variant information from the VCFs.\ `dragena cyto annotate --annotation-db /user/productfiles/CytoAnnotateData_DAv1.2.0.zip --vcf-folder user/vcf --output-folder /user/cyto-annotations` ## Command Index Use the following syntax when using the command-line interface: `dragena [module] [sub-module (not needed for cyto)] [command] [required parameters] [optional parameters]` ### **pgx** The root command for pgx module | Command | Description | | ----------- | ---------------------------------------------- | | copy-number | Call and train copy number variants. | | star-allele | Star Allele Caller for Illumina Microarrays | | help | Display more information on a specific command | | version | Display version information. | ### **pgx copy-number** The root command for actions that act on pgx copy number variants. | Command | Description | | ----------------------- | --------------------------------------------------------------------- | | pgx copy-number call | Determines copy number variants given genotypes (GTC to CNV VCF). | | pgx copy-number help | Displays help information for a copy-number command. | | pgx copy-number train | Trains copy number model for a set of samples (GTC to CN Model File). | | pgx copy-number version | Displays version information for copy-number. | ### **pgx copy-number call** The command used to call copy number variants. A batch of 24 samples or more are required for analysis. For a successful analysis, 22 samples must pass QC defined as having log R dev < 0.2. | Option | Description | | ------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | --cn-model | \[Required] Specifies the path to the copy number model parameters file (.dat). | | --gtc-folder |

\[Required] Specifies the path to the directory where all genotype files (.gtc) are located. The command cannot be used with --gtc-sample-sheet.