# Quick Start Guide

### Sequencing

**Illumina TruPath Genome Software Solution** leverages on-flow cell library prep and advanced DRAGEN informatics algorithms to enhance human whole genome sequencing with long-range insights. Sequencing with the Illumina TruPath Genome prep is available on the NovaSeq X Sequencing Systems and analysis may be deployed on cloud or on-prem:

* NovaSeq X Instrument must have upgraded to the [NovaSeq X v1.4 Digital Package Software](https://support.illumina.com/content/dam/illumina-support/documents/downloads/software/novaseq-x-series/200079828_00_NovaSeq_X_Series_1_4_0_Release_Notes.pdf). See the [Prerequisites section](https://help.connected.illumina.com/illumina-trupath-genome/sequencing-run-setup/instrument-run-setup) for additional information.
* DRAGEN Germline v4.5 in Proximity Mode is intended for use only with the Illumina TruPath Genome kit and flow cell, available on
  * Standalone DRAGEN server
  * Illumina Connected Analytics (ICA)
  * Autolaunch via Basespace Sequence Hub (BSSH) Run Planning via\
    NovaSeq X Sequencing Systems

### Analysis

**DRAGEN Germline v4.5 with Proximity Mode** enabled supports data analysis for the Illumina TruPath Genome Kit and flow cell. The software provides on-prem and cloud analysis for DNA samples from blood, saliva, or buccal extraction methods. DRAGEN is optimized to leverage proximity mapped reads from on-flow-cell library prep to perform the following. See [analysis overview](https://help.connected.illumina.com/illumina-trupath-genome/run-analysis-setup/cloud-analysis-settings) and the [DRAGEN User Guide](https://help.dragen.illumina.com/dragen-v4.5-trupath) for additional information:

* Provide phased small variants and ultra-long phasing blocks
* Resolve mapping ambiguities in challenging regions of the genome, including regions of high homology and segmental duplications
* Provide enhanced structural variant insights using colocation filtering

**Proximity Mapped Reads Features:**

* Reads are linked by physical proximity, enabling long-range information without long read sequencing
* Mapping is informed by sets of reads derived from the same input DNA molecule, reducing false positives and false negatives.
* Output Files: `Proximity_Model_Metrics.CSV`; `Mapping_Metrics.CSV`
* Key Metrics: `Proximity Rate`, `Proximity Coverage`, `Template Size`

**Phasing Features:**

* Fully phased genes with ultra-long phase blocks help determine if variants are in cis or trans
* Haplotypes maintained across complex loci
* Output Files: `Phasing_Summary_Stats.CSV`, `Phase Blocks GTF file`, `Hard-Filtered VCF`, `Haplotagged BAM`
* Key Metrics: `Phasing NG50`, `% Fully Phased Genes`

**Multi-Region Joint Detection (MRJD) Algorithm Regions:**

* Provide copy-number aware, haplotype-resolved small variant calls in copies of paralogous genes or gene / pseudogene pairs
* Reads phased and assigned to paralogous gene copies or gene / pseudogene pairs
* Genes covered include but not limited to: `SMN1/2`, `PMS2` ,`RCCX (CYP21A2)`, `STRC` ,`CYP2D6`, `OTOA`
* Relevant Output Files: Gene Specific `MRJD VCF` , `MRJD Json` ,`MRJD BAM`

**Structural Variant Calling Improvements via Colocation Filtering:**

* Proximity-mapped reads improve detection of clinically relevant structural variants that are often missed by standard short-read sequencing, particularly in repeat-rich and medically relevant loci.
* Unlike traditional short-read approaches that infer structure from isolated fragments, proximity-mapped reads detect structural variants by leveraging continuity across entire DNA molecules.
* Relevant Output Files: `Colocation Cooler` , `Colocation HiC` , `SV VCF`

#### Local and Cloud Deployments

TruPath secondary analysis is capable of running on a standalone (on-prem) phase 4 DRAGEN server. Note that a **phase 4 server is required** for TruPath on-prem secondary analysis.

Cloud analysis is available on Illumina Connected Analytics with auto-launch or manual launch. Both methods are available from BCLs and FASTQs.

### Visualization

**The DRAGEN Germline** pipeline with Proximity Mode enabled produces a comprehensive set of output files that can be visualized across three complementary tools, including **IGV (Integrative Genome Viewer)**, **DRAGEN Reports**, and **HiGlass** -- each serving a distinct purpose in downstream analysis and variant validation. Please navigate to the section in the child pages of the [Data Visualization and Analysis](https://help.connected.illumina.com/illumina-trupath-genome/get-started/broken-reference) section for more information on these tools.

**IGV (Integrative Genomics Viewer)** enables interactive inspection of phased BAMs, variant VCFs (small variants, CNVs, SVs), phase block GTFs, and MRJD paralog files.

* Data can be loaded locally from ICA or streamed directly via signed URLs (`icav2 projectdata downloadurl`)
* Recommended settings: group by haplotype (HP), color by phase set (PS), squished display, show soft clips, downsampling disabled
* These settings visually separate reads by haplotype and distinguish phased from unphased alignments

**DRAGEN Reports** integrate Multi-Region Joint Detection (MRJD) results directly into sample-level reports under the "Paralogs" tab.

* "Paralog Sets" table summarizes each paralogous region with estimated copy numbers
* "Paralogous regions" view displays haplotype-resolved variant calls with color coding: dark orange (alt alleles at reference difference sites), light orange (ref alleles), grey (non-reference difference site variants)

**HiGlass** is the recommended tool for exploring genome-wide colocation contact maps produced by TruPath in `.cooler` format.

* Workflow: install HiGlass via Docker, convert `.cooler` to multi-resolution `.mcooler` using `cooler zoomify`, then ingest into HiGlass
* Supports interactive visualization with smooth zooming, panning, and annotation tracks (chromosomes, genes, VCF, BED, BEDPE)

Together, these three tools provide a complete visualization framework -- from read-level alignment inspection and paralog-resolved variant review to genome-wide proximity analysis -- enabling thorough validation and interpretation of TruPath Constellation pipeline results.

### Emedgene

[**Emedgene**](https://help.emg.illumina.com) enables **end-to-end interpretation** for the Illumina TruPath Genome by tightly integrating advanced DRAGEN TruPath outputs.

* **Phasing-aware interpretation:**
  * Compound heterozygous filtering, visualization of haplotagged BAMs, phased SNPs, and phase blocks
* **MRJD and targeted phasing visualization:**
  * Supports paralogous and difficult-to-map regions
* **Enhanced structural variant support**:
  * Visualization and evidence for complex SVs (BNDs, inversions, translocations) with improved IGV-based evidence review

For more information on Emedgene v100.40, please [visit this link here](https://help.connected.illumina.com/emedgene/release-notes/workbench-and-pipeline-updates) for version history and release note updates.
