# Quick Start Guide

### Sequencing

**Illumina TruPath Genome Software Solution** leverages on-flow cell library prep and advanced DRAGEN informatics algorithms to enhance human whole genome sequencing with long-range insights. Sequencing with the Illumina TruPath Genome prep is available on the NovaSeq X Sequencing Systems and analysis may be deployed on cloud or on-prem:

* NovaSeq X Instrument must have upgraded to the [NovaSeq X v1.4 Digital Package Software](https://support.illumina.com/content/dam/illumina-support/documents/downloads/software/novaseq-x-series/200079828_00_NovaSeq_X_Series_1_4_0_Release_Notes.pdf). See the [Prerequisites section](https://help.connected.illumina.com/illumina-trupath-genome/sequencing-run-setup/instrument-run-setup) for additional information.
* DRAGEN Germline v4.5 in Proximity Mode is intended for use only with the Illumina TruPath Genome kit and flow cell, available on
  * Standalone DRAGEN server
  * Illumina Connected Analytics (ICA)
  * Autolaunch via Basespace Sequence Hub (BSSH) Run Planning via\
    NovaSeq X Sequencing Systems

### Analysis

**DRAGEN Germline v4.5 with Proximity Mode** enabled supports data analysis for the Illumina TruPath Genome Kit and flow cell. The software provides on-prem and cloud analysis for DNA samples from blood, saliva, or buccal extraction methods. DRAGEN is optimized to leverage proximity mapped reads from on-flow-cell library prep to perform the following. See [analysis overview](https://help.connected.illumina.com/illumina-trupath-genome/run-analysis-setup/cloud-analysis-settings) and the [DRAGEN User Guide](https://help.dragen.illumina.com/dragen-v4.5-trupath) for additional information:

* Provide phased small variants and ultra-long phasing blocks
* Resolve mapping ambiguities in challenging regions of the genome, including regions of high homology and segmental duplications
* Provide enhanced structural variant insights using colocation filtering

**Proximity Mapped Reads Features:**

* Reads are linked by physical proximity, enabling long-range information without long read sequencing
* Mapping is informed by sets of reads derived from the same input DNA molecule, reducing false positives and false negatives.
* Output Files: `Proximity_Model_Metrics.CSV`; `Mapping_Metrics.CSV`
* Key Metrics: `Proximity Rate`, `Proximity Coverage`, `Template Size`

**Phasing Features:**

* Fully phased genes with ultra-long phase blocks help determine if variants are in cis or trans
* Haplotypes maintained across complex loci
* Output Files: `Phasing_Summary_Stats.CSV`, `Phase Blocks GTF file`, `Hard-Filtered VCF`, `Haplotagged BAM`
* Key Metrics: `Phasing NG50`, `% Fully Phased Genes`

**Multi-Region Joint Detection (MRJD) Algorithm Regions:**

* Provide copy-number aware, haplotype-resolved small variant calls in copies of paralogous genes or gene / pseudogene pairs
* Reads phased and assigned to paralogous gene copies or gene / pseudogene pairs
* Genes covered include but not limited to: `SMN1/2`, `PMS2` ,`RCCX (CYP21A2)`, `STRC` ,`CYP2D6`, `OTOA`
* Relevant Output Files: Gene Specific `MRJD VCF` , `MRJD Json` ,`MRJD BAM`

**Structural Variant Calling Improvements via Colocation Filtering:**

* Proximity-mapped reads improve detection of clinically relevant structural variants that are often missed by standard short-read sequencing, particularly in repeat-rich and medically relevant loci.
* Unlike traditional short-read approaches that infer structure from isolated fragments, proximity-mapped reads detect structural variants by leveraging continuity across entire DNA molecules.
* Relevant Output Files: `Colocation Cooler` , `Colocation HiC` , `SV VCF`

#### Local and Cloud Deployments

TruPath secondary analysis is capable of running on a standalone (on-prem) phase 4 DRAGEN server. Note that a **phase 4 server is required** for TruPath on-prem secondary analysis.

Cloud analysis is available on Illumina Connected Analytics with auto-launch or manual launch. Both methods are available from BCLs and FASTQs.

### Visualization

**The DRAGEN Germline** pipeline with Proximity Mode enabled produces a comprehensive set of output files that can be visualized across three complementary tools, including **IGV (Integrative Genome Viewer)**, **DRAGEN Reports**, and **HiGlass** -- each serving a distinct purpose in downstream analysis and variant validation. Please navigate to the section in the child pages of the [Data Visualization and Analysis](https://help.connected.illumina.com/illumina-trupath-genome/get-started/broken-reference) section for more information on these tools.

**IGV (Integrative Genomics Viewer)** enables interactive inspection of phased BAMs, variant VCFs (small variants, CNVs, SVs), phase block GTFs, and MRJD paralog files.

* Data can be loaded locally from ICA or streamed directly via signed URLs (`icav2 projectdata downloadurl`)
* Recommended settings: group by haplotype (HP), color by phase set (PS), squished display, show soft clips, downsampling disabled
* These settings visually separate reads by haplotype and distinguish phased from unphased alignments

**DRAGEN Reports** integrate Multi-Region Joint Detection (MRJD) results directly into sample-level reports under the "Paralogs" tab.

* "Paralog Sets" table summarizes each paralogous region with estimated copy numbers
* "Paralogous regions" view displays haplotype-resolved variant calls with color coding: dark orange (alt alleles at reference difference sites), light orange (ref alleles), grey (non-reference difference site variants)

**HiGlass** is the recommended tool for exploring genome-wide colocation contact maps produced by TruPath in `.cooler` format.

* Workflow: install HiGlass via Docker, convert `.cooler` to multi-resolution `.mcooler` using `cooler zoomify`, then ingest into HiGlass
* Supports interactive visualization with smooth zooming, panning, and annotation tracks (chromosomes, genes, VCF, BED, BEDPE)

Together, these three tools provide a complete visualization framework -- from read-level alignment inspection and paralog-resolved variant review to genome-wide proximity analysis -- enabling thorough validation and interpretation of TruPath Constellation pipeline results.

### Emedgene

[**Emedgene**](https://help.emg.illumina.com) enables **end-to-end interpretation** for the Illumina TruPath Genome by tightly integrating advanced DRAGEN TruPath outputs.

* **Phasing-aware interpretation:**
  * Compound heterozygous filtering, visualization of haplotagged BAMs, phased SNPs, and phase blocks
* **MRJD and targeted phasing visualization:**
  * Supports paralogous and difficult-to-map regions
* **Enhanced structural variant support**:
  * Visualization and evidence for complex SVs (BNDs, inversions, translocations) with improved IGV-based evidence review

For more information on Emedgene v100.40, please [visit this link here](https://help.connected.illumina.com/emedgene/release-notes/workbench-and-pipeline-updates) for version history and release note updates.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://help.connected.illumina.com/illumina-trupath-genome/get-started/readme-1.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
