> For the complete documentation index, see [llms.txt](https://help.connected.illumina.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://help.connected.illumina.com/dragen/dragen-v4.4/product-guide/dragen-v4.4/dragen-dna-pipeline/targeted-caller/hba-calling.md).

# HBA Caller

The HBA Caller is capable of genotyping the *HBA1* and *HBA2* genes from whole-genome sequencing (WGS) and whole-exome sequencing (WES) data. Due to high sequence similarity between the genes, a specialized caller is necessary to resolve the possible genotypes of the pair of genes. We consider regions surrounding the *HBA1* and *HBA2* sites to resolve the possible *HBA1* and *HBA2* genotypes.

The HBA Caller performs the following steps:

1. Determines total copy number from read depth of the regions surrounding the *HBA1* and *HBA2* sites.
2. Determines HBA genotypes based on the copy number of the regions surrounding the *HBA1* and *HBA2* sites.
3. Calls small variants in the *HBA1* and *HBA2* regions based on the region copy number derived from the genotype along with allele counts from read information.

For a comprehensive evaluation of the HBA caller, see [HBA targeted caller blog post](https://www.illumina.com/science/genomics-research/articles/HBA-targeted-caller.html).

For information about enabling the HBA caller see [Targeted Caller](broken://pages/9jH2RKZDkZGKu3Mhz1zP#targeted-caller).

## Total Copy Number of the regions surrounding the *HBA1* and *HBA2* sites

The first step of HBA calling is to determine the copy number of the regions surrounding the *HBA1* and *HBA2* sites. Reads aligned to the regions are counted. The counts in each region are corrected for GC-bias, and then normalized to a diploid baseline. The GC-bias correction and normalization factors are determined from read counts in preselected regions across the genome. Finally, a Gaussian Mixture Model (GMM) is used to obtain the integer region copy number from the region normalized counts.

## Genotyping

The genotyping step attempts to identify the two likely haplotypes described in the following table, where "a" stands for a functional copy of either *HBA1* or *HBA2*, "-" stands for a nonfunctional/missing copy of either *HBA1* or *HBA2*, while "3.7", "4.2", and "20.5" describe the recombinant event that likely caused the deletion/duplication of the functional HBA copy.

| Genotype        |
| --------------- |
| aaa3.7/aa       |
| aaa4.2/aa       |
| aaa20.5/aa      |
| aaa20.5/aaa3.7  |
| aaa20.5/aaa4.2  |
| aaa20.5/aaa20.5 |
| aa/aa           |
| -a3.7/aa        |
| -a4.2/aa        |
| -a20.5/aa       |
| --/aaa3.7       |
| --/aaa4.2       |
| --/aaa20.5      |
| -a3.7/aaa20.5   |
| -a4.2/aaa20.5   |
| -a20.5/aaa3.7   |
| -a20.5/aaa4.2   |
| -a3.7/-a3.7     |
| -a4.2/-a4.2     |
| -a20.5/-a20.5   |
| -a3.7/-a4.2     |
| -a20.5/-a3.7    |
| -a20.5/-a4.2    |
| --/aa           |
| --/-a3.7        |
| --/-a4.2        |
| --/-a20.5       |
| --/--           |

The quality of fitting the region copy number calls to one of these genotypes is reported in the `FORMAT/TargetedSVModelQual` field for each SV VCF record and the `FILTER/TargetedSVModelQual` is applied to each record when the phred-scaled quality is below the threshold specified in the VCF header.

## Small Variant Calling

18 small variants are detected from the read alignments. These variants occur in homologous regions of *HBA1* and *HBA2* where reads mapping to either *HBA1* or *HBA2* are used for variant calling.

For each variant, reads containing either the variant allele or the nonvariant allele are counted and a binomial model is used to determine the likelihood for each possible variant allele copy number up to the maximum possible as determined from the *HBA1*/*HBA2* genotyping.

## HBA Output File

The HBA Caller generates its output in the targeted caller output file `<output-file-prefix>.targeted.json` that also contains calls from other targets (see [Targeted JSON File](broken://pages/9jH2RKZDkZGKu3Mhz1zP#targeted-json-file)).

| Fields in JSON  | Explanation                                                    | Type and Possible Values                          |
| --------------- | -------------------------------------------------------------- | ------------------------------------------------- |
| totalCopyNumber | Total copy number of *HBA1* and *HBA2* genes including hybrids | nonnegative integer                               |
| genotype        | The HBA genotype.                                              | string                                            |
| genotypeFilter  | The HBA genotype filter.                                       | string, \[PASS, HBALowGQ, HBALowPValue, No\_call] |
| genotypeQual    | The HBA Phred genotype quality.                                | double                                            |
| variants        | List of detected homology region variants in *HBA1*/*HBA2*.    | Array of variants                                 |

Each variant reported in the `variants` array will have the fields below.

| Fields in JSON   | Explanation                                      | Type and Possible Values         |
| ---------------- | ------------------------------------------------ | -------------------------------- |
| alleleId         | HGVS identifier of the variant allele            | string                           |
| alleleCopyNumber | Copy number of the allele in the called genotype | nonnegative integer              |
| genotypeQuality  | Phred-scaled quality for the called genotype     | nonnegative integer              |
| filter           | Filter for the called genotype                   | string. "PASS" when not filtered |

Structural variant and homology region variants are reported in VCF format. See [Targeted VCF File](broken://pages/9jH2RKZDkZGKu3Mhz1zP#targeted-vcf-file) for details about how these variants are reported in VCF.

### Output File Example

An example of the HBA caller content in the `<output-file-prefix>.targeted.json` output file is shown below.

```
  "hba": {
    "totalCopyNumber": 4,
    "genotype": "aa/aa",
    "genotypeQuality": 93,
    "genotypeFilter": "PASS",
    "variants": [
      {
        "alleleId": "NM_000517.6:c.95+1G>A",
        "alleleCopyNumber": 1,
        "genotypeQuality": 26,
        "filter": "PASS"
      }
    ]
  }
```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://help.connected.illumina.com/dragen/dragen-v4.4/product-guide/dragen-v4.4/dragen-dna-pipeline/targeted-caller/hba-calling.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.