# Contamination

The Contamination column reports whether a sample shows signs of DNA contamination, helping ensure data reliability before interpretation.

{% hint style="info" %}
Be mindful that **when contamination is suspected in sequencing data, it could stem from various sources**, including true contamination, sample mix-up, library preparation issues, or technical artifacts.

Always confirm the issue with other quality checks.
{% endhint %}

Contamination is detected using [Peddy](https://pubmed.ncbi.nlm.nih.gov/28190455/) calculations, which estimate the proportion of reads that do not match the expected genotype. This estimate is based on the `idr_baf` score.

`idr_baf` stands for the interdecile range of the B-allele frequency—calculated as the difference between the 90th and 10th percentiles of the distribution of alt / (ref + alt) ratios across all variant sites.

A larger `idr_baf` value indicates greater variability in allele balance, which may suggest sample contamination, particularly from another human DNA sample.

**Contamination check results:**

* **N/A**\
  No data is available (older cases or when `idr_baf` = 0.000).
* **No**\
  No contamination detected (`idr_baf` < 0.200).
* **Unlikely**\
  Possible contamination, but evidence is weak (0.200 ≤ `idr_baf` < 0.241).
* **Likely**\
  Contamination suspected (0.241 ≤ `idr_baf` < 0.300).
* **Yes**

  Contamination confirmed (`idr_baf` ≥ 0.300).

{% hint style="success" %}
Hover over the value to display a tooltip showing the **HET ratio** (proportion of sites that are heterozygous) and the **HET count** (number of heterozygote calls in sampled sites).
{% endhint %}

{% hint style="success" %}
**Tips:**

* Always review contamination results before starting interpretation to rule out technical issues that could explain unexpected variant calls.
* Cross-check contamination results with other QC metrics (e.g., depth, ploidy, sex validation) for a more complete picture of sample quality.
* For family cases, check that no contamination is flagged before relying on inheritance-based filters.
  {% endhint %}

{% hint style="danger" %}
**Warnings:**

* **Panels may be less reliable:** For targeted panels, contamination estimates may be inaccurate due to the limited number of variants available for calculation. Use caution and cross-check with other QC metrics when interpreting these results.
* **Do not use in isolation**: A "Likely" or "Yes" result should not immediately be considered diagnostic — review case setup, sequencing quality, and sample handling first.
  {% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://help.connected.illumina.com/emedgene/emedgene-analyze-manual/reviewing_a_case/lab_tab/sample-quality-section/ngs-sample-quality-metrics/contamination.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
