# VCF Input Requirement

Connected Insights imports variant calls for the following variant types in the Variant Call File (VCF) file format (v4.1 and later):

* Small variants (SNVs, MNVs, and small indels)
* Structural variants (SVs)
* Copy number variants (CNVs)
* RNA fusion variants
* RNA splice variants

> ❗ Imported VCF files must contain at least one sample and be sorted correctly to ensure valid display of results in Connected Insights.

The following sample fields are supported for each variant type:

## Small variants

| **Sample Field**                                  | **VCF Fields**          | **Details**                                                                                                                                                                                                  |
| ------------------------------------------------- | ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| Allele Depths                                     | AD                      | The read support for variants called at this position. Expected as a comma separated list of values for the reference allele followed by each alternate allele.                                              |
| Total Depth                                       | DP                      | The total read support for all alleles at this position. Will be calculated as the sum of all allele depths if not provided.                                                                                 |
| Variant Read Frequency / Variant Allele Frequency | VF (or derived from AD) | The proportion of reads supporting each alternate allele. Expected as a comma separated list of values for each alternate allele. Will be calculated based on allele depths and total depth if not provided. |
| Genotype                                          | GT¹                     | The genotype of the sample at the given position.                                                                                                                                                            |

¹ The following GT values are interpreted as an absence of the reported variant and are not imported:

* `0`
* `0/0`

## Copy number variants

Requires SVTYPE=CNV in the INFO field of each VCF entry.

| **Sample Field**            | **VCF Fields**                    | **Details**                                                                                                                                            |
| --------------------------- | --------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
| Fold Change                 | FC, SM                            | Estimated fold change for the copy number variant.                                                                                                     |
| Copy Number                 | CN                                | Estimated absolute copy number for the copy number variant.                                                                                            |
| Minor-haplotype Copy Number | MCN                               | Estimated absolute copy number for the minor-haplotype of a copy number variant. When MCN is zero the copy number variant can be determined to be LOH. |
| Genotype                    | (Derived from CN when available)¹ | The genotype of the sample at the given position.                                                                                                      |

¹ The following GT values are expected given the CN of the variant:

* `0`: The copy number is normal in a region expected to be haploid.
* `1`: The copy number differs from normal in a region expected to be haploid.
* `0/0`: The copy number is normal in a region expected to be diploid.
* `0/1`: The copy number differs from normal and is not a complete loss in a region expected to be diploid.
* `1/1`: The copy number is a complete loss in a region expected to be diploid.

## Structural variants and RNA fusion variants

Requires SVTYPE in the INFO field of each VCF entry (DUP, DEL, INS, INV, or BND).

| **Sample Field**                                  | **VCF Fields**                  | **Details**                                                                                                                                                            |
| ------------------------------------------------- | ------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Paired Reads                                      | PR                              | The paired read support for variants called at this position. Expected as a comma separated list of values for the reference allele followed by each alternate allele. |
| Split Reads                                       | SR                              | The split read support for variants called at this position. Expected as a comma separated list of values for the reference allele followed by each alternate allele.  |
| Supporting Reads                                  | (Derived from PR and SR)        | The cumulative read support from split reads and paired reads for variants called at this position.                                                                    |
| Total Depth                                       | (Derived from PR and SR)        | The total reads for all alleles called at this position.                                                                                                               |
| Variant Read Frequency / Variant Allele Frequency | (Derived from PR and SR)        | The proportion of reads supporting each alternate allele. Calculated based on supporting reads and total depth.                                                        |
| Genotype                                          | GT¹ (or derived from PR and SR) | The genotype of the sample at the given position.                                                                                                                      |

¹ The following GT values are interpreted as an absence of the reported variant and are not imported:

* `.`
* `./.`
* `0`
* `0/0`


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://help.connected.illumina.com/connected-insights/configure/configuration/du-introduction/du-custom-pipeline-configuration/du-vcf-input-requirement.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
