# Filter Variants

Variant detection can identify large numbers of variants, dependent both on the size of the regions being interrogated and the parameters utilized during detection. As such, filtering variants is often a necessary task to aid in identifying variants that may be relevant for downstream investigation. The Filter variants task enables users to filter variant data both in regards to quality metrics generated during detection and annotation information. The task can be invoked from any *Variants* or *Annotated variants* data node.

## Filter variants dialog

The *Filter variants* task dialog can contain two to five sections, dependent on the variant caller used for detection and the level of annotation. All instances of the task will include the following: *Include region overlapping variants* and a section for *Quality*.

<figure><img src="/files/h7VCBdhIY6mxMfYpRgA7" alt=""><figcaption></figcaption></figure>

Selecting *Include region overlapping variants* will bring up a dialog to include variants located within genomic regions of interest, which could be regions such as transcript models or amplicons. If variant detection was performed in the Connected Multiomics, the *Assembly* will be displayed as text in the section, and you do not have the option to change the reference. In the event that variant detection was performed outside of the Connected Multiomics, you will need to select the appropriate Assembly utilized for variant detection in the drop-down list. Assemblies previously added will be available for selection or *New assembly…* can be utilized to import the reference sequence from within the task. The *Annotation model* section will allow for the use of any annotation model in the drop-down menu or can be imported from within the task by selecting *Add annotation model*. If an annotation that contains gene-level information is selected, this filter will include both intronic and exonic regions.

<figure><img src="/files/PhG1bKArB8QRE6IMBxtq" alt=""><figcaption></figcaption></figure>

If the filter is invoked from an *Annotated variants* data node, the *Variant Novelty* section can be utilized to filter known variants as identified in a variant database used for annotation. Selecting *Known only*, *Novel only*, or *All* will include only these types of variants in the resulting filtered variants.

<figure><img src="/files/wF07dutgenBv2mr8y32n" alt=""><figcaption></figcaption></figure>

Variants annotated with a transcript model will include a filter for *Variant Type*. For variants in coding regions, *Mutation type* allows for the inclusion of *Synonymous*, *Missense*, and/or *Nonsense* variants when selecting the appropriate type. For variants located outside of coding regions, *Feature section* allows for the inclusion of 5-prime splice site (*Splice-5*), 3-prime splice site (*Splice-3*), *Non-coding RNA*, *5-prime UTR*, *3-prime UTR*, *Intron*, *Promoter*, and/or *Intergenic* variants by selecting the appropriate type.

<figure><img src="/files/uncumz2AlC2AEP3udY4s" alt=""><figcaption></figcaption></figure>

When *filter by field* option is checked, all of the fields can be displayed in the drop-down list, depends variant detection algorithms, annotation database, etc, the list of the fields will be different from different data node.

<figure><img src="/files/khLXErRrI7745OTSWZO8" alt=""><figcaption></figcaption></figure>

For instance VarQual field is a metrics generated from the variant detection, and these will be dependent upon the method utilized for variant detection.

Field can be searched from the drop-down list, when mouse over on a field, description of the field will be displayed.

<figure><img src="/files/sASFz0cIhga4DSWjLyhB" alt=""><figcaption></figcaption></figure>

Decisions on quality filtering parameters should be based upon sequencing assay design as well as goal or the study, either identification of all potential variants or identification of high confidence variants. At the very least, the use of *Minimum read depth* should be considered for filtering to ensure sufficient read evidence was available to call a variant. In instances where paired variant detection was performed in [SAMtools](https://github.com/illumina-swi/icm-docs/blob/icm-prod/docs/analyses/analysis-functionality/task-menu/variant-callers/samtools.md), *Minimum genotype log ratio* may be employed to ensure sufficient evidence of genotype differences in case and control sample pairs. Please refer to the [Samtools](http://www.htslib.org/doc/#publications), [FreeBayes](https://github.com/ekg/freebayes), and [LoFreq](http://csb5.github.io/lofreq/) documentation for further details on any of these parameters.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://help.connected.illumina.com/icm/analyses/analysis-functionality/task-menu/variant-analysis/filter-variants.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
