Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Connected Insights provides users with the flexibility to apply a selection of filter criteria to each variant category supported in the software. The selection of variant categories impacts the set of filtering criteria that can be selected for a given filter group.
The following table summarizes filters available for each variant category:
Variant Category
Small Variants
Structural Variants
Copy Number Variants
RNA Splice Variants
RNA Fusion Variants
COSMIC
+
CGC
+
+
+
+
+
Cancer Hotspots
+
+
+
+
+
Change (Copy Number)
+
Change (Fold Change)
+
ClinVar (VCV, RCV)
+
+
+
+
+
Consequences
+
+
+
+
+
Constraint Metrics(gnomAD)
+
+
+
+
+
Filters
+
+
+
+
+
Flags
+
+
+
+
+
Genes
+
+
+
+
+
Haploinsufficiency(ClinGen)
+
+
+
+
+
Length
+
+
+
+
+
Low Complexity Region(gnomAD)
+
OMIM
+
+
+
+
+
Origin
+
+
+
+
+
Population
+
+
+
Position (Chromosome)
+
+
+
+
+
Position (Genomic Regions)
+
+
+
+
+
PrimateAI-3D
+
LOH Overlap
+
+
+
+
+
Sample Metrics
+
+
+
+
+
Splice AI
+
Triplosensitivity (ClinGen)
+
+
+
+
+
Variant Type
+
+
+
This page summarizes filters related to variant quality. Filter availability can vary depending on the selected variant categories. If filters are applied to more than one variant category in the same condition group, only filters relevant for all variant categories are available. For more information, refer to Filter by Variant Category.
Filters data by the value provided for the variant in the FILTER column of the VCF file. Refer to the variant caller documentation to confirm possible values and recommended thresholds.
Filters data by read metrics, for example, VAF, allele depth, paired and read counts, split read count, and others.
Excludes small variants in low complexity regions (LCRs).
For more information, refer to Acronyms and Terms.
Connected Insights includes the Population filter that filters small variants based on the population allele frequency provided in the gnomAD database or custom annotations (refer to ).
The same Population filter is used to filter CNVs and SVs. For these variant categories, the filter uses the aggregate population allele frequency as calculated by Connected Insights based on the data provided in the 1000 Genomes Project database. The aggregation addresses challenges from significant variability in the calling of CNV and SV boundaries and the need to consider frequency of variants with close boundaries in aggregation. For example, allele frequency of variants 1000–2000 CNV gain and 1005–2000 CNV gain are considered as a sum). Specifically:
We selected alleles with 300 samples or more given the population group in the 1000 Genomes Project.
We aggregated alleles based on their similarity defined by the reciprocal overlap being equal or exceeding 0.75.
We calculated the aggregate frequency as the sum of all allele frequencies of the similar CNVs per given population group.
❗ The population frequency filter is only selectable in a filter group with a single selected variant category, as the population frequencies are tied to specific variant categories. The filter is not available for RNA Splice Variants and RNA Fusion Variants.
❗ The population frequency for copy number variants and structural variants is not displayed for alleles found in fewer than 300 samples per given population group in the 1000 Genomes Project.
Connected Insights includes the Flags filter that filters variants by the custom flags defined in Test Components. For more information, refer to
This page summarizes filters related to variant – disease and gene – disease associations. Filter availability can vary depending on the selected variant categories. If filters are applied to more than one variant category in the same condition group, only filters relevant for all variant categories are available. For more information, refer to .
In addition to filtering by the gene list, variants can be filtered by genes based on their associations with diseases and phenotypes.
Select the genes filter.
In the displayed dialog box window, select the Include genes from the diseases checkbox.
Start typing a phenotype or disease to display a list of potential matches to add to the list.
Select a checkbox next to Resource to include it in the list.
Select a high, medium, or low confidence score.
Select an overlap distance between 0.00 and 1.00.
The disease and related diseases display in the Related Diseases area with the distance and gene count. Deselect any unnecessary related diseases.
Add any other genes to the gene list.
Add additional genes in the Additional genes area.
Select Apply to save changes to the gene list.
The following tables detail the ontology sources that Connected Insights uses to determine relationships between genes and diseases.
Phenotype to gene search finds similar phenotypes and diseases across various ontology sources, independent of the underlying vocabulary in each source. If an equivalent concept does not exist across the sources, Connected Insights calculates the distance between nodes in the ontological hierarchies and assigns a score from 0 to 1, where:
Values closer to 0 indicate that the concepts are more equivalent. A value of 0 indicates that the concepts are the same.
Values closer to 1 indicate that the concepts are more dissimilar. A value of 1 indicates that the concepts can only be connected at the root node and are therefore excluded from query results.
The determination of distance accounts for the fact that sibling concepts on leaf nodes (eg, hypertrophic cardiomyopathy, and dilated cardiomyopathy) are more closely related to each other than siblings close to the root (eg, abnormal vascular morphology, and abnormal heart morphology).
Confidence scores for gene - disease associations are calculated using the following rules:
Expert-curated data from OMIM, HPO, Phenopedia, and ClinVar are assigned a high confidence score.
High, moderate, or low confidence scores are converted from GeL PanelApp strong, medium, and low scores, respectively.
GTR confidence scores are based on information content metrics, which measure the specificity of a genetic test for a particular phenotype and a gene.
GeneRIF associations, which are derived using data mining, and assigned medium confidence.
Filters variants based on gene role in cancer annotated by COSMIC Cancer Gene Census (CGC).
Filters variants based on genes with known gene-disease associations in the OMIM database.
Present in OMIM — An OMIM entry exists for the gene.
Has associated OMIM phenotypes (including ?) — A relationship exists between the phenotype and a matching gene at the transcript level. Provisional relationships, indicated by "?" in OMIM, are included.
Has associated OMIM phenotypes (excluding ?) — A relationship exists between the phenotype and a matching gene at the transcript level. Provisional relationships, indicated by "?" in OMIM, are excluded.
Selecting associated phenotypes enables options to refine the filter by mode of inheritance.
❕ The COSMIC filter is only selectable for small variants.
Filters by number of samples in cancer hotspots.
Filters on interpretation categories and the review status provided in the ClinVar database.
To filter by ClinVar review status, use the definitions provided from the ClinVar status review guidelines on the National Center for Biotechnology Information website.
This page summarizes filters related to the functional impact of a variant. Filter availability can vary depending on the selected variant categories. If filters are applied to more than one variant category in the same condition group, only filters relevant for all variant categories are available. For more information,refer to .
Filters by gnomAD constraint metrics: LOEUF, misZ, pLI, misZ, pLI, pNull, pRec, and synZ. For more information, refer to .
Filters by the haploinsufficiency and triplosensitivity evidence classification. It represents the strength of evidence supporting a relationship between a gene and disease and whether loss (haploinsufficiency) or gain (triplosensitivity) of individual genes or genomic regions is a mechanism for disease(Riggs et al., Clin Genet. 81, 403–412 (2012)).
The evidence categories can be used for clinical interpretation of copy number variants using the categories recommended by ClinGen.
Filters by presence in the COSMIC database. For more information, refer to .
Filters by the SpliceAI score. For more information, refer to .
Filters by the PrimateAI-3D score. For more information, refer to .
Resource | Description |
OMIM | Online Mendelian Inheritance in Man |
HPO | Human Phenotype Ontology |
Phenopedia | Human Genome Epidemiology (HuGE) |
GEL PanelApp | Genomics England PanelApp |
ILMN | • Clinvar – NCBI ClinVar •MedGen – NCBI portal to information about conditions and phenotypes related to Medical Genetics. •GTR – NCBI Genetic Testing Registry •GeneRIF – NCBI Gene Reference into Function |
Resource | Description |
ICD-9 | International Classification of Diseases, Ninth Revision |
ICD-10 | International Classification of Diseases, Tenth Revision |
MeSH | Medical Subject Headings |
UML | Unified Medical Language system. A repository of ontology resources. |
SNOMEDCT | Systematized Nomenclature of Medicine Clinical Terms |
Role in Cancer | Description |
TSG | Known tumor suppressor gene (TSG). |
Oncogene | Known oncogene. |
Fusion | Known fusion gene. |
Mode of Inheritance | Description |
AD | Autosomal Dominant |
AR | Autosomal Recessive |
XL | (X-linked) |
XLD | (X-linked Dominant) |
XLR | (X-linked Recessive) |
YL | (Y-linked) |
MI | Mitochondrial |
Mu | Multifactorial |
DD | Digenic Dominant |
DR | Digenic Recessive |
SMu | Somatic Mutation |
SMo | Somatic Mosaicism |
IC | Isolated Cases |
Interpretation Category | Definition in Connected Insights |
Pathogenic | The variant has at least one aggregate variant record (VCV entry) or aggregate variant –disease record (RCV) with classification category Pathogenic in the ClinVar database. |
Likely Pathogenic, UncertainSignificance, Likely Benign, Benign | The variant has at least one aggregate variant record (VCV entry) or aggregate variant –disease record (RCV) with classification category Pathogenic in the ClinVar database. |
None | The variant has no records in ClinVar or has at least one aggregate variant record (VCV entry)or aggregate variant – disease record (RCV) with interpretation categories Drug Response,Protective, and others (any other categories excluding Pathogenic, Likely Pathogenic,Uncertain Significance, Likely Benign, and Benign). |
Number of Stars | Definition in Connected Insights | Review Status Descriptions |
Four | The highest review status across all VCV and RCV records for the variant is four stars. | Practice guideline. For more information,refer to the ClinVar status review guidelines on the National Center for Biotechnology Information website. |
Three | The highest review status across all VCVand RCV records for the variant is three stars. | Reviewed by export panel. For more information, refer to the ClinVar status review guidelines on the National Center for Biotechnology Information website. |
Two | The highest review status across all VCVand RCV records for the variant is two stars. | Criteria provided, multiple submitters, no conflicts. Two or more submitters with assertion criteria and evidence (or a public contact) provided the same interpretation. |
One | The highest review status across all VCV and RCV records for the variant is one star. | Criteria provided, conflicting interpretations. Multiple submitters provided assertion criteria and evidence(or a public contact) but there are conflicting interpretations. The independent values are enumerated for clinical significance. |
None | The highest review status across all VCV and RCV records for the variant is no stars. | No assertion criteria provided. For more information, refer to the ClinVar status review guidelines on the National Center for Biotechnology Information website. |
Evidence Classification | Haploinsufficiency and TriplosensitivityScore | Suggested Classification |
Sufficient Evidence | 3 | Pathogenic |
Emerging Evidence | 2 | Likely Pathogenic |
Little Evidence | 1 | VUS |
No Evidence | 0 | VUS |
Sensitivity Unlikely | 40: Dosage Sensitivity Unlikely | Likely Benign/Benign |
Autosomal Recessive Phenotype | 30: Autosomal Recessive | Not applicable |
Variants filters provide options for applying any combination of inclusion and exclusion criteria to the variants in a case. Filter criteria can vary depending on the selected variant categories. If filters are applied to more than one variant category in the same filter group, only filters relevant for all variant categories are available. For more information, refer to Filter by Variant Category.
Each filter combination resides on a different tab in the variant grid. Default filter views are defined in the test definition. You can create filter tabs in the grid for as many additional views as necessary. Filters applied in the variant grid are specific to the selected case.
For more information about filter options, refer to Filter by Variant Category and Filtering Logic
When you configure a new test, you can add one or more specific filters to the test definition. The filters become default filters and are applied to every case in your workgroup. The default filters are locked and shown in the first tabs of the variant grid. For more information, refer to Configuration.
The default filter tabs, indicated by a lock, cannot be altered or deleted for the cases already processed. To change or delete default filters, you must update the filters that are used in the test definition and reprocess the case and upcoming cases through the updated test definition.
Included with Connected Insights, a demonstration of the filtering is provided as a template for you to define your own filter views. In the primary filter group, the filter is set up to return all variants categories (ie small variants, copy number variants, structural variants, RNA fusion variants, and RNA splice variants) and requiring that these are called with a PASS
by each of the variant callers. In the subsequent filter groups, the filter is set up to apply the following logic for each of the variant categories:
Small variants with coding consequences and population frequency ≤ 0.05 in gnomAD for AFR, AMR, EAS, NFE, SAS
Any copy number variants
Structural variants resulting in a unidirectional gene fusion with at least three supporting reads
RNA fusion variants resulting in a unidirectional gene fusion
RNA splice variants resulting in an exon loss
Configure and modify case-specific filter views in new or unlocked variant grid tabs. The default filter tabs, indicated by a lock, cannot be altered or deleted.
Create a tab using one of the following methods:
To create a filter, select New Filter.
To copy an existing filter, select the tab drop-down arrow and then select Duplicate Filter.
To load a new filter, select Load Filter.
Select or search for a filter to load from the list of compatible filters created and saved by all users in your workgroup. Filters with variant flags are only compatible to the cases using the same flag list. Select Apply.
[Optional] Double-click the tab label and enter a new name.
Select Edit Variant Filters.
Build and edit variant filters by applying various filtering criteria to the gene and variants. For more information, refer to Filtering Logic.
Select Apply.
To lock a filter view, select the tab drop-down arrow, and then select Lock Filter. Locked filter views are indicated by a blue lock and cannot be deleted.
To edit a filter name, select the tab drop-down arrow, and then select Edit Filter Name.
After editing the name, select Save.
To save a filter view, select Edit Variant Filters, then select Save As. The filter is saved and can be configured in the test definitions to be a default filter and be used across cases. Column configurations and filter dependencies are saved with the filter.
To remove a filter view, delete the tab. Default tabs are indicated by a lock and cannot be deleted.
Export a list of variants and variant data to a tab-delimited file.
The maximum number of exported variants in a list is 7500. If the list exceeds the maximum, only the first 7500 results are included in the exported file.
Configure the filters and flags to show only the variants to export.
Select the tab drop-down arrow, and then select Export Grid as TSV.
Connected Insights applies two levels of filtering. The first level is the condition group, which filters data by general information about the variants and is applied to all variants in the view. The second level includes other condition groups with more specific filtering criteria that are applied independently from other condition groups of the second-level filters.
❗ If you are building a filter that covers multiple variant categories, make sure that second level filters cover each of the variant categories that you intend to return with the filter. Including a variant category in the first level filter (eg, copy number variants) but omitting it in second-level filters (even if without filtering conditions) excludes this variant category from the filtering results. For example, filtering logic (Small Variants, CNVs, SVs) AND((Small Variants) OR (CNVs)) excludes SVs from the filtering results. Refer to the following filter examples for more details.
Use the Exclude selector for a given filter to exclude matching variants. The default filter behavior is to include variants.
The first-level condition group:
Does not specify any conditions for genes, thereby including into the filtering results variants from all genes.
Includes into the filtering results all variant categories that Connected Insights supports (for example, small variants SVs, CNVs, RNAsplice variants, and RNA fusions variants).
Sets a condition to include only variants that have PASS in the VCF variant filters, thus excluding all variants that do not have a PASSvalue.
The first-level condition group is connected to the second-level condition groups via operator AND.
The first of the second-level condition group specifies inclusion criteria for small variants to be returned in the filtering results. The condition group sets criteria to capture rare small variants with specific consequences that are of interest:
Variant category is set to Small Variant
Population frequency is specified as equal or less than 0.05 in five population groups in gnomAD
The Consequences filter lists categories of interest: Start Loss, Stop Gained, Stop Loss, and others
The small variants condition group is connected to other second-level condition groups via the operator OR.
The third second-level condition group provides inclusion criteria for RNA Splice Variants. Only PASS-ing (per first-level condition group) RNA Splice Variants with consequence Exon Loss Variant are included in the filtering results.
The next second-level condition group provides inclusion criteria for RNA Fusion Variants. This filtering logic only includes PASS-ing (per first level condition group) Unidirectional Gene Fusions in the filtering results.
The last second-level condition group lists structural variants and copy number variants as variant categories but does not provide any filtering condition. In the current version of Connected Insights, such condition group is required to include structural variants and copy number variants in the filtering results even though they are already specified in the first-level filter.
This page summarizes filters related to variant details. Filter availability can vary depending on the selected variant categories. If filters are applied to more than one variant category in the same condition group, only filters relevant for all variant categories are available. For more information, refer to Filter by Variant Category.
Filters variants by Suspected Somatic or Predicted Germline origin.
You can select these options when creating or editing a variant filter by updating the Origin criterion. For example, if you do not want predicted germline variants, then add or update the Origin criterion to only include Suspected Somatic. For more information, refer to Variant Filters.
You can also add or edit a test definition to include either somatic or predicted germline variants through selecting the applicable variant filters in the Variant Filter(s) field. For more information, refer to Test Definition Setup.
For tumor-only analyses, when enabled in DRAGEN, variant origin is determined for small variants based on population frequency databases.
For tumor-normal analysis, when enabled in DRAGEN, variant origin is determined for small variants based on the presence or absence of the variant in the normal sample.
Filters small variants by overlap with an LOH event when LOH data is provided.
Filters variants by genes. There are two ways to create gene lists in the filter.
Using a list of gene names. To create a gene list, type or paste gene names in the Additional Genes field.
Using gene-disease associations from several sources. For more information, refer to Disease Association Filters.
Filters small variants, structural variants, and copy number variants by their types.
❗ The variant type is only selectable in a filter group with a single selected variant category as the variant types are tied to specific variant categories.
Filters data by specific consequences.
❗ The consequence filter is only selectable in a condition group with a single selected variant category as the consequences are tied to specific variant categories.
When annotating transcripts with terms, Connected Insights uses the most specific term supported by the variant annotator.
Consequence filters return only the specified term and do not automatically include child terms. Specify the exact terms to include in the filter results.
These consequences are annotated when a variant has a biological assertion from any source with these consequences (for example, JAX-CKB or MyKnowledge Base).
Start and Stop Alterations
Filters data by the presence and location of start and stop alterations.
Splice Site
Filters data by the affected splice site.
Indels
Other
Filters data by the variant relationship to a gene.
When annotating transcripts with terms, Connected Insights uses the most specific terms supported by the variant annotator. Consequence filters return only the specified term and do not automatically include child terms. Specify the exact terms to include in the filter results.
These consequences are annotated when a variant has a biological assertion from any source with these consequences (for example, JAX-CKB or MyKnowledge Base).
Filters data by the transcript consequence.
Filters data by the gene fusion consequence.
When annotating transcripts with terms, Connected Insights uses the most specific term supported by the variant annotator. Consequence filters return only the specified term and do not automatically include child terms. Specify the exact terms to include in the filter results.
These consequences are annotated when a variant has a biological assertion from any source with these consequences (for example, JAX-CKB or MyKnowledge Base).
Filters data by the transcript consequence.
Filters data by the copy number consequence.
The functional consequences are annotated when a variant has a biological assertion from any source with these consequences (for example, JAX-CKB or My Knowledge Base).
The functional consequences are annotated when a variant has a biological assertion from any source with these consequences (for example, CKB or My Knowledge Base).
Filters by specified chromosomes. If no chromosome is selected, the chromosome filter is not applied.
Filters by specified regions. The input format is chr#: start-stop
, within multiple regions separated by spaces or new lines.
These values indicate a reference, deletion, or amplification of copy number variants.
❗ The change (copy number) filter is only selectable in a condition group with only the copy number variant category.
With copy number variants, the fold change value is derived from the normalized read depth of the gene in a sample. This depth is relative to the normalized ready depth of diploid regions in the same sample.
❗ The change (fold change) filter is only selectable in a condition group with only the copy number variant category.
Filters data by variant length with resolution up to one bp.
Consequence
Description
Gain of Function Variant
The variant results in gain of function.
Loss of Function Variant
The variant results in loss of function.
Consequence
Description
Start Loss
The loss of a start codon in the coding sequence.
Stop Gained
The gain of a stop codon in the coding sequence.
Stop Loss
The loss of a stop codon in the coding sequence.
Incomplete Terminal Codon
A change to at least one base of the final codon of an incomplete annotated transcript.
Feature Elongation
The variant causes the extension of the genomic feature.
Feature Truncation
The variant causes the reduction of a genomic feature.
Type
Description
Splice Acceptor Variant
The variant affects the canonical splice acceptor site (last two bases of the 3' end of the intron).
Splice Donor Variant
The variant affects the canonical splice donor site (first two bases of the 5' of the intron).
Splice Region Variant
An indel or substitution in a non coding splice region of the gene.
Type
Description
Frameshift Variant
An insertion or deletion in which the number of base pairs is not divisible by 3, causing a frame disruption.
Inframe Deletion
A deletion that does not disrupt the reading frame.
Inframe Insertion
An insertion that does not disrupt the reading frame.
Type
Description
Missense Variant
A single base pair substitution that results in the translation of a different amino acid at the position.
Protein Altering Variant
The variant has a protein-altering coding consequence.
Coding Sequence Variant
The variant changes the coding sequence.
Type
Description
Intergenic Variant
The variant position is not covered by any gene transcript.
Upstream Gene Variant
The variant position is within 5 kb upstream of the defined transcript start coordinate.
Downstream Gene Variant
The variant position is within 5 kb downstream of the defined transcript end coordinate.
Intron Variant
The variant occurs within an intron region.
3-prime UTR Variant
The variant is in the 3' untranslated region of a gene.
5-prime UTR Variant
The variant is in the 5' untranslated region of a gene.
Noncoding Transcript Exon Variant
The variant changes the noncoding exon sequence in a noncoding transcript.
Noncoding Transcript Variant
The variant occurs in a noncoding RNA gene.
Synonymous Variant
The variant does not affect the primary amino acid sequence of the translated protein.
Start Retained Variant
At least one base in the start codon is changed, but the start codon remains.
Stop Retained Variant
At least one base in the terminator code is changed, but the terminator remains.
Mature miRNA Variant
The variant occurs within a mature miRNA sequence.
NMD Transcript Variant
The variant is in a transcript and is the target of nonsense-mediated decay (NMD).
Regulatory Region Ablation
A deletion of a region that contains a regulatory region.
Regulatory Region Amplification
An amplification of a region that contains a regulatory region.
Regulatory Region Variation
The variant occurs in a regulatory region.
Consequence
Description
Gain of Function Variant
The variant results in gain of function.
Loss of Function Variant
The variant results in loss of function.
Consequence
Description
Transcript Variant
The variant changes the structure of the transcript.
Intron Variant
The variant is completely within the intron region of the gene.
Exon Variant
The variant is completely within the exon region of the gene.
Transcript Ablation
A deletion of the region that contains a transcript feature.
Transcript Amplification
An amplification of a region that contains a transcript.
Feature Elongation
The variant causes the extension of a genomic feature.
Feature Truncation
The variant causes the reduction of a genomic feature.
5-Prime Duplicated Transcript
A partially duplicated transcript in which the 5' end of the transcript is duplicated.
3-Prime Duplicated Transcript
A partially duplicated transcript in which the 3' end of the transcript is duplicated.
Consequence
Description
Unidirectional Gene Fusion
A fusion of two genes on the same strand.
Bidirectional Gene Fusion
A fusion of two genes on the opposite strand.
Gene Fusion
A fusion of two genes with ambiguous or unknown strand.
Consequence
Description
Gain of Function Variant
The variant results in gain of function.
Loss of Function Variant
The variant results in loss of function.
Consequence
Description
Transcript Variant
The variant changes the structure of the transcript.
Intron Variant
The variant is completely within the intron region of the gene.
Exon Variant
The variant is completely within the exon region of the gene.
Transcript Ablation
A deletion of a region that contains a transcript feature.
Transcript Amplification
An amplification of a region that contains a transcript.
Transcript Truncation
A truncation of a region that contains a transcript.
Feature Elongation
The variant causes the extension of a genomic feature.
Feature Truncation
The variant causes the reduction of a genomic feature.
5-Prime Duplicated Transcript
A partially duplicated transcript in which the 5' end of the transcript is duplicated.
3-Prime Duplicated Transcript
A partially duplicated transcript in which the 3' end of the transcript is duplicated.
Loss of Heterozygosity
The variant results in loss of heterozygosity of the transcript.
Type
Description
Copy Number Increase
The copy number is increased relative to the reference sequence.
Copy Number Decrease
The copy number is decreased relative to the reference sequence.
Copy Number Change
The copy number is increased or decreased.
Intron
The variant is completely within the intron region of the gene.
Exon
The variant is completely within the exon region of the gene.
Consequence
Description
Gain of Function Variant
The variant results in gain of function.
Loss of Function Variant
The variant results in loss of function.
Consequence
Description
Exon Loss Consequence
A loss of one or more exons in a gene.
Consequence
Description
Gain of Function Variant
The variant results in gain of function.
Loss of Function Variant
The variant results in loss of function.
Consequence
Description
Unidirectional Gene Fusion
A fusion of two genes on the same strand.
Transcript Variant
The variant changes the structure transcript.