Oncogenicity Prediction
Last updated
Last updated
The Oncogenicity Prediction feature estimates oncogenic potential of variants based on Standards for the classification of somatic variants in cancer (Horak et al., Genet Med. 2022). The standards include 17 criteria and allow to classify variants into 5 categories: Oncogenic, Likely Oncogenic, Variant of Unknown Significance (VUS), Likely Benign, and Benign. Connected Insights scores variants (only PASS small variants are currently supported) for 16 criteria (with exception of OP2) based on the implementation details provided below.
❗ Use predictions as a starting point for interpretation. Exercise judgement to determine if a greater or fewer number of criteria apply.
Estimated variant oncogenicity can be found:
In the column “Oncogenicity Prediction” in the Variants tab available for sorting
In the section “Oncogenicity Prediction” in the Biomarker page available for manual review and further interpretation
To view the evidence behind the estimated oncogenicity category and complete variant classification, follow the steps below:
Navigate to the "Oncogenicity Prediction" section in the Biomarker page
Review estimated oncogenicity category, score, and met criteria (displayed with filled checkboxes, see OS2, OM1, OM2, OP3, and OP4 on the figure below)
Review evidence for each criterion by clicking on it and displaying an evidence map. You can move objects on the map to facilitate review.
Select Report to edit and complete variant classification following Interpret a Variant. Information about met oncogenicity criteria is used to pre-populate variant summary.
OVS1: “Null variant (nonsense, frameshift, canonical ±1 or 2 splice sites, initiation codon, single or multi-exon deletion) in a bona fide tumor suppressor gene.”
For this criterion, we are checking the fulfillment of the following conditions:
Variant is a “null variant”. We assess null variants as variants with consequences stop_gained, start_lost, frameshift_variant, splice_acceptor_variant, or splice_donor_variant as well as variants with a high splicing prediction (SpliceAI score > 0.8) regardless of their consequences
Variant is not in the last intron or exon
Variant is in a tumor suppressor gene, according to Cancer Genome Census, JAX-CKB, OncoKB or MyKB
OS1: “Same amino acid change as a previously established oncogenic variant regardless of nucleotide change. Example: Val→Leu caused by either G>C or G>T in the same codon.”
For this criterion, we are checking the fulfillment of the following conditions:
Variant's consequences are not splice_donor_variant or splice_acceptor_variant
Variant’s splicing prediction is not high (SpliceAI score <= 0.8)
There is a previously established pathogenic / oncogenic or likely pathogenic / oncogenic variant with the same amino acid change but different nucleotide change in ClinVar (2-4 stars) or MyKB
OS2: “Well-established in vitro or in vivo functional studies supportive of an oncogenic effect of the variant.”
For this criterion, we are checking the fulfillment of the following conditions:
A variant with the same amino acid change is a gain-, loss-, or switch-of-function variant in MyKB, OncoKB, or JAX-CKB, is interpreted as pathogenic / oncogenic or likely pathogenic / oncogenic in MyKB, OncoKB, JAX-CKB, or ClinVar (2-4 stars), and OS1 is not applicable OR
A variant with the same nucleotide change is a gain-, loss-, or switch-of-function variant in MyKB, OncoKB, or JAX-CKB, is interpreted as pathogenic / oncogenic or likely pathogenic / oncogenic in ClinVar (2-4 stars), and OS1 is applicable
In this implementation, “Well-established in vitro or in vivo functional studies” is inferred based on knowledge bases having records of gain-, loss-, or switch-of-function, usually established based on functional studies. “... supportive of an oncogenic effect of the variant” is established based on the evidence of oncogenicity in knowledge sources.
OS3: “Located in one of the hotspots in cancerhotspots.org with at least 50 samples with a somatic variant at the same amino acid position, and the same amino acid change count in cancerhotspots.org in at least 10 samples.”
For this criterion, we are checking the fulfillment of the following conditions:
There are at least 50 samples with somatic variants at the same amino acid position in cancerhotspots.org, there are at least 10 samples with variants with the same amino acid change in cancerhotspots.org and OS1 is not applicable OR
There are at least 50 samples with somatic variants at the same amino acid position in COSMIC, there are less than 10 samples with variants with the same amino acid change in cancerhotspots.org and OS1 is not applicable OR
There are at least 50 samples with somatic variants at the same nucleotide position in cancerhotspots.org, there are at least 10 samples with a variant with the same nucleotide change in cancerhotspots.org and OS1 is applicable OR
There are at least 50 samples with somatic variants at the same nucleotide position in COSMIC, there are less than 10 samples with a variant with the same nucleotide change in cancerhotspots.org and OS1 is applicable
OM1: “Located in a critical and well-established part of a functional domain (e.g., active site of an enzyme).”
For this criterion, we are checking the fulfillment of the following conditions:
The variant is located in a region of a protein domain where pathogenic variants occur. The regions in each gene are defined by taking all protein domains from UniProt, mapping all pathogenic and likely pathogenic variants in ClinVar to each domain and defining the regions by using the positions of the first and last pathogenic and likely pathogenic variants from ClinVar and adding 2 bp padding on each side.
OS1 and OS3 are not applicable
OM2: “Protein length changes as a result of in-frame deletions / insertions in a known oncogene, or tumor suppressor gene or stop-loss variants in a known tumor suppressor gene.”
For this criterion, we are checking the fulfillment of the following conditions:
Variant’s consequence is inframe_deletion or inframe_insertion and it is located in an oncogene or a tumor suppressor gene (based on Cancer Genome Census, JAX-CKB, OncoKB or MyKB) or variant’s consequence is stop-lost and it is located in a tumor suppressor gene per the same sources
OVS1 is not applicable
OM3: “Located in one of the hotspots in cancerhotspots.org with less than 50 samples with a somatic variant at the same amino acid position, and the same amino acid change count in cancerhotspots.org is at least 10.”
For this criterion, we are checking the fulfillment of the following conditions:
There are less than 50 samples with somatic variants at the same amino acid position in cancerhotspots.org, there are at least 10 samples with variants with the same amino acid change in cancerhotspots.org , and OM1 and OM4 are not applicable OR
There are 10 - 49 samples with somatic variants at the same amino acid position and change in COSMIC, there are less than 10 samples with variants with the same amino acid change in cancerhotspots.org , and OM1 and OM4 are not applicable
OM4: “Missense variant at an amino acid residue where a different missense variant determined to be oncogenic (using this standard) has been documented. Amino acid difference from reference amino acid should be greater or at least approximately the same as for missense change determined to be oncogenic.”
For this criterion, we are checking the fulfillment of the following conditions:
There is a missense variant at the same amino acid position with a different amino acid change interpreted as pathogenic / oncogenic or likely pathogenic / oncogenic in ClinVar (2-4 stars), OncoKB or MyKB
OS1, OS3 and OM1 are not applicable
OP1: “All utilized lines of computational evidence support an oncogenic effect of a variant (conservation / evolutionary, splicing impact, etc.).”
For this criterion, we are checking the fulfillment of the following conditions:
Variant’s effect predicted to be damaging (REVEL > 0.75 and PrimateAI-3D score percentile > 0.8) or variant’s splicing prediction is high (SpliceAI score > 0.8)
OP2: “Somatic variant in a gene in a malignancy with a single genetic etiology.”
This criterion is not yet implemented.
OP3: “Located in one of the hotspots in cancerhotspots.org and the particular amino acid change count in cancerhotspots.org is below 10.”
For this criterion, we are checking the fulfillment of the following conditions:
There is at least 1 sample with a somatic variant at the same amino acid position in cancerhotspots.org or COSMIC
There are less than 10 samples with variants with the same amino acid change in both, cancerhotspots.org and COSMIC
OP4: “Absent from controls (or at an extremely low frequency) in Genome Aggregation Database (gnomAD).”
For this criterion, we are checking the fulfillment of the following conditions:
Variant’s allele count in the global population (gnomAD) is less than 10 if the variant is not in TP53, KRAS, or PTEN
Variant is absent in the global population (gnomAD) if the variant is in TP53 or KRAS
Variant’s frequency in the global population is less than 0.001% (gnomAD) if the variant is in PTEN
SBVS1: “Minor allele frequency is >5% in Genome Aggregation Database (gnomAD) in any of 5 general continental populations: African, East Asian, European (Non-Finnish), Latino, and South Asian.”
For this criterion, we are checking the fulfillment of the following conditions:
Variant’s frequency in any one population is over 5% (gnomAD) if the variant is not in TP53, KRAS or PTEN
Variant’s frequency in the global population is equal or over 0.1% (gnomAD) and allele count in the global population is equal or over 5 (gnomAD) if the variant is in TP53
Variant’s frequency in the global population is equal or over 0.05% (gnomAD) if the variant is in KRAS
Variant’s frequency in the global population is equal or over 1% (gnomAD) and allele count in the global population is equal or over 5 (gnomAD) if the variant is in PTEN
SBS1: “Minor allele frequency is >1% in Genome Aggregation Database (gnomAD) in any of 5 general continental populations: African, East Asian, European (Non-Finnish), Latino, and South Asian.”
For this criterion, we are checking the fulfillment of the following conditions:
Variant’s frequency in any one population is over 1% (gnomAD) if the variant is not in TP53, KRAS or PTEN
Variant’s frequency in the global population is over 0.03% (gnomAD) and allele count in the global population is equal or over 5 (gnomAD) if the variant is in TP53
Variant’s frequency in the global population is equal or over 0.025% (gnomAD) if the variant is in KRAS
Variant’s frequency in the global population is equal or over 0.1% (gnomAD) and allele count in the global population is equal or over 5 (gnomAD) if the variant is in PTEN
SBS2: “Well-established in vitro or in vivo functional studies show no oncogenic effects.”
For this criterion, we are checking the fulfillment of the following conditions:
Variant is interpreted as benign or likely benign in ClinVar (2-4 stars), OncoKB or MyKB
SBP1: “All used lines of computational evidence suggest no effect of a variant (conservation / evolutionary, splicing effect, etc.).”
For this criterion, we are checking the fulfillment of the following conditions:
Variant’s effect predicted to be neutral (REVEL is equal or less than 0.75 and PrimateAI-3D score percentile is equal or less than 0.8)
Variant’s splicing prediction is low or unknown (SpliceAI score is equal or less than 0.2 or unknown)
SBP2: “A synonymous (silent) variant for which splicing prediction algorithms predict no impact to the splice consensus sequence nor the creation of a new splice site AND the nucleotide is not highly conserved.”
For this criterion, we are checking the fulfillment of the following conditions:
Variant has a consequence synonymous_variant
Variant’s conservation prediction score is not high (PhyloP < 0.1)
Variant’s splicing prediction is low or unknown (SpliceAI score <= 0.2)