In silico predictions

The In silico predictions card highlights aggregated scores for missense prediction, conservation, and splicing prediction. These are algorithmic assessments of variant effects based on known biological features, protein structures, evolutionary conservation, or machine learning models trained on large-scale data. The scores are calculated by proprietary algorithms that integrate outputs from individual in silico variant pathogenicity predictors.

These scores support ACMG classification, particularly PVS1/PP3 (pathogenic evidence) and BP4/BP7 (benign evidence), and are especially useful when experimental data is lacking.

circle-exclamation

Each prediction type is grouped into three key categories: missense prediction, conservation, splicing prediction.

Missense prediction

The tools in this category evaluate the potential functional impact of genetic variants, with a primary focus on assessing how missense substitutions affect protein structure and function.

REVEL scores are used for automated tagging of ACMG PP3 and BP4 criteria.

Tools

Missense variant evaluation

  • LRT: Identifies deleterious missense variants via selective constraint (Chun & Fay, 2009).

  • PolyPhen-2 HDIV and HVAR: Predict the possible impact of a missense variant on protein stability and function (Adzhubei et al., 2010).

  • PrimateAI-3D: Deep learning model developed by Illumina to quantify missense variant pathogenicity (Sundaram et al., 2018).

  • REVEL:

    • Ensemble predictor combining 18 individual scores to identify rare pathogenic missense variants (Ioannidis et al., 2016).

    • REVEL scores are used for automated tagging of ACMG PP3 and BP4 criteria.

  • SIFT: Evaluates whether an amino acid substitution affects protein function based on sequence homology (Sim et al., 2012).

Evaluation of diverse variants (not restricted to missense)

  • CADD (v38.0+):

    • Ensemble predictor integrating over 60 genomic annotations (Kircher et al., 2014).

    • Assesses deleteriousness of SNVs, MNVs, and small indels in both coding and non‑coding regions.

  • DANN:

    • Deep neural network model trained on the same annotations as CADD, designed to improve upon CADD’s methodology, particularly for non‑coding variants (Quang et al., 2015).

    • Evaluates SNVs, MNVs, and small indels, both coding and non-coding.

  • MutationTaster:

    • Ensemble predictor incorporating conservation, splice‑site changes, protein features, and population frequency (Schwarz et al., 2014).

    • Assesses SNVs and small indels in both coding and non‑coding regions.

Mitochondrial variant evaluation

  • APOGEE: Predicts pathogenicity of mitochondrial missense variants (Castellana et al., 2017).

  • MitoTIP: Evaluates the likelihood that novel SNVs or indels in mitochondrial tRNA genes lead to disease (Sonney et al., 2017).

Overall missense prediction

1

REVEL score check

  1. If REVEL score is greater than 0.75, missense prediction is Damaging.

  2. If REVEL score is less than or equal to 0.75, missense prediction is Neutral.

  3. If REVEL score is not available, proceed to step 2.

2

PrimateAI-3D prediction check

  1. If PrimateAI-3D prediction is D, missense prediction is Damaging.

  2. If PrimateAI-3D prediction is not available or is B, proceed to step 3.

3
  1. If any of the following is true, missense prediction is Damaging:

    1. LRT prediction is D

    2. SIFT prediction is D

    3. MutationTaster prediction is A or D

    4. Polyphen2 HDIV or HVAR prediction is P or D

    5. DANN score is greater than 0.96

    6. CADD Phred score is greater than or equal to 20

  2. If none of the above conditions are met, proceed to step 4.

4
  1. If any of the following is true, missense prediction is Neutral:

    1. LRT prediction is N or U

    2. SIFT prediction is T

    3. MutationTaster prediction is N or P

    4. Polyphen2 HDIV or HVAR prediction is B

  2. If none of the above conditions are met, proceed to step 5.

5

If none of the conditions in steps 1-4 are met, missense prediction is Unknown.

Conservation

Conservation tools assess how strongly a nucleotide or amino acid position is conserved across species, helping to determine whether a variant is located within a biologically critical region.

The individual conservation scores are considered for automated tagging of ACMG PP3 criterion.

Tools

  • SiPhy 29 Mammals: Estimates the rate of evolution at each nucleotide based on 29 mammalian genomes to identify regions under selective constraint (Garber et al., 2009).

  • GERP RS: Quantifies the difference between the neutral substitution rate and the observed rate at a specific site (Davydov et al., 2010).

  • phastCons 100 Vertebrates: Using a phylogenetic hidden Markov model, identifies segments of the genome that are evolving more slowly than the rest (Siepel et al., 2005).

Overall conservation prediction

1

Score availability check

  1. If no conservation prediction scores are available, conservation prediction is Unknown.

  2. If at least one of conservation prediction scores is available, proceed to step 2.

2
  1. If any of the following is true, conservation prediction is High:

    1. GERP RS score is greater than 3

    2. phastCons 100 Vertebrates score is greater than 0.9

  2. If none of the above conditions are met, proceed to step 3.

3
  1. If any of the following is true, conservation prediction is Moderate:

    1. GERP RS score is greater than 1

    2. phastCons 100 Vertebrates score is greater than 0.2

  2. If none of the above conditions are met, proceed to step 4.

4

If none of the above conditions in steps 1-3 are met, conservation prediction is Low.

Splicing prediction

Splicing prediction tools evaluate whether a variant disrupts normal RNA splicing, potentially altering transcript structure or gene expression. They are especially critical for flagging intronic, synonymous, and non‑canonical splice‑region variants with potentially high splicing impact. These variants should be prioritized for transcript‑level review or laboratory RNA testing to verify the predicted effects.

SpliceAI supports tagging of PP3 and BP7 ACMG tags. SpliceAI-10K supports tagging of PVS1 and PP3 ACMG tags.

Tools

  • dbscSNV (AdaBoost and RandomForest): Machine learning ensemble models trained to predict splice‑site disruption from sequence context (Jian et al., 2014).

  • SpliceAI:

    • Deep neural network trained on large-scale human splicing data (Jaganathan et al., 2019).

    • Provides directional delta scores:

      • DS_AG (Acceptor Gain)

      • DS_AL (Acceptor Loss)

      • DS_DG (Donor Gain)

      • DS_DL (Donor Loss)

    • SpliceAI supports automated tagging of PP3 and BP7 ACMG tags.

  • SpliceAI-10K:

    • Extends SpliceAI’s window to detect broader effects such as pseudoexonization, partial intron retention, and exon skipping.

    • Scores are provided for donor and acceptor gain/loss, and high values may indicate cryptic splice site activation.

    • These annotations guide PVS1 and PP3 ACMG tag evaluation to enhance interpretation at the transcript level, especially for deep intronic variants or potential splicing disruptions.

Overall splicing prediction

1

Score availability check

  1. If no splicing prediction scores are available, splicing prediction is Unknown.

  2. If at least one of splicing prediction scores is available, proceed to step 2.

2
  1. If any of the following is true, splicing prediction is High:

    1. Both dbscSNV scores are greater than 0.6

    2. Any SpliceAI score is greater than 0.8

  2. If none of the above conditions are met, proceed to step 3.

3

If any of the following is true, splicing prediction is Low:

  1. Both dbscSNV scores are lower than 0.5

  2. Any SpliceAI score is less than or equals 0.2

  3. If none of the above conditions are met, proceed to step 4.

4

If none of the conditions in steps 1-3 are met, splicing prediction is Moderate.

In silico predictions per variant type

Category
SNV
Indel
mtDNA (SNV/indel)

Missense Prediction

+ Polyphen2 HDIV Polyphen2 HVAR SIFT MutationTaster LRT DANN REVEL PrimateAI-3D

CADD Phred

+

CADD Phred

+ APOGEE MitoTIP

Conservation

+ SiPhy 29 Mammals GERP RS phastCons 100 vertebrate

+

GERP RS

-

Splicing Prediction

+ dbscSNV-RF dbscSNV-Ada SpliceAI DS AG SpliceAI DS AL SpliceAI DS DG SpliceAI DS DL

-

-

circle-exclamation

Last updated

Was this helpful?