Organization databases
Organization databases (DBs) are custom variant datasets that enhance interpretation by adding population-specific frequencies, detecting technical artifacts, and referencing curated variants.
Organization database types
By purpose
Historic DBs: Filter out variants common in the population of interest
Noise DBs: Detect technical artifacts
Curated DBs: Reference previously curated variants
Historic database
Serves as a private population frequency database, helping users evaluate variant frequencies within their population.
An internal historic database is created from cases processed in your organization’s Emedgene account.
Typically includes all unique cases at the time of creation but can be tailored to include only specific cases.
Common variants are less likely to be pathogenic and can be filtered out.
Noise database
Serves as a quality control database to identify recurring artifacts introduced by the sequencing technique, sequencing platform, and analysis pipeline.
Typically, a noise database is composed of samples from unaffected individuals, such as healthy parents.
If only patient data is available, the database remains useful for filtering out high-frequency artifacts. However, caution is necessary when filtering rare variants to avoid excluding true pathogenic ones.
Sample size recommendations:
≥ 100 samples to filter out variants with > 5% allele frequency in the database
≥ 500 samples to filter out variants with > 1% allele frequency in the database
Multiple noise database instances can be maintained to account for different assays and calling methodologies.
Common variants can be filtered out as likely artifacts.
Curated database
Serves as a reference of previously curated variants.
A static curated variant database implemented upon request. Note: This is not the same as variants found in the dynamic Curate database.
Filtering by known variants from your curated databases aids in pinpointing significant variants, consistency, and faster interpretation.
By origin
Internal DBs: Built automatically from cases processed within your organization’s Emedgene account. Note: Historic and noise databases only.
External DBs: Created by the organization from other sources, such as:
Cases analyzed with different software
Research cohorts or legacy data
Publicly available datasets
By included variant types
SNV DBs: Store single nucleotide variants (SNVs)
CNV DBs: Store copy number variants (CNVs), insertions >50bp (INS), short tandem repeats (STRs), and regions of homozygosity (ROH)
Last updated
Was this helpful?
