Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
The AI-powered Emedgene platform utilizes machine learning throughout the analysis and interpretation workflow to deliver the fastest time from genomic data to decisions. We apply machine learning models that retrieve evidence-backed answers and provide exceptional decision support.
Using automated interpretation algorithms, Emedgene generates an accurate shortlist of up to 10 potential causative variants. In a joint study of 180 solved cases with Baylor Genetics, 96% of cases were successfully solved by the algorithm. See Meng et al, Genetics in Medicine, 2023 publication for more details.
The platform is not a black box, and overlays a layer of explainable AI (XAI), presenting supporting evidence from the literature and databases which significantly reduces the time to interpret a case.
The algorithms use a proprietary Emedgene knowledge graph which incorporates information extracted from literature with Natural Language Processing, as well as from public databases and is updated on a monthly basis.
Dozens of additional algorithms are incorporated throughout the workflow.
Overall, the system combines AI in a highly optimized and customizable workbench, in order to automate the most time-intensive aspects of genomic analysis and research.
Welcome to Emedgene, where we unlock genomic insights for hereditary disease and streamline your tertiary analysis workflows.
So you've signed in and can't wait to get started? Here we will guide you through the platform architecture, case creation, and results review. You can dive a bit deeper by following the links and exploring manuals for the platform's applications:
Analyze—Genomic analysis workbench, where you can accession, interpret, curate and report on your cases, while also efficiently managing the lab workflow
Curate—A repository for all of your organizational curated knowledge
The platform is operated from the top navigation panel.
By clicking on the corresponding buttons, you can enter:
Cases tab
Add new case page
Help dropdown menu
Settings dropdown menu
To enter the Add new case flow, click on the namesake button on the top navigation panel. Here:
Select file type
Upload files
Create a family tree
Annotate each sample with clinical information
Specify analysis details
Launch the analysis!
Select a case to review on the Cases tab. You'll be directed to the Individual case page that:
Showcases an AI-curated shortlist of variants suggested to be checked first, namely Most Likely Candidates and Candidates
Provides numerous customizable filters to help you explore the total list of genetic variants by yourself
Documents all the case-related information like Case status, sample quality metrics, and versions of all the resources used during case analysis
Investigate the evidence on the Variant page and assign appropriate tags to the variants of interest.
When you're ready to finalize the case, indicate the end result of the analysis and variants to be reported in the Case interpretation widget.
The Emedgene platform is divided into two applications:
Analyze—genomic analysis workbench
Curate—the knowledge management system
Go to the nine-dot app launcher icon located on the top navigation panel and select Curate from the dropdown menu.
Go to the nine-dot app launcher icon located on the Curate navigation panel and select Analyze from the dropdown menu.
The Dashboard tab depicts an overview of the user activity on the Emedgene platform and provides a glance at key performance indicators for an organization.
Diagnostic Yield card presents the proportion of "solved" cases out of the total number of the organization's cases of the same type.
Status Diagram card displays the total number of the organization's submitted cases as well as the numbers of cases under each status.
Stale Cases card highlights the cases that are stuck at one of the intermediate stages of the analysis, and are not finalized.
Network Activities panel displays a timeline of activities performed by multiple users within the organization. This log includes activity like creating a case, verifying a filter preset, changing a Case status, generating a report, and more.
You can sort cases by Creation date, Due date, or Quality.
A. Hover over the column header and click the up or down arrow to sort in ascending or descending order
B. Alternatively, click the column name and select Sort ascending or Sort descending from the dropdown menu
The current sort direction is indicated by a single arrow icon next to the column name.
Click on the question mark icon of the top navigation panel to open the Help dropdown menu.
From there, you can access:
Help Center. Feeling curious? Dive right in.
Feature requests. Submit your ideas.
What's New. Stay updated with the latest release notes.
The Emedgene platform utilizes the Okta Identity Management solution to control user access. This improves user management, enhances access and authentication security, and allows organizations to implement single sign-on for their users.
The case status reflects the current stage of case processing, either by the Emedgene platform or a genomic analyst.
You can view the current status of a case in the following locations:
Status column in the Cases table – for a quick overview across multiple cases
Top bar of the individual case page – for immediate visibility while reviewing a case
Case details panel, under Case-related activities – to track status updates and history
There are both out-of-the-box statuses provided by the platform and the option to create custom statuses to suit your workflow.
In the Management tab under Settings, you can tailor how case statuses appear and function to better align with your workflow:
Create custom case statuses to reflect your team's specific processes
Remove unused statuses to keep the list clean and relevant. Only statuses that are not currently assigned to any case can be removed.
Rearrange the order of statuses by drag-and-drop to match your preferred case interpretation flow. The updated order is reflected in:
The Status field dropdown in the Cases table
The top bar of the individual case page
Sequencing lab information section reports sequencing run technicalities as indicated during case creation:
Lab
Instrument
Reagents
Kit type
Expected coverage
Protocol
The Case quality section summarizes the data quality of the case and highlights the results of validation checks:
Chromosome validation
Confirms that each chromosome with at least 100 SNVs in defined enrichment kit or coding regions includes at least one high-quality variant
gnomAD validation
Verifies that each chromosome with at least 100 SNVs in defined enrichment kit or coding regions includes at least one variant annotated with gnomAD
ClinVar validation
Ensures that each chromosome with at least 100 SNV variants in defined enrichment kit or coding regions includes at least one variant annotated with ClinVar
AI Shortlist validation
Checks that at least one variant is tagged by the AI Shortlist.
This validation is not applicable if the gene list contains fewer than 50 genes
If your workgroup uses a higher threshold, it is reflected in the Gene list threshold field
mtDNA reference validation Confirms that the rCRS reference is used for mitochondrial DNA
Coverage metrics for a target region defined by a QC BED file (or RefSeq coding regions if no kit is provided) included in the Sample quality section:
Average coverage Average depth of coverage for a target region
% Bases with coverage >10x percentage of a target region that is covered at a minimum depth of 10x
% Bases with coverage >20x percentage of a target region that is covered at a minimum depth of 20x
Blue bars represent each of these parameters per sample, while a vertical line represents a general metric across all the samples of the same case type in the account.
Percentage of reads mapped to the reference sequence.
Blue bars represent each of these parameters per sample, while a vertical line represents a general metric across all the samples of the same case type in the account.
Sequencing error rate refers to the frequency at which incorrect base calls are made during sequencing process.
Blue bars represent each of these parameters per sample, while a vertical line represents a general metric across all the samples of the same case type in the account.
The Quality status provides a quick assessment of array data reliability for each sample:
High
Call rate ≥ 0.99 and Log R dev ≤ 0.2
Low If either condition is not met
N/A
If the QC file not available
Use the Quality status to quickly screen whether a sample meets minimal QC thresholds before starting detailed interpretation.
The Call rate field displays the percentage of loci on the array for which a genotype call was successfully made.
Call rate is one of the key metrics used to determine array sample quality, alongside log R deviation.
A high call rate indicates a high-quality sample and successful genotyping. Low call rates can signify problems with the DNA sample (poor quality or quantity) or issues during the array processing.
Displayed to three decimal places.
The Log R Deviation (or Log R Ratio standard deviation) quantifies the variability of the the signal intensity for each SNP marker on an array, ie, noise level.
Log R deviation is one of the key metrics used to determine array sample quality, alongside call rate.
Lower values indicate more consistent signal intensities. A high Log R Deviation can indicate a poor-quality sample or potential issues with CNV calling.
Displayed to three decimal places.
Variants that are most promising for solving the case. This list is limited to 10 top-scored variants but may include more if more than one variant is tagged per gene (suggesting compound heterozygosity). We can change the Most Likely Candidates number limit upon request.
Several dozen highly scored variants worth considering.
The ranking of variants by AI Shortlist considers:
SNVs
CNVs
SNV + CNV compound heterozygotes
SVs
mtDNA variants
STRs
The AI Shortlist rates variants based on predicted variant effects, alternative allele frequency, familial segregation pattern, phenotypic match, in silico predictions, and other relevant information from scientific papers and databases.
During the case review, you can untag variants selected by the AI Shortlist or manually tag ones not selected by the AI Shortlist.
The overall sample quality indicator provides a quick assessment of sequencing reliability for each sample.
Sample quality is evaluated using the following metrics:
Average depth of coverage Mean coverage across the target regions
% bases covered >20x Percentage of bases in the target regions covered at a depth greater than 20×, indicating reliable coverage
Error rate Sequencing error rate. Reflects general sequencing accuracy
% mapped reads Proportion of reads successfully mapped to the reference genome
Contamination check Detects mixed or low-quality samples that may affect interpretation
These metrics give an overall confidence level for whether the sequencing data can support accurate variant interpretation.
The Sex validation column indicates whether the biological sex inferred from genomic data matches the sex information provided during case creation. This helps identify potential sample mix-ups or metadata errors before interpretation begins.
Sex validation results:
Pass
Reported sex matches the estimated sex
Fail A mismatch was detected between reported and estimated sex.
N/A QC file not available; validation could not be performed.
Sex validation is performed by comparing the observed homozygous/heterozygous genotype ratio on the X chromosome with the expected ratios:
<2 for females
>2 for males
Prerequisites:
Only high-quality SNVs from targeted regions—either kit-specific or RefSeq coding regions—are used for sex validation
A minimum of 50 variants is required to generate a reliable result. If this threshold is not met, sex validation cannot be performed, and no result is displayed
If the sex was marked as unknown during case creation, the system will display the predicted sex instead of a validation status.
The Autosomal call rate field displays percentage of loci on the array for which a genotype call was successfully made, that only includes autosomes.
A high call rate indicates a high-quality sample and successful genotyping. Low call rates can signify problems with the DNA sample (poor quality or quantity) or issues during the array processing.
Displayed to three decimal places.
The Sex validation column indicates whether the biological sex inferred from genomic data matches the sex information provided during case creation. This helps identify potential sample mix-ups or metadata errors before interpretation begins.
Sex validation results:
Pass
Reported sex matches the estimated sex
Fail A mismatch was detected between reported and estimated sex.
N/A QC file not available; validation could not be performed.
If the sex was marked as unknown during case creation, the system will display the predicted sex instead of a validation status.
The top navigation panel serves as a guide to the platform. It includes:
Case search bar
Dashboard tab
Cases tab
Add new case button
Emedgene applications dropdown menu to switch between Analyze and Curate
Help dropdown menu under a question mark icon
Settings dropdown menu activated by clicking the username or profile picture
You can use the Case search tab in the top bar to search for cases by the Case ID or Proband ID.
A coverage BED file is used to calculate and determine quality control (QC) metrics for your case. This file defines the genomic regions that should meet coverage requirements during sequencing.
After selecting a coverage BED file, the available reference sequences for this kit will be displayed.
Specify details such as laboratory name, sequencing machine used, sequencing reagent kit, and expected coverage.
The Top bar in the Individual case page indicates the Case ID and current Case status.
Change the Case status
Reanalyze the case
Finalize the case and write interpretation notes
Preview the case report
Sample-level DRAGEN QC metric files for all samples in a case can be downloaded by clicking the download icon next to the Sample quality section title.
For NGS cases, the report includes coverage and mapping statistics.
For array cases, metrics include array QC values such as call rate, autosomal call rate, and Log R dev.
The Case details panel provides comprehensive information about a particular case.
The Case details panel is organized into three tabs:
Case info—displays technical, operational, and clinical information about the case
Family tree—shows a graphical pedigree and sample details for each family member
Activity—provides a timeline of all actions taken within the case for audit and collaboration
Click on the row of the case you want to view. A pop-up side Case details panel will appear on the right. To close the panel, click the X icon in the top right corner.
To expand the Case details panel, click the left-pointing arrow icon on the right edge of the screen. To collapse it, click the right-pointing arrow icon at the top left of the panel.
The Cases tab provides an overview of genomic sequencing cases submitted by the organization, as well as individual case details.
—displays a list of cases along with key details
—enables customization of the table view, including grouping and filtering of cases
—opens when a case is selected, providing additional information
The Family tree tab includes the following information:
Pedigree diagram. Pedigree legend can be found .
Sample details for each family member:
Phenotypes. For family members other than the test subject, phenotypes are categorized as:
Related—directly match one of the proband’s phenotypes
Unrelated—do not match any of the proband’s phenotypes
Medical Condition – Indicates whether the individual is considered Healthy or Affected in the case
Sex. Specified by the user
Age. Automatically calculated in years based on the provided date of birth
Maternal and Paternal —ethnic background of the proband’s parents
BAM file location. Shown where relevant
Click Fields
In Fields menu, use the toggle switch next to each field name to show or hide columns based on your preferred view
In the Cases table, click the column title you want to hide
From the dropdown menu, select Hide column
You can reorder columns in three ways.
Hover over the column title
Click the six-dot icon that appears on the left to the title
Drag and drop the column
Click Fields in Cases table navigation panel
In Fields menu, hover over the field name
Click the six-dot icon that appears on the left to the title
Drag and drop the field
Click the column header
From the dropdown menu, select Move left or Move right
Hover over the left or right border of the column header cell
When the resize cursor appears, click and drag the border to your desired width
You can filter cases using the following :
Case ID
Sample ID (Proband ID)
Status
Type
Label
Resolved: Resolved or Not resolved
Participants
Go to the Filters menu in the Cases table navigation panel
Under Field, select the field you want to filter by
Under List, select a value from the dropdown or manually enter one
Click Apply to activate the filter
To add another filter, click Add new under the active filter and repeat steps 1-4
Go to the Filters menu in the Cases table navigation panel
To remove a specific filter, click the X icon next to it
In the Cases table navigation panel, click the X icon next to the Filters menu
To organize cases by status, navigate to the Cases table, click on Group on the navigation panel, and select Status. To remove the grouping, select None.
In order to prevent accidental data loss, deleting cases in Emedgene includes a staging step before permanent case deletion.
Authorized users can permanently delete all items in the trash. To do this:
Click Empty trash on the .
Review the warning message showing the number of cases pending deletion.
Confirm to permanently delete all cases in the trash.
Warning: Once trash folder is emptied, this action cannot be undone. Review cases pending deletion before proceeding!
After deletion is confirmed:
All cases marked Move to trash are permanently removed
An activity entry is recorded
Email notifications are sent to users who have opted in
> Family tree screen > Create family tree panel
Build a pedigree via the visual tool.
It is ideal that a proband selected for case analysis is affected and has disease phenotype(s).
You can add a Father, a Mother, a Sibling, or a Child to any family member, starting with the Proband. To do this, choose their icon, then click on the Add family member button in the bottom right corner of the pedigree builder to select a family member.
More information about the pedigree symbols can be found .
To delete a family member, choose their icon, then click on the Delete Subject button in the top right corner of the Add patient information panel.
While adding a new case, you will build a pedigree and annotate each of the samples with data required for analysis ( > Family tree screen).
After the case has been created, the family tree is available in the panel (righthand panel of the Cases page).
Icon fill color in other pedigree members indicates the presence or absence of the proband's phenotypes in a present sample (regardless of the potential presence of additional unrelated phenotypes):
2. Icon color intensity denotes whether sample files have been uploaded for the particular individual:
3. Icon line type indicates whether the sample is considered or excluded during analysis (relevant to samples with uploaded files only):
> Family tree screen > Add patient information panel > Patient info section
Note: The fields marked with (*) are mandatory.
Options: Male, Female, Unknown.
Indicates the family relationship of a subject to the Proband automatically inferred from the pedigree. Options: Father, Mother, Sibling, Child, Other.
Expected format: mm/dd/yyyy.
Mark the checkbox if you want to exclude the sample from the AI Shortlist analysis and Inheritance filters while preserving genotype data.
If a sample shares some phenotypes with the Proband, you can copy them by checking this box. Proband's phenotypes will appear in a newly created Related Phenotypes section. To remove any of the proband's phenotypes not observed in a current individual, click the ☒ button next to the HPO term in the Related Phenotypes section.
Phenotypes not shared with a Proband. They can be added one by one () or in batch ().
Please follow the steps described below for each phenotype:
Enter an HPO term (e.g., Hypoplasia of the ulna), an HPO ID (e.g., HP:0003022), or a descriptive phenotype name (e.g., Underdeveloped ulna) in the search box;
Select a matching term from a dropdown menu and press Complete after you've added all the terms.
Paste a list of comma-separated HPO terms or HPO IDs in the search box and press Complete.
Add new case page > Case info screen > Select case type
Select the case type in order to define the proper analysis of your case.
Users can utilize a custom region of interest (ROI) BED file to limit analysis results to variants within the designated regions. A ROI BED determines which genomic regions will be included in the variant analysis.
If no custom ROI BED is selected, the system uses the .
You can select any region of interest, regardless of the case type.
When selecting a Custom BED as you region of interest, you must select a specific BED file that is already configured in your organization.
> Case info screen > Select genes list
You can limit analysis to a gene list in the platform while creating a case. Choose between:
No limitation of the analysis.
Select one of the previously added gene lists from a dropdown list.
Generate a new virtual panel: add a List title and then add all the gene symbols one by one () or in a batch ().
A new gene list can be comprised from a combination of configured gene lists and/or individual genes.
A gene list can by configured to hold up to 10,000 genes.
A new gene list can be created by combining configured gene lists and/or individual genes. Each gene list can be configured to contain up to 10,000 genes.
Note: Please use the up-to-date gene symbols approved by the Hugo Gene Nomenclature Committee. When adding gene symbols in a Batch mode, those genes that do not comply with HGNC standards will be automatically excluded from the gene list. These genes will appear for 3 seconds in a black error box at the bottom of the screen.
For each gene please follow the steps described below: Enter a gene symbol in the search box in the right panel (Candidate Genes) and select a matching symbol from a dropdown menu.
After selecting batch mode, paste a list of comma-separated gene symbols in the search box in the right panel (Candidate Genes).
You can choose between two different modes of a gene list feature:
Selected by default.
AI Shortlist is limited to the selected gene panel, no variants in other genes are considered in the results. If this in silico panel is used for analysis of exome or genome data, the gene restriction may be lifted during manual analysis to "open-up" the entire exome or genome for analysis.
Analysis is performed for variants in all the genes. Variants in the targeted genes get upgraded scores during prioritization by the AI Shortlist algorithm.
You have the flexibility to manage Case labels at any time: create, add, or remove them directly in the .
Adding labels to a case provides the ability to quickly mark cases for specific use cases and an easy filtering of cases sub set in the cases page.
Download and install node js platform via
Minimum version required: 16
Upgrade existing installation: nvm install --lts
Download the batch case create script.
Replace my-domain with your Emedgene domain.
Illumina cloud: my-domain.emg.illumina.com
Legacy Emedgene cloud: my-domain.emedgene.com
Download the CSV template file.
Edit the downloaded batchCases.csv file. See for more details.
Execute the batch cases creator as java script using the command below.
Replace my-domain with your Emedgene domain and my-email with your user email.
A prompt for your Emedgene password will appear, enter the password and press Enter.
In case of validation errors in the input CSV, an output CSV called batchCases_results.csv will be created in the same location with detailed error results.
-l will create a log file in the same location.
More information can be found by running
The user can enter a specific case from the by clicking Full details in the corresponding row of the case table.
—displays a Case ID and and includes Case interpretation, Edit case info, and Report preview buttons
—highlights a shortlist of variants, suggested to be reviewed first - Most Likely Candidates and Candidates
—illustrates quality metrics for the sequenced samples
—provides an interactive overview of genomic structure, ideal for analyzing CNV and ROH/LOH events
—provides numerous customizable filters to help you explore the total list of genetic variants in compliance with your organization's standard case review process. You can export shortlisted variants in .xlsx format
—documents versions of all the resources used during case analysis
In the top bar of the individual case page, click the dropdown icon next to the current case status
From the dropdown menu, select the new status you want to apply
In the Cases table, click the current case status of the relevant case
From the dropdown menu, select the new status you want to apply
To select variants with a particular tag, use the Filter candidates dropdown menu in the top right corner. You can select from Most Likely, Candidate, Incidental, Carrier, Not Reviewed, or any custom tags used in your organization.
For each variant on the Candidates tab, you can explore the suggested diagnosis, gene symbol, main variant details, and variant tag.
When a variant is found in a gene with no known association with a disease, the possible diagnosis cannot be indicated. Such variants are displayed under the Gene of Unknown Significance title.
All the relevant fitting a сompound heterozygous mode of inheritance are presented together. This refers to both confirmed and assumed compound heterozygosity (cases with at least one parent and singleton cases, respectively).
If you want to inspect the complete variant information, click on the variant bar to continue to the . You can visualize evidence in text or graphical format (Click on the interactive text in the top left corner: Show evidence as text or Show evidence graph to toggle between the two).
The Lab tab shows sample and case-level quality metrics so you can check data reliability before starting interpretation.
—highlights the key quality indicators, with more details provided in the subsequent sections
—reports sequencing run technicalities
—summarizes the data quality of the case
—highlights quality metrics for each sample
—displays the results of the relationship validation for each pair of samples in a family tree
—highlights regions that may not have been adequately sequenced
Summary dashboard provides a quick overview of key quality indicators at both the case and sample levels.
Displays the overall case quality status
Reflects sample quality status
Evaluation kit
Specifies the QC BED kit used to evaluate coverage depth and breadth. If no kit is specified at analysis launch, NCBI RefSeqGene is used as the default reference
Custom gene coverage Indicates whether the coverage of genes in the selected panel meets the expected threshold, as defined by the QC BED
Displays the results of relationship validation, confirming whether the submitted pedigree aligns with genetic data
The Ploidy column provides results from the DRAGEN Ploidy Estimator, which is designed to detect aneuploidies and determine the sex karyotype in whole genome cases.
Ploidy estimation results:
Pass All autosomes fall within the expected ploidy range.
Fail
At least one autosome shows a median score outside the expected thresholds (below 0.9 or above 1.1).
When you hover over a failed result, the system displays which chromosomes are problematic.
N/A
Case type is not Whole genome or QC file not available; validation could not be performed.
The ploidy calculation uses values from the *.ploidy_estimation_metrics.csv file.
Failed results do not confirm clinical abnormalities. They only indicate a deviation in copy number estimation and should be reviewed in context of other QC metrics and visualization.
The Contamination column reports whether a sample shows signs of DNA contamination, helping ensure data reliability before interpretation.
Contamination is detected using calculations, which estimate the proportion of reads that do not match the expected genotype. This estimate is based on the idr_baf score.
idr_baf stands for the interdecile range of the B-allele frequency—calculated as the difference between the 90th and 10th percentiles of the distribution of alt / (ref + alt) ratios across all variant sites.
A larger idr_baf value indicates greater variability in allele balance, which may suggest sample contamination, particularly from another human DNA sample.
Contamination check results:
N/A
No data is available (older cases or when idr_baf = 0.000).
No
No contamination detected (idr_baf < 0.200).
Unlikely
Possible contamination, but evidence is weak (0.200 ≤ idr_baf < 0.241).
Likely
Contamination suspected (0.241 ≤ idr_baf < 0.300).
Yes
Contamination confirmed (idr_baf ≥ 0.300).
Hover over the value to display a tooltip showing the HET ratio (proportion of sites that are heterozygous) and the HET count (number of heterozygote calls in sampled sites).
Tips:
Always review contamination results before starting interpretation to rule out technical issues that could explain unexpected variant calls.
Cross-check contamination results with other QC metrics (e.g., depth, ploidy, sex validation) for a more complete picture of sample quality.
For family cases, check that no contamination is flagged before relying on inheritance-based filters.
Warnings:
Panels may be less reliable: For targeted panels, contamination estimates may be inaccurate due to the limited number of variants available for calculation. Use caution and cross-check with other QC metrics when interpreting these results.
Do not use in isolation: A "Likely" or "Yes" result should not immediately be considered diagnostic — review case setup, sequencing quality, and sample handling first.
When available, a link appears below the sample name in the Sample quality section of the Lab tab. Clicking the link opens the detailed quality control metrics report in a new browser tab. This integration allows users to quickly assess sequencing quality and confidently interpret results—without leaving the Emedgene interface.
The is generated by the Illumina DRAGEN Bio-IT Platform and covers the entire analysis workflow—from raw sequencing reads to variant calls.
Interactive HTML summary A visual summary that includes interactive plots of key quality metrics. This report can be from the Sample quality section of the Lab tab.
CSV metric files A set of detailed CSV files containing sample-level quality metrics. These files are and support in-depth review and documentation.
curl https://my-domain.emg.illumina.com/v2/js/batchCasesCreator.js --output batchCasesCreator.jsnode batchCasesCreator.js saveTemplateFilenode batchCasesCreator.js create -h https://my-domain.emg.illumina.com -c batchCases.csv -u my-email -lnode batchCasesCreator.js --helpnode batchCasesCreator.js create --help








* Filled - the individual is affected by all of the proband's phenotypes;
* Half-filled - the individual is affected by some of the proband's phenotypes;
* Empty - the individual is not affected by any of the proband's phenotypes.* Full color - the sample has files loaded in the case;
* Faded color - no sample files are available.* Solid - the sample is included in the analysis;
* Dashed - the sample is ignored by Inheritance filters and the AI Shortlist algorithm, but you still can explore its genotypes.\
























The Cases table navigation panel provides several tools to help you customize your table view and manage cases. It includes the following components:
Filters menu Use this to narrow down the list of cases.
Group by menu Organize your cases by case status
Fields menu Choose which columns are visible in the table and define their order
Empty trash button
Permanently delete cases currently in the trash. Use with caution, as this action cannot be undone
The Case info tab includes the following information:
Case ID—a unique identifier assigned to each case by Emedgene, formatted as EMGXXXXXXXXX
Case type—the type of analysis performed:
Whole Genome
Exome
Custom Panel
Array
Sample type—the format of the sample files used in the case:
FASTQ: *.fastq.gz, *.fq.gz, *.bam, *.cram.
Project VCF: *.pvcf, *.vcf, *.vcf.gz, *.pvcf.gz
VCF: *.vcf, *.vcf.gz, *.targeted.json, *.gt_sample_summary.json
Gene list—defines whether gene list was used during analysis and how it was applied:
All genes—AI Shortlist was neither confined to nor prioritized a specific gene list
Virtual panel (In silico panel)—AI Shortlist was limited to only the genes in the gene list
Boosted gene list—AI Shortlist analyzed variants in all genes, but variants in the gene list were given higher priority
Analysis type:
If field is not present—carrier analysis was not performed
Carrier—carrier analysis was performed for the selected gene list
Human reference—the genome reference used during case analysis
Ordered by—the user who created the case and the case creation date
Signed by—the user who finalized the case
Related cases—the Case IDs of other cases that share one or more samples with the selected case
Due Date—the user-defined deadline for finalizing the case. To enter or edit the Due Date, click the calendar icon in the Due Date section
Participants—Users involved in the case, whether in submission, analysis, finalization, or those subscribed to updates. To receive email notifications, click the Subscribe icon. To unsubscribe, hover over your avatar and click the X icon
Patient Information—basic demographic details:
Sex. Specified by the user
Age. Automatically calculated in years based on the provided date of birth
Clinical Information:
Proband phenotypes—HPO terms used to describe clinical findings in the proband
Suspected disease—if provided, includes the suspected condition, penetrance (%), and severity (mild, moderate, severe, or profound)
Maternal and Paternal ethnicity—ethnic background of the proband’s parents
Parental consanguinity—indicates whether the parents are related by blood
Report secondary findings—specifies whether secondary findings analysis was requested
Clinical note—free-text notes provided at the time of launching the analysis
Additional case information can be added using custom fields, either via the API or by including extra columns in your CSV during batch case creation. This allows you to extend the case details panel with project-specific data. To enable this feature or learn more, please contact [email protected].
Log in to your Illumina private domain via URL in the following format: yourcompanyname.login.illumina.com. This opens the Connected Platform Home
In the left navigation panel: User > API keys
Name the key
Choose one of the following options:
A. Grant access to all workgroups across the domain If your domain includes multiple workgroups and you want the API key to apply universally, select "All current and future Workgroups and roles (Global API Key)"
B. Grant access to specific workgroups Select one or more workgroups from the list. For each selected workgroup, assign the following application roles:
Emedgene Has Access
Illumina Connected Analytics - Has Access
Platform-home Workgroup Admin
Click Generate. Once the API key is generated, copy it to your clipboard or download it as a file.
⚠️ Important: The API key is only accessible while the API Key Generated popup window is open. After closing the window, the key cannot be retrieved. If you didn’t copy or download it, you’ll need to generate a new key.
Log into your Emedgene domain and go to the workgroup where you want to link ICA storage
Click on the user avatar and select Settings from the dropdown
Select the Management tab
In the Storage card, click Add Storage
Select Illumina Connected Analytics (not Illumina Connected Analytics V1!) from the Storage type dropdown
Fill the storage credentials:
"Api_key"—the API key generated before
"Project"—the name of the Project in ICA that contains and will contain the data you want to connect
"Path"—the folder within the project where the data is located. This can be used to restrict the user to only be able to access data within the specified folder. Using only “ / “ will allow all folders within your ICA project
Click Add Storage
The CNV overall ploidy field displays the ploidy value extracted from the CNV VCF header. If no CNV VCF file is provided, "N/A" is displayed.
Displayed to three decimal places.
The value is shown as is. The system does not validate or flag abnormal ploidy values. Interpret ploidy in context.
Run a FASTQ case in Emedgene.
Since DRAGEN analysis is integrated into Emedgene secondary analysis pipeline, QC reports are automatically generated in the system.
Run DRAGEN analysis externally.
Prepare a TAR archive containing DRAGEN QC metrics files.
Upload the TAR archive and the sample VCF file to Emedgene.
This workflow is supported for batch case upload via UI and CLI and API-based case creation.
Array cases start from VCF input files. DRAGEN QC for array cases is supported on Emedgene v100.39.0 and later.
Run DRAGEN analysis externally using DRAGEN Array v1.3.0.
Upload the .annotated_cyto.json DRAGEN QC metrics file, the sample VCF file, and the .gt_sample_summary.json file to Emedgene.
This workflow is supported for batch case upload via UI and CLI and API-based case creation.
To directly import files from your own storage, link it to an organization's storage in Emedgene.
Click on the user initials or profile picture at the rightmost corner of the top navigation panel and select Settings
Select the Management tab and proceed to Storage card that lists currently linked storages.
To add a new storage:
Click Add Storage
Choose a storage type from:
Azure Data Lake
Azure Blob
AWS S3
File Transport Protocol (FTP)
Google Cloud
Secure File Transport Protocol (SFTP)
Illumina Basespace (BSSH)
Illumina Connected Analytics (ICA)
Fill in the required credentials
Click Add storage
Check the connection to confirm that the storage is successfully linked.
To do this, find the storage in the list and check the cloud icon status:
If it's green, the connection is set correctly
If it's red and strikethrough, something went wrong. Hover over the icon to see details
Click Manage on the right to the storage details.
Click Delete on the right to the storage details.
If data is deleted or moved from the customer's storage, it might adversely affect the case. To learn more about possible consequences, check out this table:
Add new case page > Family tree screen > Add patient information panel > Add sample section
You can choose one of the following options:
Existing sample: Pick one of the samples already loaded on the platform
Upload new sample: Upload files from your PC and enter sample name
Choose from storage: Choose files from your cloud storage and enter sample name
No sample: Postpone uploading files but proceed with case creation or skip uploading files for family members other than Proband
The Add New Case flow does not validate that sample IDs are unique or that input files are uncorrupted. Please ensure sample IDs are unique and that input files are valid before creating the case.
A case won't run if Proband sample files are missing. However, sample files are not mandatory for the rest of the family members (although highly recommended).
When choosing an existing file path, the samples used may be cached from the original run. For a top-up flow please use a new file path.
When you are loading sample files from your PC or choosing them from the storage, and there is more than one file per sample, please ensure that all the necessary files are simultaneously selected in the upload pop-up. You may only select one file type per case (i.e. you may not select both a .vcf and a .bam at the same time).
While creating a new case, you can choose whether to include secondary findings for the proband. This option is available on the Family Tree screen → Create family tree panel → Show Secondary Findings.
Secondary findings are genetic variants that are not related to the primary indication for testing but may have important medical implications. These variants are automatically assigned the Incidental tag when they meet American College of Medical Genetics and Genomics (ACMG)-defined criteria for reportable secondary findings.
A variant is automatically tagged as a secondary finding if it meets all of the following criteria:
Classification: Previously classified as pathogenic or likely pathogenic in ClinVar or Curate variant databases
Zygosity: Heterozygous or homozygous (only homozygous for the HFE gene)
Allele frequency: Less than 5%
Read depth: 10× or higher
Variant quality: Any value except LOW
Affected gene: Listed in the ACMG SF v3.2 or 3.3 medically actionable gene list for reporting secondary findings in clinical exome and genome sequencing (PMID: 37347242, 40568962)
ACTA2, ACTC1, ACVRL1, APC, APOB, ATP7B, BAG3, BMPR1A, BRCA1, BRCA2, BTD, CACNA1S, CALM1, CALM2, CALM3, CASQ2, COL3A1, DES, DSC2, DSG2, DSP, ENG, FBN1, FLNC, GAA, GLA, HFE, HNF1A, KCNH2, KCNQ1, LDLR, LMNA, MAX, MEN1, MLH1, MSH2, MSH6, MUTYH, MYBPC3, MYH11, MYH7, MYL2, MYL3, NF2, OTC, PALB2, PCSK9, PKP2, PMS2, PRKAG2, PTEN, RB1, RBM20, RET, RPE65, RYR1, RYR2, SCN5A, SDHAF2, SDHB, SDHC, SDHD, SMAD3, SMAD4, STK11, TGFBR1, TGFBR2, TMEM127, TMEM43, TNNC1, TNNI3, TNNT2, TP53, TPM1, TRDN, TSC1, TSC2, TTN, TTR, VHL, WT1.
Includes all v3.2 genes plus newly added genes:
PLN
ABCD1
CYP27A1
This brings the total to 84 reportable genes.
Tips:
Enable secondary findings when clinically relevant — this ensures variants in actionable genes are surfaced automatically.
Always review findings in the context of patient consent and your institution’s reporting policies.
Warnings:
Secondary findings are limited to the ACMG-defined gene lists. Variants outside these lists will not be tagged automatically.
Only variants with adequate sequencing depth and quality are tagged. Low-quality calls may require manual review.
You can implement different combinations of Presets to be used for different case types (i.e. Presets for exome may be different from Presets for genome) as defined by your SOPs to further streamline case review.
The combination of Presets is referred to as a Preset group.
Preset group selection is available in the Case info screen of the Add new case flow while creating or editing a case.
To manage filter Preset groups, navigate to Settings > Organization Settings > Lab Workflow:
From here, you can create (from Presets/from a JSON file, edit, hide/unhide and download Preset groups as needed.
Here, you can set a Preset group as default, so it will be used unless another Preset group is selected during case creation.
The Sample quality section in the Lab tab gives you a quick view of the reliability of sequencing or array data used in your case. The metrics displayed in the Sample quality section and their underlying calculation vary depending on the case type:
(overall)
:
Average coverage
% Bases with coverage >10x
% Bases with coverage >20x
(overall)
For eligible cases, users can review the results of DRAGEN QC: interactive DRAGEN QC report and DRAGEN QC metric files.
Columns can be reordered by drag-and-drop
Any column can be shown or hidden by selecting the columns in the Show/Hide columns menu in the top right corner of the page
You can choose between comfort and compact view by clicking the corresponding button
All modifications are automatically saved for each individual user and retained until new changes are made.
Whenever an organization is created, we automatically allocate bucket folders in AWS S3 cloud storage to it:
Path for upload
Folder intended to store input case files.
Authorized user has view and upload privileges.
Path for download
This folder contains a partially annotated (excluding results of proprietary algorithms) VCF file per case.
Authorized user has view and download privileges.
Path for DRAGEN output
This folder contains DRAGEN output files.
Authorized user has view and download privileges.
To get access to your upload, download and DRAGEN output folders, you need to get a key pair consisting of an access key ID and a secret access key. , , and credentials is available for users with Manager and Manage S3 Credentials .
You can create and use up to two dynamic access keys at the same time.
When you require technical support, you have the option to generate a new key pair specifically for the troubleshooting process. After the issue has been resolved, you can delete the credentials to ensure security of your system.
The newly generated credentials will only be saved in AWS Identity and Access Management (IAM) and not in our database.
In Settings > Management > S3 Credentials, click on Create Access Key.
You can retrieve the secret access key only when you initially create the key pair. If you lose it, you have to create a new key pair. To immediately copy the secret access key to a secure location, use the Copy to clipboard button.
In Settings > Management > S3 Credentials, click on Deactivate in the corresponding key pair card.
In Settings > Management > S3 Credentials, click on Activate in the corresponding key pair card.
In Settings > Management > S3 Credentials, click on Delete in the corresponding key pair card. Only inactive key pairs can be deleted.
The ethnicities of the proband's mother and father can be specified during the process of UI or API case creation. Please refer to the following list of supported ethnicities.
If you're comfortable with scripting and API usage, you can upload multiple cases at once using those methods. But if you're not a technical expert, don't worry. There is a user-friendly alternative available—importing a CSV file directly through the user interface.
Please follow the steps as described below.
Caution: Please note that refreshing or leaving the page, exiting the Add new case tab, or power failure of your computer before you've completed a batch case upload will result in loss of the case creation progress.
CSV (Comma-Separated Values) is a simple file format used to store data in tabular form. A row represents a sample, and a column represents a data field.
Start by downloading a CSV template with an example line and mandatory and non-mandatory fields from the Add new case page set to Batch mode (see ). Fill the file with your data according to .
Click on the + New case button on the .
Click on the Switch to batch button in the top right corner. You'll be directed to the Select file page of the Batch upload flow. Note: Here you can download a CSV template in the valid format.
Drag and drop a CSV file into the box or upload it from the file explorer. Wait for file upload and validation to finish.
After validation is complete, you will be directed to the Batch validation page. It features validation results details for you to review:
File name,
Number of rows in the file,
Number of cases to be created
Number of errors found,
Status message
If no errors were detected, a success message will be displayed
If any errors were detected, an error message will be displayed.
You will be given the option to download a file with error details to help you diagnose and correct any issues with the data. Once you've corrected the CSV file, reupload it.
Click on Create. A progress bar will appear on the right as the cases are created (Cases creation page).
If the cases have been created successfully, the Cases summary page will display the total number of cases that were created.
If there were any errors during the batch case creation process, the Cases summary page will display a table indicating the number of cases that were successfully created and the number of cases that failed.
You will have the option to download a CSV file containing two additional columns: Errors and Case ID. The Errors column will contain error messages for samples where case creation failed, while the Case ID column will contain the Case ID of a successfully created case for the lines where case creation was successful.
API/batch upload limitations
When using the API or batch upload, note that applying multiple gene lists can inadvertently exceed a combined limit of 10,000 genes across panels. The platform may not provide an explicit error message in such cases. Plan gene-panel combinations carefully.
Combining gene lists at case creation is available via the UI only and cannot be performed through API/batch upload.
API/batch upload cannot add phenotypes for an unaffected parent.
JSON files cannot be uploaded via API/batch upload.
The Candidates tab displays all tagged variants, whether tagged by the AI Shortlist or manually by a user.
Variants are automatically tagged as:
Most Likely Candidates and Candidates
Variants prioritized by the AI Shortlist
Secondary findings
Variants that meet ACMG-defined criteria for secondary findings and automatically tagged with an Incidental tag (if enabled)
Carrier variants
Variants identified by the carrier analysis pipeline (if enabled)
During review in the Candidates tab, additional tags can be applied to a variant alongside the original automatic tag.
A set of the most promising variants based on scores calculated by the AI Shortlist. These variants are initially tagged by the system.
Variant types assessed:
SNVs and indels
CNVs
SVs
mtDNA variants
STRs
Secondary findings are variants that are automatically assigned the Incidental tag when they meet the criteria for secondary findings as defined by the American College of Medical Genetics and Genomics (ACMG).
Tagging is applied only when the Secondary findings checkbox is selected during case creation.
A variant is automatically tagged as an incidental (secondary) finding if it meets all of the following criteria:
Classification: Previously classified as pathogenic or likely pathogenic in ClinVar or Curate variant databases
Zygosity: Heterozygous or homozygous (only homozygous for the HFE gene)
Allele frequency: Less than 5%
Read depth: 10× or higher
Variant quality: Any value but LOW
Affected gene: Listed in the ACMG SF v3.2 medically actionable gene list for reporting secondary findings in clinical exome and genome sequencing (PMID: 37347242)
ACTA2, ACTC1, ACVRL1, APC, APOB, ATP7B, BAG3, BMPR1A, BRCA1, BRCA2, BTD, CACNA1S, CALM1, CALM2, CALM3, CASQ2, COL3A1, DES, DSC2, DSG2, DSP, ENG, FBN1, FLNC, GAA, GLA, HFE, HNF1A, KCNH2, KCNQ1, LDLR, LMNA, MAX, MEN1, MLH1, MSH2, MSH6, MUTYH, MYBPC3, MYH11, MYH7, MYL2, MYL3, NF2, OTC, PALB2, PCSK9, PKP2, PMS2, PRKAG2, PTEN, RB1, RBM20, RET, RPE65, RYR1, RYR2, SCN5A, SDHAF2, SDHB, SDHC, SDHD, SMAD3, SMAD4, STK11, TGFBR1, TGFBR2, TMEM127, TMEM43, TNNC1, TNNI3, TNNT2, TP53, TPM1, TRDN, TSC1, TSC2, TTN, TTR, VHL, WT1.
Variants identified by the Carrier analysis pipeline. Carrier variants are automatically tagged only if you've selected the Carrier Analysis checkbox while creating a case. Analysis requirements and a list of targeted regions are specified by the organization's manager. This Carrier analysis flow is implemented by request.
Variants that were manually selected to be reported.
The Pedigree section displays relatedness metrics and the results of relationship validation for each pair of samples in the family tree.
Relatedness coefficient (observed)
Shows the observed coefficient of relatedness between sample pairs. The expected coefficient is available via hover tooltip for quick comparison.
IBS0
Identity by state 0—a number of genomic loci where two individuals share zero alleles. This occurs when the two individuals are opposite homozygotes for a biallelic SNP.
This metric is calculated across a set of biallelic SNPs and is inversely related to the degree of genetic similarity between the individuals. A low IBS0 count suggests a higher degree of overall genetic similarity, but it is an indirect and limited measure of genetic relatedness that requires interpretation alongside other metrics.
Relationship validation result
Summarizes the outcome of the relationship validation, confirming whether the observed data aligns with the expected pedigree structure.
Relationship validation is done by based on:
Relatedness coefficient (𝑟)—a measure of how much two individuals share alleles from a common ancestor, indicating the probability that alleles at the same genome location are identical by descent
IBS0 (Identity by state 0)—a number of genomic loci where two individuals share zero alleles, ie, they are opposite homozygotes
IBS2 (Identity by state 2)—a number of genomic loci where two individuals share two alleles, meaning they have the exact same genotype
Peddy takes the inferred relationships from the genetic data and cross-references them against the declared relationships. For every pair of individuals in a cohort, Peddy calculates a coefficient of relatedness from the genotypes observed at the sampled sites.
For each possible pair of samples in a pedigree, the expected relatedness coefficient based on declared family relation is compared with the observed relatedness coefficient (𝑟). IBS0 value helps to differentiate between sibling and parent–child relationships, both expected to have ~50% relatedness coefficient (see table).
The formatting of variant table rows provides visual cues about the variant status for the current user within a specific case.
A "Afghan Jews" "Afghani" "African" "African American" "Afro-Brazilian" Alaska Native" "Algerian" "Algerian Jews" "Amish" "Anatolian" "Arab" "Argentinian/Paraguayan" "Armenian" "Ashkenazi Jews" "Asian" "Asian Brazilian" "Australian Native" "Azerbaijan Jews"
B "Bedouin" "Bengali/Northeast Indian" "British/Irish" "Bulgarian Jews"
C "Caribbean Australian"
"Caucasus Jews" "Central African" "Central Asian" "Chilean" "Chinese" "Chinese Dai" "Christian Arab" "Circassian" "Colombia"
D "Druze" "Dutch"
E "East African" "East Asian" "East European" "Egyptian" "Egyptian Jews" "Emirates" "Ethiopia" "Ethiopian / Eritrean" "Ethiopian Jews" "Ethiopian Jews - Beta Israel" "European" "European American"
F "Fijian Australian" "Filipino" "Filipino Austronesian" "Finnish" "French" "French Canadian"
G "Georgian Jews" "Germans" "Ghanaian / Liberian / Sierra Leonean" "Greece Jews" "Greek Americans" "Greek / Balkan" "Guam/Chamorro"
H "Hawaiian"
I "Iberian" "India - Bene Israel Jews" "India - Cochin Jews" "Indian" "Indigenous Amazonian" "Indigenous peoples in Canada" "Indonesian" "Inuit" "Iranian" "Iranian Persian Jews" "Iraq" "Iraqi Jews" "Irish" "Italian" "Italian Americans" "Italian Jews"
J "Japanese" "Japanese Brazilian" "Jordan"
K "Kenyan" "Korean" "Kurdish" "Kurdish Jews"
L "Latino/Hispanic Americans" "Lebanese Jews" "Levantine" "Libyan" "Libyan Jews"
M "Maasai" "Malayali Indian" "Melanesian" "Mesoamerican and Andean" "Mexican American" "Middle Eastern" "Mongolian / Manchurian" "Mormon" "Moroccan" "Moroccan Jews" "Muslim Arab"
N "Native American" "Nepali" "Nigerian" "North African" "North and West European" "Northern Asian" "Northern Indian"
O "Other Pacific Islander"
P "Pakistani" "Papuan" "Polynesian" "Portuguese in Northern Brazil" "Portuguese in Southern Brazil"
R "Russian Jews" "Russians"
S "Samaritan" "Samoan" "Sardinian" "Saudi" "Scandinavian" "Senegambian / Guinean" "Siberian" "Somali" "South African" "South Asian" "Southern East African / Congolese" "Southern European" "Southern Indian" "Southern Indian / Sri Lankan" "Southern South Asian" "Spaniards" "Spanish Jews" "Sub-Saharan African" "Sudanese" "Swedes" "Syrian Jews" "Syrian-Lebanese"
T "Tajikistan Jews" "Thai / Cambodian / Vietnamese" "Tunisian" "Tunisian Jews" "Turkish" "Turkish / Anatolian" "Turkish Jews"
U "Ukraine" "Ukraine Jews" "Uzbekistan/ Bukharan Jews"
V "Venezuela"
W "West African"
Y "Yemenite" "Yemenite Jews"
Case ID
A unique identifier assigned to each case by Emedgene, formatted as EMGXXXXXXXXX. This field is fixed and cannot be hidden or repositioned in the table. Share this code with Tech Support when reporting issues.
Proband ID
The identifier of the proband. For single case creation, this corresponds to the Sample Name; for batch case creation, it corresponds to the BioSample Name of the test subject.
Phenotypes
Proband phenotypes as submitted by the user.
Status
The current case status in the system. Custom statuses can be added in the Management tab under Settings, and their order can be rearranged via drag-and-drop.
You can update the status directly from the Cases table by clicking the status badge and selecting a new status from the dropdown menu.
Creation date
The date the analysis was initiated. This is saved automatically. The field is sortable.
Due date
A customizable field that allows you to set, change, or remove a due date.
Click the calendar icon to set a date. To change it, click the existing date and select a new one. To remove it, click the cross icon next to the date. The field is sortable.
Quality
Indicates the overall case quality. Detailed validation results can be reviewed in the Lab tab. The field is sortable.
Type
Indicates the case type (whole genome, exome, custom panel, array).
Label
A customizable field that allows you to assign custom case labels. Click the pencil icon to add a new label, select an existing one, or remove a label from the case.
Participants
Users involved in the case who subscribed to updates.
To receive email alerts for case updates, click the Subscribe icon. To unsubscribe, hover over your avatar and click the button.
Lab directors and other authorized roles can assign cases directly to analysts, making workload management easier.
User groups
User groups as defined in Settings. Each group appears as a separate column in the table.
<Region_Cloud>-emg-auto-samples/<org_name>/upload/<Region_Cloud>-emg-downloads/<org_name>/ <Region_Cloud>-emg-auto-results/<org_name>/ 








The Activity tab offers a timeline of case actions and enables users to leave comments. It supports key functions that enhance case management and review:
Traceability—Maintains a complete, time-stamped history of case actions
Error recovery—Allows users to identify and trace changes, such as variant edits or disease associations, made in error
Real-time collaboration – Enables teams to monitor each other’s updates as they happen, ensuring transparency
Training & quality control – Helps identify patterns in variant interpretation and supports consistent application of evidence criteria
Audit compliance – Supports clinical and laboratory documentation standards (e.g., CAP/CLIA) by providing a verifiable action history
Each activity entry includes:
Timestamp (date + time)
User name of the person who performed the action
Action description
Activity logs are kept for at least six years for full traceability.
Case-related
Case created Case status changed Case participants updated Case labels modified Report created Case moved to trash Case data edited, no reanalysis initiated Case data edited and reanalysis launched
Comments
Comments left in the Activity tab
Variant tagging
Variant tag updated — this log entry includes a link to the relevant variant page for immediate review
Evidence notes
Evidence notes updated — this log entry includes a link to the relevant variant page for immediate review
Evidence pathogenicity
Variant pathogenicity updated — this log entry includes a link to the relevant variant page for immediate review
Evidence graph
Evidence graph updated — this log entry includes a link to the relevant variant page for immediate review
ACMG pathogenicity
ACMG evidence updated (logs any changes made via the ACMG classification wizard) — this log entry includes a link to the relevant variant page for immediate review
Transcript changes
Reference transcript updated
Viewing activity logs
In the Cases table, the Activity tab within the Case details panel displays only comments and case-related activities. To view the full list of all activities, open the Case details panel directly from the individual case page.
Edits are permanent. Even if a change is undone, the original action remains recorded for traceability
Logs are case-specific. Activity entries do not reflect changes made in other cases or in the Curate database
Time zone awareness. Timestamps follow the system’s configured time zone, which may differ from your local time—especially in international collaborations.
When creating a new case, the first step is to select the sample input type. This determines how your data will be processed and which quality metrics will be available later in the analysis.
You can choose from the following supported formats: FASTQ, Project VCF, and VCF.
Use this option if you want the platform to perform secondary analysis and variant calling.
Accepted file types:
.fastq.gz
.fq.gz
.bam
.cram. Make sure you understand the current limitation for using CRAM files by expanding the section below.
Use when working with a joint VCF file containing multiple samples.
Accepted file types:
.pvcf
.vcf
.pvcf.gz
.vcf.gz
Use for cases where variants have already been called externally, or for cytogenetic array inputs.
Accepted file types:
.vcf
.vcf.gz
.targeted.json
.gt_sample_summary.json (v37.0+, DRAGEN Array v1.2+)
.annotated_cyto.json (v100.39.0+, DRAGEN Array v1.3+)
Tips:
Choose the input type carefully — it cannot be changed after the case is created.
Keep file paths simple (avoid spaces, parentheses, or very long names >255 characters). This helps prevent errors during upload.
Warning:
If files are incomplete or corrupted, the case may still be created but will fail during processing. Double-check your files before uploading.
For large files (BAM/CRAM/FASTQ), browser upload is not recommended. Use Batch Upload, CLI, or cloud-to-cloud transfer instead to avoid incomplete or truncated uploads.
Uploading
The sequencing file is being uploaded on the platform
System
In progress
The case analysis is running
System
Delivered
The running stage is completed, and the case is now available for users for analysis and review
System
Finalized
The analysis and review of the case by the analyst group have been completed. Note: Access to applying or changing the Finalized status could be restricted to specific users, such as organization managers and directors, who possess corresponding user roles
User
Move to trash
Indicates that the case is up for deletion and makes the case inaccessible. This status is only assigned by the user. Note: Access to applying or changing the Move to trash status could be restricted to specific users, such as organization managers and directors, who possess corresponding user roles
User
Pending sequencing
The case has been created but is not yet connected to any genetic data
System
Issue reported
The case failed to run. Please check the integrity of the uploaded files and ensure that the variant caller used is on Emedgene list of accepted variant callers.
System
Reanalysis
The system is re-running the AI Shortlist algorithm
System
Unrelated
Relatively high
0
< 0.2 Pass 0.2–4 Shared ancestry 4–15 Consanguinity
> 15 Consanguinity + Failed validation
Parent–child
0 or close to 0. Any IBS0 sites—due to genotyping errors
50
< 40 Failed validation 40–60 Pass > 60 Failed validation
Full siblings
Small but detectable number
50
< 40 Failed validation 40–60 Pass > 60 Failed validation

no
no
no
by the AI Shortlist
no
by a user
yes
no
yes
by the AI Shortlist
yes
by a user
Before you proceed to this article, make sure you understand data storage management basics.
In Settings > Management Tab, add or edit the required credentials: CLIENT_ID, CLIENT_SECRET, TENANT_ID, and ACCOUNT_URL.
See the table below to learn where to look for them in your Azure account.
CLIENT_ID
application_id.
Format: ########-####-####-####-############
(letters/numbers)
CLIENT_SECRET
Value of the client_secret tuple (Value, Secret ID).
Format: #####-#######-######-######
(letters/digits/special chars)
TENANT_ID
ID of the tenant.
Format: ########-####-####-####-############
(letters/numbers)
ACCOUNT_NAME
An arbitrary name that the customer must supply to define the ACCOUNT_URL.
Format: string
CONTAINER_NAME
An arbitrary name that the customer must supply to define the ACCOUNT_URL.
Format: string
ACCOUNT_URL
The account_url of the Azure account.
Format: https://account_name.blob.core.windows.net/container_name
In Microsoft Entra ID, click on App registrations.
Select New registration.
Fill the name of the application & press "register."
You got to the registered app page: (CLIENT_ID / TENANT_ID) From this you can retrieve: Application ID and Tenant ID. Both are marked in the screenshot.
Press "Certificates & secrets"
Press on "New Client secret"
Fill the "Description" and change expires to 12 months. (or according to your organization policy), than press "Add"
8. Get the CLIENT_SECRET from this page.
Give this App registration roles and read access to the relevant Blob.
Go to Azure Storage accounts
Get into the relevant Storage account
Press on "containers"
Press on the relevant container
Press on "Properties"
Copy the ACCOUNT_URL
Errors for bad connections can be found in CloudWatch on particular FRY log stream
Search for: BlobApi, BlobFs, azure.
Log in to Emedgene and navigate to Settings in the upper right-hand corner of the page.
Click on the Management tab and then on Add Storage.
Choose Illumina BaseSpace storage type.
Fill Client Key, Client Secret and App Token as provided from BaseSpace (a description on how to get this information is provided below) and click Add storage to complete the setup.
Install BaseSpace CLI (Command Line Interface)
# Linux
$ wget "https://launch.basespace.illumina.com/CLI/latest/amd64-linux/bs" -O $HOME/bin/bs
# Mac
$ wget "https://launch.basespace.illumina.com/CLI/latest/amd64-osx/bs" -O $HOME/bin/bs
# or
$ brew tap basespace/basespace && brew install bs-cli
# Windows
$ wget "https://launch.basespace.illumina.com/CLI/latest/amd64-windows/bs.exe" -O bs.exeFollow the instructions on the BaseSpace CLI Installation Page if needed. Be aware of the Basespace Regional Instance you are working on (us, euc1, aps2, euw2)
On BSSH, login to the workgroup you want to connect as the storage.
Once the BaseSpace CLI is installed, run the authentication command in the terminal.
$ bs authThe command will direct you to a link which requires to login.
After the authentication was completed successfully, find the access token in the config file.
$ cat .basespace/default.cfgThe result should look like -
apiServer = https://api.basespace.illumina.com
accessToken = Populate the App_token with the accessToken value, and Server with the apiServer URL from the BSSH config file.
Client_key will be displayed in subsequent menus, so a descriptive name such as the workgroup name can be used.
Client_secret is unused when the App_token is available and can be set to "x".
Go to the BaseSpace developer portal and login. Be aware of the Basespace Regional Instance you are working on (us, euc1, aps2, euw2)
Go to My Apps and click Create a new Application.
Fill details for the application and click on create an application.
Fill details and press save.
You will need to fill all the fields that it requested, please add “NA” to them.
Go to My Apps and click on your new app. Then go to the credentials tab.
You will find the Client ID (Client Key), Client Secret and App Token to enter to Emedgene platform.
Log in into the desired Emedgene organization.
Go to Settings
Go to Management tab
Click on Add Storage
Select BaseSpace:
Add the information from your “Credentials” of the App previously created in BSSH.
Go to the google cloud Console.
Navigate to IAM & Admin - In the left sidebar, go to IAM & Admin > Service Accounts.
Create a New Service Account: Click on the "Create Service Account" button at the top.
Fill in the Service Account Details:
Service account name: Give your service account a name.
Service account ID: This will be automatically generated based on the name.
Description: Optionally, provide a description for the service account.
Click "Create and Continue".
example:
Assign Roles to the Service Account:
In the Grant this service account access to project step, you’ll assign the necessary roles.
Grant these role:
"storage object viewer" (read-only access)
Create the Service Account:
After assigning the roles, click "Done".
Generate and Download a Key:
Find your newly created service account, click the three dots on the right, and select "Manage Keys".
Click Add Key > Create New Key and choose the JSON format.
Download the key and store it securely, as it is used for authentication in your code or applications.
Encode the key in base 64:
use python function: put this function and your json (here named json_file.json) in the same directory and run.\
save the output printed.
Add the above 3 values into the appropriate fields:
Client_credentials_base64: pasting the output of 8.
Bucket: the bucket name.
Path: for default, fill with / else, put your path in the bucket. Seperate directories with /
Download and install the Google Cloud SDK from the Google Cloud SDK Install page.
Select Your Platform (Windows, macOS, or Linux), download and run.
Initialize and Authenticate with Google Cloud: In the Cloud SDK Shell/terminal, run:
gcloud init
This will open a browser window to authenticate your Google account. Follow the instructions to log in and select your project.
Set CORS Configuration via gcloud:
Create a JSON file (cors.json) on your machine with the CORS rules.
Example\ it should look like:
notice:
origin: if using Illumina cloud:
https://host_name.emg.illumina.com
else, Emedgene cloud:
https://host_name.emedgene.com
Apply CORS Configuration to Your Bucket: run the next command.
gcloud storage buckets update gs://your-bucket-name --cors-file=cors.json
Verify the CORS Configuration:
gcloud storage buckets describe gs://your-bucket-name
Add new case page > Family tree screen > Add patient information panel > Patient info section
Note: The fields marked with (*) are mandatory.
Options: Male, Female, Unknown.
Handling a proband sample with unknown sex
When a sample is user-assigned "Unknown" sex, the system assumes "Female". This affects CNV interpretation on sex chromosomes in case the genetic sex is actually male:
Chromosome X: CN = 2 is considered reference (REF) for a female genome, so CNVs with two copies are hidden by default. This may cause chromosome X duplications to be missed.
Chromosome Y: CN = 0 is considered reference (REF) for a female genome, so CNVs with zero copies are hidden by default. This may cause chromosome Y deletions to be missed.
To include these variants in the analysis, enable the in Workbench & Pipeline Settings.
The default fixed value for Proband is Test Subject.
Expected format: mm/dd/yyyy.
Options: Affected, Healthy.
The default value for Proband is Affected, but you may change it to Healthy.
To add all relevant phenotypes for the Proband, use one of the following methods:
,
,
, or
Automatically infer disease-associated phenotypes (see below).
Warning: Select valid HPO phenotypic abnormality terms
When adding patient phenotypes, ensure that all selected HPO terms originate from the “Phenotypic abnormality (HP:0000118)” branch of the HPO ontology. Terms outside this branch are not supported for case analysis, as they do not represent clinical phenotypes and may lead to incomplete or inaccurate downstream results.
Please follow the steps described below for each phenotype:
Enter an HPO term (e.g., Hypoplasia of the ulna), an HPO ID (e.g., HP:0003022), or a descriptive phenotype name (e.g., Underdeveloped ulna) in the search box.
Select a matching term from a dropdown menu and press Complete after you've added all the terms and additional patient information below.
Paste a list of comma-separated HPO terms or HPO IDs in the search box and press Complete.
In the Clinical Notes section upload a description of the clinical presentation in .pdf, .xls, .txt, .doc, .jpeg, or .jpg format. Among the extracted HPO terms for Phenotypes and Diseases select the ones you want to add to Proband's Phenotypes.
Enter the disease name in the search box, select a matching term from a dropdown menu and press Complete. All the associated phenotypes will be automatically added to the Proband Phenotypes. To remove any phenotype described for the disease but not observed in your patient, click the ☒ button next to the HPO term in the Proband Phenotypes list.
Enter the suspected disease penetrance as a percentage.
Select the appropriate category to indicate the severity of the disease symptoms observed in the patient: Mild, Moderate, Severe, Profound.
Mark the checkbox if applicable.
Paternal and Maternal. Enter the name in the search box and select a matching term from a dropdown menu.
When creating NGS cases that start from VCF, you can create a browsable from the DRAGEN metrics files. Due to security restrictions, CSV files are not directly ingested, but they can be included when packaged in a TAR file.
Navigate to local directory containing metrics files for a specific sample.
Define sample name as a variable samplename="NA12878".
Combine the find and tar commands to package the files into a tar.gz file with the following extension *.metrics.tar.gz.
Command to find files matching the required patterns:
Upload the metrics.tar.gz file to the storage location used for creating cases.
Add metrics.tar.gz to case creation API JSON payload using the corresponding storage ID.
Ensure that if the extension is not contained in the filename (e.g. files from BaseSpace) that "sample_type": "dragen-metrics" is set within the JSON payload.
DRAGEN report link is then available once your case has been delivered.
The Genome view tab is a powerful feature designed to give users a clear, visual overview of the genome and chromosomes in their cases. This feature is especially useful for analyzing large Copy Number Variation (CNV) events and regions of homozygosity/loss of heterozygosity (ROH/LOH) across the genome, providing intuitive filtering and interactive insights for researchers and clinicians.
The Genome View tab is a dedicated section within the Case Page for Whole Genome Sequencing (WGS), Whole Exome Sequencing (WES), and Array data. This tab offers a graphical visualization of genomic data, focusing on CNV and LOH/ROH analysis for proband cases. Users can access this tab directly from the Case Page.
Chromosome ideogram
The chromosome ideogram offers a visual representation of all 23 human chromosomes, with CNV and LOH/ROH events highlighted for intuitive analysis. Here's what you can expect:
Variant types:
Deletions (DEL): Marked in red, displayed to the left of the chromosome.
Duplications (DUP): Marked in blue, displayed to the right of the chromosome.
LOH/ROH (Regions of Homozygosity/Loss of Heterozygosity): Marked in gold, displayed over the chromosome.
Note: Variants with no coverage or reference (Ref) are excluded.
Filtering Options:
Users can filter segments by size, using a range selector with the following options: 50 bp,1 KB, 10 KB, 50 KB, 100 KB, 1 MB, 10 MB, Max (no filter).
The default filter is set from 50 KB to Max.
Limitation: Only the 500 largest variants are displayed in this tab.
Hover and Click Interactions:
Hovering over a variant displays
Chromosome number, start and end positions, and size (e.g., chr1:100000-200000 (100 KB))
Cytoband range (e.g., p12.3 - q11.2) based on ISCN nomenclature
Variant type (DEL/DUP/LOH/ROH)
Number of genes affected
Clicking on a chromosome refines the genome view below to that chromosome.
Clicking on a variant opens a detailed Variant Page where many actions and further review can be made.
Legend: A clear legend explains color coding and icons for DEL, DUP, and LOH/ROH variants for quick reference.
The genome viewer provides a deeper dive into the genomic data through three interactive tracks:
Log R / TNS Track:
Displays copy number intensity data using values from the TNS BigWig file or LogR bedgraph.
Y-axis ranges from -3 to 3, with increments of 0.4.
X-axis displays the genome (whole genome view) or chromosome segments (whole chromosome view).
BAF Track:
Displays B Allele Frequency (BAF) data using values from the BAF BigWig file or BAFbedgraph.
Y-axis ranges from 0 to 1, with increments of 0.1.
X-axis aligns with the Log R track.
ROH Track:
Indicates regions of homozygosity for further analysis.
Zoom and navigation
Default View: Displays the entire genome.
Zoom-In Options: Users can zoom in to view individual chromosomes by clicking on them.
Interactive Navigation: Clickable chromosomes on the ideogram allow seamless switching between views.
import json
import base64
def encode_json_to_base64(json_file):
# Read JSON data from file
with open(json_file, 'r') as file:
json_data = json.load(file)
# Convert the JSON data to a string
json_str = json.dumps(json_data)
# Encode the string to bytes, then to Base64
json_bytes = json_str.encode('utf-8')
base64_bytes = base64.b64encode(json_bytes)
# Convert Base64 bytes back to a string
base64_str = base64_bytes.decode('utf-8')
# Print the Base64-encoded string
print(base64_str)
encode_json_to_base64('json_file.json')[
{
"origin": ["https://<host_name>.emg.illumina.com"],
"method": ["GET"],
"responseHeader": ["emgauthorization"],
"maxAgeSeconds": 3600
}
]































find . \( -name "*.csv" -o -name "*.tsv" -o -name "*.counts" -o -name "*.counts.gz" -o -name "*.counts.gc-corrected" -o -name "*.counts.gc-corrected.gz" -o -name "*.ploidy.vcf" -o -name "*.correlation.txt.gz" -o -name "*.correlation.txt" -o -name "*.repeats.vcf" -o -name "*.ploidy.vcf.gz" -o -name "*.repeats.vcf.gz" -o -name "*.annotated_cyto.json" \) | xargs tar -czf "${samplename}.metrics.tar.gz"{
"test_data":
{
"consanguinity": false,
"inheritance_modes":
[],
"sequence_info":
{},
"type": "Whole Genome",
"notes": "",
"samples":
[
{
"bam_location": "",
"fastq": "NA12878-PCRF450-1",
"status": "uploaded",
"directoryPath": "",
"sampleFiles":
[
{
"filename": "NA12878-PCRF450-1.metrics.tar.gz",
"sample_type": "dragen-metrics",
"path": "/analysis_output/demo_data_germline_v4_3_6_v2-DRAGEN_Germline_Whole_Genome_4-3-6-v2-75b081e8-a8aa-433e-862b-a20d2d65e492/NA12878-PCRF450-1/NA12878-PCRF450-1.metrics.tar.gz",
"size": 0,
"storage_id": 420,
"status": "uploaded",
"vcf_column_name": "NA12878-PCRF450-1",
"vcf_column_names":
[
"NA12878-PCRF450-1"
],
"loadingSample": false
},
{
"filename": "NA12878-PCRF450-1.hard-filtered.vcf.gz",
"sample_type": "vcf",
"path": "/analysis_output/demo_data_germline_v4_3_6_v2-DRAGEN_Germline_Whole_Genome_4-3-6-v2-75b081e8-a8aa-433e-862b-a20d2d65e492/NA12878-PCRF450-1/NA12878-PCRF450-1.hard-filtered.vcf.gz",
"size": 0,
"storage_id": 420,
"status": "uploaded",
"vcf_column_name": "NA12878-PCRF450-1",
"vcf_column_names":
[
"NA12878-PCRF450-1"
],
"loadingSample": false
}
],
"storage_id": 420,
"sampleType": "vcf"
}
],
"sample_type": "vcf",
"patients":
{
"proband":
{
"fastq_sample": "NA12878-PCRF450-1",
"gender": "Male",
"healthy": false,
"relationship": "Test Subject",
"notes": "",
"phenotypes":
[
{
"id": "phenotypes/EMG_PHENOTYPE_0001324",
"name": "Muscle weakness"
}
],
"detailed_ethnicity":
{
"maternal":
[],
"paternal":
[]
},
"zygosity": "",
"quality": "",
"dead": false,
"ignore": false,
"id": "proband"
},
"other":
[]
},
"diseases":
[],
"disease_penetrance": 100,
"disease_severity": "",
"boostGenes": false,
"selected_preset_set": "",
"incidental_findings": null,
"labels":
[],
"gene_list":
{
"type": "all",
"id": 1,
"visible": false
}
},
"should_upload": false,
"sharing_level": 0
}If you have an Enterprise account and you would like Emedgene-managed DRAGEN solution to save the DRAGEN output files in your own bucket, reach out to [email protected].
Emedgene visualizes data in IGV directly from your AWS S3 bucket. In order to do it, you should enable CORS for the Emedgene application URLs.
FASTQ
FASTQ/BAM/CRAM (input)
Reanalysis will fail (will be fixed)
FASTQ
CRAM (Output)
Reanalysis will fail
FASTQ
VCFs
Reanalysis will fail
FASTQ
CSV, etc
Reanalysis will fail
VCF
BAM/CRAM (visualizations)
Visualization will fail
VCF
VCF (input)
Reanalysis will fail
VCF
CSV, etc
Reanalysis will fail (will be fixed)
This feature is only related to saving Dragen output files in your own bucket when using Dragen through Emedgene (without ICA).
If you are looking to:
Import data from AWS S3 to Emedgene go to Manage data storages
Integrating any data storage to Emedgene go to Manage data storages
Download any data from Emedgene go to Manage S3 credentials
Bring Your Own Bucket, also known as BYOK, enables you to control your DRAGEN file outputs.
Emedgene-managed DRAGEN solution saves the DRAGEN output files in a detected AWS S3 bucket that you have access to using your S3 credentials.
However, if you have an Enterprise account and you would like Emedgene-managed DRAGEN solution to save the DRAGEN output files in your own bucket, reach out to [email protected] and follow this steps:
Emedgene requires access to the root folder, which means a dedicated bucket might be appropriated.
Bucket policy should allow Emedgene user access to the bucket.
Example bucket policy:
{
Coming Soon
}Emedgene visualizes data in IGV directly from your AWS S3 bucket. In order to do it, you should enable CORS for the Emedgene application URLs.
Example CORS policy:
{
Coming Soon
}We will require to run a case and validate the managed DRAGEN pipeline finish successfully and all features are available in the platform.
If a customer enables an AWS S3 Lifecycle policy in order to archive or change the S3 tiers for different files, they might create an adverse effect on the platform.
FASTQ
FASTQ/BAM/CRAM (input)
Reanalysis will fail (will be fixed)
FASTQ
CRAM (Output)
Reanalysis will fail
FASTQ
VCFs
Reanalysis will fail
FASTQ
CSV, etc
Reanalysis will fail
VCF
BAM/CRAM (visualizations)
Visualization will fail
VCF
VCF (input)
Reanalysis will fail
VCF
CSV, etc
Reanalysis will fail
(will be fixed)
This guide provides a step-by-step process for creating a new case via the user interface. Detailed instructions for each step are available in the corresponding pages of the section.
Caution: Please note that refreshing or leaving the page, exiting the Add new case tab, or power failure of your computer before you've completed adding a new case will result in loss of the case creation progress.
Click on the Add New Case button on the top navigation panel.
At the Select sample type page, choose the file type for your case analysis (FASTQ, gVCF, VCF, or Array).
Click Next to proceed.
The page is divided into two panels: Create family tree (left) and Add patient information (right).
Use the visual tool to build the pedigree.
Add Clinical Notes (optional) in free text, or upload a clinical presentation file (.pdf, .xls, .txt, .doc, .jpeg, .jpg).
HPO terms for phenotypes and diseases will be extracted and can be linked to the proband.
Select suspected Inheritance mode(s) (for record only; not used in the analysis).
Decide whether to include Secondary findings in the proband for the AI Shortlist (checkbox).
For each family member:
Add a sample (use a unique file path unless reusing samples).
The Add New Case flow does not validate that sample IDs are unique or that input files are uncorrupted. Please ensure sample IDs are unique and that input files are valid before creating the case.
If a QC metrics file (metrics.tar.gz) is uploaded from BSSH, it will not be processed.
Keep file names under 255 characters and avoid spaces or parentheses in file paths.
Always ensure sample IDs are unique to prevent case failure.
If using joint gVCF input, place the proband first for accurate insufficient region calculation.
The UI does not allow reusing the same gVCF file for multiple samples.
Fill in a sample name (for VCF input, this must match the header in the file).
Complete the required patient details: for a proband and for non-proband samples.
Click Next to proceed to the Case info screen.
Here you define how the analysis will run:
Case type: Choose Array, Custom Panel, Exome, Whole Genome, or Other.
For Exome cases, variants outside exons ±50 bp are automatically filtered.
Carrier Analysis: Optional checkbox. Requires a targeted gene list.
Select an enrichment kit (if applicable) or "No kit".
If provided, kit details (Lab, Machine, Reagents, Expected coverage) will be used to compare coverage depth and breadth.
If no kit is provided, RefSeq coding regions will be used as reference.
Gene list options:
All genes
Phenotype-based genes
Existing gene list
Create a new gene list
You may combine multiple gene lists into one, or add specific genes to an existing list during case creation. The merged list behaves like any other list in the platform.
Preset group: Select the Preset group appropriate for this case type.
If none is selected, the default Preset group is applied automatically (marked as default).
Consent: Confirm subject consent for extended sharing.
Additional case info (optional):
Indication for testing (free text).
Labels (choose from predefined organization labels; these cannot be changed later).
At the Summary stage, confirm case type, gene list, and other selections.
Caution: Clicking Next here will finalize case creation. After delivery, only the proband’s phenotypes can be edited without reanalysis.
After the case is created:
The Case ID is displayed.
You may add participants so colleagues receive notifications on status changes or updates.
A region of interest (ROI) BED file determines which genomic regions will be included in the variant analysis. It functions as a preprocessing filter, determining which variants proceed to annotation and interpretation.
If no custom ROI BED kit is applied to a case, the system applies a default ROI BED file based on the case type. All default ROI BED files are available for download (see Default ROI kit details).
Research Genome
None
Whole Genome
Exome
Custom Panel
A wide range of genomic regions BED file. It contains:
"RefSeq ALL" transcripts and "GENCODE" full genes regions with 5Kbp upstream and 5Kbp downstream
Within this range, all “Clinical Regions” are included
All dosage regions (HI/TS sig level 1, 2 or 3)
Moreover, liftover versions of both reference regions were included, for the current and previous range versions.
Sources:
Liftover done using CrossMap (v0.5.2), chain hg19ToHg38.over.chain.gz
NCBI RefSeq regions are based on the release 105 (hg19) and 110 (hg38)
Gencode regions are based on the release V19 (hg19) and V41 (hg38)
All microRNA genes based on HGNC miRNA definition December 2022
ClinGen Dosage region Dec 2022
Promoters from EPDnew human version V6
mtDNA CRS
RNA disease genes based on OMIM and HGNC (Dec 2022): ATXN8OS, TERC, IL12A-AS1, FAAHP1, NUTM2B-AS1, GAS8-AS1, RNU12, MIR204, IGHG2, SLC7A2-IT1, MIR99A, RMRP, XIST, MEG3, DIRC3, MIR17HG, GNAS-AS1, LRTOMT, LINC00299, DUX4L1, MIR137, MIR140, MIR605, SNORD118, RNU4ATAC, HELLPAR, IGHG1, IGHM, MIR19B1, RNU7-1, LINC00237, MIR2861, MIR4718, IGHV3-21, IGHV4-34, IGKC, KCNQ1OT1, MIR184, MIR96, H19, HYMAI, PCDHA9, UGT1A1, AFG3L2P1, DISC2, SNORA31, TRU-TCA1-1, PCDHGA4, TRAC, ECEL1P3, MIAT
ClinVar variants (ClinVar Dec- 2022) with any pathogenic or likely pathogenic significance (and some drug responses that are affiliated with pathogenicity)
50K STR regions based on the Dragen4.0 Specification file
GRC38_full_genes
37793
2200286025
GRC37_full_genes
35776
2368701647
This is a BED file that includes every clinically relevant region. The following are included:
“RefSeq Curated” and “GENCODE” regions with flanking areas of 50bp from each side 5UTR and 3UTR region for protein coding genes (based on RefSeq)
OMIM disease-related RNA genes (flanking 50bp)
All Clinvar Pathogenic variants regions (flanking 50bp)
Promoters region (EPDnew human version 006, flanking 50bp)
Known STR regions (Dragen 4.0 specification file)
All microRNA genes (flanking 50bp based on HGNC)
Full mtDNA region
For consistency, the GRCh38 version includes the lifted over regions of GRCh37 (liftover using CrossMap).
GRC38_clinical_regions
237652
121694892
GRC37_clinical_regions
230619
119594638
By using the multi-selection mode, you can speed up review by applying actions to multiple variants at once. This makes it easier to apply tags, assign pathogenicity, and review status without handling each variant individually.
Hover over a variant to reveal a checkbox at the start of the line;
Selecting a checkbox activates the Multi-select actions bar, replacing the Search bar.
Checkboxes appear for all variants in the current view.
Select variants individually or use the Select all checkbox.
Important: Select all applies only to variants currently displayed on the page, not all variants matching your filters.
Change the review status for multiple variants at once:
Select the variants of interest.
In the Multi-select actions bar, click the Viewed icon.
Choose Viewed or Un-viewed.
Assign or manage tags across multiple variants at once:
Assign tags
Select the variants of interest.
Click the Tag icon in the Multi-select actions bar.
Choose a tag from the dropdown menu.
If a tag is already assign to at least one variant within the selected variants of interest, a confirmation window will appear showing the current assigned tag and the new tag per variants.
When multiple tags per variant are enabled, tags are always added in addition to existing tags. no confirmation window will appear.
All tags assigned to a variant are displayed in the Tag column.
If both user tags and AI tags exist, only user tags appear in the column.
Remove tags
Select the variants of interest.
Click the Tag icon.
Choose one of the following:
Clear – removes user-assigned tags.
Not relevant – removes tags assigned by Emedgene.
Set or remove pathogenicity classifications for multiple variants:
Assign pathogenicity
Select the variants of interest.
In the Multi-select actions bar, click the Pathogenicity icon.
Choose a classification (Pathogenic, Likely Pathogenic, VUS, Likely Benign, Benign).
Clear pathogenicity
Select the variants of interest.
In the Multi-select actions bar, click the Pathogenicity icon.
Select Clear to remove the existing classification.
Tips:
Use bulk tagging to ensure consistency across variants that share biological or interpretive context.
Combine tags and pathogenicity assignments in bulk to streamline reporting prep.
For large variant sets, apply filters first to narrow down your view before using multi-selection.
Warnings:
Bulk edits apply immediately to all selected variants. Double-check your selections before applying.
Removing tags or pathogenicity affects all team members working on the case—coordinate with your team to avoid conflicts.
Once a case is finalized, bulk actions are disabled to protect case integrity.











Bring Your Own Key (BYOK) is a security feature that allows organizations to use their own encryption keys to protect their data. This ensures that they maintain control over their encryption keys and, consequently, their data.
Illumina integrates with leading Key Management Services (KMS), including Azure Key Vault and AWS KMS, so organizations can maintain full control over their encryption keys. These integrations combine Illumina’s Bring Your Own Key (BYOK) feature with your preferred KMS provider to deliver robust key management and enhanced data security.
Azure Key Vault is a cloud service that provides a secure way to store and manage sensitive information like API keys, passwords, and certificates. It offers robust features for key management, including key generation, storage, and lifecycle management.
AWS Key Management Service (KMS) allows you to create and control encryption keys used to encrypt your data across a wide range of AWS services and applications. It provides centralized management of encryption keys and integrates seamlessly with other AWS services.
Losing the encryption key means that all data encrypted with that key will be inaccessible. This can lead to permanent loss of access to crucial information.
It is crucial to securely store and manage your keys to prevent such risks.
The API server encrypts the organization's information before storing it in the database and decrypts it when needed (e.g., during pipeline execution). The key vault is managed by the organization.
To configure encryption in Emedgene, you need the following information from Azure Key Vault:
Application tokens:
Client Id
Tenant Id
Client Secret
The key information:
Key URL
Navigate to App registrations
Click Register to create a new application and and fill in the required details
After registration, copy and save the Application (Client) ID and Directory (Tenant) ID
In the left menu, select Certificates & Secrets
Click New client secret. Copy and save the Value (Client Secret) immediately, as it is shown only once.
Click New Key (Create key vault)
Specify the key vault name, region (for example, East US), and pricing tier
Click Next to go to Access Policies
Select Add access policy, and set Key permissions:
Key Management Operations
Cryptographic Operations: Decrypt, Encrypt, Unwrap Key, Wrap Key
Set Secret permissions:
Secret Permission: Get
Select Principal: select the application you created earlier
Finish with Review + create
Navigate to the newly created Key vault
In the left menu, select Keys, and then select the key
Select the current version
Copy the Key Identifier (Key URL):
https://<key-vault-name>.vault.azure.net/keys/<key-name>/<key-version>Description is coming soon.
The API server will encrypt the client's information before storing it in a database and decrypt that information when needed (e.g., running the pipeline). The key vault is managed by the client, and Emedgene will only be provided with access to encrypt/decrypt functions in that key vault. This guarantees that clients control access to the information.
Illustration of data flow when creating a case in Emedgene platform:
Illustration of data flow when reading a case data from emedgene platform:
A preliminary step to this solution is having a key vault owned by the client, and a key that Emedgene is given access to.
The client will create an access policy in the key vault of type “Application” and provide the matching key and secret to Emedgene. The access policy must contain permissions to perform encrypt and decrypt actions.
In order for Emedgene to integrate with the key, depending on the key vault provider, the client needs to provide the following information:
Client Id
Client Secret
Tenant Id
Key vault name
Key name
Since some of our platform search capabilities run directly on the DB, we can’t directly search any data that is encrypted. To overcome this, we will implement a hashing search functionality as follows.
The case data will still be fully encrypted in the DB as it is today
Specific fields we want to make “searchable” - as defined by the customer, we will save their hash value alongside the encrypted data.
Hashing will be done using SHA-256, and will include a secure random generated salt of 32 characters, which will be added to the value.
The salt is unique and will not be used anywhere else in the platform.
When the user enters a string to search, we will hash that value using all the salt values, and search those hash values.
Illustration of data flow when searching in Emedgene platform:
Illustration of data flow when creating a case with searchable field in Emedgene platform:
The Genes coverage section helps you quickly identify parts of genes that may not have been adequately sequenced in your case. This insight is particularly important when assessing sequencing quality, interpreting uncertain findings, or deciding if further validation is needed.
While variant callers provide base-by-base coverage, Emedgene simplifies the view by showing average coverage per region. This makes it easier for you to spot undercovered genes at a glance, even when individual positions may appear sufficiently covered. By smoothing out local fluctuations, average coverage helps you prioritize regions that might require further review and complements DRAGEN's fine-grained metrics with a broader, more interpretable view.
Coverage metrics are generated differently depending on the type of input data used for your case:
1. FASTQ / BAM or gVCF-based cases
If your case was started from FastQ/BAM or gVCF, coverage is inferred from gVCF reference blocks (also called GVCFBlocks).
These blocks are segments of the genome where genotype quality (GQ) is consistent.
A new block is created whenever there's a significant change in GQ, which results in a highly segmented and detailed representation of local sequencing quality.
Coverage for a region is based on the median coverage of each gVCF block. If a region spans multiple blocks, the reported value is the average of those medians.
Within a region (like an exon), you’ll often see multiple blocks. Emedgene aggregates them to show you:
Average depth
Minimum depth
Depth range
Occasionally, some blocks may be unusually large and may miss internal variation—for example, in genes like XIAP, one block could span an entire region despite having uneven coverage inside.
2. VCF + BAM or VCF + BED-based Cases
If your case includes VCF and BAM or VCF and BED, coverage is calculated directly from the aligned reads or from predefined BED intervals.
Coverage is calculated as the true base-by-base average across the entire region.
This method avoids the variability of gVCF segmentation and gives a precise coverage profile for each region.
Tip: Before comparing coverage values across cases, check whether the case was processed from FASTQ/gVCF or VCF with alignment files (BAM/CRAM). The calculation method differs, so values may not be directly comparable.
Limitation: Coverage estimation is not supported for VCF + CRAM cases. If a CRAM file is used with a VCF file, as opposed to a BAM or a BED file, the Genes Coverage table will remain empty for virtual panel cases.
Coverage is compared against expected regions defined in:
Emedgene's reference BED file, or
Your test’s custom KIT BED file
Each region is defined by:
Chromosome
Start & end positions
Name and strand (Optional)
Emedgene uses the tool bedtools intersect to compare each expected region from the regions used for coverage assessment against actual read coverage. The system captures:
How much of the region overlaps with sequenced data
Depth of coverage per segment
Each region includes these metrics:
Warning:
Minimum depth for FASTQ / BAM / gVCF-based cases does not represent minimum depth but Minimum average depth within the GVCF block.
You can interactively explore gene-level coverage details using the Genes with Insufficient Coverage tab. This tool is currently available only for FASTQ-based cases.
Here’s what you can do:
Search for a specific gene or a list of genes.
Filter results based on coverage thresholds:
≤0x
≤5x
≤10x
≤20x
or All
Download tables or genomic coordinates for regions with poor coverage.
Click More details to open a pop-up with exact genomic coordinates of low-coverage blocks.
To check coverage for a gene:
Enter the gene symbol in the search box and select it.
Choose your desired coverage filter from the dropdown.
Review the results in the table or download the data.
Click More details to inspect the specific coordinates of undercovered regions.
Click the Add Gene List button and select any of your pre-loaded gene lists.
By maximum depth of coverage
Select Coverage, then choose the highest allowable coverage value from the dropdown list,
By percentage of bases covered >20×
Select % of Bases Gt20, then choose the highest allowable percentage from the dropdown list.
Visual review in allows manual variant confirmation by inspecting aligned reads at specific genomic regions.
Click on More details in the row corresponding to the gene of interest. This opens a pop-up with coverage details for the selected gene.
In the pop-up, select View on IGV to open the region in the IGV desktop application.
Click the Download button to export the full list of low-coverage regions as a *_insufficient_regions.tsv file. Each row includes region coordinates and all metrics.
Each row corresponds to one region and includes:
Region coordinates
Calculated coverage metrics
Region length
Use this file to:
Compare multiple cases
Track sequencing gaps
Plan confirmatory testing
Share results with collaborators
Min Depth
Lowest depth in the region (for gVCF-based cases: lowest avg depth in a block)
Max Depth
Highest depth observed
Average Coverage
Mean read depth across the region
% ≥3×
Percent of base pairs with at least 3x coverage
% ≥20×
Percent of base pairs with at least 20x coverage
Length
Region length in base pairs





The following are the general format requirements for a CSV file used to create multiple cases:
The file must have a .csv extension.
The file must contain a [Data] header.
The row after [Data] header must include the field names identifying the data in each column. The column names are case-sensitive.
The row after the column name header and each subsequent row represents a sample.
Each column represents a data field.
It is essential that there are no empty rows between the [Data] header and the last sample row.
Number of cases per file can’t be greater than 50.
Must be present in the sample table at all times.
Case Type;
Family Id;
Phenotypes OR Phenotypes Id.
If these fields are left empty, it will result in the creation of an empty sample.
BioSample Name;
Files Names;
Storage Provider Id;
This field is mandatory if Files Names is empty:
Sample Type.
This field is required if the "auto" option is used for Files Names (only relevant for BSSH):
Default Project.
The sample table may include these supported optional columns.
Boost Genes
Clinical Notes
Date Of Birth
Due Date
Execute now
Gender. See an important note
Gene List Id
Kit Id
Intersect Bed Id (38.0+)
Label Id
Opt In
Relation
Selected Preset
Visualization Files
The sample table may contain custom columns to suit your specific needs and include any relevant information that is important for your workflow.
Each custom field must be assigned a unique name without spaces. Data from custom columns is saved per case under the Additional information section of Case Info.
Institution
Free text
Custom
GenoMed Solutions
Sample_Received_Date
Free text
Custom
24-02-2022
Sample_Type
Free text
Custom
Amniotic Fluid
Mandatory (highlighted in red), Conditionally mandatory (highlighted in orange), and Optional fields should be filled in according to the following rules.
BioSample Name
Free text
Conditionally mandatory. An empty sample will be created if the field is left blank.
NA24385
Boost Genes
1. "TRUE" 2. "FALSE"
Optional. Indicates whether the will be used. "TRUE" means that variants in the targeted genes will receive upgraded scores during prioritization by the AI Shortlist algorithm. Default value is "FALSE". Only considered for proband.
TRUE
Case Type
1. "Whole Genome" 2. "Exome" 3. "Custom Panel" 4. Array
5. Custom case type
Mandatory. Only considered for proband.
Whole Genome
Clinical Notes
Free text
Optional
A 14-year-old boy with a visual acuity of 20/200 in both eyes in whom hearing loss was first noted at 5 years of age on routine screening; audiometry revealed sensorineural hearing loss.
Date Of Birth
Date "YYYY-MM-DD"
Optional
2013-01-22
Default Project
Free text
Conditionally mandatory. Must be filled in if the "auto" option is used for Files Names (only relevant for BSSH).
GIAB
Due Date
Date "YYYY-MM-DD"
Optional
2023-05-03
Execute now
1. "TRUE" 2. "FALSE"
Optional. Default value is "TRUE". Use "FALSE" if you don’t want to run the case upon uploading the file. Only considered for proband.
FALSE
Family Id
Free text
Mandatory
RM8392
Files Names
1. Semicolon-separated list of paths to .fastq, .fastq.gz, .vcf, .vcf.gz, .bam, .cram, .gt_sample_summary.json, .annotated_cyto.json files without spaces
2. "existing"
3. "auto" (BSSH)
Conditionally mandatory. An empty sample will be created if the field is left blank. The "existing" option automatically locates FASTQ files based on the BioSample Name. Note: If data files for an existing case were sourced from the customer’s external bucket and later removed, attempting to create a case from those files will result in an error.
Learn about the . With the "auto" option, BSSH users can automatically locate FASTQ files based on the BioSample Name and Default Project provided. When using BSSH without the "auto" option, ensure that your file path is .
/GIAB_cases/1/NA24385.dragen.hard-filtered.gvcf.gz;/QA_cases/Other/NA24385.dragen.cnv.vcf.gz;/QA_cases/Other/NA24385.dragen.repeats.vcf;
Gender
1. "F" 2. "M" 3. "U"
Optional. Default value is "U". See an .
M
Gene List Id
integer
Optional. Must be the id of a previously defined Gene List. Only considered for proband.
12345
Kit Id
integer
Optional.
<38.0: ID of a Region of interest BED.
38.0+: ID of a Coverage BED. Must be the id of a previously defined kit. Only considered for proband.
23456
Intersect Bed Id (38.0+)
integer
Optional. ID of a Region of interest BED. Must be the id of a previously defined kit. Only considered for proband.
78957
Label Id
integer
Optional. Must be the id of a previously defined Case Label. Only considered for proband.
34567
Opt In
1. "TRUE" 2. "FALSE"
Optional. Indicates whether the case subject consented to the with your network(s). Default value is "TRUE".
FALSE
Phenotypes
Semicolon-separated list of HPO phenotype terms
"Unaffected" is used for non-affected family members.
Mandatory for proband sample if Phenotypes Id is empty. List must be under 100. It is possible to include non-HPO terms if Phenotypes Id is empty.
Abnormal pupillary function;Orthotopic os odontoideum;
Phenotypes Id
Semicolon-separated list of HPO phenotype IDs
Mandatory for proband sample if Phenotypes is empty.
List must be under 100.
HP:0007686;HP:0025375;
Relation
1. "proband" 2. "mother" 3. "father" 4. "sibling"
Optional. Default value is "proband". Values "proband", "father", "mother" can be only used once per Family ID. One sample with Relation "proband" is required per Family ID.
Mother
Sample Type
1. "FASTQ" 2. "VCF"
Conditionally mandatory. Required if Files Names is empty. Only considered for proband.
FASTQ
Selected Preset
1. Free text 2. "Default"
Optional. Must be the name of a previously defined Preset. If set to default, the default Preset will be applied. If left empty, no Preset will be applied.
High quality candidates
Storage Provider Id
Integer
Conditionally mandatory. Required if Files Names is not empty. Must be from the configured storage provider ID list.
208
Visualization Files
Semicolon-separated list of paths to sequence alignment data files of extension .bam, .cram; .tn.bw, .baf.bw, .roh.bed, .lrr.bedgraph, .baf.bedgraph
Optional
/giab_project/NA24385.bam
When a sample is user-assigned "Unknown" sex, the system assumes "Female". This affects CNV interpretation on sex chromosomes in case the genetic sex is actually male:
Chromosome X: CN = 2 is considered reference (REF) for a female genome, so CNVs with two copies are hidden by default. This may cause chromosome X duplications to be missed.
Chromosome Y: CN = 0 is considered reference (REF) for a female genome, so CNVs with zero copies are hidden by default. This may cause chromosome Y deletions to be missed.
To include these variants in the analysis, enable the Include Reference Homozygosity and No Coverage Calls toggle in Workbench & Pipeline Settings.
For BSSH, it is necessary to use the actual names (numbers):
/projects/3824821/appresults/2319318/files/119675608instead of aliases
/projects/ABC_DEF_2022-12-22_DEv395/appresults/ABC-GM58342-def/files/ABC-GM58342-def.hard-filtered.vcf.gzIn version 37, we introduced an enhancement to the batch upload process that allows you to provide a human-readable path in their batch CSV for BSSH files.
When a batch CSV includes a human-readable path, the system performs the following validations for paths in BSSH storage:
Single File in the Path:
If the provided path contains exactly one file or dataset, the batch upload proceeds successfully.
Two Files in the Path:
If the path contains two files with the same name (for example, two pairs of fastqs in a dataset) , the system will:
Select the dataset marked as QCPassed.
Fail the batch upload if both datasets are marked as QCPassed, as this indicates conflicting data.
More Than Two Files in the Path:
If the path contains more than two files or datasets, the system fails the batch upload, as the path is considered ambiguous or invalid.
Multiple QCPassed Datasets: If two datasets in the same path are marked as QCPassed, the batch upload will fail with a descriptive error indicating the conflict.
Excessive Files in the Path: If more than two files are found for the provided path, the batch upload will fail, instructing the user to provide a more specific or valid path.
Enables customers to use intuitive, human-readable paths in their workflows.
Automatically handles dataset selection based on quality control status.
Emedgene provides the tightest integration with DRAGEN for germline variation analysis, providing accuracy, comprehensiveness, and efficiency, spanning variant calling through interpretation and report generation.
The Emedgene platform supports a variety of variant callers and applies specific quality parameters for each. The quality assessment is an essential step in the Emedgene pipeline because variants with low quality will not be considered by the AI components.
If the variant caller is not supported or not recognized, a default quality function will be applied. The default parameters are built on GT (genotype), depth (DP) and allele bias (AB). These fields are mandatory, and their absence will induce “Low quality” for all variants.
The following variant callers are currently supported on the Emedgene pipeline, providing a header with the variant caller command line should be present within the VCF headers.
Additional callers can be supported on demand under license.
100.39.0+
SNV, CNV, STR, SV (del/dup/ins), Targeted, MRJD, Ploidy
36.0+
SNV, CNV, STR, SV (del/dup/ins), Targeted, MRJD, JSON PGx*
All
SNV, CNV, STR, SV (del/dup/ins), SMN, JSON PGx*
4.2
All
SNV, CNV, STR, SV (del/dup/ins), SMN
4.0
All
SNV, CNV, STR, SV (del/dup/ins)
3.10
All
SNV, CNV, STR, SV (del/dup/ins)
3.6-3.9
All
SNV
1.3
100.39.0+
Cyto
1.2
37.0+
Cyto
AED CNV
N/A
Affymetrix Extensible Data. converted to VCF
CNVReadDepth
5.12, 5.20
SmallVariant
N/A
SmallVariant
1.38
CNVReadDepth
Clair3
v37.0+
SmallVariant
N/A
SmallVariant
ClinSV
N/A
SVSplitEnd
N/A
CNVReadDepth
CNVReporter
0.01
CNVReadDepth
1.0
CNVReadDepth
N/A
CNVReadDepth
cuteSV
2.02
v37.0+
SVSplitEnd
Multi-Sample Viewer:1.0.0.71
Unknown
1.0.0
SmallVariant
N/A
SVSplitEnd
0.1
CNVReadDepth
ExomeDepthAM
0.1
Private fork of ExomeDepth
CNVReadDepth
N/A
SmallVariant
3, 3.4, 3.5, 2014, 4, 4.1
SmallVariant
GATK Mutect
N/A
SmallVariant
Scramble
Running: scramble2vcf.pl
SmallVariant
1.4
SmallVariant
4.x, 5.x and not: 5.12, 5.20
SmallVariant
IONTorrent CNV
5.16
CNVReadDepth
2.2.0
SVSplitEnd
N/A
SmallVariant
2.X
SmallVariant
2.1.1
SVSplitEnd
2.2.4
SmallVariant
2.2.4
SVSplitEnd
2.X
SVSplitEnd
5.2.9
SmallVariant
201808, 201911, 202010
SmallVariant
201808.03
SmallVariant
2.0.6, 2.0.7, 2.5
SVSplitEnd
0.0.2
SmallVariant
2.0.1
CNVReadDepth
Spectre
v37.0+
CNVReadDepth
2.4.5
SmallVariant
N/A
SmallVariant
N/A
SVSplitEnd
The variant table displays all variants identified in your case, along with key annotations, quality metrics, pathogenicity data, and interpretation details. Each column provides specific information to help you review and prioritize variants effectively.
This guide explains the meaning of each column for proband analysis and trio analysis, with sorting features and scoring details.
Variant details
Displays genomic coordinates and basic variant identifiers.
SNV/Indel: Genomic position, nucleotide change, and dbSNP ID
CNV/SV: Genomic coordinates and variant size
Allows sorting by genomic start location.
Gene
Gene identifier.
SNV/Indel/single-gene CNV: An HGNC-approved gene symbol
Multi-gene CNVs: A list of HGNC-approved gene symbols and the number of genes included if only part of the list is shown.
Tip: If only the beginning of the list is displayed in the table, you can see the full gene list in the pop-up tooltip.
Variant type
Specifies whether the variant is SNV, Indel, CNV, SV, STR, or other.
Allows alphabetical sorting.
Main effect
Predicted effect(s) of the variant on protein structure and function (transcript-specific). By default the most severe effect is presented.
Allows alphabetical sorting
Disease
Lists the count of disease associations, mode(s) of inheritance, and the name of one of the diseases.
Tip: Hover over the line to see the full disease list in a pop-up window.
Allows alphabetical sorting.
Tag
Variant tag assigned by Emedgene or by a user.
Allows alphabetical sorting.
Known variants
Classification(s) of the variant in ClinVar and your curated variant database.
Allows alphabetical sorting.
Variant notes
Indicates if Variant interpretation notes are available.
Allows alphabetical sorting.
AI rank
Indicates potential causative variants: Most Likely Candidates and Candidates.
Variants with identical scores share the same rank. Ranges from 1 to 220; lower numbers indicate higher rank.
Case reanalysis causes Al ranks to be recalculated.
Allows numerical sorting.
Phenomatch score
Proprietary phenotypic match score ranging from 0 to 1.
Case reanalysis causes Phenomatch score to be recalculated.
Allows numerical sorting.
PhenomeId score
Proprietary phenotypic match score outperforming previous Phenomatch models. Ranges from 0 to 2. A score of 0 means no match, a score above 0.15 suggests a moderate match, and scores above 0.7 indicate a high phenotypic match.
Case reanalysis causes Phenomeld score to be recalculated.
Allows numerical sorting.
Proband quality
Overall score in proband.
SNV/Indel: Based on base quality, depth, mapping quality, and genotype quality
CNV: Based on CNV quality, size, and bin count
Allows alphabetical sorting.
Depth
Variant depth in proband.
SNV/Indel: Sequencing depth of coverage at the variant position
CNV: Depth of coverage across the CNV region
Allows numerical sorting.
Alternate read
Number of alternate reads.
Available only for SNVs.
Allows numerical sorting.
Allele bias
Percentage of reads that include an alternate allele out of all reads.
Available only for SNVs.
Allows numerical sorting.
Bin count
Number of bins supporting CNV detection.
Allows numerical sorting.
Allele freq
Indicates variant frequency category according to the highest allele frequency in public population frequency databases:
Private: 0
Rare: <0.01
Low Frequency: 0.01-0.05
Polymorphism: >0.05
Allows alphabetical sorting.
Emedgene DB frequency (%)
Variant frequency in the Emedgene internal control database.
Allows numerical sorting.
Emedgene DB frequency (#)
Variant allele count in the Emedgene internal control database.
Allows numerical sorting.
gnomAD All AF
Overall alternative allele frequency across gnomAD populations (also called Total AF in the ).
Allows numerical sorting.
gnomaAD allele count
Number of observed alternate alleles in gnomAD dataset.
Allows numerical sorting.
gnomAD Hom/Hemi
Number of gnomAD subjects who are homozygous (autosomal or X-linked variant in a female) or hemizygous (X-linked variant in a male) for this variant.
Max AF (%)
The highest alternative allele frequency among all public population databases.
Note: Not to be confused with Max AF in that only considers gnomAD statistics.
Allows numerical sorting.
Max AF (#)
The highest alternative allele count among all public population databases.
Note: Not to be confused with Max AF in that only considers gnomAD statistics.
Allows numerical sorting.
Historic AF (%)
Variant frequency in the organization’s Historic database, containing results of previously analyzed internal cases.
SNV/Indel: Percentage of samples carrying the variant
CNV/SV: Percentage of cases containing overlapping CNV/SV events, based on internal historic calls
Allows numerical sorting.
Historic AF (#)
Variant allele count in the organization’s Historic database.
SNV/Indel: Number of probands with the same variant.
CNV/SV: Number of cases with overlapping CNV/SV regions.
Allows numerical sorting.
Prediction
Summarized in silico pathogenicity prediction score.
Tip: You can glance at the underlying scores in a pop-up tooltip.
Allows alphabetical sorting.
Conservation
Summarized nucleotide conservation score.
Tip: You can glance at the underlying scores in a pop-up tooltip.
Allows alphabetical sorting.
Splice prediction
Summarized splicing impact prediction score.
Tip: You can glance at the underlying scores in a pop-up tooltip.
Allows alphabetical sorting.
Coding change
HGVS-compliant coding sequence change notation.
Allows alphabetical sorting.
Protein change
HGVS-compliant protein change notation.
Allows alphabetical sorting.
Variant length
Variant size in kilobases (relevant for CNVs/SVs).
Allows numerical sorting.
Cytoband
Chromosomal cytogenetic band where variant is located.
Allows alphanumeric sorting.
ISCN
Cytogenetic description of a chromosomal abnormality, using the International System for Human Cytogenomic Nomenclature (ISCN).
Allows alphanumeric sorting.
Pathogenicity
Pathogenicity classification assigned in the .
Allows alphabetical sorting.
Manual classification
Displays pathogenicity classifications previously assigned by members of the organization to the same variant in earlier cases. Badge color indicates pathogenicity class while badge number indicates count.
Tip: hover over the badge to see pathogenicity.
Allows alphabetical sorting.
Networks classification
Displays pathogenicity classifications assigned by partnering organizations. Badge color indicates pathogenicity class while badge number indicates count.
Tip: hover over the badge to see pathogenicity.
Allows alphabetical sorting.
Proband zygosity
Variant zygosity status in the proband. Allows alphabetical sorting
Mother zygosity
Variant zygosity status in mother.
Allows alphabetical sorting
Father zygosity
Variant zygosity status in father.
Allows alphabetical sorting.
Mother quality
Overall score in mother.
SNV/Indel: Based on base quality, depth, mapping quality, and genotype quality
CNV: Based on CNV quality, size, and bin count
Allows alphabetical sorting
Father quality
Overall score in father.
SNV/Indel: Based on base quality, depth, mapping quality, and genotype quality
CNV: Based on CNV quality, size, and bin count
Allows alphabetical sorting.
Mother depth
Variant depth in mother.
SNV/Indel: Sequencing depth of coverage at the variant position
CNV: Depth of coverage across the CNV region
Allows numerical sorting
Father depth
Variant depth in father.
SNV/Indel: Sequencing depth of coverage at the variant position
CNV: Depth of coverage across the CNV region
Allows numerical sorting