1 of 100

Emedgene

Get Started with Emedgene

Get started with Emedgene

Welcome to Emedgene, where we unlock genomic insights for hereditary disease and streamline your tertiary analysis workflows.

So you've signed in and can't wait to get started? Here we will guide you through the platform architecture, case creation, and results review. You can dive a bit deeper by following the links and exploring manuals for the platform's applications:

—Genomic analysis workbench, where you can accession, interpret, curate and report on your cases, while also efficiently managing the lab workflow
—A repository for all of your organizational curated knowledge

Look around

The platform is operated from the .

By clicking on the corresponding buttons, you can enter:

tab
page
menu
dropdown menu
dropdown menu

Create a case

To enter the flow, click on the namesake button on the . Here:

Select file type

Upload files

Create a family tree

Annotate each sample with clinical information

Specify analysis details

Launch the analysis!

Your case status will be In progress. You'll be notified when results are ready and the case is in status Delivered.

Examine the analysis results

Select a case to review on the tab. You'll be directed to the that:

Showcases an AI-curated suggested to be checked first, namely and
Provides numerous customizable to help you by yourself
Documents all the case-related information like , , and used during case analysis

Investigate the evidence on the and assign appropriate to the variants of interest.

When you're ready to , indicate the end result of the analysis and variants to be reported in the Case interpretation widget.

How can Emedgene help you solve a case?

The AI-powered Emedgene platform utilizes machine learning throughout the analysis and interpretation workflow to deliver the fastest time from genomic data to decisions. We apply machine learning models that retrieve evidence-backed answers and provide exceptional decision support.

Using automated interpretation algorithms, Emedgene generates an accurate shortlist of up to 10 potential causative variants. In a joint study of 180 solved cases with Baylor Genetics, 96% of cases were successfully solved by the algorithm. See Meng et al, , 2023 publication for more details.
The platform is not a black box, and overlays a layer of explainable AI (XAI), presenting supporting evidence from the literature and databases which significantly reduces the time to interpret a case.
The algorithms use a proprietary Emedgene knowledge graph which incorporates information extracted from literature with Natural Language Processing, as well as from public databases and is updated on a monthly basis.
Dozens of additional algorithms are incorporated throughout the workflow.

Overall, the system combines AI in a highly optimized and customizable workbench, in order to automate the most time-intensive aspects of genomic analysis and research.

Emedgene Analyze manual

Getting around the platform

Top navigation panel

The top navigation panel serves as a guide to the platform. It includes:

Case search bar
Dashboard tab
Cases tab
Add new case button
Emedgene applications dropdown menu to switch between Analyze and Curate
Help dropdown menu under a question mark icon
Settings dropdown menu activated by clicking the username or profile picture

Dashboard tab

The Dashboard tab depicts an overview of the user activity on the Emedgene platform and provides a glance at key performance indicators for an organization.

Lefthand panel

Diagnostic Yield card presents the proportion of "solved" cases out of the total number of the organization's cases of the same type.
Status Diagram card displays the total number of the organization's submitted cases as well as the numbers of cases under each status.
Stale Cases card highlights the cases that are stuck at one of the intermediate stages of the analysis, and are not finalized.

Righthand panel

Network Activities panel displays a timeline of activities performed by multiple users within the organization. This log includes activity like creating a case, verifying a filter preset, changing a Case status, generating a report, and more.

Cases tab

The Cases tab provides an overview of genomic sequencing cases submitted by the organization, as well as individual case details.

The Cases tab includes:

Cases table—displays a list of cases along with key details
Cases table navigation panel—enables customization of the table view, including grouping and filtering of cases
Case details panel—opens when a case is selected, providing additional information

Cases table

Cases table lists key details of all genomic sequencing cases submitted by the organization.

You can customize the table by hiding, showing, rearranging fields, or adjusting column widths, except for Case ID, which is fixed as the first column and always visible.

Cases table fields

Field

Description

Case ID

A unique identifier assigned to each case by Emedgene, formatted as EMGXXXXXXXXX. This field is fixed and cannot be hidden or repositioned in the table. Share this code with Tech Support when reporting issues.

Proband ID

The identifier of the proband. For , this corresponds to the Sample Name; for , it corresponds to the BioSample Name of the test subject.

Phenotypes

Proband phenotypes as submitted by the user.

Status

The current in the system. Custom statuses can be added in the tab under Settings, and their order can be rearranged via drag-and-drop.

You can update the status directly from the Cases table by clicking the status badge and selecting a new status from the dropdown menu.

Creation date

The date the analysis was initiated. This is saved automatically. The field is sortable.

Due date

A customizable field that allows you to set, change, or remove a due date.

Click the calendar icon to set a date. To change it, click the existing date and select a new one. To remove it, click the cross icon next to the date. The field is sortable.

Quality

Indicates the overall case quality. Detailed validation results can be reviewed in the Lab tab. The field is sortable.

Type

Indicates the case type (whole genome, exome, custom panel, array).

Label

A customizable field that allows you to assign custom . Click the pencil icon to add a new label, select an existing one, or remove a label from the case.

Participants

Users involved in the case, whether in submission, analysis, finalization, or those subscribed to updates.

To receive email notifications, click the Subscribe icon. To unsubscribe, hover over your avatar and click the X icon.

User groups

as defined in Settings. Each group appears as a separate column in the table.

Cases table navigation panel

The Cases table navigation panel provides several tools to help you customize your table view and manage cases. It includes the following components:

menu Use this to narrow down the list of cases.
menu Organize your cases by case status
menu Choose which columns are visible in the table and define their order
button
Permanently delete cases currently in the trash. Use with caution, as this action cannot be undone

Case details

The Case details panel provides comprehensive information about a particular case.

The Case details panel is organized into three tabs:

Case info—displays technical, operational, and clinical information about the case
Family tree—shows a graphical pedigree and sample details for each family member
Activity—provides a timeline of all actions taken within the case for audit and collaboration

How to access the Case details panel

From the

Click on the row of the case you want to view. A pop-up side Case details panel will appear on the right. To close the panel, click the X icon in the top right corner.

From an

To expand the Case details panel, click the left-pointing arrow icon on the right edge of the screen. To collapse it, click the right-pointing arrow icon at the top left of the panel.

Case info

The Case info tab includes the following information:

Case ID—a unique identifier assigned to each case by Emedgene, formatted as EMGXXXXXXXXX
Case type—the type of analysis performed:
- Whole Genome
- Exome
- Custom Panel
- Array
Sample type—the format of the sample files used in the case:
- FASTQ: *.fastq.gz, *.fq.gz, *.bam, *.cram.
- Project VCF: *.pvcf, *.vcf, *.vcf.gz, *.pvcf.gz
- VCF: *.vcf, *.vcf.gz, *.targeted.json, *.gt_sample_summary.json
Gene list—defines whether gene list was used during analysis and how it was applied:
- All genes—AI Shortlist was neither confined to nor prioritized a specific gene list
- Virtual panel (In silico panel)—AI Shortlist was limited to only the genes in the gene list
- Boosted gene list—AI Shortlist analyzed variants in all genes, but variants in the gene list were given higher priority
Analysis type:
- If field is not present—carrier analysis was not performed
- Carrier—carrier analysis was performed for the selected gene list
Human reference—the genome reference used during case analysis

Ordered by—the user who created the case and the case creation date
Signed by—the user who finalized the case
Related cases—the Case IDs of other cases that share one or more samples with the selected case
Due Date—the user-defined deadline for finalizing the case. To enter or edit the Due Date, click the calendar icon in the Due Date section
Participants—Users involved in the case, whether in submission, analysis, finalization, or those subscribed to updates. To receive email notifications, click the Subscribe icon. To unsubscribe, hover over your avatar and click the X icon

Patient Information—basic demographic details:
- Sex. Specified by the user
- Age. Automatically calculated in years based on the provided date of birth
Clinical Information:
- Proband phenotypes—HPO terms used to describe clinical findings in the proband
- Suspected disease—if provided, includes the suspected condition, penetrance (%), and severity (mild, moderate, severe, or profound)
- Maternal and Paternal ethnicity—ethnic background of the proband’s parents
- Parental consanguinity—indicates whether the parents are related by blood
- Report secondary findings—specifies whether secondary findings analysis was requested
Clinical note—free-text notes provided at the time of launching the analysis

Additional case information can be added using custom fields, either via the API or by including extra columns in your CSV during batch case creation. This allows you to extend the case details panel with project-specific data. To enable this feature or learn more, please contact [email protected].

Family tree

The Family tree tab includes the following information:

Pedigree diagram. Pedigree legend can be found here.
Sample details for each family member:
- Phenotypes. For family members other than the test subject, phenotypes are categorized as:
  - Related—directly match one of the proband’s phenotypes
  - Unrelated—do not match any of the proband’s phenotypes
- Medical Condition – Indicates whether the individual is considered Healthy or Affected in the case
- Sex. Specified by the user
- Age. Automatically calculated in years based on the provided date of birth
- Maternal and Paternal ethnicity—ethnic background of the proband’s parents
- BAM file location. Shown where relevant

How to open a case

To open a case:

A. Hover over the corresponding row in the Cases table and click on the Open case link next to the Case ID in the first column

B. Alternatively, double-click the row

How to customize Cases table view

How to select columns to be displayed

Click Fields

In Fields menu, use the toggle switch next to each field name to show or hide columns based on your preferred view

B. Hide a column directly from the Cases table

In the Cases table, click the column title you want to hide

From the dropdown menu, select Hide column

How to change column order

You can reorder columns in three ways.

A. Drag and drop the column

Hover over the column title

Click the six-dot icon that appears on the left to the title

Drag and drop the column

Click Fields in Cases table navigation panel

In Fields menu, hover over the field name

Click the six-dot icon that appears on the left to the title

Drag and drop the field

Click the column header

From the dropdown menu, select Move left or Move right

How to adjust column width

Hover over the left or right border of the column header cell

When the resize cursor appears, click and drag the border to your desired width

How to filter cases

Available filters

You can filter cases using the following fields:

Case ID
Sample ID (Proband ID)
Status
Type
Label
Resolved: Resolved or Not resolved
Participants

How to apply filters

Go to the Filters menu in the Cases table navigation panel

Under Field, select the field you want to filter by

Under List, select a value from the dropdown or manually enter one

Click Apply to activate the filter

To add another filter, click Add new under the active filter and repeat steps 1-4

How to remove a filter

Go to the Filters menu in the Cases table navigation panel

To remove a specific filter, click the X icon next to it

How to clear all filters

In the Cases table navigation panel, click the X icon next to the Filters menu

How to group cases

To organize cases by status, navigate to the Cases table, click on Group on the navigation panel, and select Status. To remove the grouping, select None.

How to sort cases

You can sort cases by Creation date, Due date, or Quality.

To sort cases:

A. Hover over the column header and click the up or down arrow to sort in ascending or descending order

B. Alternatively, click the column name and select Sort ascending or Sort descending from the dropdown menu

The current sort direction is indicated by a single arrow icon next to the column name.

Only one column can be used for sorting at a time.

Help

Click on the question mark icon of the top navigation panel to open the Help dropdown menu.

From there, you can access:

Help Center. Feeling curious? Dive right in.
Feature requests. Submit your ideas.
What's New. Stay updated with the latest release notes.

Okta identity management

The Emedgene platform utilizes the Okta Identity Management solution to control user access. This improves user management, enhances access and authentication security, and allows organizations to implement single sign-on for their users.

Managing data storage

Manage data storages

To directly import files from your own storage, link it to an organization's storage in Emedgene.

Note: to manage data storage, you must have Manager and Multiple Storage .

How to link your storage to Emedgene:

Click on the user initials or profile picture at the rightmost corner of the top navigation panel and select Settings

Select the Management tab and proceed to Storage card that lists currently linked storages.

To add a new storage:

Click Add Storage
Choose a storage type from:
1. Azure Data Lake
2. Azure Blob
3. AWS S3
4. File Transport Protocol (FTP)
5. Google Cloud
6. Secure File Transport Protocol (SFTP)
7. Illumina Basespace (BSSH)
8. Illumina Connected Analytics (ICA)
Fill in the required credentials
Click Add storage

Check the connection to confirm that the storage is successfully linked.

To do this, find the storage in the list and check the cloud icon status:

If it's green, the connection is set correctly
If it's red and strikethrough, something went wrong. Hover over the icon to see details

How to edit storage information:

Click Manage on the right to the storage details.

How to remove a link to storage:

Click Delete on the right to the storage details.

If data is deleted or moved from the customer's storage, it might adversely affect the case. To learn more about possible consequences, check out this table:

Manage S3 credentials

Whenever an organization is created, we automatically allocate bucket folders in AWS S3 cloud storage to it:

Path for upload

Folder intended to store input case files.

Authorized user has view and upload privileges.

Path for download

This folder contains a partially annotated (excluding results of proprietary algorithms) VCF file per case.

Authorized user has view and download privileges.

Path for DRAGEN output

This folder contains DRAGEN output files.

Authorized user has view and download privileges.

To get access to your upload, download and DRAGEN output folders, you need to get a key pair consisting of an access key ID and a secret access key. , , and credentials is available for users with Manager and Manage S3 Credentials .

You can create and use up to two dynamic access keys at the same time.

When you require technical support, you have the option to generate a new key pair specifically for the troubleshooting process. After the issue has been resolved, you can delete the credentials to ensure security of your system.

The newly generated credentials will only be saved in AWS Identity and Access Management (IAM) and not in our database.

How to create a key pair

In Settings > Management > S3 Credentials, click on Create Access Key.
You can retrieve the secret access key only when you initially create the key pair. If you lose it, you have to create a new key pair. To immediately copy the secret access key to a secure location, use the Copy to clipboard button.

How to deactivate a key pair

In Settings > Management > S3 Credentials, click on Deactivate in the corresponding key pair card.

How to activate an inactive key pair

In Settings > Management > S3 Credentials, click on Activate in the corresponding key pair card.

How to delete a key pair

In Settings > Management > S3 Credentials, click on Delete in the corresponding key pair card. Only inactive key pairs can be deleted.

Manage ICA storage

Prerequisites for managing ICA storage

To manage ICA storage, the user must have:

The Storage Provider user role
Upload and download permissions on the ICA project—either granted individually or via entire workgroup

How to get your ICA credentials:

Log in to your Illumina private domain via URL in the following format: yourcompanyname.login.illumina.com. This opens the Connected Platform Home

In the left navigation panel: User > API keys

Name the key

Choose one of the following options:

A. Grant access to all workgroups across the domain If your domain includes multiple workgroups and you want the API key to apply universally, select "All current and future Workgroups and roles (Global API Key)"

B. Grant access to specific workgroups Select one or more workgroups from the list. For each selected workgroup, assign the following application roles:

Emedgene Has Access
Illumina Connected Analytics - Has Access
Platform-home Workgroup Admin

Click Generate. Once the API key is generated, copy it to your clipboard or download it as a file.

⚠️ Important: The API key is only accessible while the API Key Generated popup window is open. After closing the window, the key cannot be retrieved. If you didn’t copy or download it, you’ll need to generate a new key.

To connect ICA:

Log into your Emedgene domain and go to the workgroup where you want to link ICA storage

Click on the user avatar and select Settings from the dropdown

Select the Management tab

In the Storage card, click Add Storage

Select Illumina Connected Analytics (not Illumina Connected Analytics V1!) from the Storage type dropdown

Fill the storage credentials:

"Api_key"—the API key generated before
"Project"—the name of the Project in ICA that contains and will contain the data you want to connect
"Path"—the folder within the project where the data is located. This can be used to restrict the user to only be able to access data within the specified folder. Using only “ / “ will allow all folders within your ICA project

Click Add Storage

Storage providers

Manage Azure Blob storage

Before you proceed to this article, make sure you understand data storage management basics.

Update Azure Blob Storage Credentials

In Settings > Management Tab, add or edit the required credentials: CLIENT_ID, CLIENT_SECRET, TENANT_ID, and ACCOUNT_URL.

See the table below to learn where to look for them in your Azure account.

Emedgene setting

Corresponidng client (Azure) setting

CLIENT_ID

application_id.

Format: ########-####-####-####-############

(letters/numbers)

CLIENT_SECRET

Value of the client_secret tuple (Value, Secret ID).

Format: #####-#######-######-######

(letters/digits/special chars)

TENANT_ID

ID of the tenant.

Format: ########-####-####-####-############

(letters/numbers)

ACCOUNT_NAME

An arbitrary name that the customer must supply to define the ACCOUNT_URL.

Format: string

CONTAINER_NAME

An arbitrary name that the customer must supply to define the ACCOUNT_URL.

Format: string

ACCOUNT_URL

The account_url of the Azure account.

Format: https://account_name.blob.core.windows.net/container_name

Blob Integration Setup

Create an App registration

In Microsoft Entra ID, click on App registrations.

Select New registration.
Fill the name of the application & press "register."
You got to the registered app page: (CLIENT_ID / TENANT_ID) From this you can retrieve: Application ID and Tenant ID. Both are marked in the screenshot.

Press "Certificates & secrets"
Press on "New Client secret"

Fill the "Description" and change expires to 12 months. (or according to your organization policy), than press "Add"

8. Get the CLIENT_SECRET from this page.

Give this App registration roles and read access to the relevant Blob.

Azure Blob configuration

Go to Azure Storage accounts

Get into the relevant Storage account

Press on "containers"

Press on the relevant container
Press on "Properties"
Copy the ACCOUNT_URL

For Internal support:

Errors for bad connections can be found in CloudWatch on particular FRY log stream

Search for: BlobApi, BlobFs, azure.

Manage Google Cloud storage

Google Cloud Storage Credentials update procedure

How to get the client credentials?

Go to the google cloud Console.
Navigate to IAM & Admin - In the left sidebar, go to IAM & Admin > Service Accounts.

Create a New Service Account: Click on the "Create Service Account" button at the top.

Fill in the Service Account Details:
- Service account name: Give your service account a name.
- Service account ID: This will be automatically generated based on the name.
- Description: Optionally, provide a description for the service account.
Click "Create and Continue".
example:

Assign Roles to the Service Account:
- In the Grant this service account access to project step, you’ll assign the necessary roles.
- Grant these role:
  - "storage object viewer" (read-only access)
Create the Service Account:
- After assigning the roles, click "Done".
Generate and Download a Key:
- Find your newly created service account, click the three dots on the right, and select "Manage Keys".
- Click Add Key > Create New Key and choose the JSON format.
- Download the key and store it securely, as it is used for authentication in your code or applications.
Encode the key in base 64:
- use python function: put this function and your json (here named json_file.json) in the same directory and run.\
- save the output printed.

Add the storage provider to Emedgene platform:

Add the above 3 values into the appropriate fields:
- Client_credentials_base64: pasting the output of 8.
- Bucket: the bucket name.
- Path: for default, fill with / else, put your path in the bucket. Seperate directories with /

CORS - Visualisation

Download and install the Google Cloud SDK from the Google Cloud SDK Install page.
Select Your Platform (Windows, macOS, or Linux), download and run.
Initialize and Authenticate with Google Cloud: In the Cloud SDK Shell/terminal, run: gcloud init This will open a browser window to authenticate your Google account. Follow the instructions to log in and select your project.
Set CORS Configuration via gcloud: Create a JSON file (cors.json) on your machine with the CORS rules. Example\ it should look like:

notice:

origin: if using Illumina cloud:
https://host_name.emg.illumina.com
else, Emedgene cloud:
https://host_name.emedgene.com

Apply CORS Configuration to Your Bucket: run the next command. gcloud storage buckets update gs://your-bucket-name --cors-file=cors.json
Verify the CORS Configuration: gcloud storage buckets describe gs://your-bucket-name

Manage BaseSpace storage

Click on the Management tab and then on Add Storage.

Choose Illumina BaseSpace storage type.

Fill Client Key, Client Secret and App Token as provided from BaseSpace (a description on how to get this information is provided below) and click Add storage to complete the setup.

Via Command Line

Prerequisite

Install BaseSpace CLI (Command Line Interface)

Follow the instructions on the if needed. Be aware of the Basespace Regional Instance you are working on (us, euc1, aps2, euw2)

Authenticate

On BSSH, login to the workgroup you want to connect as the storage.

Once the BaseSpace CLI is installed, run the authentication command in the terminal.

The command will direct you to a link which requires to login.

After the authentication was completed successfully, find the access token in the config file.

The result should look like -

Populate the App_token with the accessToken value, and Server with the apiServer URL from the BSSH config file.

Client_key will be displayed in subsequent menus, so a descriptive name such as the workgroup name can be used.

Client_secret is unused when the App_token is available and can be set to "x".

Via BaseSpace Developer Portal

Go to the BaseSpace and login. Be aware of the Basespace Regional Instance you are working on (us, , , )

Go to My Apps and click Create a new Application.

Fill details for the application and click on create an application.

Fill details and press save.

You will need to fill all the fields that it requested, please add “NA” to them.

Go to My Apps and click on your new app. Then go to the credentials tab.

You will find the Client ID (Client Key), Client Secret and App Token to enter to Emedgene platform.

Adding BSSH account to your Emedgene account

Log in into the desired Emedgene organization.
Go to Settings
Go to Management tab
Click on Add Storage
Select BaseSpace:

Add the information from your “Credentials” of the App previously created in BSSH.

Bring Your Own Bucket

If you have an Enterprise account and you would like Emedgene-managed DRAGEN solution to save the DRAGEN output files in your own bucket, reach out to .

Emedgene directly from your AWS S3 bucket. In order to do it, you should enable for the Emedgene application URLs.

Case Type

File Type

Expected effect

This feature is only related to saving Dragen output files in your own bucket when using Dragen through Emedgene (without ICA).

If you are looking to:

Import data from AWS S3 to Emedgene go to
Integrating any data storage to Emedgene go to
Download any data from Emedgene go to

Bring your own bucket is only available for Enterprise level support accounts and require Illumina support for setup.

Bring Your Own Bucket

Bring Your Own Bucket, also known as BYOK, enables you to control your DRAGEN file outputs.

Emedgene-managed DRAGEN solution saves the DRAGEN output files in a detected AWS S3 bucket that you have access to using your .

However, if you have an Enterprise account and you would like Emedgene-managed DRAGEN solution to save the DRAGEN output files in your own bucket, reach out to [email protected] and follow this steps:

1. Create an AWS bucket

Emedgene requires access to the root folder, which means a dedicated bucket might be appropriated.

2. Edit Bucket policy

Bucket policy should allow Emedgene user access to the bucket.

Example bucket policy:

3. Allow illumina.com and emedgene.com for CORS

Emedgene directly from your AWS S3 bucket. In order to do it, you should enable for the Emedgene application URLs.

Example CORS policy:

4. Test and validate the configuration with Illumina support

We will require to run a case and validate the managed DRAGEN pipeline finish successfully and all features are available in the platform.

The BYOB solution means you managed your own data, meaning if you accidentally deleted or moved the data the integration with Emedgene might break. You are responsible for your DRP and data backup solutions.

Managing AWS S3 Lifecycle policy

If a customer enables an AWS S3 Lifecycle policy in order to archive or change the S3 tiers for different files, they might create an adverse effect on the platform.

Case Type

File Type

Expected effect

Launching analysis

Creating a single case

Select sample type

When creating a new case, the first step is to select the sample input type. This determines how your data will be processed and which quality metrics will be available later in the analysis.

You can choose from the following supported formats:

FASTQ Accepted file types: .fastq.gz, .fq.gz, .bam, .cram Use this option if you want the platform to perform secondary analysis and variant calling.
Project VCF Accepted file types: .pvcf, .vcf, .vcf.gz, .pvcf.gz Use when working with a joint VCF file containing multiple samples.
VCF Accepted file types: .vcf, .vcf.gz, .targeted.json, .gt_sample_summary.json Use for cases where variants have already been called externally, or for array inputs with accompanying JSON files.
Array Supported with DRAGEN Array v1.2 VCF and quality files. Use this option for cytogenetic array cases. Array results can be visualized in Genome View and the IGV tab, and sample-level quality metrics are available under the Lab tab.

Tips:

Choose the input type carefully — it cannot be changed after the case is created.
For joint gVCF or project VCF inputs, make sure the proband sample is listed first to ensure correct downstream calculations.
Keep file paths simple (avoid spaces, parentheses, or very long names >255 characters). This helps prevent errors during upload

Warning:

If files are incomplete or corrupted, the case may still be created but will fail during processing. Double-check your files before uploading.

Create a family tree

Add new case page > Family tree screen > Create family tree panel

Build a pedigree via the visual tool.

It is ideal that a proband selected for case analysis is affected and has disease phenotype(s).

You can add a Father, a Mother, a Sibling, or a Child to any family member, starting with the Proband. To do this, choose their icon, then click on the Add family member button in the bottom right corner of the pedigree builder to select a family member.

More information about the pedigree symbols can be found here.

To delete a family member, choose their icon, then click on the Delete Subject button in the top right corner of the Add patient information panel.

Note: There is no technical limit on the size or number of generations for a family tree.

Family tree legend

While adding a new case, you will build a pedigree and annotate each of the samples with data required for analysis (Add new case page > Family tree screen).

After the case has been created, the family tree is available in the Case details panel (righthand panel of the Cases page).

Family tree legend:

Icon fill color in other pedigree members indicates the presence or absence of the proband's phenotypes in a present sample (regardless of the potential presence of additional unrelated phenotypes):

* Filled - the individual is affected by all of the proband's phenotypes;
* Half-filled - the individual is affected by some of the proband's phenotypes;
* Empty - the individual is not affected by any of the proband's phenotypes.

2. Icon color intensity denotes whether sample files have been uploaded for the particular individual:

* Full color - the sample has files loaded in the case;
* Faded color - no sample files are available.

3. Icon line type indicates whether the sample is considered or excluded during analysis (relevant to samples with uploaded files only):

* Solid - the sample is included in the analysis;
* Dashed - the sample is ignored by Inheritance filters and the AI Shortlist algorithm, but you still can explore its genotypes.\

Add a sample

Add new case page > Family tree screen > Add patient information panel > Add sample section

You can choose one of the following options:

Existing sample - pick one of the samples already loaded on the platform
Upload New Sample - upload files from your PC and enter sample name
Choose from storage - choose files from your cloud storage and enter sample name
No sample - postpone uploading files but proceed with case creation or skip uploading files for family members other than Proband

Note: A case won't run if Proband sample files are missing. However, sample files are not mandatory for the rest of the family members (although highly recommended).

Note: When choosing an existing file path, the samples used may be cached from the original run. For a top-up flow please use a new file path.

Note: When you are loading sample files from your PC or choosing them from the storage, and there is more than one file per sample, please ensure that all the necessary files are simultaneously selected in the upload pop-up. You may only select one file type per case (i.e. you may not select both a .vcf and a .bam at the same time).

Adding patient info for the non-proband samples

Add new case page > Family tree screen > Add patient information panel > Patient info section

1. Fill in the boxes:

Note: The fields marked with (*) are mandatory.

Note: Please omit the Patient ethnicities field for non-proband samples.

1. Sex (*)

Options: Male, Female, Unknown.

2. Relationship

Indicates the family relationship of a subject to the Proband automatically inferred from the pedigree. Options: Father, Mother, Sibling, Child, Other.

3. Date of Birth

Expected format: mm/dd/yyyy.

4. Ignore Sample

Mark the checkbox if you want to exclude the sample from the AI Shortlist analysis and Inheritance filters while preserving genotype data.

5. Add Proband's phenotypes

If a sample shares some phenotypes with the Proband, you can copy them by checking this box. Proband's phenotypes will appear in a newly created Related Phenotypes section. To remove any of the proband's phenotypes not observed in a current individual, click the ☒ button next to the HPO term in the Related Phenotypes section.

Note: A popup notification will appear at the bottom of the page if any input HPO term or HPO ID is unknown.

6. Unrelated Phenotypes

Phenotypes not shared with a Proband. They can be added one by one (Selection mode) or in batch (Batch mode).

Selection mode

Please follow the steps described below for each phenotype:

Enter an HPO term (e.g., Hypoplasia of the ulna), an HPO ID (e.g., HP:0003022), or a descriptive phenotype name (e.g., Underdeveloped ulna) in the search box;
Select a matching term from a dropdown menu and press Complete after you've added all the terms.

Batch mode

Paste a list of comma-separated HPO terms or HPO IDs in the search box and press Complete.

2. Click on Complete once all the information is added.

Incidental (secondary) findings

While creating a new case, you can choose whether to include secondary findings for the proband. This option is available on the Family Tree screen → Create family tree panel → Show Secondary Findings.

Secondary findings are genetic variants that are not related to the primary indication for testing but may have important medical implications. These variants are automatically assigned the Incidental tag when they meet American College of Medical Genetics and Genomics (ACMG)-defined criteria for reportable secondary findings.

In Emedgene, the terms incidental findings and secondary findings both refer to ACMG-defined secondary findings. The platform continues to use the “incidental” label in certain places for technical consistency, though the modern clinical standard is “secondary findings.”

Tagging criteria

A variant is automatically tagged as a secondary finding if it meets all of the following criteria:

Classification: Previously classified as pathogenic or likely pathogenic in ClinVar or Curate variant databases
Zygosity: Heterozygous or homozygous (only homozygous for the HFE gene)
Allele frequency: Less than 5%
Read depth: 10× or higher
Variant quality: Any value except LOW
Affected gene: Listed in the ACMG SF v3.2 or 3.3 medically actionable gene list for reporting secondary findings in clinical exome and genome sequencing (PMID: 37347242, 40568962)

ACMG SF v3.2 gene list

ACTA2, ACTC1, ACVRL1, APC, APOB, ATP7B, BAG3, BMPR1A, BRCA1, BRCA2, BTD, CACNA1S, CALM1, CALM2, CALM3, CASQ2, COL3A1, DES, DSC2, DSG2, DSP, ENG, FBN1, FLNC, GAA, GLA, HFE, HNF1A, KCNH2, KCNQ1, LDLR, LMNA, MAX, MEN1, MLH1, MSH2, MSH6, MUTYH, MYBPC3, MYH11, MYH7, MYL2, MYL3, NF2, OTC, PALB2, PCSK9, PKP2, PMS2, PRKAG2, PTEN, RB1, RBM20, RET, RPE65, RYR1, RYR2, SCN5A, SDHAF2, SDHB, SDHC, SDHD, SMAD3, SMAD4, STK11, TGFBR1, TGFBR2, TMEM127, TMEM43, TNNC1, TNNI3, TNNT2, TP53, TPM1, TRDN, TSC1, TSC2, TTN, TTR, VHL, WT1.

ACMG SF v3.3 (2025 release; requires pipeline v100.39.0+)

Includes all v3.2 genes plus newly added genes:

PLN
ABCD1
CYP27A1

This brings the total to 84 reportable genes.

Historical note

When Emedgene was first released, the term “incidental findings” was adopted in alignment with the clinical genomics standard at the time. The 2013 ACMG recommendations defined incidental findings as “the results of a deliberate search for pathogenic or likely pathogenic alterations in genes that are not apparently relevant to a diagnostic indication for which the sequencing test was ordered” (PMID: 23788249).

As the field evolved, the ACMG and broader clinical community began to distinguish between “incidental findings” (unexpected, not actively sought) and “secondary findings” (intentionally analyzed and reportable). This shift was reflected in the updated 2016 ACMG guidance (PMID: 27854360).

To reflect this change, Emedgene introduced the term “secondary findings” into the platform. However, “incidental findings” remains in use throughout the platform for technical consistency.

Tips:

Enable secondary findings when clinically relevant — this ensures variants in actionable genes are surfaced automatically.
Always review findings in the context of patient consent and your institution’s reporting policies.

Warnings:

Secondary findings are limited to the ACMG-defined gene lists. Variants outside these lists will not be tagged automatically.
Only variants with adequate sequencing depth and quality are tagged. Low-quality calls may require manual review.

Case type and region of interest

> Case info screen > Select case type

Select the case type in order to define the proper analysis of your case.

Select the region of interest in order to filter-in only variants located within the proper regions.

Regional information is used to process genomic data in a variety of ways. Regional information influences the variants presented in gene panels, exomes, and genomes, variant quality and variant annotations.

Regardless of the case type, you can select any region of interest.

When selecting a Custom BED as you region of interest, you must select a specific BED file that is configured in your organization.

Case type table:

Case Type

Default Region of Interest BED

Case type + regions of interest BEDs are applied as follows:

Research Genome + No Region of interest: There is no intersection and all variants are viewable (Note: data pipeline time will increase, and intergenic variants have very limited annotations).
Whole Genome + Full genes: A wide range of genomic regions BED file. It contains:
- "RefSeq ALL" transcripts and "GENCODE" full genes regions with 5Kbp upstream and 5Kbp downstream
- Within this range, all “Clinical Regions” are included
- All dosage regions (HI/TS sig level 1, 2 or 3)
Exome + Clinical regions: A comprehensive bed file that includes every clinically relevant region. The following are included:
- “RefSeq Curated” and “GENCODE” regions with flanking areas of 50bp from each side 5UTR and 3UTR region for protein coding genes (based on RefSeq)
- OMIM disease-related RNA genes (50bp flanking)
- All Clinvar Pathogenic variants regions (flanking 50bp)
- Promoters region (EPDnew human version 006, flanking 50bp)
- Known STR regions (Dragen 4.0 specification file)
- All microRNA genes (flanking 50bp based on HGNC)
- Full mtDNA region
Custom Panel+ Custom BED: You can select any existing BED file configured in your organization to restrict according to it's regions.

BED files

Sequencing information

Select a coverage BED file to be used in order to calculate and determine QC data for your case. This is relevant for exomes and panel cases only. Once selecting a coverage BED, an indication of availability per reference sequence will appear.

You can define a region name as an additional column within your BED. This region name will appear when reviewing regions of insufficiency in the Lab page.

Include lab level information such as lab name, machine used, reagents and expected coverage.

Gene list

Add new case page > Case info screen > Select genes list

Select gene list

You can limit analysis to a gene list in the platform while creating a case. Choose between:

1. All genes

No limitation of the analysis.

2. Existing gene list

Select one of the previously added gene lists from a dropdown list.

3. Create a new gene list

Generate a new virtual panel: add a List title and then add all the gene symbols one by one (Selection mode) or in a batch (Batch mode).

A new gene list can be comprised from a combination of configured gene lists and/or individual genes.

A gene list can by configured to hold up to 10,000 genes.

A new gene list can be created by combining configured gene lists and/or individual genes. Each gene list can be configured to contain up to 10,000 genes.

Note: Please use the up-to-date gene symbols approved by the Hugo Gene Nomenclature Committee. When adding gene symbols in a Batch mode, those genes that do not comply with HGNC standards will be automatically excluded from the gene list. These genes will appear for 3 seconds in a black error box at the bottom of the screen.

Selection mode

For each gene please follow the steps described below: Enter a gene symbol in the search box in the right panel (Candidate Genes) and select a matching symbol from a dropdown menu.

Batch mode

After selecting batch mode, paste a list of comma-separated gene symbols in the search box in the right panel (Candidate Genes).

Gene list modes

You can choose between two different modes of a gene list feature:

1. In silico panel

Selected by default.

AI Shortlist is limited to the selected gene panel, no variants in other genes are considered in the results. If this in silico panel is used for analysis of exome or genome data, the gene restriction may be lifted during manual analysis to "open-up" the entire exome or genome for analysis.

2. Boosted genes

Analysis is performed for variants in all the genes. Variants in the targeted genes get upgraded scores during prioritization by the AI Shortlist algorithm.

Preset group

You can implement different combinations of Presets to be used for different case types (i.e. Presets for exome may be different from Presets for genome) as defined by your SOPs to further streamline case review.

The combination of Presets is referred to as a Preset group.

Select a Preset group to display in the case

Preset group selection is available in the Case info screen of the Add new case flow while creating or editing a case.

Where can I manage Preset groups?

To manage filter Preset groups, navigate to Settings > Organization Settings > Lab Workflow:

Preset groups

From here, you can create (from Presets/from a JSON file, edit, hide/unhide and download Preset groups as needed.

Default Preset group

Here, you can set a Preset group as default, so it will be used unless another Preset group is selected during case creation.

Supported parental ethnicities

The ethnicities of the proband's mother and father can be specified during the process of UI or API case creation. Please refer to the following list of supported ethnicities.

A "Afghan Jews" "Afghani" "African" "African American" "Afro-Brazilian" Alaska Native" "Algerian" "Algerian Jews" "Amish" "Anatolian" "Arab" "Argentinian/Paraguayan" "Armenian" "Ashkenazi Jews" "Asian" "Asian Brazilian" "Australian Native" "Azerbaijan Jews"

B "Bedouin" "Bengali/Northeast Indian" "British/Irish" "Bulgarian Jews"

C "Caribbean Australian"

"Caucasus Jews" "Central African" "Central Asian" "Chilean" "Chinese" "Chinese Dai" "Christian Arab" "Circassian" "Colombia"

D "Druze" "Dutch"

E "East African" "East Asian" "East European" "Egyptian" "Egyptian Jews" "Emirates" "Ethiopia" "Ethiopian / Eritrean" "Ethiopian Jews" "Ethiopian Jews - Beta Israel" "European" "European American"

F "Fijian Australian" "Filipino" "Filipino Austronesian" "Finnish" "French" "French Canadian"

G "Georgian Jews" "Germans" "Ghanaian / Liberian / Sierra Leonean" "Greece Jews" "Greek Americans" "Greek / Balkan" "Guam/Chamorro"

H "Hawaiian"

I "Iberian" "India - Bene Israel Jews" "India - Cochin Jews" "Indian" "Indigenous Amazonian" "Indigenous peoples in Canada" "Indonesian" "Inuit" "Iranian" "Iranian Persian Jews" "Iraq" "Iraqi Jews" "Irish" "Italian" "Italian Americans" "Italian Jews"

J "Japanese" "Japanese Brazilian" "Jordan"

K "Kenyan" "Korean" "Kurdish" "Kurdish Jews"

L "Latino/Hispanic Americans" "Lebanese Jews" "Levantine" "Libyan" "Libyan Jews"

M "Maasai" "Malayali Indian" "Melanesian" "Mesoamerican and Andean" "Mexican American" "Middle Eastern" "Mongolian / Manchurian" "Mormon" "Moroccan" "Moroccan Jews" "Muslim Arab"

N "Native American" "Nepali" "Nigerian" "North African" "North and West European" "Northern Asian" "Northern Indian"

O "Other Pacific Islander"

P "Pakistani" "Papuan" "Polynesian" "Portuguese in Northern Brazil" "Portuguese in Southern Brazil"

R "Russian Jews" "Russians"

S "Samaritan" "Samoan" "Sardinian" "Saudi" "Scandinavian" "Senegambian / Guinean" "Siberian" "Somali" "South African" "South Asian" "Southern East African / Congolese" "Southern European" "Southern Indian" "Southern Indian / Sri Lankan" "Southern South Asian" "Spaniards" "Spanish Jews" "Sub-Saharan African" "Sudanese" "Swedes" "Syrian Jews" "Syrian-Lebanese"

T "Tajikistan Jews" "Thai / Cambodian / Vietnamese" "Tunisian" "Tunisian Jews" "Turkish" "Turkish / Anatolian" "Turkish Jews"

U "Ukraine" "Ukraine Jews" "Uzbekistan/ Bukharan Jews"

V "Venezuela"

W "West African"

Y "Yemenite" "Yemenite Jews"

Labeling a case

You have the flexibility to manage Case labels at any time: create, add, or remove them directly in the .

Adding labels to a case provides the ability to quickly mark cases for specific use cases and an easy filtering of cases sub set in the cases page.

Batch case upload from platform

If you're comfortable with scripting and API usage, you can upload multiple cases at once using those methods. But if you're not a technical expert, don't worry. There is a user-friendly alternative available—importing a CSV file directly through the user interface.

Please follow the steps as described below.

Caution: Please note that refreshing or leaving the page, exiting the Add new case tab, or power failure of your computer before you've completed a batch case upload will result in loss of the case creation progress.

1. Prepare a CSV file

CSV (Comma-Separated Values) is a simple file format used to store data in tabular form. A row represents a sample, and a column represents a data field.

Start by downloading a CSV template with an example line and mandatory and non-mandatory fields from the Add new case page set to Batch mode (see step 2). Fill the file with your data according to CSV format requirements.

2. Upload a CSV file

Click on the + New case button on the top navigation panel.
Click on the Switch to batch button in the top right corner. You'll be directed to the Select file page of the Batch upload flow. Note: Here you can download a CSV template in the valid format.
Drag and drop a CSV file into the box or upload it from the file explorer. Wait for file upload and validation to finish.

3. Review file validation results

After validation is complete, you will be directed to the Batch validation page. It features validation results details for you to review:

File name,
Number of rows in the file,
Number of cases to be created
Number of errors found,
Status message
- If no errors were detected, a success message will be displayed
- If any errors were detected, an error message will be displayed.
  You will be given the option to download a file with error details to help you diagnose and correct any issues with the data. Once you've corrected the CSV file, reupload it.

4. Create cases

Click on Create. A progress bar will appear on the right as the cases are created (Cases creation page).
If the cases have been created successfully, the Cases summary page will display the total number of cases that were created.
If there were any errors during the batch case creation process, the Cases summary page will display a table indicating the number of cases that were successfully created and the number of cases that failed.

You will have the option to download a CSV file containing two additional columns: Errors and Case ID. The Errors column will contain error messages for samples where case creation failed, while the Case ID column will contain the Case ID of a successfully created case for the lines where case creation was successful.

Batch case upload via CLI

Prerequisites

Download and install node js platform via https://nodejs.org/en/download Minimum version required: 16 Upgrade existing installation: nvm install --lts

Batch upload via CLI (Command Line Interface)

Download the batch case create script. Replace my-domain with your Emedgene domain. Illumina cloud: my-domain.emg.illumina.com Legacy Emedgene cloud: my-domain.emedgene.com

curl https://my-domain.emg.illumina.com/v2/js/batchCasesCreator.js --output batchCasesCreator.js

Download the CSV template file.

node batchCasesCreator.js saveTemplateFile

Edit the downloaded batchCases.csv file. See CSV format requirements for more details.
Execute the batch cases creator as java script using the command below. Replace my-domain with your Emedgene domain and my-email with your user email. A prompt for your Emedgene password will appear, enter the password and press Enter.

node batchCasesCreator.js create -h https://my-domain.emg.illumina.com -c batchCases.csv -u my-email -l

In case of validation errors in the input CSV, an output CSV called batchCases_results.csv will be created in the same location with detailed error results.
-l will create a log file in the same location.

More information can be found by running

node batchCasesCreator.js --help

node batchCasesCreator.js create --help

Reviewing a case

Individual case page

The user can enter a specific case from the Cases tab by clicking Full details in the corresponding row of the case table.

The Individual case page includes:

Top bar—displays a Case ID and Case status and includes Case interpretation, Edit case info, and Report preview buttons
Candidates tab—highlights a shortlist of variants, suggested to be reviewed first - Most Likely Candidates and Candidates
Lab tab—illustrates quality metrics for the sequenced samples
Genome view—provides an interactive overview of genomic structure, ideal for analyzing CNV and ROH/LOH events
Analysis tools tab—provides numerous customizable filters to help you explore the total list of genetic variants in compliance with your organization's standard case review process. You can export shortlisted variants in .xlsx format
Versions tab—documents versions of all the resources used during case analysis

Case status

The case status reflects the current stage of case processing, either by the Emedgene platform or a genomic analyst.

You can view the current status of a case in the following locations:

Status column in the – for a quick overview across multiple cases
of the individual case page – for immediate visibility while reviewing a case
panel, under Case-related activities – to track status updates and history

There are both provided by the platform and the option to create to suit your workflow.

Out-of-the-box case statuses

Status

Meaning

Triggered by

User-changeable

Uploading

The sequencing file is being uploaded on the platform

System

In progress

The case analysis is running

System

Delivered

The running stage is completed, and the case is now available for users for analysis and review

System

Finalized

The analysis and review of the case by the analyst group have been . Note: Access to applying or changing the Finalized status could be restricted to specific users, such as organization managers and directors, who possess corresponding

User

Move to trash

Indicates that the case is up for and makes the case inaccessible. This status is only assigned by the user. Note: Access to applying or changing the Move to trash status could be restricted to specific users, such as organization managers and directors, who possess corresponding

User

Pending sequencing

The case has been created but is not yet connected to any genetic data

System

Issue reported

The case failed to run. Please check the integrity of the uploaded files and ensure that the variant caller used is on Emedgene list of accepted .

System

Reanalysis

The system is re-running the AI Shortlist algorithm

System

Case statuses management

In the tab under Settings, you can tailor how case statuses appear and function to better align with your workflow:

Create custom case statuses to reflect your team's specific processes
Remove unused statuses to keep the list clean and relevant. Only statuses that are not currently assigned to any case can be removed.
Rearrange the order of statuses by drag-and-drop to match your preferred case interpretation flow. The updated order is refelected in:
- The Status field dropdown in the Cases table
- The top bar of the individual case page

Individual case page: Top bar

The Top bar in the Individual case page indicates the Case ID and current .

Options available through the Top bar:

Change the
Reanalyze the case
and write interpretation notes
Preview the

Candidates tab

The Candidates tab displays all tagged variants, whether tagged by the AI Shortlist or manually by a user.

Variant tagging by the AI Shortlist

Variants are automatically tagged as:

Most Likely Candidates and Candidates
Variants prioritized by the AI Shortlist
Secondary findings
Variants that meet ACMG-defined criteria for secondary findings and automatically tagged with an Incidental tag (if enabled)
Carrier variants
Variants identified by the carrier analysis pipeline (if enabled)

Assigning variant tags during review

During review in the Candidates tab, additional tags can be applied to a variant alongside the original automatic tag.

The Candidates tab presents:

Most Likely Candidates and Candidates

A set of the most promising variants based on scores calculated by the AI Shortlist. These variants are initially tagged by the system.

Variant types assessed:

SNVs and indels
CNVs
SVs
mtDNA variants
STRs

Incidental (Secondary)*

Secondary findings are variants that are automatically assigned the Incidental tag when they meet the criteria for secondary findings as defined by the American College of Medical Genetics and Genomics (ACMG).

Tagging is applied only when the Secondary findings checkbox is selected during case creation.

Tagging criteria

A variant is automatically tagged as an incidental (secondary) finding if it meets all of the following criteria:

Classification: Previously classified as pathogenic or likely pathogenic in ClinVar or Curate variant databases
Zygosity: Heterozygous or homozygous (only homozygous for the HFE gene)
Allele frequency: Less than 5%
Read depth: 10× or higher
Variant quality: Any value but LOW
Affected gene: Listed in the ACMG SF v3.2 medically actionable gene list for reporting secondary findings in clinical exome and genome sequencing (PMID: 37347242)

ACMG SF v3.2 gene list

*In Emedgene, the terms "incidental findings" and "secondary findings" both refer to secondary findings as defined by the ACMG, due to historical usage.

When Emedgene was first released, the term “incidental findings” was adopted in alignment with the clinical genomics standard at the time. The 2013 ACMG recommendations defined incidental findings as “the results of a deliberate search for pathogenic or likely pathogenic alterations in genes that are not apparently relevant to a diagnostic indication for which the sequencing test was ordered” (PMID: 23788249).

As the field evolved, the ACMG and broader clinical community began to distinguish between “incidental findings” (unexpected, not actively sought) and “secondary findings” (intentionally analyzed and reportable). This shift was reflected in the updated 2016 ACMG guidance (PMID: 27854360).

To reflect this change, Emedgene introduced the term “secondary findings” into the platform. However, “incidental findings” remains in use throughout the platform for technical consistency.

Carrier

Variants identified by the Carrier analysis pipeline. Carrier variants are automatically tagged only if you've selected the Carrier Analysis checkbox while creating a case. Analysis requirements and a list of targeted regions are specified by the organization's manager. This Carrier analysis flow is implemented by request.

In Report and other custom variant tags

Variants that were manually selected to be reported.

Reviewing the Candidates tab

To select variants with a particular tag, use the Filter candidates dropdown menu in the top right corner. You can select from Most Likely, Candidate, Incidental, Carrier, Not Reviewed, or any custom tags used in your organization.

For each variant on the Candidates tab, you can explore the suggested diagnosis, gene symbol, main variant details, and variant tag.

When a variant is found in a gene with no known association with a disease, the possible diagnosis cannot be indicated. Such variants are displayed under the Gene of Unknown Significance title.

All the relevant fitting a сompound heterozygous mode of inheritance are presented together. This refers to both confirmed and assumed compound heterozygosity (cases with at least one parent and singleton cases, respectively).

If you want to inspect the complete variant information, click on the variant bar to continue to the . You can visualize evidence in text or graphical format (Click on the interactive text in the top left corner: Show evidence as text or Show evidence graph to toggle between the two).

Most Likely Candidates and Candidates

To streamline case review, the AI Shortlist pre-selects the list of variants likely to be causative for each case:

Most Likely Candidates

Variants that are most promising for solving the case. This list is limited to 10 top-scored variants but may include more if more than one variant is tagged per gene (suggesting compound heterozygosity). We can change the Most Likely Candidates number limit upon request.

Candidates

Several dozen highly scored variants worth considering.

The ranking of variants by AI Shortlist considers:

SNVs
CNVs
SNV + CNV compound heterozygotes
SVs
mtDNA variants
STRs

The AI Shortlist rates variants based on predicted variant effects, alternative allele frequency, familial segregation pattern, phenotypic match, in silico predictions, and other relevant information from scientific papers and databases.

During the case review, you can untag variants selected by the AI Shortlist or manually tag ones not selected by the AI Shortlist.

Lab tab

The Lab tab shows sample and case-level quality metrics so you can check data reliability before starting interpretation.

The Lab tab includes:

—highlights the key quality indicators, with more details provided in the subsequent sections
—reports sequencing run technicalities
—summarizes the data quality of the case
—highlights quality metrics for each sample
—displays the results of the relationship validation for each pair of samples in a family tree
—highlights regions that may not have been adequately sequenced

Summary dashboard

Summary dashboard provides a quick overview of key quality indicators at both the case and sample levels.

Included metrics:

Displays the overall case quality status
Reflects sample quality status
Evaluation kit
Specifies the QC BED kit used to evaluate coverage depth and breadth. If no kit is specified at analysis launch, NCBI RefSeqGene is used as the default reference
Custom gene coverage Indicates whether the coverage of genes in the selected panel meets the expected threshold, as defined by the QC BED
Displays the results of relationship validation, confirming whether the submitted pedigree aligns with genetic data

Case quality section

The Case quality section summarizes the data quality of the case and highlights the results of validation checks:

Chromosome validation
Confirms that each chromosome with at least 100 SNVs in defined enrichment kit or coding regions includes at least one high-quality variant
gnomAD validation
Verifies that each chromosome with at least 100 SNVs in defined enrichment kit or coding regions includes at least one variant annotated with gnomAD
ClinVar validation
Ensures that each chromosome with at least 100 SNV variants in defined enrichment kit or coding regions includes at least one variant annotated with ClinVar
AI Shortlist validation
Checks that at least one variant is tagged by the AI Shortlist.
- This validation is not applicable if the gene list contains fewer than 50 genes
- If your workgroup uses a higher threshold, it is reflected in the Gene list threshold field
mtDNA reference validation Confirms that the rCRS reference is used for mitochondrial DNA

Sample quality section

The Sample quality section in the Lab tab gives you a quick view of the reliability of sequencing or array data used in your case.

For eligible cases, users can review the results of : interactive DRAGEN report and DRAGEN QC metric files.

The metrics displayed in the Sample quality section and their underlying calculation vary depending on the case type:

NGS case

Array case

NGS sample quality metrics

NGS sex validation

The Sex validation column indicates whether the biological sex inferred from genomic data matches the sex information provided during case creation. This helps identify potential sample mix-ups or metadata errors before interpretation begins.

Sex validation results:

Pass
Reported sex matches the estimated sex
Fail A mismatch was detected between reported and estimated sex.
N/A QC file not available; validation could not be performed.

Sex validation is performed by comparing the observed homozygous/heterozygous genotype ratio on the X chromosome with the expected ratios:

<2 for females
>2 for males

Prerequisites:

Only high-quality SNVs from targeted regions—either kit-specific or RefSeq coding regions—are used for sex validation
A minimum of 50 variants is required to generate a reliable result. If this threshold is not met, sex validation cannot be performed, and no result is displayed

If the sex was marked as unknown during case creation, the system will display the predicted sex instead of a validation status.

Ploidy

The Ploidy column provides results from the DRAGEN Ploidy Estimator, which is designed to detect aneuploidies and determine the sex karyotype in whole genome cases.

Ploidy estimation results:

Pass All autosomes fall within the expected ploidy range.
Fail
At least one autosome shows a median score outside the expected thresholds (below 0.9 or above 1.1).
- When you hover over a failed result, the system displays which chromosomes are problematic.
N/A
Case type is not Whole genome or QC file not available; validation could not be performed.

The ploidy calculation uses values from the *.ploidy_estimation_metrics.csv file.

Tips:

Use ploidy checks early in case review to spot potential large-scale chromosomal abnormalities.
Always confirm whether the sex karyotype inferred from ploidy matches the sex validation results to rule out sample swaps.

Failed results do not confirm clinical abnormalities. They only indicate a deviation in copy number estimation and should be reviewed in context of other QC metrics and visualization.

Contamination

The Contamination column reports whether a sample shows signs of DNA contamination, helping ensure data reliability before interpretation.

Be mindful that when contamination is suspected in sequencing data, it could stem from various sources, including true contamination, sample mix-up, library preparation issues, or technical artifacts.

Always confirm the issue with other quality checks.

Contamination is detected using Peddy calculations, which estimate the proportion of reads that do not match the expected genotype and is based on the idr_baf score.

Contamination check results:

N/A No data is available (older cases or when idr_baf = 0.000).
No No contamination detected (idr_baf < 0.200).
Unlikely Possible contamination, but evidence is weak (0.200 ≤ idr_baf < 0.241).
Likely Contamination suspected (0.241 ≤ idr_baf < 0.300).
Yes
Contamination confirmed (idr_baf ≥ 0.300).

Tips:

Always review contamination results before starting interpretation to rule out technical issues that could explain unexpected variant calls.
Cross-check contamination results with other QC metrics (e.g., depth, ploidy, sex validation) for a more complete picture of sample quality.
For family cases, check that no contamination is flagged before relying on inheritance-based filters.

Warnings:

Panels may be less reliable: For targeted panels, contamination estimates may be inaccurate due to the limited number of variants available for calculation. Use caution and cross-check with other QC metrics when interpreting these results.
Do not use in isolation: A "Likely" or "Yes" result should not immediately be considered diagnostic — review case setup, sequencing quality, and sample handling first.

Coverage

Coverage metrics for a target region defined by a QC BED file (or RefSeq coding regions if no kit is provided) included in the Sample quality section:

Average coverage Average depth of coverage for a target region
% Bases with coverage >10x percentage of a target region that is covered at a minimum depth of 10x
% Bases with coverage >20x percentage of a target region that is covered at a minimum depth of 20x

Blue bars represent each of these parameters per sample, while a vertical line represents a general metric across all the samples of the same case type in the account.

Percentage of mapped reads

Percentage of reads mapped to the reference sequence.

Blue bars represent each of these parameters per sample, while a vertical line represents a general metric across all the samples of the same case type in the account.

Sequencing error rate

Sequencing error rate refers to the frequency at which incorrect base calls are made during sequencing process.

Blue bars represent each of these parameters per sample, while a vertical line represents a general metric across all the samples of the same case type in the account.

Array sample quality metrics

Array sample quality

The Quality status provides a quick assessment of array data reliability for each sample:

High
Call rate ≥ 0.99 and Log R dev ≤ 0.2
Low If either condition is not met
N/A
If the QC file not available

Use the Quality status to quickly screen whether a sample meets minimal QC thresholds before starting detailed interpretation.

Array sex validation

Sex validation results:

Pass
Reported sex matches the estimated sex
Fail A mismatch was detected between reported and estimated sex.
N/A QC file not available; validation could not be performed.

If the sex was marked as unknown during case creation, the system will display the predicted sex instead of a validation status.

CNV overall ploidy

The CNV overall ploidy field displays the ploidy value extracted from the CNV VCF header. If no CNV VCF file is provided, "N/A" is displayed.

Displayed to three decimal places.

The value is shown as is. The system does not validate or flag abnormal ploidy values. Interpret ploidy in context.

Autosomal call rate

The Autosomal call rate field displays percentage of loci on the array for which a genotype call was successfully made, that only includes autosomes.

A high call rate indicates a high-quality sample and successful genotyping. Low call rates can signify problems with the DNA sample (poor quality or quantity) or issues during the array processing.

Displayed to three decimal places.

Call rate

The Call rate field displays the percentage of loci on the array for which a genotype call was successfully made.

Call rate is one of the key metrics used to determine array sample , alongside .

A high call rate indicates a high-quality sample and successful genotyping. Low call rates can signify problems with the DNA sample (poor quality or quantity) or issues during the array processing.

Displayed to three decimal places.

Log R deviation

The Log R Deviation (or Log R Ratio standard deviation) quantifies the variability of the the signal intensity for each SNP marker on an array, ie, noise level.

Log R deviation is one of the key metrics used to determine array sample , alongside .

Lower values indicate more consistent signal intensities. A high Log R Deviation can indicate a poor-quality sample or potential issues with CNV calling.

Displayed to three decimal places.

DRAGEN QC report

The DRAGEN sample-level quality control (QC) report is generated by the Illumina DRAGEN Bio-IT Platform and covers the entire analysis workflow—from raw sequencing reads to variant calls.

DRAGEN QC report formats

Interactive HTML summary A visual summary that includes interactive plots of key quality metrics. This report can be accessed from the Sample quality section of the Lab tab.

CSV metric files A set of detailed CSV files containing sample-level quality metrics. These files are downloadable and support in-depth review and documentation.

Prerequisites for accessing the DRAGEN QC report

To view the DRAGEN QC report within an Emedgene case, one of the following workflows must be used:

A. FASTQ case

Run a FASTQ case in Emedgene. Since the DRAGEN analysis is integrated into Emedgene secondary analysis pipeline, QC reports are automatically generated.

B. Bring your own DRAGEN (BYOD)

Run the DRAGEN analysis externally. Then upload a VCF case to Emedgene, along with the QC report metrics files. This workflow is supported for batch case upload via UI and CLI and API-based case creation. Files must be prepared as described here.

Review interactive DRAGEN report

When available, a link appears below the sample name in the Sample quality section of the Lab tab. Clicking the link opens the detailed quality control metrics report in a new browser tab. This integration allows users to quickly assess sequencing quality and confidently interpret results—without leaving the Emedgene interface.

Download DRAGEN QC metrics files

Sample-level DRAGEN QC metric files for all samples in a case can be downloaded by clicking the download icon next to the Sample quality section title.

For NGS cases, the report includes coverage and mapping statistics.

For array cases, metrics include array QC values such as call rate, autosomal call rate, and Log R dev.

Pedigree section

The Pedigree section displays relatedness metrics and the results of relationship validation for each pair of samples in the family tree.

Included metrics

Relatedness coefficient (observed)

Shows the observed coefficient of relatedness between sample pairs. The expected coefficient is available via hover tooltip for quick comparison.

IBS0

Identity by state 0—a number of genomic loci where two individuals share zero alleles. This occurs when the two individuals are opposite homozygotes for a biallelic SNP.

This metric is calculated across a set of biallelic SNPs and is inversely related to the degree of genetic similarity between the individuals. A low IBS0 count suggests a higher degree of overall genetic similarity, but it is an indirect and limited measure of genetic relatedness that requires interpretation alongside other metrics.

Relationship validation result

Summarizes the outcome of the relationship validation, confirming whether the observed data aligns with the expected pedigree structure.

Relationship validation calculation

Relationship validation is done by based on:

Relatedness coefficient (𝑟)—a measure of how much two individuals share alleles from a common ancestor, indicating the probability that alleles at the same genome location are identical by descent
IBS0 (Identity by state 0)—a number of genomic loci where two individuals share zero alleles, ie, they are opposite homozygotes
IBS2 (Identity by state 2)—a number of genomic loci where two individuals share two alleles, meaning they have the exact same genotype

Peddy takes the inferred relationships from the genetic data and cross-references them against the declared relationships. For every pair of individuals in a cohort, Peddy calculates a coefficient of relatedness from the genotypes observed at the sampled sites.

For each possible pair of samples in a pedigree, the expected relatedness coefficient based on declared family relation is compared with the observed relatedness coefficient (𝑟). IBS0 value helps to differentiate between sibling and parent–child relationships, both expected to have ~50% relatedness coefficient (see table).

Inferred relationship

Expected IBS0 number

Expected 𝑟 (%)

Observed 𝑟 (%) and interpretation

Analysis tools tab

Variant table

Variant table row formatting

The formatting of variant table rows provides visual cues about the variant status for the current user within a specific case.

Variant viewing status

Font weight indicates whether the variant has been viewed by the current user:

Viewed: Displayed in regular font weight. A variant is marked as viewed if the user opened the variant page before the case was finalized
Not viewed: Displayed in bold font weight

Note: After a case , all variants appear as not viewed.

Variant tagging status

Font color indicates whether the variant has been tagged, either by any user or by the AI Shortlist:

Not tagged: Displayed in black font
Tagged by the AI Shortlist: Displayed in green font
Tagged by any user: Displayed in blue font

Formatting of the variant table row

Viewed by the current user?

Tagged?

Variant table view customization

The table view is customizable:

Columns can be reordered by drag-and-drop
Any column can be shown or hidden by selecting the columns in the Show/Hide columns menu in the top right corner of the page
You can choose between comfort and compact view by clicking the corresponding button

All modifications are automatically saved for each individual user and retained until new changes are made.

Manually add variants to a delivered case

In some cases, you may need to record variants that were not included in the uploaded VCF or not called from FASTQ data. This is particularly useful when:

You are complementing NGS results with findings from other genetic tests (e.g., long-read sequencing, optical mapping, CGH, SNP array, karyotyping/FISH, repeat-primed PCR, MLPA, Southern blot, etc.)
You wish to report on adjacent CNV calls as a single CNV event.
You wish to report a set of adjacent variants together as a single multi-nucleotide variant (MNV).

Manually adding variants ensures that all relevant findings are visible in the case review and can be considered during interpretation.

Currently supported variant types: SNV, CNV, UPD, ROH, and STR.

Note: SV support is planned for future releases.

To manually add a variant:

Open the Analysis Tools tab.
Click the plus (+) button in the top-right corner.

Note: If you do not see the option, please contact support to verify your user role permissions.

In the Manually Add Variant window, select the variant type: SNV, CNV, UPD, ROH, or STR.
Fill in the details based on the selected type:

Chromosome,
Position,
REF,
ALT,
Zygosity

Chromosome,
Position Start,
Position End,
REF,
ALT,
Type:
- CNV: DEL, DUP,
- UPD: IUPDMAT (maternal isodisomy), IUPDPAT (paternal isodisomy), HUPDPAT (paternal heterodisomy), HUPDMAT (maternal heterodisomy),
Zygosity

Chromosome,
Position,
REF Repeats Number,
ALT Repeats Number,
Repeats Unit,
Zygosity

Click on Create Variant to add it to the case.

Viewing manually added variants

Manually added variants have a blue frame and are clearly marked with the label “Manually added variant” on the Variant page.
These variants differ from pipeline-called variants:
- Quality and Visualization sections are not available.
- Population Statistics are not shown.
- Automatic ACMG scoring is not applied, but you can still manually add ACMG tags, interpretation notes, and classifications.

Sorting and formatting notes

STR variants that are added manually do not fully align with the pipeline-called STR format. For example, variant length is not displayed.
Manually added variants cannot be sorted by columns that do not apply, such as AI Rank.

Tip: When reviewing manually added STRs, rely on the repeat numbers and unit provided, since length formatting is not included.

Filtering manually added variants

To view only the variants you have added manually:

In the Evidence & Tags Filters section, select Manually added variants.

This helps you focus specifically on variants that were entered outside of the automated pipeline.

Emedgene

Get Started with Emedgene

Get started with Emedgene

Look around

Create a case

Examine the analysis results

How can Emedgene help you solve a case?

Emedgene Analyze manual

Getting around the platform

Top navigation panel

Dashboard tab

Lefthand panel

Righthand panel

Cases tab

The Cases tab includes:

Cases table

Cases table fields

Cases table navigation panel

Case details

How to access the Case details panel

From the

From an

Case info

Family tree

How to open a case

To open a case:

How to customize Cases table view

How to select columns to be displayed

A. Show or hide columns via the Fields menu

B. Hide a column directly from the Cases table

How to change column order

A. Drag and drop the column

B. Reorder columns via the Fields menu

C. Move a column using a dropdown menu

How to adjust column width

How to filter cases

Available filters

How to apply filters

How to remove a filter

How to clear all filters

How to group cases

How to sort cases

To sort cases:

Help

Okta identity management

Managing data storage

Manage data storages

How to link your storage to Emedgene:

How to edit storage information:

How to remove a link to storage:

Manage S3 credentials

How to create a key pair

How to deactivate a key pair

How to activate an inactive key pair

How to delete a key pair

Manage ICA storage

How to get your ICA credentials:

To connect ICA:

Storage providers

Manage Azure Blob storage

Update Azure Blob Storage Credentials

Blob Integration Setup

Create an App registration

Azure Blob configuration

For Internal support:

Manage Google Cloud storage

Google Cloud Storage Credentials update procedure

How to get the client credentials?

Add the storage provider to Emedgene platform:

CORS - Visualisation

Manage BaseSpace storage

Via Command Line

Prerequisite

Authenticate

Via BaseSpace Developer Portal

Adding BSSH account to your Emedgene account

Bring Your Own Bucket

Bring Your Own Bucket

1. Create an AWS bucket

2. Edit Bucket policy