New in Emedgene V32.0 (June 8th 2023)
New Features (June 8th, 2023)
New AI Shortlist Features and Enhancements
On this version we continued focusing on new AI innovation, with the goal of streamlining workflows and further reducing time-per-case.
AI Shortlist now shortlists mtDNA variants
For customers analyzing mtDNA variants in Emedgene, the AI Shortlist will now shortlist SNV/CNV mtDNA variants (homoplasmy or heteroplasmy) that are likely to solve the case.
AI Shortlist now shortlists STR variants
For customers analyzing STR variants in Emedgene, the AI Shortlist will now shortlist STR variants with expansion size within the pathogenic range according to gnomAD STR and that are likely to solve the case, with evidence.
AI Shortlist now shortlists SV insertions
For customers analyzing SV insertions in Emedgene, the AI Shortlist will now shortlist SV insertions that are likely to solve the case, with evidence. Due to variant calling accuracy limitations, insertions, as well as other CNVs, are considered by the AI Shortlistin addition to the regular SNVs.
Shortlisting with evidence for Carrier Screening
Support for sequential (single parent) carrier status.
This new setting will prioritize variants for carrier status, rather than affected, in order to reduce time per case for expanded carrier screening applications.
AI Shortlist can be configured on the organization level to consider only known P/LP variants or both known and high severity variants.
Analysis type ‘carrier’ should be selected during case accessioning.
Output of the AI Shortlistwill be variants tagged as ‘carrier’ rather than the typical ‘most likely’ and ‘candidates’.
A customer-generated gene list is required for the carrier analysis workflow.
Improved recall for AI Shortlist Focused Mode to 96%
Our AI Shortlist model can be run in two modes, including genes of unknown significance (GUS), labeled Discovery Modeor excluding GUS, labeled Focused Mode. In this version, we improved the recall for the Focused Modeto 96%, nearing the results of our Discovery Mode at 97%. The Focused Mode will display up to 7 SNVs, indels, mtDNA and STR variants, and up to an additional 3 CNV variants.
This new Focused Mode was validated on 3 datasets totaling 330 cases solved by SNV, indel, and CNV variants, and demonstrated 96% recall.
Performance Enhancement
AI Shortlist code was refactored to improve performance, scalability, and clean up errors. A bit-exact validation was performed on 3 datasets totaling 330 cases solved by SNV, indel, and CNV variants. A comparison of the precise ranking of most-likely and candidate variants was performed to ensure recall was not affected for cases solved by SNV, indels or CNV variants.
Full Featured Genome: New! Support for SMN, ROH, Ploidy & BAF visualization
Support for the DRAGEN SMN caller
DRAGEN SMN calls SMN1 and SMN2 copy numbers, enabling the analysis of 95% of SMA cases by determining the absence of the functional SMN1 allele in any copy of SMN. This targeted caller overcomes the challenge in producing complete variant caller results with standard WGS due to a high-similarity duplication of SMN1 and the paralog SMN2. Learn more about the SMN Caller.
SMN1 copy number variants can be shortlisted by the AI Shortlist and interpreted just like other CNVs.
SMN2 copy number can also be reviewed and interpreted as a prognostic biomarker of SMA clinical severity.
Carrier status can be assessed by interpreting SMN1 copy number.
Support for the DRAGEN ROH (Regions of Homozygosity) caller
DRAGEN ROH detects and outputs the runs of homozygosity from whole genome calls on autosomal human chromosomes. Sex chromosomes are ignored. ROH output enables screening and prediction of consanguinity between the parents of the proband subject (see DRAGEN™ Bio-IT Platform documentation for more details about the algorithm).
In Emedgene, the ROH BED will be displayed in the IGV viewer. On hover, ROH score, # of Hom SNPs and # of Het SNPs in the region as well as Location (start to end positions) will be displayed.
Support for the DRAGEN Ploidy estimator
DRAGEN Ploidy estimator is used to assess if there are any copy number variants of complete chromosomes (Trisomy, Monosomy etc). The results of the Ploidy estimator will be displayed on the lab tab. If ploidy estimation is marked as fail, the lab tab will have a warning in the page header. Results will be available to download from the sample quality section. (see DRAGEN™ Bio-IT Platform documentation for more details about the algorithm).
Support for B Allele Frequency visualization
The BAF bigwig file is available to view in the additional tracks, both in simple and advanced modes. It will be available for WES and WGS samples called with the Emedgene DRAGEN pipeline.
To avoid confusion, the TNF track, previously labeled as BigWig, is renamed TNF.
Research genome test type
A new Research Genome test type enables full analysis of a genome with no BED intersect. Most of the newly available variants will receive limited or no annotations. The Research Genome significantly increases time per case and performance and is only recommended for research labs with a need to interpret non-coding variants. The Emedgene Whole Genome test type covers 5Kbp upstream/downstream gene and known variants from the non-coding regions, all of which will be annotated.
Support for Illumina Complete Long Reads Interpretation
Support for the recommended Illumina Complete Long Reads interpretation flow.
Ingests the combined long-read/short-read callers for both SVs and SNVs, along with the long read BAM.
Short read callers supported by Emedgene from VCF can be added as well (CNV read-depth, STR (starting from DRAGEN 4.2)).
Once ingested, ICLR variants will be shortlisted by the AI Shortlist and automated ACMG classification will be applied where relevant.
All ICLR calling will be performed via the BSSH application and Emedgene will ingest VCFs and BAM via the existing BSSH integration.
Limitations
Emedgene does not yet support the interpretation of complex SV variants such as inversions and translocations.
ICLR cases must start from VCF, Emedgene doesn’t support case accessioning from both VCF and FASTQ.
Enhanced private sharing of curated data
Set sharing levels, opt-in/opt-out, improved interpretation workflow
Private sharing of curated data between collaborating organizations in Analyze now include granular sharing permissions per network and opt-in/opt-out per case.
Networks are managed from [Settings | Network]. Here the dedicated network manager can:
Create networks; Labs can belong to multiple networks.
Set data sharing policy for each of them.
Leave or delete networks.
Currently, to invite other organizations to their network(s), the network manager must reach out to tech support. (Network invitation flow will be enabled in a future version).
Level of sharing of a particular case with a particular network of collaborators is defined by:
Sharing mode set for each data field for a particular network. There are 4 levels of sharing that can be applied to each data field: Mandatory – always shared by default; Not shared – never shared; Restricted – shared in cases with or without patient’s consent to Extended sharing; Extended – shared only in cases with patient’s consent to Extended sharing.
Sharing patterns are defined by the network manager in [Settings | Network]. Data fields that could be shared via Analyze Network include:
Data field
Applicable level of sharing
Case ID
Mandatory
Collaborator
Overlap (CNV)
Pathogenicity
Variant Details (CNV)
ACMG Tags
Not shared / Restricted / Extended
Age
Alt Repeats (STR)
Case Type
Date
Ethnicity
Phenotypes
Proband ID
Selected Disease
Sex
Tag
Variant Interpretation
Zygosity
Case subject consent for Extended data sharing.
Review details of variants tagged by collaborators within organization network(s) in the [Variant page | Related cases section].
View modes for [Variant page | Related cases section]: Simple (check/uncheck Network data from all collaborators) and Advanced (select collaborators).
For each variant, extended sharing details are available at a click, as is the ability to contact a collaborator.
Analysis tools improvements: Exclude variants based on pathogenicity, view Presets & sort columns
Exclude variants based on pathogenicity
Exclude variants from your Preset filters based on their pathogenicity in ClinVar or Curate
Save time by excluding benign variants from your Preset filters. Benign variants can be identified from multiple sources (ClinVar, Curate) and can be excluded, as defined by the customer. For ClinVar, the classification can also be matched with a star review classification for more granularity in what you exclude. This filter may be applied to any individual preset bin. Contact techsupport@illumina.com to request an update to your Preset filters.
View Presets
Added the ability to view the filters underlying each Preset. For variant analysts working on a preset filter, this will enable a quick refresher on the Preset composition, within the analysis flow.
Actions:
Click on the name of the Preset to review the Preset variants.
Click on the dropdown arrow to expand the Preset definition.
Click on the checkbox to indicate that you have completed your review of this Preset.
Sort columns
Sort any column in the analysis table, with the exception of Phenomatch score (which will be available in v33) and Tags.
Primary and secondary sorting are enabled.
Search alternative allele
In the search bar, we now enable searching by a specific SNV/indel variant following the : > format.
Spaces are supported.
This search will return an exact match.
Add New Case: Single click batch upload of cases
Case accessioning now enables batch upload of cases from the UI! This single click upload of a case manifest file, allows customers without API integrations to easily upload batches of 50 cases at once.
Download the Batch Upload csv template, which is compatible with Sample Sheet v2, adding fields needed for tertiary analysis.
Every field available in manual case creation and API, is included in the template: including Family Id, Case Type, Files Names, Visualization Files, Execute Now, BioSample Name, Relation, Gender, Phenotypes, Phenotypes Id, Boost Genes, Gene List Id, Kit Id, Selected Preset, Due Date, Label Id, Clinical Notes, Opt In, Storage Provider Id, Date Of Birth, Default Project including Additional Fields.
Cases can be created with or without samples, just like in UI and API.
Data from unknown columns will be saved in the 'additional data' fields.
The uploaded file will be automatically validated. A CSV with reported issues can be downloaded, fixed and reuploaded.
Once the full batch has been uploaded and no issues are identified, the cases will be created and analysis will begin.
Up to 50 cases can be created at once.
Workbench | Workflow Improvements: IGV desktop integration, finalize case & more
Improved IGV desktop integration
The Emedgene variant page contains an IGV web viewer, which has limited functionality compared to IGV desktop. Some customers prefer to work in a dual screen set up, with IGV desktop open on a second screen.
Emedgene has an existing IGV desktop API integration for customers who start from VCF. In this version, the integration was expanded to customers who start from FASTQ or store files in an AWS S3 storage bucket.
By clicking the Load to IGV button, all available BAM/CRAM files will be loaded to the desktop IGV (only if the application is already running), and customers can further enhance and customize their viewer.
Case interpretation flow update
Users must select a case interpretation status (confidently solved, likely solved, uncertain, negative) before moving the case status to ‘Finalized’.
Interpretation notes, Gene interpretation and Recommendations are saved even if interpretation status is not selected.
Importantly, finalize case via API remains possible.
Add new case | Create existing gene lists
Increase flexibility to support customer workflows by enabling the creation of new gene lists that are identical to existing gene lists from API and UI. This will enable customers with a complex and large panel test menu to more accurately manage case creation.
Support for gene lists with up to 10,000 genes
Increase flexibility to support customer workflows by enabling the creation of large gene lists. Previous limitation was 900 genes.
Gene lists can be created via API and UI; Very large gene lists (>6700 genes) may return a false error message during creation, despite being successfully created.
New ACMG SNV Scoring!
In order to ease the burden of interpreting Variant of Unknown Significance, we have added a score according to the guidelines in Tavtigian et al., fitting a naturally scaled point system to the ACMG/AMP variant classification guidelines. Hum Mutat. 2020 Oct;41(10):1734-1737 PMCID: PMC8011844.
Each tag can be edited, and editing will modify the score.
Additional:
API | New query returns a list of genes and NCBI IDs associated to the case phenotypes using the Phenomeld algorithm. This query will help laboratories meet compliance requirements.
Variant Interpretation Paragraph - Now supports Catalan, Spanish, Portuguese, and Hebrew symbols.
New Settings
Manage your own S3 credentials
Users (with the roles Manager and Manage S3 Credentials) can now manage access to Emedgene AWS S3 buckets for Upload & Outputs, under Settings -> Management.
Users can create up to two access keys.
When creating a new key, there is only one opportunity to save/copy the access key and settings (warning appears).
Users can deactivate keys.
Following deactivation, users can delete keys.
A help center article details the S3 buckets and how-to access.
If assistance from Tech Support is needed, a new access key can be created and deleted after retrieval of the files needed for troubleshooting.
Gene Lists | View NCBI ID associated with a gene symbol; Download gene lists with NCBI IDs
The Emedgene pipeline and platform associates NCBI IDs to gene symbols, and preferentially utilizes NCBI IDs throughout the software to alleviate gene list changes due to gene symbol changes or user errors.
In the Management Page, users can view the NCBI IDs associated to the gene symbols within a list and download gene lists with NCBI IDs for review.
Limitations
The accuracy of the AI Shortlist for the new variants (INS, mtDNA, STR, CNV/SV) depends on the variant calling accuracy. The user should be careful when interpreting these challenging-to-call variants.
The new callers and visualizations – SMN, ROH, Ploidy, BAF - are only available for customers running WGS from FASTQ on Emedgene, with the exception of BAF, which is also available for customers running WES from FASTQ on Emedgene.
To enable the SMN caller on your account please contact Support.
Load to IGV will only work if IGV desktop is running.
Load to IGV will upload all available case tracks to desktop upon every click, if a user accidentally clicks multiple times, the tracks will be uploaded multiple times.
Emedgene Curate does not yet support CNV variants larger than 10M bp or intergenic variants
Fixed Issues
The AI Shortlist typically ignores commonly occurring variants. There is an exception list of common variants known to cause disease. This list was expanded, and now includes:
Gene
Variant Position (hg19)
Max Allele Freq. (Population)
RBM8A
chr1-145507765-G-C
2.3% (Europeans)
BTD
chr3-15686693-G-C
5.5% (Europeans)
F11
chr4-187201412-T-C
2.4% (Ashkenazi Jewish)
HFE
chr6-26091179-C-G
14.4% (Europeans)
HFE
chr6-26093141-G-A
5.7% (Europeans)
CYP21A2
chr6-32007887-G-T
2.4% (Ashkenazi Jewish)
HBB
chr11-5248232-T-A
4.5% (African/African Americans)
GJB2
chr13-20763612-C-T
8.3% (East Asians)
MEFV
chr16-3293310-A-G
3.9% (Ashkenazi Jewish)
SV calling from WES | Updated DRAGEN SV command line for exomes, per DRAGEN recommendations.
Disabled STR calling from WES | The DRAGEN ExpansionHunter method is only available for WGS.
Update of the Emedgene UI to use the most updated version of the API, in alignment with the batch uploader and new Add New Case features.
Error message time on screen was extended to 20 seconds following multiple customers requests.
When customers tag a variant, evidence graph is created automatically along with variant summary and ACMG calculation. Previously, users had to enter evidence page in order to generate these features.
Lab Tab | Insufficient regions – fixed a bug that sometimes displayed % > 100.
Gene Lists | Fixed a bug where gene names with a hypen were split into two genes.
Evidence Text | Fixed a bug with de novo on ChrX which generated a wrong inheritance in the interpretation text.
Lab Tab | Fixed a bug where users couldn’t navigate directly to the Lab Tab.
Illumina clouds | Fixed a bug where customers moving between several organizations were not redirected to the proper URL.
Variant Page | Fixed a bug where large SNV indels were not displaying correctly in multiple page elements.
Variant Page | Visualization | Fixed a bug where TNF bigwig tracks selected from advanced mode were misaligned to labels.
Variant Page | Main effect field will always write ‘and’ instead of ‘&’ to avoid use of special characters unsupported by some reporting templates.
Case Interpretation | Fixed bug in activity tracking that incorrectly mentioned reanalysis.
Known Issues
CNV variants larger than 20 Mbp are not annotated with genes name and effect, these variants are shortlisted by default by the AI Shortlist, but ACMG automation is not applied to them. (Fix planned for v33).
GRCh37<-->GRCh38 Liftover not available for older components of Network infrastructure, as a result, [Variant Page | Clinical Significance | Networks Classified] may remain erroneously empty while [Variant page | Related cases section] shows relevant information. Same gap for manually classified variants.
Zygosity, even when set in extended sharing, may remain blank for older cases. Once you click on a case missing zygosity it will be saved for all future views.
Sort by Phenomatch score is not available.
Sort by Tags is not available.
Analysis Type field is not available with this version of Batch Upload, and it cannot be used to initiate the new Carrier workflow.
Analyze | Manually added variants | STRs -When manually adding an STR variant it cannot be tagged or reported.
API | Creation of large gene lists may return an error (due to timeout) despite the creation of the gene list.
Cases Page | Contact support link for failed cases does not work. Please use techsupport@illumina.com.
ILMN Clouds | Help Center | Some links may not work. Work around: Paste the title into the help center search.
Variant Page | Visualization | Chromosome visualization is missing for mtDNA variants in VCF case run on GRCh37.
Variant Page | Visualization | Simple/Advanced mode for upload locally BAM does not work.
Analysis Tools | Het count filters are not precise, include AC (total counts), display more data than expected.
Analysis Tools | ‘Last’ button on pagination does not work.
API | Assign users to case fails with no error if faulty emails used.
Add New Case | Create a case from case creation summary does not work, please click on top Add New Case button from cases page.
Lab Tab | % mapped reads is not populating data, can be viewed by downloading DRAGEN metrics. (Fix planned for V33)
Candidates Page | Compound het SNV-CNV variants will not display the automated CNV classification. Workaround – view variants from analysis table.
Candidates | When clicking on See all candidates link, variant filters are inactivated. Workaround: Reset filters to default.
Organization Settings | Set mandatory fields - does not work from the UI. Please contact support if you’d like to configure these fields for your account.
Gene Lists | Very large gene lists (>6700 genes) may return a false error message during creation, despite being successfully created.
Last updated