Percentage of reads mapped to the reference sequence.
Blue bars represent each of these parameters per sample, while a vertical line represents a general metric across all the samples of the same case type in the account.
NGS sex validation
The Sex validation column indicates whether the biological sex inferred from genomic data matches the sex information provided during case creation.
This helps identify potential sample mix-ups or metadata errors before interpretation begins.
Sex validation results:
Pass
Reported sex matches the estimated sex
Fail
A mismatch was detected between reported and estimated sex.
N/A
QC file not available; validation could not be performed.
Sex validation is performed by comparing the observed homozygous/heterozygous genotype ratio on the X chromosome with the expected ratios:
<2 for females
>2 for males
Prerequisites:
Only high-quality SNVs from targeted regions—either kit-specific or RefSeq coding regions—are used for sex validation
A minimum of 50 variants is required to generate a reliable result. If this threshold is not met, sex validation cannot be performed, and no result is displayed
If the sex was marked as unknown during case creation, the system will display the predicted sex instead of a validation status.
Call rate
The Call rate field displays the percentage of loci on the array for which a genotype call was successfully made.
Call rate is one of the key metrics used to determine array sample quality, alongside log R deviation.
A high call rate indicates a high-quality sample and successful genotyping. Low call rates can signify problems with the DNA sample (poor quality or quantity) or issues during the array processing.
Displayed to three decimal places.
Sequencing lab information section
Sequencing lab information section reports sequencing run technicalities as indicated during case creation:
Lab
Instrument
Reagents
Kit type
Expected coverage
Protocol
Array sample quality metrics
Ploidy
The Ploidy column provides results from the DRAGEN Ploidy Estimator, which is designed to detect aneuploidies and determine the sex karyotype in whole genome cases.
Ploidy estimation results:
Pass
All autosomes fall within the expected ploidy range.
Fail
At least one autosome shows a median score outside the expected thresholds (below 0.9 or above 1.1).
When you hover over a failed result, the system displays which chromosomes are problematic.
N/A
Case type is not Whole genome
or
QC file not available; validation could not be performed.
The ploidy calculation uses values from the *.ploidy_estimation_metrics.csv file.
Tips:
Use ploidy checks early in case review to spot potential large-scale chromosomal abnormalities.
Always confirm whether the sex karyotype inferred from ploidy matches the results to rule out sample swaps.
Failed results do not confirm clinical abnormalities. They only indicate a deviation in copy number estimation and should be reviewed in context of other QC metrics and visualization.
Download DRAGEN QC metrics files
Sample-level DRAGEN QC metric files for all samples in a case can be downloaded by clicking the download icon next to the Sample quality section title.
For NGS cases, the report includes coverage and mapping statistics.
For array cases, metrics include array QC values such as call rate, autosomal call rate, and Log R dev.
Contamination
The Contamination column reports whether a sample shows signs of DNA contamination, helping ensure data reliability before interpretation.
Be mindful that when contamination is suspected in sequencing data, it could stem from various sources, including true contamination, sample mix-up, library preparation issues, or technical artifacts.
Always confirm the issue with other quality checks.
Contamination is detected using calculations, which estimate the proportion of reads that do not match the expected genotype. This estimate is based on the idr_baf
Case quality section
The Case quality section summarizes the data quality of the case and highlights the results of validation checks:
Chromosome validation
Confirms that each chromosome with at least 100 SNVs in defined enrichment kit or coding regions includes at least one high-quality variant
gnomAD validation
Verifies that each chromosome with at least 100 SNVs in defined enrichment kit or coding regions includes at least one variant annotated with gnomAD
ClinVar validation
Ensures that each chromosome with at least 100 SNV variants in defined enrichment kit or coding regions includes at least one variant annotated with ClinVar
AI Shortlist validation
Checks that at least one variant is tagged by the AI Shortlist.
This validation is not applicable if the gene list contains fewer than 50 genes
If your workgroup uses a higher threshold, it is reflected in the Gene list threshold field
mtDNA reference validation
Confirms that the rCRS reference is used for mitochondrial DNA
idr_baf stands for the interdecile range of the B-allele frequency—calculated as the difference between the 90th and 10th percentiles of the distribution of alt / (ref + alt) ratios across all variant sites.
A larger idr_baf value indicates greater variability in allele balance, which may suggest sample contamination, particularly from another human DNA sample.
Contamination check results:
N/A
No data is available (older cases or when idr_baf = 0.000).
No
No contamination detected (idr_baf < 0.200).
Unlikely
Possible contamination, but evidence is weak (0.200 ≤ idr_baf < 0.241).
Hover over the value to display a tooltip showing the HET ratio (proportion of sites that are heterozygous) and the HET count (number of heterozygote calls in sampled sites).
Tips:
Always review contamination results before starting interpretation to rule out technical issues that could explain unexpected variant calls.
Cross-check contamination results with other QC metrics (e.g., depth, ploidy, sex validation) for a more complete picture of sample quality.
For family cases, check that no contamination is flagged before relying on inheritance-based filters.
Warnings:
Panels may be less reliable: For targeted panels, contamination estimates may be inaccurate due to the limited number of variants available for calculation. Use caution and cross-check with other QC metrics when interpreting these results.
Do not use in isolation: A "Likely" or "Yes" result should not immediately be considered diagnostic — review case setup, sequencing quality, and sample handling first.
The Quality status provides a quick assessment of array data reliability for each sample:
High
Call rate ≥ 0.99 and Log R dev ≤ 0.2
Low
If either condition is not met
N/A
If the QC file not available
Use the Quality status to quickly screen whether a sample meets minimal QC thresholds before starting detailed interpretation.
Autosomal call rate
The Autosomal call rate field displays percentage of loci on the array for which a genotype call was successfully made, that only includes autosomes.
A high call rate indicates a high-quality sample and successful genotyping. Low call rates can signify problems with the DNA sample (poor quality or quantity) or issues during the array processing.
Displayed to three decimal places.
NGS sample quality metrics
CNV overall ploidy
The CNV overall ploidy field displays the ploidy value extracted from the CNV VCF header. If no CNV VCF file is provided, "N/A" is displayed.
Displayed to three decimal places.
The value is shown as is. The system does not validate or flag abnormal ploidy values. Interpret ploidy in context.
Summary dashboard
Summary dashboard provides a quick overview of key quality indicators at both the case and sample levels.
Specifies the QC BED kit used to evaluate coverage depth and breadth. If no kit is specified at analysis launch, NCBI RefSeqGene is used as the default reference
Custom gene coverage
Indicates whether the coverage of genes in the selected panel meets the expected threshold, as defined by the QC BED
Pedigree status
Displays the results of relationship validation, confirming whether the submitted pedigree aligns with genetic data
The Lab tab shows sample and case-level quality metrics so you can check data reliability before starting interpretation.
The Lab tab includes:
Summary dashboard—highlights the key quality indicators, with more details provided in the subsequent sections
—reports sequencing run technicalities
—summarizes the data quality of the case
—highlights quality metrics for each sample
—displays the results of the relationship validation for each pair of samples in a family tree
—highlights regions that may not have been adequately sequenced
DRAGEN QC report
The is generated by the Illumina DRAGEN Bio-IT Platform and covers the entire analysis workflow—from raw sequencing reads to variant calls.
DRAGEN QC report formats
Interactive HTML summary
A visual summary that includes interactive plots of key quality metrics. This report can be from the Sample quality section of the Lab tab.
CSV metric files
A set of detailed CSV files containing sample-level quality metrics. These files are downloadable and support in-depth review and documentation.
The Log R Deviation (or Log R Ratio standard deviation) quantifies the variability of the the signal intensity for each SNP marker on an array, ie, noise level.
Log R deviation is one of the key metrics used to determine array sample quality, alongside call rate.
Lower values indicate more consistent signal intensities. A high Log R Deviation can indicate a poor-quality sample or potential issues with CNV calling.
Displayed to three decimal places.
Sequencing error rate
Sequencing error rate refers to the frequency at which incorrect base calls are made during sequencing process.
Blue bars represent each of these parameters per sample, while a vertical line represents a general metric across all the samples of the same case type in the account.
Array sex validation
The Sex validation column indicates whether the biological sex inferred from genomic data matches the sex information provided during case creation.
This helps identify potential sample mix-ups or metadata errors before interpretation begins.
Sex validation results:
Pass
Reported sex matches the estimated sex
Fail
A mismatch was detected between reported and estimated sex.
N/A
QC file not available; validation could not be performed.
If the sex was marked as unknown during case creation, the system will display the predicted sex instead of a validation status.
NGS sample quality
The overall sample quality indicator provides a quick assessment of sequencing reliability for each sample.
Sample quality is evaluated using the following metrics:
Average depth of coverage
Mean coverage across the target regions
% bases covered >20x
Percentage of bases in the target regions covered at a depth greater than 20×, indicating reliable coverage
Error rate
Sequencing error rate. Reflects general sequencing accuracy
% mapped reads
Proportion of reads successfully mapped to the reference genome
Contamination check
Detects mixed or low-quality samples that may affect interpretation
These metrics give an overall confidence level for whether the sequencing data can support accurate variant interpretation.
Coverage
Coverage metrics for a target region defined by a QC BED file (or RefSeq coding regions if no kit is provided) included in the Sample quality section:
Average coverage
Average depth of coverage for a target region
% Bases with coverage >10x
percentage of a target region that is covered at a minimum depth of 10x
% Bases with coverage >20x
percentage of a target region that is covered at a minimum depth of 20x
Blue bars represent each of these parameters per sample, while a vertical line represents a general metric across all the samples of the same case type in the account.
Review interactive DRAGEN report
When available, a DRAGEN report link appears below the sample name in the Sample quality section of the Lab tab. Clicking the link opens the detailed quality control metrics report in a new browser tab. This integration allows users to quickly assess sequencing quality and confidently interpret results—without leaving the Emedgene interface.
Prepare DRAGEN QC metrics files to be included in a NGS VCF case
When creating NGS cases that start from VCF, you can create a browsable from the DRAGEN metrics files. Due to security restrictions, CSV files are not directly ingested, but they can be included when packaged in a TAR file.
Navigate to local directory containing metrics files for a specific sample.
Define sample name as a variable samplename="NA12878".
Sample quality section
Sample quality metrics
The Sample quality section in the Lab tab gives you a quick view of the reliability of sequencing or array data used in your case. The metrics displayed in the Sample quality section and their underlying calculation vary depending on the case type:
NGS case
Prerequisites for accessing the DRAGEN QC report
NGS case
Option 1: FASTQ case
1
Run a FASTQ case in Emedgene.
2
Since DRAGEN analysis is integrated into Emedgene secondary analysis pipeline, QC reports are automatically generated in the system.
Option 2: VCF case—Bring your own DRAGEN (BYOD)
1
Run DRAGEN analysis externally.
2
Prepare a TAR archive containing DRAGEN QC metrics files.
3
Upload the TAR archive and the sample VCF file to Emedgene.
Combine the find and tar commands to package the files into a tar.gz file with the following extension *.metrics.tar.gz.
Command to find files matching the required patterns:
Upload the metrics.tar.gz file to the storage location used for creating cases.
Add metrics.tar.gz to case creation API JSON payload using the corresponding storage ID.
Ensure that if the extension is not contained in the filename (e.g. files from BaseSpace) that "sample_type": "dragen-metrics" is set within the JSON payload.
Case creation API JSON
{"test_data":{
DRAGEN report link is then available once your case has been delivered.
Identity by state 0—a number of genomic loci where two individuals share zero alleles. This occurs when the two individuals are opposite homozygotes for a biallelic SNP.
This metric is calculated across a set of biallelic SNPs and is inversely related to the degree of genetic similarity between the individuals. A low IBS0 count suggests a higher degree of overall genetic similarity, but it is an indirect and limited measure of genetic relatedness that requires interpretation alongside other metrics.
Relationship validation result
Summarizes the outcome of the relationship validation, confirming whether the observed data aligns with the expected pedigree structure.
Relationship validation calculation
Relationship validation is done by Peddy based on:
Relatedness coefficient (𝑟)—a measure of how much two individuals share alleles from a common ancestor, indicating the probability that alleles at the same genome location are identical by descent
IBS0 (Identity by state 0)—a number of genomic loci where two individuals share zero alleles, ie, they are opposite homozygotes
IBS2 (Identity by state 2)—a number of genomic loci where two individuals share two alleles, meaning they have the exact same genotype
Peddy takes the inferred relationships from the genetic data and cross-references them against the declared relationships. For every pair of individuals in a cohort, Peddy calculates a coefficient of relatedness from the genotypes observed at the sampled sites.
For each possible pair of samples in a pedigree, the expected relatedness coefficient based on declared family relation is compared with the observed relatedness coefficient (𝑟). IBS0 value helps to differentiate between sibling and parent–child relationships, both expected to have ~50% relatedness coefficient (see table).
The Genes coverage section helps you quickly identify parts of genes that may not have been adequately sequenced in your case. This insight is particularly important when assessing sequencing quality, interpreting uncertain findings, or deciding if further validation is needed.
While variant callers provide base-by-base coverage, Emedgene simplifies the view by showing average coverage per region. This makes it easier for you to spot undercovered genes at a glance, even when individual positions may appear sufficiently covered. By smoothing out local fluctuations, average coverage helps you prioritize regions that might require further review and complements DRAGEN's fine-grained metrics with a broader, more interpretable view.
Coverage metrics are generated differently depending on the type of input data used for your case:
1. FASTQ / BAM or gVCF-based cases
If your case was started from FastQ/BAM or gVCF, coverage is inferred from gVCF reference blocks (also called GVCFBlocks).
These blocks are segments of the genome where genotype quality (GQ) is consistent.
A new block is created whenever there's a significant change in GQ, which results in a highly segmented and detailed representation of local sequencing quality.
Coverage for a region is based on the median coverage of each gVCF block. If a region spans multiple blocks, the reported value is the average of those medians.
Within a region (like an exon), you’ll often see multiple blocks. Emedgene aggregates them to show you:
Average depth
Minimum depth
Occasionally, some blocks may be unusually large and may miss internal variation—for example, in genes like XIAP, one block could span an entire region despite having uneven coverage inside.
2. VCF + BAM or VCF + BED-based Cases
If your case includes VCF and BAM or VCF and BED, coverage is calculated directly from the aligned reads or from predefined BED intervals.
Coverage is calculated as the true base-by-base average across the entire region.
This method avoids the variability of gVCF segmentation and gives a precise coverage profile for each region.
Tip: Before comparing coverage values across cases, check whether the case was processed from FASTQ/gVCF or VCF with alignment files (BAM/CRAM). The calculation method differs, so values may not be directly comparable.
Limitation: Coverage estimation is not supported for VCF + CRAM cases. If a CRAM file is used with a VCF file, as opposed to a BAM or a BED file, the Genes Coverage table will remain empty for virtual panel cases.
Regions evaluated for coverage
Coverage is compared against expected regions defined in:
Emedgene's reference BED file, or
Your test’s custom KIT BED file
Each region is defined by:
Chromosome
Start & end positions
Name and strand (Optional)
Coverage assessment
Emedgene uses the tool bedtools intersect to compare each expected region from the regions used for coverage assessment against actual read coverage. The system captures:
How much of the region overlaps with sequenced data
Depth of coverage per segment
Coverage statistics
Each region includes these metrics:
Metric
Description
Min Depth
Lowest depth in the region (for gVCF-based cases: lowest avg depth in a block)
Max Depth
Highest depth observed
Average Coverage
Mean read depth across the region
% ≥3×
Percent of base pairs with at least 3x coverage
% ≥20×
Percent of base pairs with at least 20x coverage
Length
Region length in base pairs
Note:
Genes with insufficient coverage is available only for FASTQ-based cases.
Warning:
Minimum depth for FASTQ / BAM / gVCF-based cases does not represent minimum depth but Minimum average depth within the GVCF block.
How to use the coverage tool
You can interactively explore gene-level coverage details using the Genes with Insufficient Coverage tab. This tool is currently available only for FASTQ-based cases.
Here’s what you can do:
Search for a specific gene or a list of genes.
Filter results based on coverage thresholds:
≤0x
≤5x
≤10x
≤20x
or All
Download tables or genomic coordinates for regions with poor coverage.
Click More details to open a pop-up with exact genomic coordinates of low-coverage blocks.
To check coverage for a gene:
Enter the gene symbol in the search box and select it.
Choose your desired coverage filter from the dropdown.
Review the results in the table or download the data.
Click More details to inspect the specific coordinates of undercovered regions.
To look up the coverage for multiple genes that are saved as a Gene list:
Click the Add Gene List button and select any of your pre-loaded gene lists.
To further filter regions:
By maximum depth of coverage
Select Coverage, then choose the highest allowable coverage value from the dropdown list,
By percentage of bases covered >20×
Select % of Bases Gt20, then choose the highest allowable percentage from the dropdown list.
Visual review in IGV allows manual variant confirmation by inspecting aligned reads at specific genomic regions.
To inspect poorly covered regions of a gene in the desktop IGV browser:
Click on More details in the row corresponding to the gene of interest. This opens a pop-up with coverage details for the selected gene.
In the pop-up, select View on IGV to open the region in the IGV desktop application.
To download data
Click the Download button to export the full list of low-coverage regions as a *_insufficient_regions.tsv file. Each row includes region coordinates and all metrics.