Joint calling in Emedgene

Classic joint calling consists of calling variants "simultaneously across all sample BAMs, generating a single call set for the entire cohort." (GATK.broadInstitute.org)

When running from BAM or FastQ samples on Emedgene, we do not apply a classic joint calling but a BAM look-up methodology.

This methodology consists of retrieving coverage information from BAM during the VCF merging process. Thus, if a variant does not exist in a parental sample, the algorithm will check the coverage in that position using data from the BAM file. The position will be considered as "REF" allele if it is covered (depth > 3), and "No coverage" or "N/A" (./. in the VCF FORMAT/GT field), if it is below that threshold or has no coverage.

This process involves the creation of a “genome coverage” file as a separate preliminary step. The coverage file could also be provided via a BED or a gVCF file.

BAM look-up approach is slightly different from classic joint calling used by the joint calling option in DRAGEN and other variant callers, and therefore will not produce identical results.

However, it is important to mention that Emedgene platform supports joint called VCF files, as well.

Remark: If a coverage file (ie. BED, BAM, gVCF) is not provided, then it is not possible to estimate the presence of REF allele in empty positions. As a consequence, "No_coverage" value will be assigned to those variants, which can affect the inheritance mode filters.

Limitation: It should be noted that the current data pipeline has a limitation stemming from the way it merges variants from different samples into the same case (e.g., in a trio). Since it is based on bcftools, variants are identified by the chromosome number, start position, reference allele, and alternate allele. However, it does not take into account the size of the variant itself. As a result, this may sometimes lead to inaccurate merging of CNV-type variants that differ in size. That limitation is not present when joint calling is used.

PreviousSupported reference genome assemblies NextTranscript prioritization logic

Last updated 4 months ago

Was this helpful?