Tandem Repeats

Short tandem repeats (STRs) are regions of the genome consisting of repetitions of short DNA segments called repeat units. STRs can expand to lengths beyond the normal range and cause mutations called repeat expansions. For more information, refer to the DRAGEN user guide.

DRAGEN-STR ships with two STR catalogs. The default catalog contains a restricted number of well-studied loci whose expansion is linked with various diseases. The expanded catalog contains ~174,000 highly-polymorphic STR loci located in and around genes, and is more suitable for whole-genome explorations.

STR size genotyping accuracy in the expanded catalog

Accuracy on the expanded catalog loci was evaluated against the GIAB tandem repeat benchmark (v1.0) (1), using Truvari (2) for variant comparison.

DRAGEN: DRAGEN 4.5.4 | Truthset: HG002 GIABTR v1.0 | Reference: GRCh38

Standard WGS STR table (click to expand)
Subtype
Recall
Precision
F1-score
FN
FP

STR

0.9617

0.9871

0.9742

2007

664

STR for Standard WGS

Classification of samples with known pathogenic STR expansions

We sequenced 40 Coriell cell lines with known STR expansions and compared DRAGEN's STR classification against orthogonal validation methods. The swimplot shows long-allele size distributions in repeat-count units. Dots are colored by STRipy database range classification (normal/intermediate/pathogenic) according to each locus thresholds and orthogonally validated size prediction. Shaded regions and classification thresholds from Ibañez K. et al., Lancet Neurology 21, 234–245 (2022).

STR classification metrics table (click to expand)
Subtype
Recall
Precision
F1-score

STR Classification Accuracy Metrics

0.9824

1.0000

0.9911

STR lengths distribution across 20 loci for 40 Coriell samples with pathogenic STRs. Shaded regions show normal, intermediate, and pathogenic ranges (STRipy database, motif-length scaled). Dots are colored by classification.

STR size distribution, 20 loci, bp, filled intermediate

References

  1. https://www.nature.com/articles/s41587-024-02225-z

  2. https://link.springer.com/article/10.1186/s13059-022-02840-6

Last updated

Was this helpful?