Tandem Repeats
Short tandem repeats (STRs) are regions of the genome consisting of repetitions of short DNA segments called repeat units. STRs can expand to lengths beyond the normal range and cause mutations called repeat expansions. For more information, refer to the DRAGEN user guide.
DRAGEN-STR ships with two STR catalogs. The default catalog contains a restricted number of well-studied loci whose expansion is linked with various diseases. The expanded catalog contains ~174,000 highly-polymorphic STR loci located in and around genes, and is more suitable for whole-genome explorations.
STR size genotyping accuracy in the expanded catalog
Accuracy on the expanded catalog loci was evaluated against the GIAB tandem repeat benchmark (v1.0) (1), using Truvari (2) for variant comparison.
DRAGEN: DRAGEN 4.5.4 | Truthset: HG002 GIABTR v1.0 | Reference: GRCh38
Standard WGS STR table (click to expand)
STR
0.9617
0.9871
0.9742
2007
664

Classification of samples with known pathogenic STR expansions
We sequenced 40 Coriell cell lines with known STR expansions and compared DRAGEN's STR classification against orthogonal validation methods. The swimplot shows long-allele size distributions in repeat-count units. Dots are colored by STRipy database range classification (normal/intermediate/pathogenic) according to each locus thresholds and orthogonally validated size prediction. Shaded regions and classification thresholds from Ibañez K. et al., Lancet Neurology 21, 234–245 (2022).
STR classification metrics table (click to expand)
STR Classification Accuracy Metrics
0.9824
1.0000
0.9911
STR lengths distribution across 20 loci for 40 Coriell samples with pathogenic STRs. Shaded regions show normal, intermediate, and pathogenic ranges (STRipy database, motif-length scaled). Dots are colored by classification.

References
https://www.nature.com/articles/s41587-024-02225-z
https://link.springer.com/article/10.1186/s13059-022-02840-6
Last updated
Was this helpful?
