F2 Validation
Background
DRAGEN supports various FPGA types for accelerated analyses of on-premises, on-instrument and cloud pipelines.
The software automatically detects the platform and selects the appropriate method to interact with the FPGA type. Subsequently, it loads the appropriate FPGA images and configures the system for the analysis.
The FPGA images for different platforms are unique, but they are built from the same logic, and they all produce the same results across platforms - per design.
The software releases are tested and validated across all supported platforms, and concordance between platforms is validated.
AWS F2 Instance Type Support
AWS has launched a new FPGA instance type (F2) and has announced obsolescence of the existing FPGA instance type (F1) used by DRAGEN on BSSH and ICA platforms.
To support customers on existing versions of the DRAGEN software which do not have built-in support for the new FPGA type on the F2 instance, Illumina is adding F2 instance support to older DRAGEN versions.
As a result, Illumina is publishing updated BSSH and ICA apps based on the same versions of workflows, pipelines and bioinformatics features - but with added F2 instance type support.
The updated DRAGEN versions are designed to produce the exact same analysis outputs on F2 instances as on F1 instances.
This includes exact same analysis outputs of the mapper (BAM/CRAM) all variant callers and QC metrics.
The updated BSSH and ICA apps are exact replicas of the existing F1 versions and only add F2 instance type support with an associated update to the DRAGEN software.
The updated BSSH and ICA apps produce the exact same outputs as the F1 versions.
Development Methods
DRAGEN R&D
Support for the AWS F2 instance type is developed in partnership with AWS.
Extensive test campaigns are executed to ensure that stability, robustness and exact concordance are achieved.
Validation of the software on AWS F2 is performed according to existing software life cycle and QA processes at Illumina
DRAGEN Releases
The source code for a specific version is branched.
Support for the AWS F2 instance type is added.
The software version is tested and validated for robustness, run time, and exact concordance of outputs on F2 vs. F1, over a set of test cases.
No changes are made to any bioinformatics components.
BSSH/ICA applications
The DRAGEN software is added as a new version.
F2 instance type support is added to the app configuration.
The app version is tested and validated for functionality and concordant outputs, over a set of samples.
Validation Methodology
Test Samples: Whole Genome Sequencing (WGS) samples from HG002, HG003, and HG004 at 35x coverage.
Versions Tested: DRAGEN versions 4.2.4, 4.3.6, and 4.4.4.
Platforms: Each sample was analyzed with F1 and F2 instances using identical command lines and configurations.
Concordance Metrics
BAM Files: Verified using md5sum checksums, excluding metadata headers (e.g., timestamps, version strings).
VCF Files: Variant caller outputs were also validated using md5sum comparisons.
Output Metrics: All summary metrics were compared to confirm that results match.
Results
Bit-exact: All tested DRAGEN versions produced identical outputs on F1 and F2 instances.
Known Exceptions:
repeats.bam files differ in sorting order, but alignment outputs are identical.
Differences in pcr-model-0.log file content is expected due to multi-threading and have no impact on the results
Input and Output Locations
Data is available here:
Input File Location: https://basespace.illumina.com/s/qSSsl5ViSZwj
Output Concordance Data: https://ilmn-sso.basespace.illumina.com/s/NgwKhpHlhbmB
versions/HG002.novaseq.pcr-free.35x/f1/
versions/HG002.novaseq.pcr-free.35x/f2/
versions/HG003.novaseq.pcr-free.35x/f1/
versions/HG003.novaseq.pcr-free.35x/f2/
versions/HG004.novaseq.pcr-free.35x/f1/
versions/HG004.novaseq.pcr-free.35x/f2/
Conclusion
Illumina has confirmed that DRAGEN Secondary Analysis produces bit-exact outputs across AWS F1 and F2 FPGA instances when using the same software versions, sample data, and commands. Validation results on whole genome sequencing samples from HG002, HG003, and HG004 at 35x coverage with DRAGEN versions 4.2.4, 4.3.6, and 4.4.4, are shared. BAM and VCF file concordance have been validated through checksum comparisons and metric analysis. Customers can transition confidently to faster F2 instances without compromising results.
Last updated
Was this helpful?