arrow-left

All pages
gitbookPowered by GitBook
1 of 18

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

DRAGEN Microbial Enrichment Plus App Documentation

hashtag
Overview

hashtag
Summary:

DRAGEN Microbial Enrichment Plus offers a dedicated informatics solution with flexible analysis options for Illumina Infectious Disease and Microbiology target-capture enrichment panel kits. The app delivers easy-to-use, powerful secondary analysis of Illumina sequencing data, with workflows for sample QC, viral WGS (whole-genome sequencing), pathogen detection and quantification, and antimicrobial resistance (AMR) marker profiling. It also supports user-defined microorganism reporting thresholds and custom reference sequence analysis.

hashtag
Supported hybrid-capture enrichment panels:

Product Page
Panel Summary

hashtag
Input files:

  • FASTQ files

  • (optional)

  • (if applicable)

hashtag
Demo Data:

The includes external control, contrived, and environmental samples prepared using the RPIP, UPIP, RVOP, VSP, and VSP V2 target-capture enrichment kits. Example custom reference sequence FASTA and BED files are also included.

hashtag
Analysis Pipeline:

(all panels except where noted, (*) indicates applicable to custom reference sequence analysis)

  1. Read QC* (optional)

  2. Dehosting* (human read removal)

  3. Sample QC (sample composition and enrichment factor calculations. Internal control required to calculate the enrichment factor) – RPIP, UPIP, VSP V2

hashtag
Output files:

  • Analysis-level outputs: XLSX, HTML, ZIP

  • Sample-level outputs: JSON, HTML, FASTA (consensus sequences), VCF (viral variants)

hashtag
Important Notes:

DRAGEN Microbial Enrichment Plus is a secondary analysis tool for research use only. Further interpretation, statistical analysis, and downstream analysis of results may be necessary.

hashtag
For Research Use Only. Not for use in diagnostic procedures.

How to set up and run an analysis

  1. Launch the DRAGEN Microbial Enrichment Plus BaseSpace app, which can be found in the "Dragen" and "Infectious Disease + Microbiology" app collections.

  2. Enter a name for the Analysis.

  3. Choose either “Biosample” or “Project” as input type. When a Project is selected, the app will attempt to find all FASTQ files in that Project and run analyses on them. There is no FASTQ file limitation when reading Biosamples from a Project. However, 99 associated FASTQ files is the maximum allowed per analysis when providing Biosample input from a list.

(if applicable)
Microorganism classification (configurable sensitivity) - RVOP, VSP, VSP V2
  • Microorganism detection (alignment, consensus generation, variant calling)

  • Microorganism quantification (quantitative internal control required) – RPIP, UPIP, VSP V2

  • Microorganism reporting thresholds (proprietary algorithms or user-defined reporting logic)

  • Bacterial AMR marker analysis (nucleotide and protein alignment, consensus generation, variant calling and annotation) – RPIP, UPIP

  • Viral AMR marker analysis (variant calling and annotation) – RPIP, RVOP, VSP, VSP V2

  • Viral clade and lineage prediction (Pangolin, Nextclade) – RPIP, RVOP, VSP, VSP V2

  • Result filters (user-specified filters applied)

  • Reporting*

  • Illumina Respiratory Pathogen ID/AMR Enrichment Panel Kit (RPIP)arrow-up-right

    Illumina Urinary Pathogen ID/AMR Enrichment Panel Kit (UPIP)arrow-up-right

    Illumina Respiratory Virus Oligo Panel / Respiratory Virus Enrichment Kit (RVOP / RVEK)arrow-up-right

    Illumina Viral Surveillance Panel (VSP)arrow-up-right

    Illumina Viral Surveillance Panel V2 (VSP V2)arrow-up-right

    User-defined microorganism reporting file in TSV or XLSX format
    Custom reference FASTA file
    DRAGEN Microbial Enrichment Plus Demo Projectarrow-up-right
    file-downloadRPIP_6-5-1_Panel_Summary.xlsx
    file-downloadUPIP_8-6-0_Panel_Summary.xlsx
    file-downloadRVOP_2-7-0_Panel_Summary.xlsx
    file-downloadVSP_2-7-0_Panel_Summary.xlsx
    file-downloadVSPV2_2-7-0_Panel_Summary.xlsx
    Custom reference BED file
  • Select a target-capture Enrichment Panel for the appropriate analysis options and default settings to populate. Only one enrichment panel can be selected per analysis. If Custom Panel is selected, the "Custom panel specification" section is enabled to allow entry of a reference FASTA file and (optionally) a reference BED file. See Custom reference FASTA and BED files for further details.

  • Under "Enrichment Panel Microorganism Reporting List", select from the available list to report All microorganisms (default), specify a Pre-defined subset of microorganisms (RPIP, UPIP only), or specify a User-defined microorganism reporting list and reporting thresholds.

    • If Pre-defined is selected, the Pre-defined specification section is enabled to allow specification of a pre-defined subset of microorganisms for the selected Enrichment Panel. This option is only available for RPIP and UPIP.

    • If User-defined is selected, the User-defined specification section is enabled to allow entry of a microorganism reporting list and reporting thresholds file in TSV or XLSX format. See Microorganism Reporting File format for further details.

  • Analysis Options:

    • Perform read QC (Quality Control)

      • If checked, reads are pre-processed using quality metrics before analysis.

      • If unchecked, read quality metrics are calculated, but reads are not trimmed or filtered before analysis.

    • Report bacterial AMR markers only

      • If checked, only bacterial AMR markers but no microorganisms are reported

      • This option is disabled if RVOP/RVEK, VSP, VSP V2 or Custom Panel is selected

      • This option is disabled if the "Report bacterial AMR markers only when an associated microorganism is reported" option is enabled

    • Report bacterial AMR markers only when an associated microorganism is reported

      • If checked, detected bacterial AMR markers are reported if the bacterial AMR marker passes a minimum reporting threshold and one or more associated microorganisms are also detected and reported

      • If unchecked, detected bacterial AMR markers are reported if the bacterial AMR marker passes a minimum reporting threshold

    • Report microorganisms and/or AMR markers that are below threshold

      • If checked, microorganisms and/or AMR markers below reporting thresholds are included in reports

      • If unchecked, only microorganisms and/or AMR markers above reporting thresholds are included in reports

    1. Specify "Read classification sensitivity". This setting is used as a pre-alignment filtering step for RVOP/RVEK, VSP, and VSP V2 only. The default setting of 5 means that if less than 5 reads classify to the set of reference sequences belonging to a given virus, that virus will not be reported. On the other hand, if 5 or more reads classify to the set of reference sequences belonging to a given virus, read alignment will proceed and alignment-based thresholds will be used to determine whether that virus is reported. The read classification sensitivity can be set as low as 1 or as high as 1000. Lowering the read classification sensitivity threshold below 5 may significantly increase computational run time and is not recommended for most use cases.

    2. Pangolin is currently enabled for all enrichment panels besides UPIP. For Custom Panel analyses, Pangolin will run on custom reference sequences with at least 3% coverage that meet these naming conventions:

      • If only a FASTA file is provided, Pangolin will run on sequences that have a header containing either SARS-CoV-2 or NC_045512

      • If both a FASTA and BED file are provided, Pangolin will run on sequences where the first column (chrom) contains NC_045512 or the fourth column (genomeName) contains SARS-CoV-2

    3. Optionally, enable Nextclade to run when one of the following microorganisms is detected (RPIP, RVOP/RVEK, VSP, VSP V2 only):

      • Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)

      • Influenza A virus (H1N1)

    4. Select a quantitative Internal Control (IC) from the available list (RPIP, UPIP, VSP V2 only). If the quantitative IC is set to NONE, which is the default, the IC concentration value is ignored. For VSP V2, the only valid quantitative IC selections are:

      • NONE

      • Enterobacteria phage T7

    5. Enter Internal Control (IC) concentration as an integer in the following scientific notation format: "#.## x 10^#". **An incorrect quantitative IC or incorrect IC concentration will result in inaccurate microorganism absolute quantification results.

    6. Select the Project where the Analysis Output should be saved.

    This option is disabled if RVOP/RVEK, VSP, VSP V2 or Custom Panel is selected
    This option is disabled if Custom Panel is selected
  • This option is disabled if the "Report bacterial AMR markers only when an associated microorganism is reported" option is enabled

  • Influenza A virus (H3N2)
  • Influenza A virus (H5N1)

  • Influenza A virus (H5N6)

  • Influenza A virus (H5N8)

  • Influenza B virus (B/Victoria/2/87-like)

  • Influenza B virus (B/Yamagata/16/88-like)

  • Human immunodeficiency virus 1 (HIV-1)

  • Human respiratory syncytial virus A (HRSV-A)

  • Human respiratory syncytial virus B (HRSV-B)

  • Monkeypox virus (MPV)

  • Measles virus (MV)

  • Dengue virus (DENV), Dengue virus type 1 (DENV-1), Dengue virus type 2 (DENV-2), Dengue virus type 3 (DENV-3), Dengue virus type 4 (DENV-4)

  • Escherichia virus T4
  • Escherichia Virus MS2

  • Armored RNA Quant Internal Process Control

  • Custom reference FASTA and BED files

    hashtag
    Custom reference FASTA file:

    A custom reference FASTA file containing one or more reference sequences is required to run the custom reference sequence analysis. In the FASTA file, sequence names must be unique and should not contain any spaces. If there is any space in the FASTA header, the part before the first space is assumed to be the sequence name. It is recommended to use only the following in sequence names: alphabets, numbers, underscore (_), hyphen (-), parentheses ((,)), and period (.). Otherwise, the sequence names may appear different in the output. An example custom reference FASTA file is provided in the link below.

    file-download
    25KB
    custom_reference_snippet.fasta
    arrow-up-right-from-squareOpen
    Example Fasta file formatting

    To upload a custom reference FASTA file, go to the "Projects" tab and click on the folded paper icon (representing File) to reveal a dropdown menu. Click on "Upload" and select "Files". Within the upload page, select "Other" format for FASTA files, and upload the file as a Biosample. Within the DRAGEN Microbial Enrichment Plus app, under "Custom panel specification" use the "Custom reference FASTA for consensus generation" control to select the uploaded FASTA file.

    hashtag
    Custom reference BED file (optional):

    Optionally, a custom reference BED file may also be provided. Sequence names must match between the FASTA file and BED file, and the same set of sequences must appear in both files. If there are multiple viruses, their names should be unique. For example, if there are multiple Influenza genomes, they should not be labeled with the same virus name in the 4th column.

    The BED file controls how sequences are grouped and labeled in the output. If the custom reference FASTA file includes sequences from multiple segments of a viral genome, it is recommended to provide a BED file so that the segments are included under the results of that microorganism.

    The BED file must be tab-delimited with at least 4 columns:

    1. chrom: the sequence name as it appears in the FASTA

    2. chromStart: start position (always set to 0)

    3. chromEnd: end position (sequence length)

    hashtag
    Example custom reference BED file:

    To upload a custom reference BED file, go to the "Projects" tab and click on the folded paper icon (representing File) to reveal a dropdown menu. Click on "Upload" and select "Files". Within the upload page, select "Other" format for BED files, and upload the file as a Biosample. Within the DRAGEN Microbial Enrichment Plus App, under "Custom panel specification" use the "Custom reference BED (optional)" dropdown to select the uploaded BED file.

    hashtag
    Pangolin custom analysis behavior:

    For Custom Panel analyses, Pangolin is enabled and will run on custom reference sequences with at least 3% coverage that meet these naming conventions:

    • If only a FASTA file is provided, Pangolin will run on sequences that have a header containing either SARS-CoV-2 or NC_045512

    • If both a FASTA and BED file are provided, Pangolin will run on sequences where the first column (chrom) contains NC_045512 or the fourth column (genomeName) contains SARS-CoV-2

    hashtag
    Nextclade custom analysis behavior:

    For Custom Panel analyses, Nextclade is disabled and will not be run. Do not enable Nextclade.

    Custom Panel

    Abbreviation
    Definition

    NGS

    next-generation sequencing

    pangolin

    phylogenetic assignment of named global outbreak lineages

    Category
    Test information

    RUO

    For Research Use Only. Not for use in diagnostic procedures.

    Test information

    Enrichment panel

    Understanding the BaseSpace HTML reports

    hashtag
    Summary results

    hashtag
    Sample Composition

    The Sample Composition bargraphs show the proportion of reads classified to six broad categories for of all samples in the analysis run: Targeted Microbial, Untargeted, Ambiguous, Unclassified, Low Complexity, and Targeted Internal Control (RPIP, UPIP, VSP V2 only).

    hashtag
    Summary Statistics

    The Summary Statistics table summarizes sample QC metrics for all samples in the analysis run. Further details on each metric can be found by hovering over each column header.

    hashtag
    Per sample results

    Individual sample results can be further explored by clicking on "Report" under each sample name in the panel on the left. There are four tabs in the Sample Report: Sample Quality Control, Microorganisms, Antimicrobial Resistance Markers, and User Options.

    hashtag
    1. Sample Quality Control

    • Version Information is a table with the application version, test type, and test version that were run. Running the latest version of the application is recommended.

    • Sample Composition is a bargraph showing the proportion of post-quality reads classified to six broad categories for the sample (RPIP, UPIP, VSP V2 only).

    • Read Classification is a dynamic plot that can be configured to show the following (RPIP, UPIP, VSP V2 only):

      • Targeted Microbial Reads - Relative (default): Bargraph of post-quality targeted microbial reads belonging to Viral, Bacterial, Fungal, Parasite and AMR categories, relative to post-quality targeted microbial reads only. Percentages are expected to sum to 100%. Hover over an individual bar to display the values.

      • Targeted Microbial Reads - Absolute: Bargraph of post-quality targeted microbial reads belonging to Viral, Bacterial, Fungal, Parasite and AMR categories for all post-quality reads in the sample overall. Hover over an individual bar to display the values.

      • Untargeted Reads - Relative: Bargraph of post-quality untargeted reads belonging to untargeted categories, relative to post-quality untargeted reads only. Percentages are expected to sum to 100%. Hover over an individual bar to display the values.

      • Untargeted Reads - Absolute: Bargraph of post-quality untargeted reads belonging to untargeted categories for all post-quality reads in the sample overall. Hover over an individual bar to display the values.

        **Note that accurate sample composition and read classification results rely on selecting the correct enrichment panel. If you run an analysis that is not specific to the enrichment panel (e.g., VSP V2 analysis with VSP-enriched samples), reads from high background viruses that are not targeted by VSP probes (e.g., Measles virus) but that are targeted by VSP V2 probes will be reported as targeted viral reads.

    • Internal Controls is a table containing supported Internal Control options along with observed RPKM values (RPIP, UPIP, VSP V2 only).

    • QC Metrics is a table containing sample QC metrics. Dehosting refers to human reads only.

    hashtag
    2. Microorganisms

    • Microorganism results are summarized in tables, separated by type (Viruses, Bacteria, Fungi, Parasites). Each table includes whether the microorganism is predicted present in the sample, as well as various alignment metrics. Further details on each metric can be found by hovering over each column header. The best-match Reference Accession(s) are provided for all RPIP, RVOP/RVEK, VSP, and VSP V2 viruses in the Viruses table. To see all best-match Reference Accession(s), click on the three dots (...) in the table and scroll down the page.

    • Reference Coverage is a dynamic plot showing the coverage depth across the viral genome for detected RPIP, RVOP/RVEK, VSP, and VSP V2 viruses. Select a virus from the dropdown list to view the coverage plot. Segments are concatenated for segmented viruses, and the targeted regions of the viral genome are indicated for RPIP viruses.

    hashtag
    3. Antimicrobial Resistance Markers

    • Viral AMR (Variants) is a table with viral AMR variant results for Influenza A/B viruses (RPIP, RVOP/RVEK, VSP, and VSP V2 only)

    • Bacterial AMR (Genes) is a table witb bacterial AMR gene results (RPIP, UPIP only)

    • Bacterial AMR (Variants) is a table with bacterial AMR variant results (RPIP, UPIP only)

    hashtag
    4. User Options

    The User Options table summarizes user options selected during launch of the analysis.

    URL

    See https://www.illumina.com/ for additional information.

    Pango lineage

    The most likely Pango (phylogenetic assignment of named global outbreak) lineage is assigned to the majority consensus SARS-CoV-2 genome sequence using pangolin 4.3.1 (Áine O'Toole & Emily Scher et al. 2021 Virus Evolution DOI:10.1093/ve/veab064).

    Limitations

    Custom panel data analysis using DRAGEN Microbial Enrichment Plus aligns human-dehosted next-generation sequencing (NGS) reads to reference sequences. Contamination with microorganisms is possible during specimen collection, transport, and processing. Reads from closely related microorganisms may align to reference sequences based on sequence homology. Alignment of reads to a microorganism does not confirm that the microorganism is causing symptoms, is viable, or is infectious. Results should be interpreted in the context of all available information. Other sources of data may be required for confirmation.

    Illumina Respiratory Pathogen ID/AMR Enrichment Panel Kit (RPIP)
    Illumina Urinary Pathogen ID/AMR Enrichment Panel Kit (UPIP)
    Illumina Respiratory Virus Oligo Panel / Respiratory Virus Enrichment Kit (RVOP / RVEK)
    Illumina Viral Surveillance Panel (VSP)
    Illumina Viral Surveillance Panel V2 (VSP V2)
    Illumina Custom Panel
    genomeName: name of the genome, target, or microorganism the sequence belongs to (e.g. Monkeypox virus clade II)
  • segmentName (optional): the name of the segment or gene (e.g. Segment 4 (HA)). Set to 'Full' if the sequence is the full genome

  • NC_012532.1	0	10794	Zika	Full
    KJ609203.1	0	2292	Influenza A virus (A/Perth/16/2009(H3N2))	Segment 1 (PB2)
    KJ609204.1	0	2304	Influenza A virus (A/Perth/16/2009(H3N2))	Segment 2 (PB1)
    KJ609205.1	0	2168	Influenza A virus (A/Perth/16/2009(H3N2))	Segment 3 (PA+PA-X)
    KJ609206.1	0	1727	Influenza A virus (A/Perth/16/2009(H3N2))	Segment 4 (HA)
    KJ609207.1	0	1530	Influenza A virus (A/Perth/16/2009(H3N2))	Segment 5 (NP)
    KJ609208.1	0	1441	Influenza A virus (A/Perth/16/2009(H3N2))	Segment 6 (NA)
    KJ609209.1	0	1001	Influenza A virus (A/Perth/16/2009(H3N2))	Segment 7 (M1+M2)
    KJ609210.1	0	866	Influenza A virus (A/Perth/16/2009(H3N2))	Segment 8 (NS1+NEP)
    MK239128.1	0	2316	Influenza A virus (A/Iowa/38/2017(H3N2v))	Segment 1 (PB2)
    MK239126.1	0	2316	Influenza A virus (A/Iowa/38/2017(H3N2v))	Segment 2 (PB1)
    MK239124.1	0	2208	Influenza A virus (A/Iowa/38/2017(H3N2v))	Segment 3 (PA+PA-X)
    MK239073.1	0	1737	Influenza A virus (A/Iowa/38/2017(H3N2v))	Segment 4 (HA)
    MK239074.1	0	1540	Influenza A virus (A/Iowa/38/2017(H3N2v))	Segment 5 (NP)
    MK239123.1	0	1441	Influenza A virus (A/Iowa/38/2017(H3N2v))	Segment 6 (NA)
    MK239125.1	0	1002	Influenza A virus (A/Iowa/38/2017(H3N2v))	Segment 7 (M1+M2)
    MK239127.1	0	865	Influenza A virus (A/Iowa/38/2017(H3N2v))	Segment 8 (NS1+NEP)

    Output files

    circle-info

    Note: Some files may not be generated depending on the selected analysis options and analysis results

    hashtag
    Sample-level output files

    Filename
    Type
    Description

    hashtag
    Analysis-level output files

    Filename
    Type
    Description

    Microorganism Reporting File format

    Enrichment panel
    Template file

    RVOP/RVEK

    Abbreviation
    Definition

    Bacterial AMR gene nucleotide consensus sequence(s) for all bacterial AMR markers reported in the sample (RPIP, UPIP only)

    Samplename.Panelname.bacterial_amr_protein_consensus.fa

    fasta

    Bacterial AMR gene protein consensus sequence(s) for all bacterial AMR markers reported in the sample (RPIP, UPIP only)

    viral_consensus_genomes

    Dataset

    Directory containing viral genome (or segment) nucleotide consensus sequence(s) per virus reported in the sample (RPIP, RVOP/RVEK, VSP, VSP V2 only)

    Samplename.Panelname.report.json

    json

    Comprehensive report file. See Report JSON format for further details

    Samplename.Panelname.report.html

    html

    Visual report file. See Understanding the BaseSpace HTML reports for further details

    Samplename.Panelname.viral_variants.vcf

    vcf

    Viral variant call file describing variant calls between viral consensus genome (or segment) sequences and best-match reference sequences (all RVOP/RVEK, VSP, and VSP V2 viruses, RPIP: SARS-CoV-2 & FluA/B/C only)

    Samplename.Panelname.viral_genomes_consensus.fa

    fasta

    Viral genome (or segment) nucleotide consensus sequence(s) for all viruses reported in the sample (RPIP, RVOP/RVEK, VSP, VSP V2 only)

    Samplename.Panelname.viral_targets_consensus.fa

    fasta

    Viral targeted region nucleotide consensus sequence(s) for all viruses reported in the sample (RPIP only)

    Samplename.Panelname.bacterial_amr_nucleotide_consensus.fa

    AnalysisIDnumber.Panelname.results.zip

    zip

    Compressed file containing all output files for single-click download

    AnalysisIDnumber.Panelname.report.xlsx

    xlsx

    Aggregate Excel report file that summarizes results for all samples across 4 tabs: Samples, Microorganisms, AMR, and Variants. See below for further details

    report.html

    html

    Visual report file. See Understanding the BaseSpace HTML reports for further details

    file-download
    26KB
    DMEplus_aggregate_report_descriptions.xlsx
    arrow-up-right-from-squareOpen
    RPIP, UPIP, RVOP/RVEK, VSP, VSP V2: Description of aggregate Excel report file fields
    file-download
    20KB
    DMEplus_custom_aggregate_report_descriptions.xlsx
    arrow-up-right-from-squareOpen
    Custom Panel: Description of aggregate Excel report file fields

    fasta

    hashtag
    How to edit the template file
    1. First, we recommend saving the provided template file with a new name

    2. Do not add any new columns and do not delete any columns from the template file

    3. Do not change or remove any text from the header row. **The "kmer_read_count" metric is only valid with the UPIP enrichment panel.

    4. Rows with microorganism names that are not of interest can be deleted. However, the entire tiered reporting group for certain viruses must be included to preserve tiered reporting logic (if desired). Membership in a tiered reporting group means that a hierarchical relationship is pre-built into the database and the most granular tier level passing reporting thresholds is reported. For example, if Influenza B virus (B/Victoria/2/87-like) or Influenza B virus (B/Yamagata/16/88-like) are reported in a sample then the less granular Influenza B virus reporting name will NOT be reported. See the "Has Tiered Reporting" and "Reporting Tier" columns of the "Microorganisms" table in the for RPIP, RVOP/RVEK, VSP, and VSP V2 to select and see which viruses are reported as part of a tiered reporting group.

    5. Upload the microorganism reporting file to a BaseSpace Project. **It is only necessary to upload the file once.

    6. Select the file by clicking on the "Dataset File(s)" option under the "User-defined specification" section.

    hashtag
    Example user-defined microorganism reporting file

    See example below for 6 RPIP microorganism reporting names. Prediction logic can be specified on a microorganism-by-microorganism basis using multiple parameters and combinatorial logical expressions.

    reporting_name
    prediction_logic
    coverage
    median_depth
    ani
    aligned_read_count
    rpkm
    kmer_read_count

    Acinetobacter baumannii

    default

    Illumina Respiratory Pathogen ID/AMR Enrichment Panel Kit (RPIP)arrow-up-right

    Illumina Urinary Pathogen ID/AMR Enrichment Panel Kit (UPIP)arrow-up-right

    Illumina Respiratory Virus Oligo Panel / Respiratory Virus Enrichment Kit (RVOP / RVEK)arrow-up-right

    Illumina Viral Surveillance Panel (VSP)arrow-up-right

    Illumina Viral Surveillance Panel V2 (VSP V2)arrow-up-right

    file-downloadRPIP_6-5-1_Microorganism_Reporting_Template.xlsx
    file-downloadUPIP_8-6-0_Microorganism_Reporting_Template.xlsx
    file-downloadRVOP_2-7-0_Microorganism_Reporting_Template.xlsx
    file-downloadVSP_2-7-0_Microorganism_Reporting_Template.xlsx
    file-downloadVSPV2_2-7-0_Microorganism_Reporting_Template.xlsx

    phylogenetic assignment of named global outbreak lineages

    RPKM

    targeted Reads mapped Per Kilobase of targeted sequence per Million quality-filtered reads

    RVEK

    Respiratory Virus Enrichment Kit

    RVOP

    Respiratory Virus Oligo Panel

    Category
    Test information

    RUO

    For Research Use Only. Not for use in diagnostic procedures.

    URL

    See https://www.illumina.com/ for additional information.

    Quantification

    RVOP data analysis using DRAGEN Microbial Enrichment Plus detects 24 viruses and 238 AMR markers based on target enriched next-generation sequencing (NGS) of viral DNA and cDNA sequences. Sequencing data are interpreted by the DRAGEN software platform and viruses that pass detection thresholds are reported. Relative abundance is expressed as proportion of RPKM values.

    AMR

    This test detects 238 antimicrobial resistance (AMR) markers associated with resistance to Influenza virus neuraminidase inhibitor (NAI) and polymerase acidic protein inhibitor (PAI) in Influenza A virus (H1N1pdm09), Influenza A virus (H1N1), Influenza A virus (H5N1), Influenza A virus (H3N2), Influenza A virus (H3N2; swine-like), Influenza A virus (H7N9), and Influenza B virus. AMR markers and drug associations are based on the World Health Organization (WHO) Influenza virus NAI and PAI Reduced Susceptibility Marker Tables (07 March 2023 version). Detection of an AMR marker is reported if the marker passes a minimum detection threshold and if the Influenza virus associated with the marker is also detected. Reported AMR markers have been associated with antimicrobial resistance but may not always indicate phenotypic resistance. Failure to detect AMR variants does not always indicate phenotypic susceptibility. Results should be interpreted in the context of all available information.

    AMR

    Mutations connected with a '+' form an epistatic group. Epistatic groups are two or more mutations that need to be present concurrently to confer the associated resistance.

    Pango lineage

    The most likely Pango (phylogenetic assignment of named global outbreak) lineage is assigned to the majority consensus SARS-CoV-2 genome sequence using pangolin 4.3.1 (Áine O'Toole & Emily Scher et al. 2021 Virus Evolution DOI:10.1093/ve/veab064).

    AMR

    antimicrobial resistance

    mL

    milliliter

    NAI

    neuraminidase inhibitor

    NGS

    next-generation sequencing

    PAI

    polymerase acidic endonuclease inhibitor

    pangolin

    VSP

    Abbreviation
    Definition

    AMR

    antimicrobial resistance

    mL

    milliliter

    NAI

    neuraminidase inhibitor

    NGS

    next-generation sequencing

    PAI

    polymerase acidic endonuclease inhibitor

    Category
    Test information

    VSP V2

    Abbreviation
    Definition

    Limitations

    Non-detected results do not rule out the presence of viruses and AMR markers. Contamination is possible during specimen collection, transport, and processing. Closely related viruses may be misidentified based on sequence homology to viruses present in the database. The identification of cDNA or DNA sequences from a virus does not confirm that the identified virus is causing symptoms, is viable, or is infectious. Recombinant viral strains may not be reported or may be reported as one or more individual viruses. Should one or more individual viruses be reported for a recombinant viral strain, antiviral resistance results may be inaccurate. In viral strains containing insertion-deletion mutations (indels), there is a risk of false positive or false negative results for other resistance mutations within a region of 100 nucleotides around the indel.

    Limitations

    Information provided by DRAGEN Microbial Enrichment Plus is based on scientific knowledge and has been curated; however, scientific knowledge evolves and reported information may not always be complete and/or correct. Results should be interpreted in the context of all available information. Other sources of data may be required for confirmation.

    Limitations

    Non-detected results do not rule out the presence of viruses and AMR markers. Contamination is possible during specimen collection, transport, and processing. Closely related viruses may be misidentified based on sequence homology to viruses present in the database. The identification of cDNA or DNA sequences from a virus does not confirm that the identified virus is causing symptoms, is viable, or is infectious. Recombinant viral strains may not be reported or may be reported as one or more individual viruses. Should one or more individual viruses be reported for a recombinant viral strain, antiviral resistance results may be inaccurate. In viral strains containing insertion-deletion mutations (indels), there is a risk of false positive or false negative results for other resistance mutations within a region of 100 nucleotides around the indel.

    Limitations

    Information provided by DRAGEN Microbial Enrichment Plus is based on scientific knowledge and has been curated; however, scientific knowledge evolves and reported information may not always be complete and/or correct. Results should be interpreted in the context of all available information. Other sources of data may be required for confirmation.

    pangolin

    phylogenetic assignment of named global outbreak lineages

    RPKM

    targeted Reads mapped Per Kilobase of targeted sequence per Million quality-filtered reads

    VSP

    Viral Surveillance Panel

    RUO

    For Research Use Only. Not for use in diagnostic procedures.

    URL

    See https://www.illumina.com/ for additional information.

    Quantification

    VSP data analysis using DRAGEN Microbial Enrichment Plus detects 149 viruses and 238 AMR markers based on target enriched next-generation sequencing (NGS) of viral DNA and cDNA sequences. Sequencing data are interpreted by the DRAGEN software platform and viruses that pass detection thresholds are reported. Relative abundance is expressed as proportion of RPKM values.

    AMR

    This test detects 238 antimicrobial resistance (AMR) markers associated with resistance to Influenza virus neuraminidase inhibitor (NAI) and polymerase acidic protein inhibitor (PAI) in Influenza A virus (H1N1pdm09), Influenza A virus (H1N1), Influenza A virus (H5N1), Influenza A virus (H3N2), Influenza A virus (H3N2; swine-like), Influenza A virus (H7N9), and Influenza B virus. AMR markers and drug associations are based on the World Health Organization (WHO) Influenza virus NAI and PAI Reduced Susceptibility Marker Tables (07 March 2023 version). Detection of an AMR marker is reported if the marker passes a minimum detection threshold and if the Influenza virus associated with the marker is also detected. Reported AMR markers have been associated with antimicrobial resistance but may not always indicate phenotypic resistance. Failure to detect AMR variants does not always indicate phenotypic susceptibility. Results should be interpreted in the context of all available information.

    AMR

    Mutations connected with a '+' form an epistatic group. Epistatic groups are two or more mutations that need to be present concurrently to confer the associated resistance.

    Pango lineage

    The most likely Pango (phylogenetic assignment of named global outbreak) lineage is assigned to the majority consensus SARS-CoV-2 genome sequence using pangolin 4.3.1 (Áine O'Toole & Emily Scher et al. 2021 Virus Evolution DOI:10.1093/ve/veab064).

    Cryptococcus neoformans

    coverage

    0.3

    Escherichia coli

    aligned_read_count

    200

    Human adenovirus E

    (coverage AND median_depth) OR (aligned_read_count AND ani)

    0.1

    1

    0.95

    100

    Human bocavirus 1 (HBoV1)

    rpkm OR (ani AND coverage) OR median_depth

    0.2

    5

    0.9

    5

    Klebsiella pneumoniae

    default AND coverage

    0.5

    Panel Summary

    phylogenetic assignment of named global outbreak lineages

    RPKM

    targeted Reads mapped Per Kilobase of targeted sequence per Million quality-filtered reads

    VSP

    Viral Surveillance Panel

    Category
    Test information

    RUO

    For Research Use Only. Not for use in diagnostic procedures.

    URL

    See https://www.illumina.com/ for additional information.

    Quantification - when a quantitative Internal Control {ic_name} and concentration {ic_concentration} is specified

    VSP (generation 2) data analysis using DRAGEN Microbial Enrichment Plus detects 200 viruses and 238 AMR markers based on target enriched next-generation sequencing (NGS) of viral DNA and cDNA sequences. Sequencing data are interpreted by the DRAGEN software platform and viruses that pass detection thresholds are reported. Absolute quantification assumes use of {ic_name} as an Internal Control spiked at {ic_concentration} copies/mL of sample. Relative abundance is calculated based on absolute quantities and is expressed as proportion of absolute quantities. If RPKM for the Internal Control is zero, no absolute quantification is provided, and relative abundance is expressed as proportion of RPKM values.

    Quantification - when a quantitative Internal Control is NOT specified

    VSP (generation 2) data analysis using DRAGEN Microbial Enrichment Plus detects 200 viruses and 238 AMR markers based on target enriched next-generation sequencing (NGS) of viral DNA and cDNA sequences. Sequencing data are interpreted by the DRAGEN software platform and viruses that pass detection thresholds are reported. Relative abundance is expressed as proportion of RPKM values. Internal Control not specified; no absolute quantification provided.

    AMR

    This test detects 238 antimicrobial resistance (AMR) markers associated with resistance to Influenza virus neuraminidase inhibitor (NAI) and polymerase acidic protein inhibitor (PAI) in Influenza A virus (H1N1pdm09), Influenza A virus (H1N1), Influenza A virus (H5N1), Influenza A virus (H3N2), Influenza A virus (H3N2; swine-like), Influenza A virus (H7N9), and Influenza B virus. AMR markers and drug associations are based on the World Health Organization (WHO) Influenza virus NAI and PAI Reduced Susceptibility Marker Tables (07 March 2023 version). Detection of an AMR marker is reported if the marker passes a minimum detection threshold and if the Influenza virus associated with the marker is also detected. Reported AMR markers have been associated with antimicrobial resistance but may not always indicate phenotypic resistance. Failure to detect AMR variants does not always indicate phenotypic susceptibility. Results should be interpreted in the context of all available information.

    AMR

    Mutations connected with a '+' form an epistatic group. Epistatic groups are two or more mutations that need to be present concurrently to confer the associated resistance.

    AMR

    antimicrobial resistance

    mL

    milliliter

    NAI

    neuraminidase inhibitor

    NGS

    next-generation sequencing

    PAI

    polymerase acidic endonuclease inhibitor

    pangolin

    RPIP

    Abbreviation
    Definition

    AMR

    antimicrobial resistance

    CLSI

    Clinical and Laboratory Standards Institute

    ESBL

    extended spectrum beta-lactamase

    EUCAST

    European Committee on Antimicrobial Susceptibility Testing

    mL

    milliliter

    Category
    Test information

    Pipeline logic

    hashtag
    Pipeline steps

    Step
    Description
    Notes
    RPIP
    UPIP
    RVOP/RVEK
    VSP
    VSP V2
    Custom Panel

    Read classification

    This test differentiates sequencing reads classified to microorganism and Internal Control regions that are targeted by capture probes (“Targeted Microbial” and “Targeted Internal Control”) from those that are not targeted (“Untargeted”), are low complexity (“Low Complexity”), cannot be unambiguously assigned to one category (“Ambiguous”), or cannot be classified with confidence (“Unclassified”).

    Pango lineage

    The most likely Pango (phylogenetic assignment of named global outbreak) lineage is assigned to the majority consensus SARS-CoV-2 genome sequence using pangolin 4.3.1 (Áine O'Toole & Emily Scher et al. 2021 Virus Evolution DOI:10.1093/ve/veab064).

    Limitations

    Non-detected results do not rule out the presence of viruses and AMR markers. Contamination is possible during specimen collection, transport, and processing. Closely related viruses may be misidentified based on sequence homology to viruses present in the database. The identification of cDNA or DNA sequences from a virus does not confirm that the identified virus is causing symptoms, is viable, or is infectious. Recombinant viral strains may not be reported or may be reported as one or more individual viruses. Should one or more individual viruses be reported for a recombinant viral strain, antiviral resistance results may be inaccurate. In viral strains containing insertion-deletion mutations (indels), there is a risk of false positive or false negative results for other resistance mutations within a region of 100 nucleotides around the indel.

    Limitations

    Information provided by DRAGEN Microbial Enrichment Plus is based on scientific knowledge and has been curated; however, scientific knowledge evolves and reported information may not always be complete and/or correct. Results should be interpreted in the context of all available information. Other sources of data may be required for confirmation.

    AMR

    Linkage between bacterial AMR marker, antimicrobial, and drug class is based on the Comprehensive Antibiotic Research Database (CARD, version 3.2.8) from McMaster University, ResFinder (version 2.2.1), NCBI Reference Gene Catalog (version 2023-09-26.1), EUCAST expert rules on indicator agents (2019-2023), and CLSI Performance Standards for Antimicrobial Susceptibility Testing (M100 34th Edition). Linkage between viral AMR marker, antimicrobial, and drug class is based on the publications provided in the JSON report - see PubMed IDs (pmids) field. Not all antimicrobials and drug classes that are listed may be relevant. Detected AMR markers may also confer resistance to antimicrobials and drug classes that are not listed.

    AMR

    A representative list of associated microorganisms known to harbor the detected or similar bacterial AMR markers, based on the Comprehensive Antibiotic Research Database Prevalence Data (CARD Prevalence, version 4.0.1) from McMaster University, can be found in the Associated Microorganisms field.

    AMR

    Mutations connected with a '+' form an epistatic group. Epistatic groups are two or more mutations that need to be present concurrently to confer the associated resistance.

    AMR

    All intrinsic resistance described in CLSI Performance Standards for Antimicrobial Susceptibility Testing, M100 34th Edition, Appendix B for detected microorganism(s) is reported. Additional comments regarding CLSI intrinsic resistance definitions may be reported in footnotes specific to the detected microorganism(s). Some intrinsic resistance is described with reference to drug classes rather than specific antimicrobials. Users may reference CLSI Glossary I (Part 1 and Part 2): Class and Subclass Designations and Generic Names for information on how CLSI categorizes antimicrobials and drug classes.

    AMR

    Confidence of bacterial AMR marker detection is shown as High, Medium, or Low and is based on the available sequencing data. High confidence indicates that a bacterial AMR marker has 100% protein sequence coverage and 100% protein sequence percent identity (PID). Medium confidence indicates that a bacterial AMR marker has ≥90% protein sequence coverage and ≥90% protein sequence percent identity (PID). Low confidence indicates that a bacterial AMR marker has ≥60% protein sequence coverage and ≥80% protein sequence percent identity (PID).

    Phenotypic group

    Targeted microorganisms are classified into three Phenotypic Groups based on general association with normal flora, colonization, or contamination from the environment or other sources, as well as based on general association with disease. Phenotypic grouping DOES NOT INDICATE PATHOGENICITY IN A GIVEN CASE and results need to be interpreted in the context of all available information. Phenotypic Group 1: Microorganisms that are frequently considered part of the normal flora, colonizers, or contaminants but may be associated with disease in certain settings. Phenotypic Group 2: Microorganisms that may represent normal flora, colonizers, or contaminants but that are frequently associated with disease. Phenotypic Group 3: Microorganisms that are not generally considered part of the normal flora, colonizers, or contaminants and are generally considered to be associated with disease.

    Pango lineage

    The most likely Pango (phylogenetic assignment of named global outbreak) lineage is assigned to the majority consensus SARS-CoV-2 genome sequence using pangolin 4.3.1 (Áine O'Toole & Emily Scher et al. 2021 Virus Evolution DOI:10.1093/ve/veab064).

    Read classification

    This test differentiates sequencing reads classified to microorganism and Internal Control regions that are targeted by capture probes (“Targeted Microbial” and “Targeted Internal Control”) from those that are not targeted (“Untargeted”), are low complexity (“Low Complexity”), cannot be unambiguously assigned to one category (“Ambiguous”), or cannot be classified with confidence (“Unclassified”).

    Limitations

    Non-detected results do not rule out the presence of viruses, bacteria, fungi, and AMR markers. Contamination with microorganisms is possible during specimen collection, transport, and processing. Closely related microorganisms may be misidentified based on sequence homology to species present in the database. The identification of cDNA or DNA sequences from a microorganism does not confirm that the identified microorganism is causing symptoms, is viable, or is infectious. Recombinant viral strains may not be reported or may be reported as one or more individual viruses. Should one or more individual viruses be reported for a recombinant viral strain, antiviral resistance results may be inaccurate.

    Limitations

    The best matching allele is reported for each detected bacterial AMR gene family. If two or more alleles within the same bacterial AMR gene family are detected, only the allele with the higher confidence will be reported as the best match unless multiple alleles have a High confidence interpretation (100% coverage and PID). In strains containing insertion-deletion mutations (indels), there is a risk of false positive or false negative results for other resistance mutations within a region of 100 nucleotides around the indel.

    Limitations

    Information provided by DRAGEN Microbial Enrichment Plus is based on scientific knowledge and has been curated; however, scientific knowledge evolves and information about associated microorganism and associated resistance may not always be complete and/or correct. Results should be interpreted in the context of all available information. Other sources of data may be required for confirmation.

    NAI

    neuraminidase inhibitor

    NGS

    next-generation sequencing

    PAI

    polymerase acidic endonuclease inhibitor

    pangolin

    phylogenetic assignment of named global outbreak lineages

    RPIP

    Respiratory Pathogen ID/AMR Panel

    RPKM

    targeted Reads mapped Per Kilobase of targeted sequence per Million quality-filtered reads

    RUO

    For Research Use Only. Not for use in diagnostic procedures.

    URL

    See https://www.illumina.com/ for additional information.

    Quantification - when a quantitative Internal Control {ic_name} and concentration {ic_concentration} is specified

    RPIP data analysis using DRAGEN Microbial Enrichment Plus detects 41 viruses, 187 bacteria, 53 fungi, and 4,079 AMR markers, unless filtered reporting options are selected, based on target enriched next-generation sequencing (NGS) of microorganism DNA and cDNA sequences. Sequencing data are interpreted by the DRAGEN software platform and microorganisms that pass detection thresholds are reported. Absolute quantification assumes use of {ic_name} as an Internal Control spiked at {ic_concentration} copies/mL of sample. Relative abundance is calculated based on absolute quantities and is expressed as proportion of absolute quantities within each pathogen class (i.e., bacteria, viruses, fungi). If RPKM for the Internal Control is zero, no absolute quantification is provided, and relative abundance is expressed as proportion of microorganism RPKM values within each pathogen class.

    Quantification - when a quantitative Internal Control is NOT specified

    RPIP data analysis using DRAGEN Microbial Enrichment Plus detects 41 viruses, 187 bacteria, 53 fungi, and 4,079 AMR markers, unless filtered reporting options are selected, based on target enriched next-generation sequencing (NGS) of microorganism DNA and cDNA sequences. Sequencing data are interpreted by the DRAGEN software platform and microorganisms that pass detection thresholds are reported. Relative abundance is expressed as proportion of microorganism RPKM values within each pathogen class (i.e., bacteria, viruses, fungi). Internal Control not specified; no absolute quantification provided.

    AMR - when "Report bacterial AMR markers only when an associated microorganism is reported" is selected

    This test detects 4,079 antimicrobial resistance (AMR) markers and reports associations for 99 microorganisms, 181 antimicrobials, and 35 drug classes, unless filtered reporting options are selected. Bacterial AMR markers are based on the Comprehensive Antibiotic Research Database (CARD, version 3.2.8) and viral AMR markers are based on World Health Organization (WHO) Influenza virus neuraminidase inhibitor (NAI) and polymerase acidic protein inhibitor (PAI) Reduced Susceptibility Marker Tables (07 March 2023 version). Detection of an AMR marker is reported if the AMR marker passes a minimum detection threshold and if one or more of the microorganisms associated with the AMR marker is also detected, in alignment with guidance provided by the College of American Pathologists (CAP) MIC.21855. However, reported AMR markers may originate from microorganisms that did not meet detection thresholds or microorganisms not targeted by the test. Association between microorganisms and bacterial AMR marker is based on scientific literature and the Comprehensive Antibiotic Research Database Prevalence Data (CARD Prevalence, version 4.0.1) from McMaster University. Reported AMR markers have been associated with antimicrobial resistance but may not always indicate phenotypic resistance. Failure to detect AMR markers does not always indicate phenotypic susceptibility. Results should be interpreted in the context of all available information.

    AMR - when "Report bacterial AMR markers only when an associated microorganism is reported" is NOT selected

    This test detects 4,079 antimicrobial resistance (AMR) markers and reports associations for 99 microorganisms, 181 antimicrobials, and 35 drug classes. Bacterial AMR markers are based on the Comprehensive Antibiotic Research Database (CARD, version 3.2.8) and viral AMR markers are based on World Health Organization (WHO) Influenza virus neuraminidase inhibitor (NAI) and polymerase acidic protein inhibitor (PAI) Reduced Susceptibility Marker Tables (07 March 2023 version). Association between microorganisms and bacterial AMR marker is based on scientific literature and the Comprehensive Antibiotic Research Database Prevalence Data (CARD Prevalence, version 4.0.1) from McMaster University. Detection of a bacterial AMR marker is reported if the marker passes a minimum detection threshold, regardless of associated microorganism detection. Reported AMR markers may originate from microorganisms that did not meet detection thresholds or microorganisms not targeted by the test. Reported AMR markers have been associated with antimicrobial resistance but may not always indicate phenotypic resistance. Failure to detect AMR markers does not always indicate phenotypic susceptibility. Results should be interpreted in the context of all available information.

    Sample QC

    Sample composition and enrichment factor calculations

    Internal control required to calculate the enrichment factor

    X

    X

    X

    Microorganism classification

    Pre-alignment filtering step

    Configurable sensitivity

    X

    X

    X

    Microorganism detection

    Reference alignment, consensus sequence generation, variant calling

    X

    X

    X

    X

    X

    X

    Microorganism quantification

    Absolute copies/mL calculation

    Quantitative internal control and concentration required

    X

    X

    X

    Microorganism reporting thresholds

    Proprietary algorithms or user-defined reporting logic

    X

    X

    X

    X

    X

    Bacterial AMR marker analysis

    Nucleotide and protein alignment, consensus sequence generation, variant calling and annotation

    X

    X

    Viral AMR marker analysis

    Variant calling and annotation

    X

    X

    X

    X

    Viral clade and lineage prediction

    Pangolin, Nextclade

    X

    X

    X

    X

    Result filters

    User-specified filters applied

    X

    X

    X

    X

    X

    Reporting - Analysis level

    XLSX, HTML, ZIP

    X

    X

    X

    X

    X

    X

    Reporting - Sample level

    JSON, HTML, FASTA (consensus sequences), VCF (viral variants)

    X

    X

    X

    X

    X

    X

    Read QC

    Low-quality bases are trimmed from the ends of each read. After trimming, the read is discarded if fewer than 50% of its bases have a quality score greater or equal to q20, the read is shorter than 32 bp, or the read has 5 or more ambiguous bases. It is assumed that appropriate adapter trimming has already been performed.

    Optional

    X

    X

    X

    X

    X

    X

    Dehosting

    Human read removal using the DRAGEN Kmer Classifier

    X

    X

    X

    X

    X

    X

    Release notes

    hashtag
    DRAGEN Microbial Enrichment Plus app version 1.1.0

    Component versions

    • Test type, version:

      • RPIP 6.5.1

      • UPIP 8.6.0

      • RVOP 2.7.0

      • VSP 2.7.0

      • VSPv2 2.7.0

      • Custom 1.0.0

    • Analysis Pipeline version: 6.3.12

    • DRAGEN version: 4.3.11

    Third-party versions

    • Pangolin 4.3.1 (Pangolin database PUSHER version 1.27)

    • Nextclade 3.5.0

    • SnpEff 5.1

    • ResFinder (version 2.2.1)

    Key updates

    • Various bug fixes (see below)

    • Tiered reporting added for Norovirus (GI, GII, GIV, GVIII, GIX) and Dengue virus (type 1, type 2, type 3, type 4)

    • Tiered reporting suppressed for below subtype resolution of Influenza A virus subtypes H1N1 and H3N2

    Known issues

    • When reading Biosamples from a Project, Fastq files for Biosamples sharing the same sample name prefix before the first underscore are merged. For example, Fastq files for Biosamples PREFIX_001 and PREFIX_002 will be merged and reported as a single PREFIX sample. To avoid this error, ensure that sample names are unique before the first underscore, replace underscores with a hyphen, or provide Biosample input from a list

    • Coverage results for SARS-CoV-2 are slightly (<1%) over-estimated, which may result in coverage >100% due to an error accounting for masked polyA-tail bases

    • Viral genome consensus sequence bases without aligned read support are indicated by "X" bases rather than "N" bases for RPIP viruses except SARS-CoV-2 and Influenza viruses

    Known limitations

    • When providing Biosample input from a list, 99 associated Fastq files is the maximum allowed per analysis. There is no Fastq file limitation when reading Biosamples from a Project

    • In strains containing insertion-deletion mutations (indels), there is a risk of false positive or false negative results for other resistance mutations within a region of 100 nucleotides around the indel

    • In strains containing long insertion-deletion mutations (indels), there is a risk of false negative results for two large vAMR-associated deletion mutations (RPIP, VSPv2) and one large bAMR-associated insertion mutation (RPIP). Long indels may be incorrectly reported as other variant types, such as frameshift mutations

    Bug fixes

    • Nextclade parsing errors for some samples

    • Custom reference sequence analysis not functional in non-US regions

    • User-defined microorganism reporting feature not reporting microorganisms that belong to a tiered reporting group when “prediction_logic” column set to “default”

    hashtag
    DRAGEN Microbial Enrichment Plus app version 1.0.0

    Initial release.

    Component versions

    • Test type, version:

      • RPIP 6.3.0

      • UPIP 8.4.0

    Third-party versions

    • Pangolin 4.3.1 (Pangolin data 1.27)

    • Nextclade 3.5.0

    • SnpEff 5.1

    • ResFinder (version 2.2.1)

    Key updates

    • Updated and expanded microorganism and bacterial AMR marker databases

    • Updated and expanded Influenza virus typing capability and antiviral resistance (AVR) reporting

    • User-defined microorganism reporting list and reporting thresholds

    Known issues

    • When reading Biosamples from a Project, Fastq files for Biosamples sharing the same sample name prefix before the first underscore are merged. For example, Fastq files for Biosamples PREFIX_001 and PREFIX_002 will be merged and reported as a single PREFIX sample. To avoid this error, ensure that sample names are unique before the first underscore, replace underscores with a hyphen, or provide Biosample input from a list

    • Reads are duplicated for samples with a single FASTQ file

    • Empty FASTQ files will abort analysis

    Known limitations

    • When providing Biosample input from a list, 99 associated Fastq files is the maximum allowed per analysis. There is no Fastq file limitation when reading Biosamples from a Project

    • Small differences in results may be observed between repeat analyses

    • In strains containing insertion-deletion mutations (indels), there is a risk of false positive or false negative results for other resistance mutations within a region of 100 nucleotides around the indel

  • NCBI Reference Gene Catalog (version 2023-09-26.1)

  • EUCAST expert rules on indicator agents (2019-2023)

  • CLSI Performance Standards for Antimicrobial Susceptibility Testing (M100 34th Edition)

  • Comprehensive Antibiotic Research Database (CARD, version 3.2.8)

  • Comprehensive Antibiotic Research Database Prevalence Data (CARD Prevalence, version 4.0.1)

  • World Health Organization (WHO) Influenza virus neuraminidase inhibitor (NAI) and polymerase acidic protein inhibitor (PAI) Reduced Susceptibility Marker Tables (07 March 2023 version)

  • Nextclade datasets added for Measles virus (MV) and Dengue virus (DENV) clade assignment
  • Reference genomes added for Monkeypox virus (MPV) Clade 1b

  • Additional database curation

  • Variant annotation information for Influenza A and B viruses, including antiviral resistance prediction results, may not be populated when below threshold reporting is enabled and/or a user-defined microorganism reporting file is specified that does not include all members of the Influenza A and B virus tiered reporting groups. If viral variant annotation is of interest for Influenza A and B viruses, default microorganism reporting options are recommended

    Small differences in SARS-CoV-2 and Influenza virus results may be observed between repeat analyses

    RPKM and absolute quantity metrics inaccurate when read QC disabled
  • SHV beta-lactamase AMR markers incorrectly reported as carbapenemases based on a known curation error in CARD version 3.2.8

  • Reads duplicated for samples with a single FASTQ file

  • Empty FASTQ files abort analysis

  • Pangolin not run on all samples with SARS-CoV-2 detected

  • Viral genome coverage plots not rendered for segmented viruses when all segments not detected

  • Description information missing for some viral genome accessions

  • RVOP 2.3.0
  • VSP 2.3.0

  • VSPv2 2.3.0

  • Custom 1.0.0

  • Analysis Pipeline version: 6.3.12

  • DRAGEN version: 4.3.6

  • NCBI Reference Gene Catalog (version 2023-09-26.1)

  • EUCAST expert rules on indicator agents (2019-2023)

  • CLSI Performance Standards for Antimicrobial Susceptibility Testing (M100 34th Edition)

  • Comprehensive Antibiotic Research Database (CARD, version 3.2.8)

  • Comprehensive Antibiotic Research Database Prevalence Data (CARD Prevalence, version 4.0.1)

  • World Health Organization (WHO) Influenza virus neuraminidase inhibitor (NAI) and polymerase acidic protein inhibitor (PAI) Reduced Susceptibility Marker Tables (07 March 2023 version)

  • Below threshold reporting for microorganisms and/or AMR markers
  • Custom reference sequence analysis

  • Nextclade may encounter a parsing error for some samples. If an analysis fails, try re-running the analysis with Nextclade disabled

  • Pangolin may not be run on all samples with SARS-COV-2 detected

  • Custom reference sequence analysis is not functional in non-US regions

  • The user-defined microorganism reporting feature does not report microorganisms that belong to a tiered reporting group when the “prediction_logic” column is set to “default”. See the User Guide for further information about microorganism tiered reporting

  • RPKM and absolute quantity metrics are inaccurate when read QC is disabled

  • SHV beta-lactamase AMR markers are incorrectly reported as carbapenemases based on a known curation error in CARD version 3.2.8

  • Coverage results for SARS-CoV-2 are slightly (<1%) over-estimated, which may result in coverage >100% due to an error accounting for masked polyA-tail bases

  • Viral genome consensus sequence bases without aligned read support are indicated by "X" bases rather than "N" bases for RPIP viruses except SARS-CoV-2 and Influenza viruses

  • Variant annotation information for Influenza A and B viruses, including antiviral resistance prediction results, may not be populated when below threshold reporting is enabled and/or a user-defined microorganism reporting file is specified that does not include all members of the Influenza A and B virus tiered reporting groups. If viral variant annotation is of interest for Influenza A and B viruses, default microorganism reporting options are recommended

  • In strains containing long insertion-deletion mutations (indels), there is a risk of false negative results for two large vAMR-associated deletion mutations (RPIP, VSPv2) and one large bAMR-associated insertion mutation (RPIP). Long indels may be incorrectly reported as other variant types, such as frameshift mutations

  • The RPIP, VSPv2, VSPv1, and RVOP Data Analysis solutions can report Influenza A virus subtypes H1N1 and H3N2 to a below-subtype resolution. Multiple results for H1N1 and/or H3N2 may be reported concurrently, particularly in samples that contain a mixture of Influenza A virus subtypes

  • Viral genome coverage plots are not rendered for segmented viruses when all segments are not detected

  • Description information is missing for some viral genome accessions

  • Scientific evidence

    hashtag
    Illumina Respiratory Pathogen ID/AMR Enrichment Panel Kit (RPIP)

    Application note: Analytical performance of the Respiratory Pathogen ID/AMR Panelarrow-up-right

    Application note: Rapid detection of respiratory pathogens using the MiniSeq™ Systemarrow-up-right

    Technical note: Evaluating reference materials for use with the Illumina Respiratory Pathogen ID/AMR Enrichment Panel Kit: Ensure optimal performance with external controls from commercial vendorsarrow-up-right

    hashtag
    Illumina Urinary Pathogen ID/AMR Enrichment Panel Kit (UPIP)

    Genomics Research Hub (GRH) article:

    Data sheet:

    UPIP ID Week Scientific Poster:

    hashtag
    Illumina Respiratory Virus Oligo Panel / Respiratory Virus Enrichment Kit (RVOP / RVEK)

    Application note:

    Application note:

    Application note:

    hashtag
    Illumina Viral Surveillance Panel (VSP)

    Data sheet:

    hashtag
    Illumina Viral Surveillance Panel V2 (VSP V2)

    Data sheet:

    Wastewater AMR surveillance with a broad probe-capture precision metagenomics (PMG) panelarrow-up-right
    Urinary Pathogen ID/AMR Panel: Highly sensitive, culturefree identification and quantification of common and underrecognized uropathogensarrow-up-right
    Theoretical Antimicrobial Selection Based on Precision Metagenomics Compared with Standard Urine Culture/Susceptibility: A Reliability and Inter-Rater Agreement Feasibility Analysisarrow-up-right
    Detection and characterization of respiratory viruses, including SARS-CoV-2, using Illumina RNA Prep with Enrichmentarrow-up-right
    Faster detection of respiratory viruses using the MiniSeq™ Rapid Reagent Kit and Illumina RNA Prep with Enrichmentarrow-up-right
    Surveillance of infectious disease through wastewater sequencing: Detect SARS-CoV-2 variants and other respiratory viruses in the communityarrow-up-right
    Viral Surveillance Panel: Streamlined whole-genome sequencing of high-impact viruses using hybrid–capture enrichmentarrow-up-right
    Viral Surveillance Panel v2: Streamlined whole-genome sequencing for high-risk viral surveillance and researcharrow-up-right

    Frequently Asked Questions (FAQ)

    hashtag
    General

    hashtag
    Q: Which Illumina Infectious Disease and Microbiology target-capture enrichment panel kits are compatible with the DRAGEN Microbial Enrichment Plus app?

    A: RPIP, UPIP, RVOP/RVEK, VSP, VSP V2, and Custom infectious disease and microbiology enrichment panels. To analyze the Pan-Coronavirus (Pan-CoV) panel, a custom coronavirus reference sequence database may be specified. The DME+ app is not intended for use with non-infectious disease enrichment panels (such as human exome).

    hashtag
    Q: Can I analyze the Pan-Coronavirus (Pan-CoV) panel here?

    A: The only infectious disease and microbiology enrichment panel without a pre-set DME+ database is the Pan-CoV panel. To analyze Pan-CoV enriched data with the DME+ app, select "Custom Panel" under the "Enrichment Panel" drop-down list and specify a custom coronavirus reference sequence database. Alternatively, we recommend using the DRAGEN Targeted Microbial app.

    hashtag
    Q: What does it cost to analyze samples using the DRAGEN Microbial Enrichment Plus app?

    A: A Basic Basespace Sequence Hub (BSSH) user account is required to access the DME+ app. However, there is no subscription cost for a Basic BSSH account and no compute cost to run the DME+ app. A Basic BSSH account provides 1 TB of free storage. Additional storage may require iCredits.

    hashtag
    Q: Where do I upload my custom reference FASTA and/or BED file?

    A: Upload these files to a BSSH project before launching the DME+ app. It will then be possible to select these files in the "Select Dataset File(s)" browser in the app. Please see and reach out to [email protected] with any unresolved upload issues.

    hashtag
    Panel Content & Design

    hashtag
    Q: Is my viral subtype of interest captured by the VSP V2 panel?

    A: See the "Virus Types Captured" column of the "Microorganisms" table in the .

    hashtag
    Q: Was VSP V2 designed using contemporary viral genomes or against traditional reference strains only?

    A: The VSP V2 viral genome sourcing approach aimed at being as inclusive and comprehensive as possible for the 200 targeted human viruses. All viral genomes passing quality filters available as of June 2023 were included in the design, including recombinant and vaccine strains.

    hashtag
    Q: How much of the genome is targeted by the RPIP, UPIP, RVOP/RVEK, VSP, and VSP V2 panels?

    A: The full viral genome is targeted for all RVOP/RVEK, VSP, and VSP V2 viruses. For RPIP viruses, see the "Percent Genome Targeted" column of the "Microorganisms" table in the . No more than ~1% of bacterial, fungal, and parasitic genomes are targeted by RPIP or UPIP.

    hashtag
    Analysis Options & Settings

    hashtag
    Q: I am using the "Custom panel specification" option and my custom analysis aborted or shows an error, why?

    A: While there are many possible reasons, one of the most common causes is that the custom database was not formatted correctly. Below are requirements for the custom reference FASTA and (optional) BED file:

    • Do not exceed the file size limitation: 10 million bases

    • Do not include duplicate entries

    • Do not use spaces in the file name; instead use an underscore "_"

    See for further details.

    hashtag
    Q: I am using the "User-defined specification" option. I am not seeing the microorganisms I expect to be there AND/OR I am seeing microorganisms that I do not want to see.

    A: Ensure that the correct microorganism reporting file was uploaded and used. We recommend saving the updated microorganism reporting file with a new name. Rows with microorganism names that are not of interest can be deleted, but do not add any new columns or delete any columns from the provided template. Similarly, do not change or remove any text from the header row. Also, please note that the "kmer_read_count" metric is only valid with the UPIP panel. See for further details.

    hashtag
    Q: What read QC (Quality Control) is performed by the DRAGEN Microbial Enrichment Plus app?

    A: If enabled, low-quality bases are trimmed from the ends of each read. After trimming, the read is discarded if fewer than 50% of its bases have a quality score greater or equal to q20, the read is shorter than 32 bp, or the read has 5 or more ambiguous bases. It is assumed that appropriate adapter trimming has already been performed.

    hashtag
    Q: What does "Read classification sensitivity" mean in the settings for RVOP/RVEK, VSP, and VSP V2?

    A: This setting is used as a pre-alignment filtering step for all viral whole-genome sequencing (WGS) panels. The default setting of 5 means that if less than 5 reads classify to the set of reference sequences belonging to a given virus, that virus will not be reported. On the other hand, if 5 or more reads classify to the set of reference sequences belonging to a given virus, read alignment will proceed and alignment-based thresholds will be used to determine whether that virus is reported. The read classification sensitivity can be set as low as 1 or as high as 1000. Lowering the read classification sensitivity threshold below 5 may significantly increase computational run time and is not recommended for most use cases.

    hashtag
    Q: When is a Pangolin analysis run?

    A: Pangolin is currently enabled for all enrichment panels except UPIP. For Custom Panel analyses, Pangolin is enabled and will run on custom reference sequences with at least 3% coverage that meet these naming conventions:

    • If only a FASTA file is provided, Pangolin will run on sequences that have a header containing either SARS-CoV-2 or NC_045512

    • If both a FASTA and BED file are provided, Pangolin will run on sequences where the first column (chrom) contains NC_045512 or the fourth column (genomeName) contains SARS-CoV-2

    hashtag
    Q: When is a Nextclade analysis run?

    A: When enabled, a Nextclade analysis using the specified dataset(s) is run for the following microorganisms, as applicable:

    Microorganism
    Nextclade Dataset
    Type of Nextclade Dataset

    hashtag
    Q. What Internal Control (IC) options are supported and what additional information does using an IC provide?

    A: The RPIP, UPIP, and VSP V2 enrichment panels contain probes targeting commercially available Internal Controls. See the table below for Internal Control options compatible with RPIP, UPIP, and VSP V2. It is recommended to spike each sample prior to extraction with Enterobacteria phage T7 at 1.21 x 10^7 copies/mL of sample.

    Internal Control
    RPIP
    UPIP
    VSP V2
    Process control
    Enrichment factor calculation
    Microorganism absolute quantification*
    Notes

    *Quantitative Internal Control concentration must be provided

    hashtag
    Q. What are the DRAGEN Microbial Enrichment Plus app settings related to consensus sequence generation and variant calling?

    A: See the table below. Consensus sequence bases without aligned read support are indicated by "N" bases.

    Setting
    Value

    hashtag
    Reporting

    hashtag
    Q: I don't see the microbe I'm interested in listed in the reported microorganism summary. Does that mean my microbe of interest is not present?

    A: Not necessarily. The microbe of interest may be present in the sample, but the DME+ app may not have reported it because the detection metrics fell below the default reporting thresholds. If it is suspected that this may be the case, select the "Report microorganisms and/or AMR markers that are below threshold" option. A user-defined microorganism reporting file can also be specified on a microorganism-by-microorganism basis using multiple parameters should more sensitive reporting be required for a given use case. See for further details.

    hashtag
    Q: What is the default reporting threshold for a microorganism to be "predicted present" and make it into reports?

    A: Multiple parameters are used to determine whether the sequencing data for a given microorganism is sufficient for a positive call. These may include the horizontal coverage, median read depth, normalized read count, average nucleotide identity, etc of the microorganism and/or other genetically related microorganisms. The default reporting thresholds are different for different microorganisms, as microorganisms with close genetic neighbors generally require more stringent reporting thresholds than genetically distinct microorganisms. As with most tests and prediction algorithms, the default reporting thresholds are intended to balance the trade-off between analytical sensitivity and specificity. Should a given use case require more sensitive or specific reporting, a user-defined microorganism reporting file can be specified on a microorganism-by-microorganism basis using multiple parameters. See for further details. Additionally, the "Report microorganisms and/or AMR markers that are below threshold" option can be enabled.

    hashtag
    Q. Are low coverage, median depth 0 microorganisms actually in the sample or are they artifacts?

    A: Mathematically, any result with a horizontal coverage of <50% will have a median depth of 0 (50% or more of the nucleotide positions have a depth of 0). Low coverage results could represent true low positives (the most likely reason) or non-specific results, contamination, etc. If maximum confidence is required for a given use case, stricter microorganism reporting thresholds can be specified on a microorganism-by-microorganism basis using multiple parameters. See for further details.

    hashtag
    Q. What is tiered reporting logic, which viruses are reported as part of a tiered reporting group, and why should I care?

    A: See the "Has Tiered Reporting" and "Reporting Tier" columns of the "Microorganisms" table in the for RPIP, RVOP/RVEK, VSP, and VSP V2 to select and see which viruses are reported as part of a tiered reporting group. Membership in a tiered reporting group means that a hierarchical relationship is pre-built into the database and the most granular tier level passing reporting thresholds is reported. For example, if Influenza B virus (B/Victoria/2/87-like) or Influenza B virus (B/Yamagata/16/88-like) are reported in a sample then the less granular Influenza B virus reporting name will NOT be reported. Tiered reporting group membership is especially relevant when specifying a user-defined microorganism reporting file as including the entire tiered reporting group is necessary to preserve tiered reporting logic.

    hashtag
    Q. How can I evaluate DRAGEN Microbial Enrichment Plus microorganism absolute quantification results?

    A: To evaluate microorganism absolute quantification results, it is recommended to perform experiments using the relevant sample type and full sequencing workflow (including extraction) and to compare results obtained from the DME+ app with those from digital droplet PCR (ddPCR) and/or quantitative PCR (qPCR) assays. A per-microorganism absolute quantification correction factor can be applied to DME+ results as needed.

    hashtag
    Q. I noticed some antimicrobials listed that do not usually get used in clinical environments - is this expected?

    A: Yes. Not all antimicrobials and drug classes that are listed may be relevant. Detected AMR markers may also confer resistance to antimicrobials and drug classes that are not listed. Linkage between bacterial AMR marker, antimicrobial, and drug class is based on the Comprehensive Antibiotic Research Database (CARD, version 3.2.8) from McMaster University, ResFinder (version 2.2.1), NCBI Reference Gene Catalog (version 2023-09-26.1), EUCAST expert rules on indicator agents (2019-2023), and CLSI Performance Standards for Antimicrobial Susceptibility Testing (M100 34th Edition). Linkage between viral AMR marker, antimicrobial, and drug class is based on the publications provided in the JSON report - see the PubMed IDs (pmids) field.

    hashtag
    Q. Some of the reported bacterial AMR markers in my sample have an “ESBL” flag, a “Carbapenemase” flag, or both. How are these flags determined?

    A: Extended-spectrum beta-lactamase (ESBL) and Carbapenemase flags are assigned based on the antimicrobials and drug classes associated with each bacterial AMR marker. An ESBL flag is reported if a 3rd, 4th, or 5th generation cepholosporin OR a beta-lactam + beta-lactamase inhibitor combination is contained in the list of associated antimicrobials or drug classes. A carbapenemase flag is reported if a carbapenem is contained in the list of associated antimicrobials or drug classes. The logic for each of these flags is decoupled, such that a marker can be reported with both flags if the associated antimicrobial or drug class metadata indicates both ESBL and carbapenemase activity.

    hashtag
    Results & Output Files

    hashtag
    Q: Most of my reads are untargeted reads. Is enrichment working?

    A: For complex samples or samples with the majority of nucleic acid being host/untargeted, while 100-1000X more targeted reads and sensitivity over a shotgun/pre-enriched library is expected, typically targeted reads will still only represent a minority of the overall sequencing reads. Notably, RPIP, UPIP, and VSP V2 support various Internal Control options that can be spiked into samples prior to extraction to enable automated calculation of an enrichment factor sample QC metric.

    hashtag
    Q: Is any typing information included for my virus of interest?

    A: See the "Has Tiered Reporting" and "Lineage/Clade Prediction" columns of the "Microorganisms" table in the for RPIP, RVOP/RVEK, VSP, and VSP V2. Consensus sequence and best match reference accession are also provided for RPIP, RVOP/RVEK, VSP, and VSP V2 viruses. Subtype information may be possible to infer from the consensus sequence (e.g. by Blast) or from the best match reference accession (if annotated in NCBI). Consensus sequence can also be used as input to downstream viral typing tools.

    hashtag
    Q. The % Targeted Microbial Reads is not exactly equal to the sum of microorganism Aligned Read Count values, why?

    A: The % Targeted Microbial Reads is calculated using a kmer-based classification approach that is intended to give a quick, high-level overview of sample composition. The Aligned Read Count values for microorganisms are calculated in a separate pipeline step using microorganism-specific reference sequence alignment as opposed to broad, categorical, kmer-based classification. Reads that were unclassified or that were classified as low-complexity or ambiguous may actually align to reference sequences. It is also possible for a read to align to a reference sequence of more than one microorganism, for example in a conserved region.

    hashtag
    Q: How can I verify or compare results of the DRAGEN Microbial Enrichment Plus app to previously used apps (such as DRAGEN Targeted Microbial)?

    A: FASTQ files previously run through other apps can be re-analyzed using the DME+ app. Results from other apps may not be identical to results from the DME+ app, most notably because of the expanded databases used in DME+.

    hashtag
    Q: The Reference Coverage section of the HTML report only shows coverage plots for viral genomes. Why doesn't it show the plots for bacterial genomes and/or for viral targeted regions?

    A: Viral genomes are orders of magnitude smaller and thus computationally much "cheaper" to align to than bacterial, fungal, and parasitic genomes. In the case of RVOP/RVEK, VSP, and VSP V2, the full viral genome is targeted for all viruses. For RPIP viruses, see the "Percent Genome Targeted" column of the "Microorganisms" table in the . While not visualized in the HTML report at this time, the DME+ does contain coverage depth vector information for all microorganism targeted regions (viruses, bacteria, fungi, and parasites). See: .targetReport.microorganisms[].condensedDepthVector[], which is the read depth across the targeted microorganism reference sequences, condensed (if needed) into 256 bins.

    File extension must be .fasta or .fa for custom reference FASTA file and .bed for custom reference BED file
  • If providing a custom reference BED file, the names in the first column of the BED file (chrom) must match the names that appear in the FASTA file (text after > and before the first whitespace character).

  • Influenza B virus (B/Yamagata/16/88-like)

    Influenza B Yamagata HA (relative to B/Wisconsin/01/2010)

    Official

    Human respiratory syncytial virus A (HRSV-A)

    RSV-A

    Official

    Human respiratory syncytial virus B (HRSV-B)

    RSV-B

    Official

    Monkeypox virus (MPV)

    Mpox virus (All Clades)

    Official

    Measles virus (MV)

    Measles virus N450 (WHO-2012)

    Official

    Dengue virus (DENV), Dengue virus type 1 (DENV-1), Dengue virus type 2 (DENV-2), Dengue virus type 3 (DENV-3), Dengue virus type 4 (DENV-4)

    Dengue virus (All Serotypes)

    Official

    Human immunodeficiency virus 1 (HIV-1)

    HIV-1 (relative to HXB2)

    Community

    Influenza A virus (H5N1)

    Influenza A H5Nx HA (relative to A/Goose/Guangdong/1/96)

    Community

    Influenza A virus (H5N6)

    Influenza A H5Nx HA (relative to A/Goose/Guangdong/1/96)

    Community

    Influenza A virus (H5N8)

    Influenza A H5Nx HA (relative to A/Goose/Guangdong/1/96)

    Community

    Armored RNA Quant Internal Process Control

    X

    X

    X

    X

    X

    Enterobacteria phage T7

    X

    X

    X

    X

    X

    X

    Recommended IC concentration = 1.21 x 10^7 copies/mL of sample

    Escherichia virus MS2

    X

    X

    X

    X

    X

    X

    Escherichia virus Qbeta

    X

    X

    X

    X

    X

    Escherichia virus T4

    X

    X

    X

    X

    X

    X

    Imtechella halotolerans

    X

    X

    X

    X

    X

    Phocid alphaherpesvirus

    X

    X

    X

    X

    X

    Phocine morbillivirus

    X

    X

    X

    X

    X

    Truepera radiovictrix

    X

    X

    X

    X

    X

    Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)

    SARS-CoV-2 (relative to Wuhan-Hu-1/2019)

    Official

    Influenza A virus (H1N1)

    Influenza A H1N1pdm HA (relative to A/Wisconsin/588/2019) & Influenza A H1N1pdm NA (relative to A/Wisconsin/588/2019)

    Official

    Influenza A virus (H3N2)

    Influenza A H3N2 HA (relative to A/Darwin/6/2021) & Influenza A H3N2 NA (relative to A/Darwin/6/2021)

    Official

    Influenza B virus (B/Victoria/2/87-like)

    Influenza B Victoria HA (relative to B/Brisbane/60/2008)

    Allobacillus halotolerans

    X

    X

    X

    X

    X

    Read de-duplication

    Not performed

    Depth threshold for consensus sequence generation

    1x

    Depth threshold for variant calling

    5x

    Minimum minor allele frequency

    20%

    general guidelines for how to upload data to BaseSpacearrow-up-right
    VSP V2 Panel Summary
    RPIP Panel Summary
    Custom reference FASTA and BED files
    Microorganism Reporting File format
    Microorganism Reporting File format
    Microorganism Reporting File format
    Microorganism Reporting File format
    Panel Summary
    Panel Summary
    RPIP Panel Summary
    Report JSON

    Official

    Report JSON format

    The DRAGEN Microbial Enrichment Plus app outputs a comprehensive sample-level report.json file containing general metadata, version information, sample QC, microorganism, and AMR marker results, as well as detailed test information. The additional convenience file formats generated by the DRAGEN Microbial Enrichment Plus app do not contain novel content.

    (*) indicates results generated by the application layer as opposed to the DRAGEN secondary analysis pipeline

    Top-Level Node

    The top-level section of the report JSON contains general metadata and version information.

    Field
    Description

    .qcReport.sampleQc Node

    This section contains information about sample quality control (QC). The fields are relative to .qcReport.sampleQc

    Field
    Description

    .qcReport.enrichmentFactor Node

    This section contains information about the enrichment factor calculation and is relevant to RPIP, UPIP, and VSP V2 only. Detection of an appropriate Internal Control is required. The fields are relative to .qcReport.enrichmentFactor

    Field
    Description

    .qcReport.sampleComposition Node

    This section contains information about the composition of the sample and is provided for RPIP, UPIP, and VSP V2 only. The fields are relative to .qcReport.sampleComposition

    Field
    Description

    .qcReport.internalControls Node

    This section contains information about internal control detection and is relevant to RPIP, UPIP, and VSP V2 only. The value of the .qcReport.internalControls field is an array of objects containing name and RPKM information for each Internal Control. See the code block below for an example:

    .userOptions Node

    This section gives information about analysis options specified by the user. The fields are relative to .userOptions

    Field
    Description

    .targetReport.microorganisms[] Node

    The value of the .targetReport.microorganisms[] field is an array of objects containing information about detected microorganisms. The following table describes one .targetReport.microorganisms[] object. The fields are relative to .targetReport.microorganisms[]

    Field
    Description

    .targetReport.microorganisms[].predictionInformation[].relatedMicroorganisms[] Node

    The value of the .targetReport.microorganisms[].predictionInformation[].relatedMicroorganisms[] field is an array of objects containing information about genetically related microorganisms. The following table describes one .targetReport.microorganisms[].predictionInformation[].relatedMicroorganisms[] object. The fields are relative to .targetReport.microorganisms[].predictionInformation[].relatedMicroorganisms[]

    Field
    Description

    .targetReport.microorganisms[].variants[] Node

    The value of the .targetReport.microorganisms[].variants[] field is an array of objects containing information about viral variants for all RVOP/RVEK, VSP, and VSP V2 viruses, RPIP: SARS-CoV-2 & FluA/B/C only. The following table describes one .targetReport.microorganisms[].variants[] object. The fields are relative to .targetReport.microorganisms[].variants[]

    Field
    Description

    .targetReport.microorganisms[].pangoLineage[] Node

    The value of the .targetReport.microorganisms[].pangoLineage[] field is an array of objects containing information about SARS-CoV-2 Pango lineage prediction results. The following table describes one .targetReport.microorganisms[].pangoLineage[] object. The fields are relative to .targetReport.microorganisms[].pangoLineage[].

    .targetReport.microorganisms[].nextclade[] Node

    The value of the .targetReport.microorganisms[].nextclade[] field is an array of objects containing information about viral clade assignment results for applicable viruses. The following table describes one .targetReport.microorganisms[].nextclade[] object. The fields are relative to .targetReport.microorganisms[].nextclade[].

    .targetReport.amrMarkers[] Node

    The value of the .targetReport.amrMarkers[] field is an array of objects containing information about detected bacterial AMR markers. The following table describes one .targetReport.amrMarkers[] object. The fields are relative to .targetReport.amrMarkers[]

    Field
    Description

    .targetReport.amrMarkers[].variants[] Node

    The value of the .targetReport.amrMarkers[].variants[] field is an array of objects containing information about variants for bacterial AMR markers with "protein variant" or "rRNA variant" model types. The following table describes one .targetReport.amrMarkers[].variants[] object. The fields are relative to .targetReport.amrMarkers[].variants[]

    Field
    Description

    .targetReport.customReferences[] Node

    This section contains information about custom reference detection results and is only present for custom database analyses. When only a custom reference FASTA file is provided (no BED file), each .targetReport.customReferences[] object contains information for a single reference sequence. When both a FASTA and BED file are provided, each .targetReport.customReferences[] object contains information for a single genome/microorganism, which can be a collection of one or more reference sequences. The fields are relative to .targetReport.customReferences[]

    Field
    Description

    .targetReport.customReferences[].consensusSequences[] Node

    The value of the .targetReport.customReferences[].consensusSequences[] field is an array of objects containing majority consensus sequence information for a single custom reference sequence. When only a FASTA file is provided (no BED file), there will be only one object in the array. When both a FASTA and BED file are provided, there may be more than one object in the array. The fields are relative to .targetReport.customReferences[].consensusSequences[]

    Field
    Description

    .targetReport.customReferences[].variants[] Node

    The value of the .targetReport.customReferences[].variants[] field is an array of objects containing information about a single detected variant. The fields are relative to .targetReport.customReferences[].variants[]

    Field
    Description

    .targetReport.customReferences[].pangoLineage[] Node

    The value of the .targetReport.customReferences[].pangoLineage[] field is an array of objects containing information about SARS-CoV-2 Pango lineage prediction results. The following table describes one .targetReport.customReferences[].pangoLineage[] object. The fields are relative to .targetReport.customReferences[].pangoLineage[]

    .additionalInformation[] Node

    The value of the .additionalInformation[] field is an array of objects containing additional information about the test and data analysis solution. The fields are relative to .additionalInformation[]

    Field
    Description

    .postQualityReads

    Number of reads in sample after read QC processing, inclusive of any duplicate reads

    .postQualityReadsProportion

    Proportion of post-quality reads in sample relative to total raw reads

    .removedInDehostingReads

    Number of host reads in sample removed during dehosting (host = human)

    .removedInDehostingReadsProportion

    Proportion of host reads in sample removed relative to total raw reads (host = human)

    .entropy

    Shannon entropy of the counts of 5-mers in the reads after read QC processing, which is a measure of randomness

    .gContent

    Proportion of guanine (G) base calls in reads after read QC processing

    .libraryQScore

    Quality score of the library after read QC processing

    .readClassification.lowComplexity

    Low complexity

    .targetedMicrobial

    Proportion of post-quality targeted microbial reads classified to the following sub-categories:

    .targetedMicrobial.viral

    Viral targeted

    .targetedMicrobial.bacterial

    Bacterial targeted

    .targetedMicrobial.fungal

    Fungal targeted

    .targetedMicrobial.parasitic

    Parasitic targeted

    .targetedMicrobial.bacterialAmr

    Bacterial AMR targeted

    .untargeted

    Proportion of post-quality untargeted reads classified to the following sub-categories:

    .untargeted.viral

    Viral untargeted

    .untargeted.bacterial

    Bacterial untargeted

    .untargeted.fungal

    Fungal untargeted

    .untargeted.parasitic

    Parasitic untargeted

    .untargeted.bacterialAmr

    Bacterial AMR untargeted

    .untargeted.internalControl

    Internal Control untargeted

    .untargeted.human

    Human untargeted

    .viral

    Proportion of post-quality viral reads classified to the following categories:

    .viral.targeted

    Viral targeted

    .viral.untargeted

    Viral untargeted

    .viral.untargetedSubcategories

    Proportion of post-quality viral untargeted reads classified to the following sub-categories:

    .viral.untargetedSubcategories.panel

    Viral panel members

    .viral.untargetedSubcategories.phage

    Viral phage

    .viral.untargetedSubcategories.other

    Viral other (not a panel member or phage)

    .bacterial

    Proportion of post-quality bacterial reads classified to the following categories:

    .bacterial.targeted

    Bacterial targeted

    .bacterial.untargeted

    Bacterial untargeted

    .bacterial.untargetedSubcategories

    Proportion of post-quality bacterial untargeted reads classified to the following sub-categories:

    .bacterial.untargetedSubcategories.panel

    Bacterial panel members

    .bacterial.untargetedSubcategories.ribosomalDna

    Bacterial ribosomal DNA (16S)

    .bacterial.untargetedSubcategories.plasmid

    Bacterial plasmids

    .bacterial.untargetedSubcategories.other

    Bacterial other (not a panel member, ribosomal DNA, or plasmid)

    .fungal

    Proportion of post-quality fungal reads classified to the following categories:

    .fungal.targeted

    Fungal targeted

    .fungal.untargeted

    Fungal untargeted

    .fungal.untargetedSubcategories

    Proportion of post-quality fungal untargeted reads classified to the following sub-categories:

    .fungal.untargetedSubcategories.panel

    Fungal panel members

    .fungal.untargetedSubcategories.ribosomalDna

    Fungal ribosomal DNA (18S)

    .fungal.untargetedSubcategories.other

    Fungal other (not a panel member or ribosomal DNA)

    .parasitic

    Proportion of post-quality parasitic reads classified to the following categories:

    .parasitic.targeted

    Parasitic targeted

    .parasitic.untargeted

    Parasitic untargeted

    .parasitic.untargetedSubcategories

    Proportion of post-quality parasitic untargeted reads classified to the following sub-categories:

    .parasitic.untargetedSubcategories.panel

    Parasitic panel members

    .parasitic.untargetedSubcategories.ribosomalDna

    Parasitic ribosomal DNA (18S)

    .parasitic.untargetedSubcategories.other

    Parasitic other (not a panel member or ribosomal DNA)

    .human

    Proportion of post-quality human reads classified to the following categories:

    .human.untargeted

    Human untargeted

    .human.untargetedSubcategories

    Proportion of post-quality human untargeted reads classified to the following sub-categories:

    .human.untargetedSubcategories.ribosomalDna

    Human ribosomal DNA

    .human.untargetedSubcategories.codingSequence

    Human coding sequence

    .human.untargetedSubcategories.other

    Human other (not ribosomal DNA or coding sequence)

    .internalControl

    Proportion of post-quality Internal Control reads classified to the following categories:

    .internalControl.targeted

    Internal Control targeted

    .internalControl.untargeted

    Internal Control untargeted

    .microbialAndInternalControl

    Proportion of post-quality Microbial and Internal Control reads classified to the following categories:

    .microbialAndInternalControl.targeted

    Microbial and Internal Control targeted

    .microbialAndInternalControl.untargeted

    Microbial and Internal Control untargeted

    .bacterialAmr

    Proportion of post-quality bacterial AMR reads classified to the following categories:

    .bacterialAmr.targeted

    Bacterial AMR targeted

    .bacterialAmr.untargeted

    Bacterial AMR untargeted

    .belowThresholdEnabled*

    Boolean indicating if microorganisms and/or AMR markers below detection thresholds are reported

    .bacterialAmrMarkersOnly*

    (RPIP, UPIP only) Boolean indicating if only bacterial AMR markers are reported

    .bacterialAmrMarkerMicroorganismRequired*

    (RPIP, UPIP only) Boolean indicating if bacterial AMR markers are reported only when an associated microorganism is reported

    .preDefinedMicroorganismReportingList*

    (RPIP, UPIP only) Pre-defined microorganism reporting list, if specified

    .userDefinedMicroorganismReportingListUsed*

    Boolean indicating if a user-defined microorganism reporting file is specified

    .userDefinedMicroorganismReportingListFile*

    Name of the user-defined microorganism reporting file, if specified

    .providedAnalysisName*

    User-provided analysis name

    .rpkm

    Normalized representation of the number of sample sequencing reads aligned to targeted microorganism reference sequences (targeted Reads mapped Per Kilobase of targeted sequence per Million quality-filtered reads)

    .alignedReadCount

    Number of sample sequencing reads that aligned to targeted microorganism reference sequences

    .kmerReadCount

    (UPIP only) Number of sample sequencing reads classified to targeted microorganism reference sequences

    .absoluteQuantityRatio

    Numerical absolute quantification value. Quantitative internal control required for calculation

    .absoluteQuantityRatioFormatted

    Formatted absolute quantification value with units. Quantitative internal control required for calculation

    .phenotypicGroup

    (RPIP, UPIP only) Grouping indicating general association with normal flora, colonization, or contamination from the environment or other sources, as well as general association with disease

    .associatedAmrMarkers

    (Bacteria only) Information about the bacterial AMR markers associated with the microorganism

    .associatedAmrMarkers.applicable

    Boolean indicating whether one or more bacterial AMR markers are associated with the microorganism

    .associatedAmrMarkers.detected

    List of detected bacterial AMR markers associated with the microorganism

    .associatedAmrMarkers.predicted

    List of predicted bacterial AMR markers associated with the microorganism

    .consensusGenomeSequences

    (RPIP, RVOP/RVEK, VSP, VSP V2 viruses only) Information about the majority consensus genome (or segment) sequence

    .consensusGenomeSequences.sequence

    Consensus genome (or segment) sequence bases

    .consensusGenomeSequences.referenceAccession

    Accession of the reference genome (or segment) sequence

    .consensusGenomeSequences.referenceDescription

    Description of the reference genome (or segment) sequence

    .consensusGenomeSequences.referenceLength

    Length of the reference genome (or segment) sequence

    .consensusGenomeSequences.maximumAlignmentLength

    Longest contiguous alignment between consensus sequence and reference genome (or segment) sequence

    .consensusGenomeSequences.maximumGapLength

    Longest contiguous alignment gap (insertion or deletion) between consensus sequence and reference genome (or segment) sequence

    .consensusGenomeSequences.maximumUnalignedLength

    Longest section of the reference genome (or segment) sequence not aligned to by consensus sequence

    .consensusGenomeSequences.coverage

    Proportion of reference genome (or segment) sequence bases that appear in sample sequencing reads

    .consensusGenomeSequences.ani

    Average nucleotide identity of consensus sequence to reference genome (or segment) sequence

    .consensusGenomeSequences.alignedReadCount

    Number of sample sequencing reads that aligned to reference genome (or segment) sequence

    .consensusGenomeSequences.medianDepth

    Median depth of sample sequencing reads aligned to reference genome (or segment) sequence, indicating the median number of times each reference genome (or segment) sequence base appears in sample sequencing reads

    .consensusGenomeSequences.targetAnnotation

    List of targeted region annotations for the reference genome (or segment) sequence. Each annotation is a JSON object with the following fields: start (int), end (int), strand (string: "+", "-"), target_name (string), type (string)

    .consensusGenomeSequences.condensedDepthVector

    Read depth across the reference genome (or segment) sequence, condensed to 256 bins

    .consensusTargetSequences

    (RPIP viruses only) Information about the majority targeted region consensus sequences

    .consensusTargetSequences.sequence

    Consensus targeted region sequence bases

    .consensusTargetSequences.name

    Name of the targeted region

    .consensusTargetSequences.referenceAccession

    Accession of the targeted region reference sequence

    .consensusTargetSequences.depthVector

    Read depth across the targeted region reference sequence, not condensed

    .consensusTargetSequences.scaledDepthVector*

    Read depth across the targeted region reference sequence, condensed and scaled such that the longest targeted region for the microorganism has a maximum length of 256 bins

    .predictionInformation

    Information about microorganism prediction results

    .predictionInformation.predictedPresent

    Boolean indicating whether the microorganism passed its reporting logic algorithm

    .predictionInformation.notes

    List of notes about the prediction result

    .predictionInformation.subpanels

    List of pre-defined subpanels that the microorganism belongs to

    .predictionInformation.relatedMicroorganisms

    Array of objects with information about genetically related microorganisms. See below for details

    .predictionInformation.userDefined*

    User-defined reporting prediction logic for microorganism, if specified

    .variants

    (all RVOP/RVEK, VSP, and VSP V2 viruses, RPIP: SARS-CoV-2 & FluA/B/C only) Information about viral variants. See below for details

    .comments*

    List of additional information regarding the microorganism

    .abundance*

    Relative abundance of the microorganism within the microorganism class

    .pangoLineage*

    (SARS-CoV-2 only) Information about SARS-CoV-2 Pango lineage prediction results. See below for details

    .nextclade*

    (applicable viruses only) Information about viral clade assignment results. See below for details

    .potentialAmrDetected*

    (Bacteria only) Potential AMR detection flag for microorganism. Can be "Yes", “Not Detected”, or “n/a”

    .potentialAmrPredicted*

    (Bacteria only) Potential AMR prediction flag for microorganism. Can be "Yes", “Not Predicted”, or “n/a”

    .flags*

    (Bacteria only) Flag for potential resistance to an important drug class ("Potential ESBL", "Potential Carbapenemase")

    .intrinsicResistance*

    (Bacteria only) List of antimicrobials to which the reported bacteria is intrinsically resistant, based on CLSI Performance Standards for Antimicrobial Susceptibility Testing, M100 34th Edition, Appendix B

    .intrinsicResistanceDrugClasses*

    (Bacteria only) List of drug classes to which the reported bacteria is intrinsically resistant, based on CLSI Performance Standards for Antimicrobial Susceptibility Testing, M100 34th Edition, Appendix B

    .depth

    Variant depth, indicating the number of times variant position appears in sample sequencing reads

    .alleleFrequency

    Frequency of variant allele in sample sequencing reads

    .category*

    Variant category ("Viral Variant; Known AMR", "Viral Variant")

    .comments*

    List of additional information regarding the variant

    .gene*

    (SARS-CoV-2, Flu A/B/C only) Gene name

    .product*

    Protein product of gene

    .annotation*

    Type of change (e.g., "Nonsynonymous Variant")

    .aachange*

    Amino acid change associated with variant

    .epistaticGroups*

    List of epistatic groups variant is associated with

    .standardNomenclatureEpistaticGroups*

    (Flu A/B only) List of epistatic groups variant is associated with using standard nomenclature coordinates

    .standardNomenclatureAaChange*

    (Flu A/B only) Amino acid change associated with variant using standard nomenclature coordinates

    .standardNomenclatureAccession*

    (Flu A/B only) NCBI accession of the reference sequence used to establish standard nomenclature coordinates

    .drugClasses*

    List of drug classes variant is predicted to confer resistance to

    .representativeAntimicrobials*

    List of representative antimicrobials variant is predicted to confer resistance to

    .inhibitionLevel*

    (Flu A/B only) Level of inhibition per cited publications (see pmids)

    .pmids*

    PubMed IDs of publications associated with variant

    Total number of detected nucleotide substitutions

    .totalNonACGTNs*

    Total number of detected ambiguous nucleotides (nucleotide characters that are not A, C, G, T, N)

    .totalMissing*

    Total number of detected missing nucleotides (nucleotide character N)

    .coverage*

    Proportion of consensus genome (or segment) sequence bases that aligned to reference accession

    .totalInsertions*

    Total number of inserted nucleotide bases

    .totalFrameShifts*

    Total number of detected frame shifts

    .stopCodons*

    Total number of detected stop codons

    .version*

    Version of the Nextclade software

    .referenceAccession

    Accession of the bacterial AMR marker reference sequence

    .coverage

    Proportion of bacterial AMR marker reference sequence residues that appear in sample sequencing reads (protein alignment for "homolog" and "protein variant" model types; nucleotide alignment for "rRNA variant" model type)

    .pid

    Percent identity of consensus sequence aligned to bacterial AMR marker reference sequence (protein alignment for "homolog" and "protein variant" model types; nucleotide alignment for "rRNA variant" model type)

    .medianDepth

    Median depth of sample sequencing reads aligned to bacterial AMR marker reference sequence, indicating the median number of times each bacterial AMR marker sequence residue appears in sample sequencing reads (protein alignment for "homolog" and "protein variant" model types; nucleotide alignment for "rRNA variant" model type)

    .rpkm

    Normalized representation of the number of sample sequencing reads aligned to bacterial AMR reference sequence (protein alignment for "homolog" and "protein variant" model types; nucleotide alignment for "rRNA variant" model type)

    .alignedReadCount

    Number of sample sequencing reads that aligned to bacterial AMR reference sequence (protein alignment for "homolog" and "protein variant" model types; nucleotide alignment for "rRNA variant" model type)

    .nucleotideConsensusSequence

    Nucleotide consensus sequence bases

    .proteinConsensusSequence

    Protein consensus sequence bases

    .nucleotideDepthVector

    Read depth across the bacterial AMR marker nucleotide reference sequence, not condensed

    .proteinDepthVector

    Read depth across the bacterial AMR marker protein reference sequence, not condensed

    .associatedMicroorganisms

    Information about the microorganisms associated with the bacterial AMR marker

    .associatedMicroorganisms.all

    List of all microorganisms associated with the bacterial AMR marker

    .associatedMicroorganisms.detected

    List of detected microorganisms associated with the bacterial AMR marker

    .associatedMicroorganisms.predicted

    List of predicted microorganisms associated with the bacterial AMR marker

    .predictionInformation

    Information about bacterial AMR marker prediction results

    .predictionInformation.predictedPresent

    Boolean indicating whether the bacterial AMR marker passed its reporting logic algorithm

    .predictionInformation.confidence

    Confidence level of bacterial AMR marker prediction ("high", "medium", "low")

    .predictionInformation.notes

    List of notes about the prediction result

    .flags*

    Flag indicating AMR marker is an extended-spectrum beta-lactamase (ESBL) or carbapenemase (Carbapenemase)

    .representativeAntimicrobials*

    List of representative antimicrobials the AMR marker is predicted to confer resistance to

    .drugClasses*

    List of drug classes the AMR marker is predicted to confer resistance to

    .referenceAllele

    Reference allele at variant position

    .variantAllele

    Variant allele

    .depth

    Variant depth, indicating the number of times variant position appears in sample sequencing reads

    .alleleFrequency

    Frequency of variant allele in sample sequencing reads

    .annotation

    Type of change (e.g. "Nonsynonymous Variant")

    .aaChange

    Amino acid change associated with variant

    .epistaticGroups

    List of epistatic groups variant is associated with

    .representativeAntimicrobials*

    List of representative antimicrobials variant is predicted to confer resistance to

    .drugClasses*

    List of drug classes variant is predicted to confer resistance to

    .confidenceLevel*

    (MTB only) Confidence level is given for Mycobacterium tuberculosis variants if provided by the WHO Catalogue of mutations in Mycobacterium tuberculosis (Final Grading Confidence; for rpoB only), or the Comprehensive Antibiotic Resistance Database (CARD), as part of a confidence model for AMR developed by the Relational Sequencing Tuberculosis Data Platform (ReSeqTB)

    .pmids*

    PubMed IDs of publications associated with variant

    .alignedReadCount

    Number of sample sequencing reads that aligned to custom reference sequence or, if specified, collection of one or more custom reference sequences

    .consensusSequences

    Array of objects with information about each consensus sequence. See below for details

    .variants

    Array of objects with information about variants detected in custom reference sequence or, if specified, collection of one or more custom reference sequences. See below for details

    .pangoLineage*

    Array of objects with information about SARS-CoV-2 Pango lineage prediction results. See below for details

    .medianDepth

    Median depth of sample sequencing reads aligned to custom reference sequence, indicating the median number of times each custom reference sequence base appears in sample sequencing reads

    .depthVector

    Read depth across custom reference sequence, not condensed

    .alignedReadCount

    Number of sample sequencing reads that aligned to custom reference sequence

    .maximumAlignmentLength

    Longest contiguous alignment between consensus sequence and custom reference sequence

    .maximumGapLength

    Longest contiguous alignment gap (insertion or deletion) between consensus sequence and custom reference sequence

    .maximumUnalignedLength

    Longest section of custom reference sequence not aligned to by consensus sequence

    .alleleFrequency

    Frequency of variant allele in sample sequencing reads

    .accession

    Identifier used for the sample

    .deploymentEnvironment

    Environment in which the results were produced

    .batchId

    Identifier used for the batch of samples processed together

    .analysisId

    Identifier used for the analysis

    .runId

    Identifier used for the sequencing run

    .controlFlag

    Indicates whether the sample is a control. It is set to “POS” if the substring “PosCon” is found in the sample name, “NEG” if the substring “NegCon” is found, or “BLANK” if the substring “controlBlk” is found. Otherwise, it is set to “-”

    .dragenVersion

    DRAGEN release version

    .analysisPipelineVersion

    Analysis Pipeline release version

    .testType

    Type of test panel ("RPIP", "UPIP", "RVOP", "VSPv1", "VSPv2", "Custom")

    .testVersion

    Test panel release version

    .testName

    Full name of test panel

    .testUse

    Test use. "For Research Use Only. Not for use in diagnostic procedures"

    .reportTime

    Date and time the report was generated

    .warnings

    List of warnings encountered during the analysis

    .errors

    List of errors encountered during the analysis

    .results*

    High level result: “One or more potential pathogens predicted” or ”No potential pathogens predicted”

    .appVersion*

    DRAGEN Microbial Enrichment plus application release version

    .totalRawBases

    Number of base pairs in sample before read QC processing

    .totalRawReads

    Number of reads in sample before read QC processing

    .uniqueReads

    Number of distinct reads in sample before read QC processing

    .uniqueReadsProportion

    Proportion of distinct reads in sample before read QC processing

    .preQualityMeanReadLength

    Average read length before read QC processing

    .postQualityMeanReadLength

    Average read length after read QC processing

    .value

    Enrichment factor value reflecting how well targeted regions were enriched

    .category

    Enrichment factor category: "poor", "fair", "good", or "not calculated"

    .readClassification

    Proportion of post-quality reads classified to the following categories:

    .readClassification.targetedMicrobial

    Targeted microbial

    .readClassification.targetedInternalControl

    Targeted Internal Control

    .readClassification.untargeted

    Untargeted

    .readClassification.ambiguous

    More than one category

    .readClassification.unclassified

    No category

    .quantitativeInternalControlName

    Quantitative Internal Control used for microorganism absolute quantification (recommendation: Enterobacteria phage T7)

    .quantitativeInternalControlConcentration

    Quantitative Internal Control concentration (recommendation: 1.21 x 10^7 copies/mL of sample)

    .readQcEnabled

    Boolean indicating if read QC (trimming and filtering based on quality and read length) is enabled

    .readClassificationSensitivity

    (RVOP/RVEK, VSP, VSP V2 only) Sensitivity threshold for classifying reads. Determines whether alignment should proceed for a microorganism and/or reference sequence. Value is an integer with a valid range of 1 to 1000, inclusive

    .customPanelFastaFile

    (Custom Panel only) Name of the custom reference FASTA file

    .customPanelBedFile

    (Custom Panel only) Name of the custom reference BED file

    .class

    Microorganism class ("viral", "bacterial", "fungal", "parasite")

    .name

    Name of microorganism

    .coverage

    Proportion of targeted microorganism reference sequence bases that appear in sample sequencing reads

    .ani

    Average nucleotide identity of consensus sequence to targeted microorganism reference sequences

    .medianDepth

    Median depth of sample sequencing reads aligned to targeted microorganism reference sequences, indicating the median number of times each targeted microorganism reference sequence base appears in sample sequencing reads

    .condensedDepthVector

    Read depth across the targeted microorganism reference sequences, condensed to 256 bins

    .name

    Name of related microorganism

    .onPanel

    Boolean indicating whether the related microorganism is a panel member

    .kmerReadCount

    (UPIP only) Number of sample sequencing reads classified to related microorganism reference sequences

    .coverage

    Proportion of related microorganism reference sequence bases that appear in sample sequencing reads

    .ani

    Average nucleotide identity of consensus sequence to related microorganism reference sequences

    .alignedReadCount

    Number of sample sequencing reads that aligned to related microorganism reference sequences

    .referenceAccession

    Accession of reference genome (or segment) sequence used for variant calling

    .segment

    (Segmented viruses only) Segment number of reference segment sequence

    .ntChange

    Nucleotide change associated with variant

    .referencePosition

    Variant position in viral reference genome (or segment) sequence

    .referenceAllele

    Reference allele at variant position

    .variantAllele

    Variant allele

    Field

    Description [Source]arrow-up-right

    .lineage*

    From Pangolin: "The most likely lineage assigned to a given sequence based on the inference engine used and the SARS-CoV-2 diversity designated. This assignment may be sensitive to missing data at key sites"

    .conflict*

    From Pangolin: "In the pangoLEARN model, a given sequence gets assigned to the most likely category based on known diversity. If a sequence can fit into more than one category, the conflict score will be greater than 0 and reflect the number of categories the sequence could fit into. If the conflict score is 0, this means that within the current decision tree there is only one category that the sequence could be assigned to"

    .ambiguityScore*

    From Pangolin: "This score is a function of the quantity of missing data in a sequence. It represents the proportion of relevant sites in a sequnece which were imputed to the reference values. A score of 1 indicates that no sites were imputed, while a score of 0 indicates that more sites were imputed than were not imputed. This score only includes sites which are used by the decision tree to classify a sequence"

    .version*

    Version of the PUSHER database

    .pangolinVersion*

    Version of the Pangolin software

    Field

    Description [Source]arrow-up-right

    .sequenceName*

    Name of the sequence

    .referenceAccession*

    Reference accession

    .overallStatus*

    Overall quality controlarrow-up-right status

    .clade*

    Assigned clade

    .pangoLineage*

    Pango lineage assigned by Nextclade

    .cladeWho*

    World Health Organization (WHO) nomenclature

    .class

    Microorganism class ("bacterial")

    .cardModelType

    Bacterial AMR marker model type in the Comprehensive Antibiotic Resistance Database (CARD) ("homolog", "protein variant", "rRNA variant")

    .cardGeneFamily

    Bacterial AMR marker gene family in the Comprehensive Antibiotic Resistance Database (CARD)

    .name

    Bacterial AMR marker name

    .cardName

    Bacterial AMR marker name in the Comprehensive Antibiotic Resistance Database (CARD)

    .ncbiName

    Bacterial AMR marker name in the National Center for Biotechnology Information (NCBI) Reference Gene Catalog

    .category

    Variant category ("Bacterial Variant; Known AMR")

    .referenceSourceMicroorganism

    Microorganism that reference sequence is associated with in NCBI

    .comments

    List of additional information regarding the variant

    .product

    Protein product of gene

    .ntChange

    Nucleotide change associated with variant

    .referencePosition

    Variant position in reference sequence

    .name

    Provided name of custom reference sequence, accession, genome, or microorganism

    .coverage

    Proportion of custom reference sequence bases that appear in sample sequencing reads

    .ani

    Average nucleolotide identity of consensus sequence to custom reference sequence or, if specified, collection of one or more custom reference sequences

    .medianDepth

    Median depth of sample sequencing reads aligned to custom reference sequence or, if specified, collection of one or more custom reference sequences, indicating the med\ian number of times each custom reference sequence base appears in sample sequencing reads

    .condensedDepthVector

    Read depth across custom reference sequence or, if specified, collection of one or more custom reference sequences, condensed to 256 bins

    .rpkm

    Normalized number of sample sequencing reads aligned to custom reference sequence or, if specified, collection of one or more custom reference sequences (targeted Reads mapped Per Kilobase of targeted sequence per Million quality-filtered reads)

    .sequence

    Majority consensus sequence bases

    .referenceAccession

    Accession of custom reference sequence

    .referenceDescription

    Description of custom reference sequence

    .referenceLength

    Length of custom reference sequence

    .coverage

    Proportion of custom reference sequence bases that appear in sample sequencing reads

    .ani

    Average nucleolotide identity of consensus sequence to custom reference sequence

    .ntChange

    Nucleotide change associated with variant

    .referenceAccession

    Accession of custom reference sequence used for variant calling

    .referencePosition

    Variant position in custom reference sequence

    .referenceAllele

    Reference allele at variant position

    .variantAllele

    Variant allele

    .depth

    Variant depth, indicating the number of times variant position appears in sample sequencing reads

    Field

    Description [Source]arrow-up-right

    .lineage*

    The most likely lineage assigned to a given sequence based on the inference engine used and the SARS-CoV-2 diversity designated. This assignment may be is sensitive to missing data at key sites

    .conflict*

    In the pangoLEARN model, a given sequence gets assigned to the most likely category based on known diversity. If a sequence can fit into more than one category, the conflict score will be greater than 0 and reflect the number of categories the sequence could fit into. If the conflict score is 0, this means that within the current decision tree there is only one category that the sequence could be assigned to

    .ambiguityScore*

    This score is a function of the quantity of missing data in a sequence. It represents the proportion of relevant sites in a sequnece which were imputed to the reference values. A score of 1 indicates that no sites were imputed, while a score of 0 indicates that more sites were imputed than were not imputed. This score only includes sites which are used by the decision tree to classify a sequence

    .version*

    Version of the PUSHER database

    .pangolinVersion*

    Version of the Pangolin software

    .abbreviations*

    Information about abbreviations relevant to test

    .abbreviations.abbreviation*

    Abbreviation

    .abbreviations.definition*

    Abbreviation definition

    .interpretiveData*

    Information about test

    .interpretiveData.header*

    Test information category

    .interpretiveData.paragraph*

    Test information text

    .substitutions*

    [
        {
            "name": "Allobacillus halotolerans",
            "rpkm": 0
        },
        {
            "name": "Armored RNA Quant Internal Process Control",
            "rpkm": 0
        },
        {
            "name": "Enterobacteria phage T7",
            "rpkm": 180323
        },
        {
            "name": "Escherichia virus MS2",
            "rpkm": 0
        },
        {
            "name": "Escherichia virus Qbeta",
            "rpkm": 0
        },
        {
            "name": "Escherichia virus T4",
            "rpkm": 0
        },
        {
            "name": "Imtechella halotolerans",
            "rpkm": 0
        },
        {
            "name": "Phocid alphaherpesvirus 1",
            "rpkm": 0
        },
        {
            "name": "Phocine morbillivirus",
            "rpkm": 0
        },
        {
            "name": "Truepera radiovictrix",
            "rpkm": 0
        }
    ]

    UPIP

    Abbreviation
    Definition

    AMR

    antimicrobial resistance

    CLSI

    Clinical and Laboratory Standards Institute

    ESBL

    extended spectrum beta-lactamase

    EUCAST

    European Committee on Antimicrobial Susceptibility Testing

    mL

    milliliter

    Category
    Test information

    AMR

    Linkage between AMR marker, antimicrobial, and drug class is based on the Comprehensive Antibiotic Research Database (CARD, version 3.2.8) from McMaster University, ResFinder (version 2.2.1), NCBI Reference Gene Catalog (version 2023-09-26.1), EUCAST expert rules on indicator agents (2019-2023), and CLSI Performance Standards for Antimicrobial Susceptibility Testing (M100 34th Edition). Not all antimicrobials and drug classes that are listed may be relevant. Detected AMR markers may also confer resistance to antimicrobials and drug classes that are not listed.

    AMR

    A representative list of associated microorganisms known to harbor the detected or similar AMR markers, based on the Comprehensive Antibiotic Research Database Prevalence Data (CARD Prevalence, version 4.0.1) from McMaster University, can be found in the Associated Microorganisms field.

    AMR

    Mutations connected with a '+' form an epistatic group. Epistatic groups are two or more mutations that need to be present concurrently to confer the associated resistance.

    AMR

    All intrinsic resistance described in CLSI Performance Standards for Antimicrobial Susceptibility Testing, M100 34th Edition, Appendix B for detected microorganism(s) is reported. Additional comments regarding CLSI intrinsic resistance definitions may be reported in footnotes specific to the detected microorganism(s). Some intrinsic resistance is described with reference to drug classes rather than specific antimicrobials. Users may reference CLSI Glossary I (Part 1 and Part 2): Class and Subclass Designations and Generic Names for information on how CLSI categorizes antimicrobials and drug classes.

    AMR

    Confidence of AMR marker detection is shown as High, Medium, or Low and is based on the available sequencing data. High confidence indicates that an AMR marker has 100% protein sequence coverage and 100% protein sequence percent identity (PID). Medium confidence indicates that an AMR marker has ≥90% protein sequence coverage and ≥90% protein sequence percent identity (PID). Low confidence indicates that an AMR marker has ≥60% protein sequence coverage and ≥80% protein sequence percent identity (PID).

    Phenotypic group

    Targeted microorganisms are classified into three Phenotypic Groups based on general association with urinary tract infections, normal flora, colonization, or contamination from the environment or other sources. Phenotypic grouping DOES NOT INDICATE PATHOGENICITY IN A GIVEN CASE and results need to be interpreted in the context of all available information. Phenotypic Group 1: Microorganisms that are rarely associated with urinary tract infections and may frequently represent normal flora, colonizers, or contaminants. Phenotypic Group 2: Microorganisms that are infrequently associated with urinary tract infections and may frequently represent part of the normal flora, colonizers, or contaminants. Phenotypic Group 3: Microorganisms that are commonly associated with urinary tract infections but may also represent part of the normal flora, colonizers, or contaminants.

    Read classification

    This test differentiates sequencing reads classified to microorganism and Internal Control regions that are targeted by capture probes (“Targeted Microbial” and “Targeted Internal Control”) from those that are not targeted (“Untargeted”), are low complexity (“Low Complexity”), cannot be unambiguously assigned to one category (“Ambiguous”), or cannot be classified with confidence (“Unclassified”).

    Limitations

    Non-detected results do not rule out the presence of viruses, bacteria, fungi, parasites, and AMR markers. Contamination with microorganisms is possible during specimen collection, transport, and processing. Closely related microorganisms may be misidentified based on sequence homology to species present in the database. The identification of DNA sequences from a microorganism does not confirm that the identified microorganism is causing symptoms, is viable, or is infectious. Recombinant viral strains may not be reported or may be reported as one or more individual viruses. The Enterobacter cloacae complex may not be reported if targeted species members (Enterobacter cloacae, Enterobacter hormaechei, and Enterobacter cancerogenus) are not present.

    Limitations

    The best matching allele is reported for each detected AMR gene family. If two or more alleles within the same AMR gene family are detected, only the allele with the higher confidence will be reported as the best match unless multiple alleles have a High confidence interpretation (100% protein sequence coverage and PID). In bacterial strains containing insertion-deletion mutations (indels), there is a risk of false positive or false negative results for other resistance mutations within a region of 100 nucleotides around the indel.

    Limitations

    Information provided by DRAGEN Microbial Enrichment Plus is based on scientific knowledge and has been curated; however, scientific knowledge evolves and information about associated microorganism and associated resistance may not always be complete and/or correct. Results should be interpreted in the context of all available information. Other sources of data may be required for confirmation.

    NGS

    next-generation sequencing

    RPKM

    targeted Reads mapped Per Kilobase of targeted sequence per Million quality-filtered reads

    UPIP

    Urinary Pathogen ID/AMR Panel

    RUO

    For Research Use Only. Not for use in diagnostic procedures.

    URL

    See https://www.illumina.com/ for additional information.

    Quantification - when a quantitative Internal Control {ic_name} and concentration {ic_concentration} is specified

    UPIP data analysis using DRAGEN Microbial Enrichment Plus detects 35 viruses, 121 bacteria, 14 fungi, 4 parasites, and 4,371 AMR markers, unless filtered reporting options are selected, based on target enriched next-generation sequencing (NGS) of microorganism DNA sequences. Sequencing data are interpreted by the DRAGEN software platform and microorganisms that pass reporting thresholds are reported. Absolute quantification assumes use of {ic_name} as an Internal Control spiked at {ic_concentration} copies/mL of sample. Relative abundance is calculated based on absolute quantities and is expressed as proportion of absolute quantities within each pathogen class (i.e., bacteria, viruses, fungi, parasites). If RPKM for the Internal Control is zero, no absolute quantification is provided, and relative abundance is expressed as proportion of microorganism RPKM values within each pathogen class.

    Quantification - when a quantitative Internal Control is NOT specified

    UPIP data analysis using DRAGEN Microbial Enrichment Plus detects 35 viruses, 121 bacteria, 14 fungi, 4 parasites, and 4,371 AMR markers, unless filtered reporting options are selected, based on target enriched next-generation sequencing (NGS) of microorganism DNA sequences. Sequencing data are interpreted by the DRAGEN software platform and microorganisms that pass reporting thresholds are reported. Relative abundance is expressed as proportion of microorganism RPKM values within each pathogen class (i.e., bacteria, viruses, fungi, parasites). Internal Control not specified; no absolute quantification provided.

    AMR - when "Report bacterial AMR markers only when an associated microorganism is reported" is selected

    This test detects 4,371 antimicrobial resistance (AMR) markers and reports associations for 72 microorganisms, 185 antimicrobials, and 33 drug classes, unless filtered reporting options are selected. AMR markers are based on the Comprehensive Antibiotic Research Database (CARD, version 3.2.8). Detection of an AMR marker is reported if the AMR marker passes a minimum detection threshold and if one or more of the microorganisms associated with the AMR marker is also detected, in alignment with guidance provided by the College of American Pathologists (CAP) MIC.21855. However, reported AMR markers may originate from microorganisms that did not meet detection thresholds or microorganisms not targeted by the test. Association between microorganisms and AMR marker is based on scientific literature and the Comprehensive Antibiotic Research Database Prevalence Data (CARD Prevalence, version 4.0.1) from McMaster University. 3,968 out of 4,371 AMR markers are associated with a microorganism targeted by UPIP. Reported AMR markers have been associated with antimicrobial resistance but may not always indicate phenotypic resistance. Failure to detect AMR markers does not always indicate phenotypic susceptibility. Results should be interpreted in the context of all available information.

    AMR - when "Report bacterial AMR markers only when an associated microorganism is reported" is NOT selected

    This test detects 4,371 antimicrobial resistance (AMR) markers and reports associations for 72 microorganisms, 185 antimicrobials, and 33 drug classes. AMR markers are based on the Comprehensive Antibiotic Research Database (CARD, version 3.2.8). Association between microorganisms and AMR marker is based on scientific literature and the Comprehensive Antibiotic Research Database Prevalence Data (CARD Prevalence, version 4.0.1) from McMaster University. Detection of an AMR marker is reported if the AMR marker passes a minimum detection threshold, regardless of associated microorganism detection. Reported AMR markers may originate from microorganisms that did not meet detection thresholds or microorganisms not targeted by the test. Reported AMR markers have been associated with antimicrobial resistance but may not always indicate phenotypic resistance. Failure to detect AMR markers does not always indicate phenotypic susceptibility. Results should be interpreted in the context of all available information.