# Counting Algorithm

The Illumina miRNA sequencing fastqs are uploaded to either BSSH or ICA, where reads are processed through the following steps in the DRAGEN miRNA app:

1. **Calibration of miRBase entries**\
   For miRNA entries with identical or nearly identical sequences in the miRBase mature database, manual calibration is performed. A combined entry is generated for each overlapping miRNA set. For instance, the sequence of *hsa-miR-151b* is entirely contained within *hsa-miR-151a-5p*; therefore, the resulting entry is reported as *hsa-miR-151b/151a-5p*.
2. **Adapter and quality trimming**\
   Reads are processed with *cutadapt* ([documentation](https://cutadapt.readthedocs.io/en/stable/guide.html)) to remove 3′ adapters (AACTGTAGGCACCATCAAT) and low-quality bases. Reads lacking adapter sequences are separately tallied as *no\_adapter\_reads*.
3. **Insert and UMI identification**\
   After trimming, insert and UMI sequences are extracted. Reads with inserts shorter than 16 bp (*too\_short\_reads*) or UMIs shorter than 10 bp (*UMI\_defective\_reads*) are discarded.
4. **Insert sequence alignment**\
   A unique sequence set is generated across all samples within a submitted job. Insert sequences are annotated using a sequential alignment strategy with *bowtie* (bowtie-bio.sourceforge.net). Alignments proceed in the following order:

   * Perfect match to miRBase mature
   * miRBase hairpin
   * Noncoding RNA, mRNA, otherRNA
   * Secondary alignment to miRBase mature (allowing up to two mismatches)

   At each step, only unmapped sequences are passed forward. Read counts are reported per RNA category (e.g., *miRNA\_Reads, hairpin\_Reads, piRNA\_Reads, tRNA\_Reads, rRNA\_Reads, mRNA\_Reads*). miRBase is used for miRNAs (v21 or v22), while piRNABank is referenced for piRNAs.

   For human, mouse, and rat, a species-specific miRBase mature database is used, followed by genome alignment of remaining sequences to identify potential novel miRNAs (human: GRCh38, mouse: GRCm38, rat: Rnor\_6.0). For all other species, a comprehensive miRBase mature database is applied.
5. **Counting reads and unique molecules**\
   For each sample, all reads assigned to a given miRNA or piRNA ID are tallied, and UMIs are aggregated to calculate unique molecule counts. Results are reported as follows:
   * *miRNA\_piRNA* sheet: read counts and UMI counts for miRNAs and piRNAs
   * *tRNA* and *otherRNA* sheets: results for tRNAs and other RNAs
   * *notCharacterized\_mappable* sheet: reads and clustered UMIs aligned to the genome in the final step (human, mouse, rat only)
   * *notCharacterized\_notMappable*: tally of all remaining unmapped reads


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://help.connected.illumina.com/dragen-mirna/readme/counting-algorithm.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
