> For the complete documentation index, see [llms.txt](https://help.connected.illumina.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://help.connected.illumina.com/icm/analyses/walkthroughs/bulk-rna.md).

# Bulk RNA

## Getting started

### [Logging into Connected Multiomics](https://help.multiomics.illumina.com/icm/introduction/readme-1#log-in-to-connected-multiomics)

### [Creating a study and adding projects from BioInsight Platform Core](https://help.multiomics.illumina.com/icm/studies/create-study)

### [Viewing results and navigating in Connected Multiomics](https://help.multiomics.illumina.com/icm/analyses/enter-analysis)

### Data / Task nodes and Performing tasks in Connected Multiomics

* Within a study, the *Analyses tab* contains two elements: task nodes (rectangles) and data nodes (circles) connected by lines and arrows. Collectively, they represent a data analysis pipeline.
* Clicking a data node brings up a context sensitive menu on the right. This menu changes depending on the type of data node. It will only present tasks which can be performed on that specific data type. Hover over the task to obtain additional information regarding each option.
* Select the task you wish to perform from the menu. When configuring task options, additional information regarding each option is available. Click **Finish** to perform the task.
* Depending on the task, a new data node may automatically be created and connected to the original data node. This contains the data resulting from the task. Tasks that do not produce new data types will not produce an additional data node.
* To view the results of a task, click the data node and choose the **Task report** option on the menu.

### Viewing and saving data

* All data contained in data nodes can be downloaded to the local machine by selecting the node and navigating to the bottom of the toolbox then choose **Download data**.
* The [Data Viewer](/icm/analyses/analysis-functionality/data-viewer.md) can be used to plot, modify, and save data. In this walkthrough the PCA data node and Hierarchical clustering / heatmap node can be automatically opened in the Data viewer by double-clicking the data node or opening the Task report from the toolbox.
* To save an individual image within the Data Viewer to your machine, click **Plot** then **Export image** & select the format, size, and resolution then click **Save**. Use the plot-specific tools for this.
* All visualizations within a sheet in the Data Viewer can be exported as one image (e.g. use one image with all plots for a poster). Use the **Export** drop-down at the top of the data-viewer for this and select **Export image**.

## Input: secondary outputs from the DRAGEN analysis

You will noticed that there are two .sf file options to choose from in [secondary outputs of the DRAGEN analysis](https://help.dragen.illumina.com/product-guide/dragen-v4.4/dragen-rna-pipeline/gene-expression-quantification).

* `<outputPrefix>.quant.genes.sf` - Contains quantification results at the gene level. The results are produced by summing together all transcripts with the same geneID in the annotation file (GTF).
* `<outputPrefix>.quant.sf` - Contains quantification results at the transcript level.

## Import Data

Import data that has been processed through the DRAGEN RNA analysis pipeline in [BaseSpace](https://ilmn.basespace.illumina.com/apps/17802785/DRAGEN%20RNA), BioInsight Platform Core, or the command line.

* Use the .sf file from the secondary outputs.

{% hint style="warning" %}
Remember the [genome reference and annotation file used during DRAGEN analysis](https://support.illumina.com/sequencing/sequencing_software/dragen-bio-it-platform/product_files.html)—you’ll need them for feature annotation. If you have used built in genome references, the following table shows the [default GTFs](https://support.illumina.com/sequencing/sequencing_software/dragen-bio-it-platform/product_files.html) being used.
{% endhint %}

| Annotation file (GTF) | Genome reference file                                                                                                                                                |
| --------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| GENCODE v19           | <p>Homo sapiens \[UCSC] hg19 v5</p><p>Homo sapiens \[UCSC] hg19 v5 Pangenome</p><p>Homo sapiens \[NCBI] hs37d5 v5</p><p>Homo sapiens \[NCBI] hs37d5 v5 Pangenome</p> |
| GENCODE v44           | <p>Homo sapiens \[1000 Genomes] hg38 v5</p><p>Homo sapiens \[1000 Genomes] hg38 v5 Pangenome</p>                                                                     |
| GENCODE vM23          | Mus musculus \[UCSC] mm10                                                                                                                                            |
| ENSEMBL 98            | Rattus norvegicus \[UCSC] rn6                                                                                                                                        |

* After creating a study and adding data to the study, click **+ New Analysis**
* Give the Analysis a name, select the **Analysis Type > Custom: RNA**, select the sample groups to add to the analysis, and click **Run Analysis**

<figure><img src="/files/dcshrsQFHEJ0ihy2MVTE" alt=""><figcaption></figcaption></figure>

* The Status will show as Complete when ready to analyze

{% hint style="warning" %}
Click the Refresh button to see change the Status in real-time.
{% endhint %}

* Click the complete Analysis to open and customize the analysis pipeline

<figure><img src="/files/a2Y5XAAMdb0NftnvA4nz" alt=""><figcaption></figcaption></figure>

* The Quantification node is created when the analysis completes.

{% hint style="warning" %}
Hover over nodes to see details about the data node. Below, the number of samples, features, and data size is shown in the Quantification node.
{% endhint %}

<figure><img src="/files/TwElJtqwcXMZmhe2aFfa" alt=""><figcaption><p>The Quantification node is the starting node for analysis</p></figcaption></figure>

## Annotate Features

Add gene-level annotations to the quantified data.

* Single-click the *Quantification* node.
* Select **Annotate features** under the *Pre-analysis tools* section in the toolbox on the right.
* Choose the **genome** and **annotation** files that match those used in DRAGEN then click **Finish**.
* Outcome:
  * Task node: *Annotate features*
  * Result node: *Annotated counts*

<figure><img src="/files/ymBvPyCEyk8Z3zw3iooS" alt=""><figcaption><p>Annotate feature task node &#x26; Annotated counts data node</p></figcaption></figure>

## [Normalization and Scaling](/icm/analyses/analysis-functionality/task-menu/normalization-and-scaling/normalization.md)

Normalize the data to prepare for downstream analysis.

* Single-click the Annotated Counts node, then select the **Normalization** task from the *Normalization and Scaling* section.
* Click the **"Use Recommended"** button or select an alternative method. We recommend the widely used *Median ratio (DESeq2 only)* method.

<figure><img src="/files/mV58uhuG6dB6xr315xY6" alt=""><figcaption><p>Median ratio (DESeq2 only) is the recommended normalization method for bulk transcriptomic data</p></figcaption></figure>

* Outcome:
  * Task node: *Normalize counts*
  * Result node: *Normalized counts*

<figure><img src="/files/4ShRRqbeZolaYrmKAtPG" alt=""><figcaption><p>Normalize counts task node &#x26; Normalized counts data node</p></figcaption></figure>

## [Dimension Reduction (PCA)](/icm/analyses/analysis-functionality/task-menu/exploratory-analysis/pca.md)

Visualize sample clustering and variance.

* From the *Normalized Counts* node, select **PCA** under *Exploratory Analysis*.
* Outcome:
  * Task node: *PCA*
  * Result node: *PCA*

<figure><img src="/files/RjknWXOLfnwvJ8YdA563" alt=""><figcaption><p>PCA task node &#x26; PCA data node</p></figcaption></figure>

## [Differential Analysis](/icm/analyses/analysis-functionality/task-menu/statistics/differential-analysis.md)

Compare gene expression across experimental groups.

* From the *Normalized counts* node, select **Differential Analysis** from the *Statistics* section.
* Choose your preferred model and set up the comparison. Note that we have chosen the [DESeq2 method](/icm/analyses/analysis-functionality/task-menu/statistics/differential-analysis/deseq2-r-vs-deseq2.md) and used the corresponding normalization prior.
* Outcome:
  * Task node: *Differential analysis* (labeled as model used)
  * Result node: *Differential results* (labeled as comparison made)

<figure><img src="/files/iQUXDNE9mvbSKgEBWGoP" alt=""><figcaption><p>Differential analysis task node &#x26; Differential analysis data node labeled as model &#x26; comparison made</p></figcaption></figure>

## Filter Feature List

Refine the list of genes/features based on criteria.

* Open the *Differential Results* node (double-click or single-click and select *Task report* from the toolbox).
* Use the **filter menu** to apply criteria relevant to your study.
* Click **Generate filtered node** once satisfied.
* Outcome:
  * Task node: *Filter list*
  * Result node: *Filtered feature list*

<figure><img src="/files/JeCFiW1FTXLXXFXrXtm4" alt=""><figcaption><p>Filter list task node &#x26; Filtered feature list data node</p></figcaption></figure>

## [Gene Set Enrichment](/icm/analyses/analysis-functionality/task-menu/biological-interpretation/gene-set-enrichment.md)

Identify enriched biological pathways or gene sets.

* Select **Gene Set Enrichment** from the *Biological Interpretation* section.
* Choose between **KEGG Pathway Enrichment** or **Gene Set Ontology**.

{% hint style="warning" %}
The latest version of KEGG can be added in the **Settings > Library file management**
{% endhint %}

* Outcome:
  * Task node: *Gene set enrichment*
  * Result node: *Pathway enrichment*

<figure><img src="/files/f9KdBUqRoQmnQguESBuU" alt=""><figcaption><p>Gene set enrichment task node &#x26; Pathway enrichment data node</p></figcaption></figure>

## [Hierarchical clustering / Heatmap](/icm/analyses/analysis-functionality/task-menu/exploratory-analysis/hierarchical-clustering.md)

Visualize features in an informative way.

* Select **Hierarchical clustering / Heatmap** from the *Exploratory analysis* section.
* This task can be used for either a heatmap or bubble map. Choose the task options that best suite your needs.
* Double-click on the output node to visualize the results in the Data viewer.
* Outcome:
  * Node: *Hierarchical clustering / heatmap*

<figure><img src="/files/vaYSIWsfr88LEEBd9P4N" alt=""><figcaption><p>Hierarchical clustering / heatmap result</p></figcaption></figure>


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://help.connected.illumina.com/icm/analyses/walkthroughs/bulk-rna.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.