# Gene set ANOVA

Gene set ANOVA allows user to perform 1-way ANOVA to compare different groups at gene set level. This method takes normalized gene expression count matrix, a gene set is a group of genes based on database specified, like GO term, KEGG pathway etc.

Like setup [ANOVA](https://help.connected.illumina.com/icm/analyses/analysis-functionality/task-menu/statistics/differential-analysis/anova-limma-trend-limma-voom) model for gene expression analysis, but only one factor can be added to the model. In addition, the following extra terms will be added to the model by the task automatically:

* **Gene ID** - Since not all genes in a functional group express at the same level, gene ID is added to the model to account for gene-to-gene differences
* **Factor \* Gene ID** - Interaction of gene ID with the factor is added to detect changes within the expression of a gene set with respect to different levels of the factor, referred to disruption. For instance, in a gene set, maybe some genes showing up-regulation in treatment group, but some other genes showing down-regulation in the treatment group, we call it gene set disruption.

## Running Gene set ANOVA

Select the data node with normalized data and then go to **Biological interpretation > Gene set ANOVA**

Use the first dialog to specify gene sets database. You can rung gene set ANOVA on pathways (currently based on Kyoto Encyclopedia of Genes and Genomes ([KEGG](https://www.genome.jp/kegg/)) pathways) or on other gene set databases. The *Gene set size* option allows you to restrict your analysis on gene sets of certain size (i.e. number of genes). Make sure the feature identifier in the data contains gene symobl/gene name, which is used to map to the database. Click **Next**.

<div align="left"><figure><img src="https://580316046-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWMxqQAMFOJtu98OBk9KN%2Fuploads%2Fgit-blob-b41d764f854a30f3595a5dc7c94ec60ee14ec8df%2Fimage%20(191)%20(1).png?alt=media" alt=""><figcaption></figcaption></figure></div>

Once your choices are made, push **Next** to proceed.

In the second part of the set up, pick the experimental factor, only one factor can be selected.

<div align="left"><figure><img src="https://580316046-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWMxqQAMFOJtu98OBk9KN%2Fuploads%2Fgit-blob-a9b050c5c062d7afc6335330ddb29a90dd77aebc%2Fimage%20(192)%20(1).png?alt=media" alt="" width="563"><figcaption></figcaption></figure></div>

Click **Next** to setup comparisons:

<div align="left"><figure><img src="https://580316046-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWMxqQAMFOJtu98OBk9KN%2Fuploads%2Fgit-blob-893a1c957374852546a904258ef2a9bae284ab8d%2FgseANOVA_comparison.png?alt=media" alt="" width="563"><figcaption></figcaption></figure></div>

The box on the left side displays the categories of the selected factor (shown as *Factor*). Use the arrow buttons (**>**) to move one of the factors to the *Denominator* box (that factor should be interpreted as the reference category) and the other factor to the *Numerator* box. Confirm your selection by pushing the **Add comparison** button and the comparison will be added to the *Comparisons* table.

Click **Finish** to run. Each comparison will be performed individually and generate its own section in the report.

Click on the **Configure** icon to access the advanced options.

## Gene Set ANOVA Results

When the task completes, double click on the **Gene Set ANOVA** node to view the report.

Like [ANOVA ](https://help.connected.illumina.com/icm/analyses/analysis-functionality/task-menu/statistics/differential-analysis/anova-limma-trend-limma-voom)report, the report consists of two parts: the GSEA result table on the right and the filter panel on the left.

<div align="left"><figure><img src="https://580316046-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWMxqQAMFOJtu98OBk9KN%2Fuploads%2Fgit-blob-7ad35277db70e5bfef42330995308aef5c5055a9%2FgseANOVA_report.png?alt=media" alt=""><figcaption></figcaption></figure></div>

The comparison (i.e. Denominator vs. Numerator) is given at the top of the table. Each row of the table corresponds to one gene set (pathway) and the gene sets are ranked by the first comparison's p-value in ascending order.

* *View*. The icons in the *View* column open the dot plot (![](https://580316046-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWMxqQAMFOJtu98OBk9KN%2Fuploads%2Fgit-blob-2c3a596488bc7e8a01edcac0cd03d629ab446c87%2Fimage%20\(91\)%20\(1\).png?alt=media)) or the extra details report (<img src="https://580316046-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWMxqQAMFOJtu98OBk9KN%2Fuploads%2Fgit-blob-97596bba2781585ced3f7689a5a0461eaf0d1bf0%2Fimage%20(248).png?alt=media" alt="" data-size="line">) (explanations below).
* *Gene set ID*. The Gene set IDs are based on the gene set file that was selected during set up. Each ID is a link to the details of he selected set.
* *Gene set size*. Number of genes in the set (as specified in the gene set file), click on the number to download the list of genes.
* For each comparison, there are p-value, FDR, ratio, fold change and LSmean of each comparison group reported
* Disruption: is the factor and gene ID interaction term, p-value and FDR are reported on this term too.

Click on the dot plot icon to open the viewer

<div align="left"><figure><img src="https://580316046-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWMxqQAMFOJtu98OBk9KN%2Fuploads%2Fgit-blob-96f3eabfed9e815b35296c8375fdfb598731dcd6%2Fimage%20(135)%20(1).png?alt=media" alt="" width="563"><figcaption></figcaption></figure></div>

The plot display the genes of the gene set selected. X-axis represents genes within the gene set, Y-axis represents the mean value of gene expression, each dot represent of the group in the comparison.

Click on the **View extra details** icon (<img src="https://580316046-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWMxqQAMFOJtu98OBk9KN%2Fuploads%2Fgit-blob-97596bba2781585ced3f7689a5a0461eaf0d1bf0%2Fimage%20(248).png?alt=media" alt="" data-size="line">) to open a gene set-specific report page, the model used for the computation is included in this report.

##


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://help.connected.illumina.com/icm/analyses/analysis-functionality/task-menu/biological-interpretation/gsea-1.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
