Biological Interpretation

What is the difference between GSEA and Gene Set Enrichment?

In Partek Flow, GSEA should be performed on a sample/cell and feature matrix data node (e.g. normalization count data). GSEA is used to detect a gene set/a pathway which is significantly different between two groups. Gene set enrichment should be performed on a filtered gene list; it is used to identify overrepresented gene set/pathway based the filtered gene list using Fisher's exact test. The input data is a filtered list using gene names.

What is the enrichment score shown in the Gene Set Enrichment report?

The enrichment score shown in the enrichment report is the negative natural log of the enrichment p-value derived from Fisher Exact test. The higher the enrichment score, the more overrepresented our list of genes in the gene set of a GO/pathway category.

In KEGG pathway, genes can be colored by Fold change and p-value etc, how are the gene statistics calculated?

For Gene set enrichment analysis, only genes from the input data node (filtered gene list) will be colored in the KEGG pathway gene network, using the statistics in the data node.

During GSEA (or Gene set ANOVA) computation, we also perform ANOVA on each gene based on the attributed selected independent from GESA computation (at gene set level). The results of ANOVA is only used to color the genes in the KEGG gene network. If GSEA is computed using another other database, e.g. GO, we don't compute ANOVA on each gene since GO databased doesn't have gene network information.

When should I use GSEA or Gene set ANOVA?

Both methods should be performed on a normalized matrix data node, and requires gene symbol in feature annotation. Both methods are detecting a differentially expressed Gene set (pathway) instead of each individual gene. The algorithms are different. GSEA is a popular method from the Broad institute. Gene Set ANOVA is based on generalized linear model, here are the details.

Last updated