Gene ontology (GO), enrichment analysis has been incorporated into the gene expression, microRNA expression, exon, copy number, tiling, ChIP-Seq, RNA-Seq, miRNA-Seq and methylation workflows. The Gene Ontology Consortium website provides an excellent overview for new and experienced users of GO analysis. In brief, the common nomenclature of genes and gene products has been used to group genes into a functional hierarchy. This enables analyses to be compared across all types of genomic data, even data from different species. A broader understanding of experimental results is possible by grouping genes of interest into biological processes, cellular components and molecular functions of the genes. With the GO enrichment tool in Partek® Genomics Suite® you can take a list of genes (e.g. significantly differentially expressed genes) and see how they group in the functional hierarchy. This is analogous to going from looking at individual trees (genes) to see how the whole forest (gene ontology) is organized.
This tutorial illustrates how to:
Note: the workflow described below is enabled in Partek Genomics Suite version 7.0 software. Please fill out the form on Our support page to request this version or use the Help > Check for Updates command to check whether you have the latest released version. The screenshots shown within this tutorial may vary across platforms and across different versions of Partek Genomics Suite.
This tutorial will provide a step-by-step guide to performing GO enrichment analysis. The data set used is based on 51 subjects run on the Illumina Human Ref-8 BeadChip platform. Twenty-six of the subjects were categorized as "Young" with an age range of 18 to 28. The other 25 subjects were categorized as "Old" with an age range of 65 to 84. Skeletal muscle, a type of striated muscle tissue, was obtained via biopsy from each subject. The total RNA was extracted from the skeletal cells, prepared and run on the BeadChips producing the data that is used for this tutorial.
The paper this data is based on can be found at PLOS.
Data and associated files for this tutorial can be downloaded by going to Help > On-line Tutorials on the main menu toolbar within the Partek Genomics Suite software. Download the zipped file and store it on your local disk drive. There is no need to manually unzip the directory.
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
One of the main functions of GO enrichment is to find the overrepresentation of functional categories in a gene list. With the Gene_List.txt spreadsheet selected:
From the Gene Expression workflow, choose Biological Interpretation followed by Gene Set Analysis
Select the GO Enrichment radio button in the Gene Set Analysis dialog (Figure 1) followed by Next
In the next dialog, make sure the Gene_List.txt spreadsheet is chosen from the drop-down list and click Next
Figure 1. Gene set analysis dialog. Choose GO Enrichment and select Next
You have the choice to use the Fisher's Exact or Chi-Square test. Both tests compare the proportion of a gene list in a functional group to the proportion of genes in the background for that group. Both are acceptable and you can always test both by re-running the analysis. You can also restrict the analysis to functional groups with more than or fewer than a specified number of genes. Restricting the analysis to GO groups with fewer than 150-200 genes will increase the speed of analysis and exclude large groups which may not be too informative. If analysis time is not a concern, you can just use the default settings.
Select the Use Fisher's Exact test radio button (Figure 2)
Make sure the Invoke gene ontology browser on the result check box is selected
Leave all other settings as default and click Next
Figure 2. Configure the parameters of the GO enrichment test
Select the Default mapping file radio button and click Next
Figure 3. For an explanation of the different kinds of mapping file supported, click the help icon next to each one
A new spreadsheet (Figure 4) and the gene ontology browser (Figure 5) will appear.
Figure 4. GO enrichment output spreadsheet. Right-click a row header to perform additional tasks on a chosen GO term
The new spreadsheet (GO-Enrichement.txt) is a child spreadsheet of the gene list. The first column contains the GO functional groups, each of which falls into a broader category (biological process, cellular component or molecular processes), shown in column 2. The GO functional groups are sorted by descending enrichment score, which is shown in the column 3. The enrichment score is the negative natural logarithm of the enrichment p-value, which is shown in column 4. The higher the enrichment score, the more overrepresented a functional group is in the gene list. If a functional group has an enrichment score of over 1, it is overrepresented. As a rule of thumb, an enrichment score of 3 corresponds to significant overrepresentation (p-value=0.05). For your data, you may wish to add a multiple test correction (e.g. FDR) by going to Stat > Multiple Test correction. We will not perform the multiple test correction for this tutorial.
Additional columns help describe the enrichment score for each group, including the percentage of genes in the group that are present in the gene list, the number of genes present in the group that are present in the list, and the total number of genes in the group. Because the original gene list was derived from statistical analysis, extra columns will appear for all p-values in the ANOVA model. For example, the Young/Old score and Gender score columns contain the negative natural logarithm of the geometric mean of p-values for each marker/gene present in the list and in the group. These scores represent the level of differential expression of the genes in the functional group. The larger the score, the more differentially expressed the genes are in the group. A score of 3 or greater corresponds to an average p-value of 0.05 or less. For example, the Y_oung/Old_ score explains how differentially expressed the genes present in the list and in a given group are between the "Young" and "Old" categories.
On the GO-Enrichment.txt spreadsheet, right-click on a row header of a functional group, such as hydrogen ion transmembrane transporter activity, which has an enrichment score of 29.9484, and choose Browse to GO term from the menu
Figure 5. Viewing a functional group on the gene ontology browser
The Gene Ontology Browser opens in a separate tab (Figure 5). A functional group viewed in the browser will show the hierarchical relationship to the other GO terms. The selected functional group will be highlighted on the left. In Figure 5, you can see the hydrogen ion transmembrane transporter activity is found in the tree molecular function > transporter activity > substrate-specific transmembrane transporter activity > cation-transporting ATPase activity > inorganic cation transmembrane transporter activity > monovalent inorganic cation transmembrane activity. On the right, a bar chart shows the sub-groups of the selected group and their respective enrichment scores.
On the GO-Enrichment.txt spreadsheet, right-click on a row header of a functional group and choose Term Details from the menu
A web browser will be opened and you will be re-directed to the AmiGO website, where you will find more details about the chosen GO term (internet connection required).
On the GO-Enrichment.txt spreadsheet, right-click on a row header of a functional group and choose Create Gene List from the menu
Figure 6. New gene list spreadsheet containing all the genes in the original list that belong to the chosen functional group
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
Partek Genomics Suite supports different types of mapping files (Figure 3). These are library files that define how genes are organized into functional groups. For an explanation of each type of mapping file, click on the help icon () next to each one.
A new spreadsheet (gene-list) that contains the genes in the original list that belong to the chosen functional group will be created (Figure 6). Note that this spreadsheet is a Partek temporary (ptmp) file. To save it, click the Save icon ().
The zipped project file contains several prepared files used in this analysis as well as the annotation information for the BeadChip. The zipped file also contains a Partek project file (.ppj).
After downloading the file, go to File > Import > Zipped project... and browse to GO_Enrichment.zip on your local drive
Partek Genomics Suite will automatically unzip the file, read the .ppj file, open and annotate all spreadsheets (Figure 1). The parent spreadsheet (GSE8479-AVGSignal) contains the original intensity data. The first child spreadsheet (ANOVAResults) contains the results of differential gene expression analysis from a 3-way ANOVA. The second child spreadsheet (Gene_List.txt) is a list of significantly differentially expressed genes. When working with your own data, you will need to detect differentially expressed genes and create a gene list yourself.
Figure 1. Viewing the Gene List spreadsheet
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.