This tutorial outlines how to analyze miRNA expression data in Partek Genomics Suite and outlines how miRNA expression data can be integrated with mRNA expression data from gene expression microarrays.
This tutorial illustrates how to:
Note: the workflow described below is enabled in Partek Genomics Suite version 7.0 software. Please fill out the form on Our support page to request this version or use the Help > Check for Updates command to check whether you have the latest released version. The screenshots shown within this tutorial may vary across platforms and across different versions of Partek Genomics Suite.
The data set for this tutorial includes miRNA from 3 human brain samples and 3 heart samples quantified using the Affymetrix GeneChip miRNA 1.0 array. The same sample set was also processed on GeneChip Human Gene 1.0 ST arrays for mRNA expression.
For this tutorial, the gene expression and miRNA expression studies have been analyzed and stored in Partek Genomics Suite project (ppj) format as miRNAmRNA integration. The project contains two Partek format files: Affy_miR_BrainHeart_intensities.fmt with the miRNA data and Affy_HuGeneST_BrainHeart_GeneIntensities.fmt with the analyzed mRNA data. There is also an ANOVA results spreadsheet open as a child spreadsheet of Affy_HuGeneST_BrainHeart_GeneIntensities.fmt.
Download the miRNA Expression and Integration with Gene Expression data set and save it in an easily accessible location on your computer
We can now open the project in Partek Genomics Suite.
Select File
Select Import
Select Zipped Project...
Select the miRNA_tutorial_data.zip zipped folder
The project files will open in the Analysis tab (Figure 1).
Figure 1. The miRNA tutorial data set
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
Typically, you would begin a miRNA expression analysis with the same steps outlined in the Importing Affymetrix CEL files section of the Gene Expression tutorial. Here, the data has already been imported and attributes added.
To being our analysis, we will open the miRNA Expression workflow.
Select the miRNA Expression workflow from the Workflows drop-down menu
The miRNA Expression workflow provides a series of steps for analyzing miRNA expression data and integrating it with gene expression data (Figure 1).
Figure 1. The miRNA Expression workflow
Select the Affy_miR_BrainHeart_intensities spreadsheet
This is the probe intensities spreadsheet for the miRNA expression data (Figure 2). Each row is a sample; columns 7 to 9 give attribute information about each sample including tissue, replicate number, and scan date, while columns 10 on give prove intensities values.
Figure 2. Viewing the miRNA probe intensities spreadsheet
Select PCA Scatter Plot from the QA/QC section of the workflow
A new tab will open showing a PCA scatter plot (Figure 3).
Figure 3. PCA scatter plot. Samples are spheres. Samples with more similar miRNA expression are close together while dissimilar samples are further apart.
In this PCA scatter plot, each point represents a sample in the spreadsheet. Points that are close together in the plot are more similar, while points that are far apart in the plot are more dissimilar.
To better view the data, we can rotate the plot.
Click and drag to rotate the plot
Rotating the plot allows us to look for outliers in the data on each of the three principal components (PC1-3). The percentage of the total variation explained by each PC is listed by its axis label. The chart label shows the sum percentage of the total variation explained by the displayed PCs.
Here, we can see that the brain and heart samples are well separated across PC1, which is expected.
For more information about customizing the plot, please see Exploring the data set with PCA from the Gene Expression with Batch Effect tutorial.
Next, we will identify miRNAs that are differentially expressed between brain and heart tissues.
Select the Analysis tab
Select the Affy_miR_BrainHeart_intensities spreadsheet
Select Detect Differentially Expressed miRNAs from the Analysis section of the workflow
The ANOVA dialog (Figure 4) allows us to configure the comparisons we want to make between samples and groups within the data set.
Figure 4. ANOVA dialog
Select Tissue from the Experimental Factor(s) panel
Select Add Factor > to move Tissue to the ANOVA Factor(s) panel
The Contrasts... button will now be available to select.
Select Contrasts...
The Configure ANOVA dialog (Figure 5) is used to set up contrasts. Contrasts are the comparisons between groups and are where experimental questions can be asked. In this study, we are asking what miRNAs are differentially expressed between heart and brain tissue.
Figure 5. ANOVA configuration dialog
Select Yes for Data is already log transformed?
Select Fold change for Report comparisons as
Select 7. Tissue from the Select Factor/Interaction drop-down menu
Select brain from the left panel
Select Add Contrast Level > to move brain to the upper group - initially Group 1
Select heart from the left panel
Select Add Contrast Level > to move heart to the lower group - initially Group 2
This contrast (Figure 6) will compare expression of miRNAs in brain samples to expression in heart samples with brain as the numerator and heart as the denominator for fold-change calculations.
Figure 6. Configuring a contrast between brain and heart tissue in the ANOVA dialog
Select Add Contrast
Select OK
The Contrasts... button should now read Contrasts Included.
Select OK to run the ANOVA as configured
An ANOVA Results sheet, ANOVAResults, will be created as a child spreadsheet of Affy_miR_BrainHeart_intensities (Figure 7). In this spreadsheet, each row represents a probe set and the columns represent the computation results for that probe set. Although not synonymous, probe set and gene will be treated as synonyms in this tutorial for convenience. By default, the genes are sorted in ascending order by the p-value of the first categorical factor, which, in this case, is Tissue. This means the most significant differentially expressed miRNAs between the brain and heart (up-regulated and donw-regulated) are at the top of the spreadsheet.
Figure 7. Viewing the ANOVA results spreadsheet
You may explore what is known about any listed miRNA using external databases TargetScan, miRBase, microRNA.org, or miR2Disease, by right-clicking a row header, selecting Find miRNA in... and choosing one of the external databases. This will open a web page in your default web browser and requires your computer be connected to the internet.
For more information about AVOVA in Partek Genomics Suite, see Identifying differentially expressed genes using ANOVA.
The ANOVA results spreadsheet includes every miRNA on the array for a total of 7815 miRNAs. However, many of these miRNAs are not significantly differentially expressed between brain and heart and, thus, are not of interest. Next, we will create a filtered list of significantly differentially expressed miRNAs.
Select the ANOVAResults spreadsheet
Select Create List from the Analysis section of the workflow
The List Manager dialog will open (Figure 8).
Select brain vs. heart under Contrast: find genes that change between two categories
By default, the fold-change and significance thresholds are set to > 2, < -2 and p-value with FDR < 0.05. These defaults are appropriate for this tutorial so we will leave them in place.
Select Create to create a new list, brain vs. heart containing only the 1404 miRNAs that pass the criteria
Figure 8. Creating a list of significantly differentially expressed miRNAs
A new spreadsheet, brain vs. heart will be created as a child spreadsheet of Affy_miR_BrainHeart (Figure 9).
Figure 9. Viewing brain vs. heart spreadsheet
To view the miRNAs with the largest difference between tissues, we can sort by fold-change.
Right-click the 6. Fold-Change(brain vs. heart) column header
Select Sort Descending by Absolute Value from the pop-up menu
The top 33 miRNAs we see (Figure 10) are all miR-124 from different species. The miRNA miR-124 is the most abundant miRNA in neuronal cells so this finding is expected. The multiple species versions of miR-124 are present because Affymetrix GeneChip miRNA arrays provide comprehensive coverage of miRNAs from multiple organisms including human, mouse, rat, canine, monkey, and many more on a single chip. The miRNAs from these different species are highly homologous so probes targeting miRNAs from other species will hybridize with human miRNAs. Therefore, we need to filter the list of miRNAs to include only human miRNAs.
Figure 10. miR-124 is highly differentially expressed in brain vs. heart
To do this, we need to add a new annotation column containing species information for each probe.
Right-click on the 2. Probeset ID column header
Select Insert Annotation from the pop-up menu
Select Add as categorical
Check Species Scientific Name (Figure 11)
Select OK to add the annotation column
Figure 11. Inserting species annotation column
The table now includes a column 3. Species Scientific Name with the species name of each miRNA. We can now filter to include only human miRNAs.
Right-click the 3. Species Scientific Name column header
Select Find / Replace / Select... from the pop-up menu
Type Homo sapiens for Find What
Select Only in column for Search
Select 3. Species Scientific Name from the drop-down menu next to the Only in column option
Select Select All (Figure 12)
Figure 12. Configuring the Find / / Replace / Select... dialog
The search should find and select 251 miRNAs.
Select Close
Right-click any of the row headers that are selected
Select Filter Include from the pop-up menu
The spreadsheet will now include only the 251 miRNAs from human (Figure 13). The first row is still miR-124 with a fold change of 4087.94. The black and gold bar on the right-hand side of the spreadsheet indicates the fraction of rows that have been filtered. To retain this filtered list, we can create a new spreadsheet.
Figure 13. Viewing differentially expressed human miRNAs
Right-click the brain_vs_heart spreadsheet in the spreadsheet tree
Select Clone... from the pop-up menu
Cloning a spreadsheet while a filter is applied copies only the included rows/columns.
Name the spreadsheet brain_vs_heart_human
Select Affy_miR_BrainHeart_intensities from the drop-down menu Create new spreadsheet as a child of spreadsheet
Name the new file brain vs. heart human
Select Save
The new spreadsheet includes only the 251 human miRNAs that are significantly differentially expressed between brain and heart tissue (Figure 14).
Figure 14. Viewing the filtered human miRNAs spreadsheet
The next step in our analysis will be integrating miRNA and gene expression data.
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
Principal Components Analysis (PCA) is an excellent method to visualize similarities and differences between the samples in a data set. PCA can be invoked through a workflow, by selecting from the main command bar, or by selecting Scatter Plot from the View section of the main toolbar. We will use a workflow.
Select () to activate Rotate Mode
Select
miRNAs regulate gene expression at the post-transcriptional level by base-pairing with the three prime untranslated region (3’ UTR) of the target gene, causing cleavage/degradation of the cognate mRNA or preventing translation initiation. Integration of miRNA expression with gene expression data to study the overall network of gene regulation is vital to understanding miRNA function in a given sample. Partek Genomics Suite provides a platform that can analyze miRNA and gene expression data independently, yet allows data to be integrated for downstream analysis.This integrative analysis can be accomplished at several different levels. If you only have miRNA data, then Partek Genomics Suite can search the predicted gene targets in a miRNA-mRNA database like TargetScan to provide a list of genes that might be regulated by the differentially expressed miRNAs. Alternatively, if you have only gene expression data, Partek Genomics Suite can use the same database to identify the microRNAs that putatively regulate those differentially expressed genes in a statistically significant manner. If you have gene expression data and miRNA data from comparable tissue/species, Partek Genomics Suite can combine the results of these separate experiments into one spreadsheet. Lastly, if the miRNA and mRNA from the same source was analyzed (as in this tutorial), then you may statistically correlate the results of miRNA and gene expression assays.
This application is useful in the case where you have miRNA expression data, but not gene expression data. Using a database like TargetScan, microCosm, or a custom database, you can identify the list of genes that are predicted to be regulated by these differentially expressed miRNAs and then perform Biological Interpretation tasks on the list of genes.
Select Combine miRNAs with their mRNA targets from the miRNA Integration section of the miRNA Expression workflow
Select the Get All Targets tab
Select TargetScan7.1 for Database Name
Select brain vs. heart human for Spreadsheet Name
Set Column with microRNA labels to 2. Probeset ID
Name the Result file PutativeGenes
Select OK (Figure 1)
Figure 1. Identifying all predicted gene targets of differentially expressed miRNAs
This will create a new spreadsheet PutativeGenes that contains a miRNA and a putative gene target in each row. Because each miRNA can regulate multiple genes, the list will be much longer than the input miRNA list. Each row contains a gene so this spreadsheet can be analyzed using GO Enrichment and Pathway Enrichment tasks from the Biological Interpretation section of the workflow.
Another useful way to analyze this list is to determine which genes could be targeted by multiple miRNAs in the list. To do this:
Right-click on the column 13. Gene Symbol header
Select Create List With Occurrence Counts from the pop-up menu (Figure 2)
Figure 2. Creating an occurrence counts list from the list of putative miRNA target genes
The new spreadsheet is a temporary spreadsheet listing each gene in alphabetical order and giving the occurance count of each. Sorting by descending order will list the gene with the most occurances first (Figure 3).
Figure 3. Occurrence list of putative miRNA target genes
This application is useful when you only have gene expression results or a gene list of interest and are interested in identifying which miRNAs might regulated the genes. Using a databse like TargetScan, you can create a list of miRNAs that are statistically predicted to regulated those genes. miRNAs of particular interest could then be explored using a lower-throughput technique like RT-qPCR.
Using the gene list as input, a Fisher's Exact right-tailed p-value is calculated to show the overrepresentation of genes of interest for each miRNA in the database. The smaller the p-value, the more overrepresented the miRNAs are for the dataset. Target associations are taken from a database, TargetScan in this example. If the input list is a filtered list of genes from an ANOVA calculation, the parent spreadsheet is used to identify the background list of genes from the array. Genes in the array but not in the significant gene list will be treated as background in the calculations.
To begin, we need to create a list of significant genes using the ANOVAResults gene spreadsheet.
Select the ANOVAResults gene spreadsheet in the spreadsheet tree
Select Create List from the workflow
Select Brain vs. Heart
Set the Save list as to brain vs. heart genes
Leave other fields at their default values (Figure 4)
Select Create
Figure 4. Creating a list of significantly differentially expressed genes
Select Close to exit the List Manager dialog
We will now use this list to identify overrepresented miRNA target sets.
Select Find overrepresented miRNA target sets from the miRNA Integration section of the workflow
Select TargetScan 7.1 from the Target Databse drop-down menu
Select brain vs. heart genes from the mRNA Spreadsheet drop-down menu
Select 4. Gene Symbol from the Column with gene symbols drop-down menu (Figure 5)
Select OK
Figure 5. Finding enriched miRNA target sets
A new spreadsheet, enrichedAssociations, will be created with miRNAs from the database on rows (Figure 6). Column 1 contains the miRNA name and column 2 shows its p-value. The smaller the p-value, the more significant it is. Column 3 contains the number of genes from the (input) significant gene list that are targeted by this microRNA and Column 7 shows the number of significant genes from the input list that are not targeted by this microRNA. Columns 4 and 5 contain the number of significantly up- and down-regulated genes from the input significant gene list targeted by the miRNA. Column 6 shows the number of background genes (genes on the array but not in the input significant gene list) that are targeted by the miRNA and Column 8 shows the number of background genes on the array that are not targeted by the miRNA. The numbers in columns 3, 6, 7 and 8 will be used to calculate the Fisher’s Exact (right-tailed) p-value, a measure of the overrepresentation of the predicted miRNAs within a gene set.
Figure 6. Output of the Find Overrepresented miRNA Target Sets tool
As the enrichment p-values have not been corrected for running multiple statistical tests, we can the multiple test corrrection feature of Partek Genomics Suite to adjust the p-values.
Select the enrichedAssociations spreadsheet
Select Stat from the main menu toolbar
Select Multiple Test Correction
Select all the multiple test correction options
Transfer Enrichment p-value to the Selected Column(s) panel from the Candidate Column(s) panel (Figure 7)
Figure 7. Configuring the Multiple Test Correction dialog
Columns for each of the test correction methods will be added to the enrichedAssociations spreadsheet and can be used to filter the list of miRNAs.
This option is useful if you have miRNA and gene expression experiments you want to compare. The samples should be comparable, but do not have to originate from the same specimens.
Select Combine miRNAs with their mRNA targets from the miRNA Integration section of the workflow
Select the Get Targets from Spreadsheet tab
Select TargetScan 7.1 from the Target Database drop-down menu
Select brain vs. heart human from the microRNA Spreadsheet drop-down menu
Select 2. Probeset ID for Column with microRNA labels
Select ANOVAResults gene from the mRNA Spreadsheet drop-down menu
Select 4. Gene Symbol for Column with gene symbols (Figure 8)
Select OK
Figure 8. Combining miRNAs with their mRNA targets
In the new spreadsheet, each row represents a specific miRNA associated with one of its target genes; a single miRNA can have multiple targets. For example, hsa-miR-133b_st has 659 rows, one for each target (Figure 9).
Figure 9. Viewing the combined spreadsheet with miRNAs and mRNA targets
Columns 1-12 are taken from the miRNA expression source spreadsheet while columns 13-26 are taken from the gene expression source spreadsheet.
This application is useful when you have miRNA and mRNA expression data form the same samples and want to correlate the findings to determine whether up- or down-regulated miRNAs result in gene expression changes in their cognate genes. Pearson and Spearman correlation coefficients and their p-values are calculated.
Select Correlate miRNA and mRNA data from the miRNA Integration section of the workflow
Select TargetScan7.1 from the Target Database drop-down menu
Select Affy_miR_BrainHeart_intensities for the microRNA spreadsheet using the drop-down menu
Select Affy_HuGeneST_BrainHeart_GeneIntensities as the mRNA spreadsheet using the drop-down menu (Figure 10)
Select OK
Figure 10. Configuring the Correlate miRNA-mRNA dialog
Next, select the SmapleID column from each spreadsheet. These must match.
Select 6. SampleID for Affy_miR_BrainHeart_intensities
Select 6. SampleID for Affy_HuGeneST_BrainHeart_GeneIntensities
Select OK (Figure 11)
Figure 11. Choosing matching Sample ID columns
The new spreadsheet, correlation.txt (Figure 12). Each row contains one miRNA correlated with one of its target gnees. The first column contains the miRNA probeset ID from the miRNA intensities spreadsheet. The second column contains the mRNA probeset ID from the gene expression intensities spreadsheet. The third column lists the gene symbol and the fourth the miRNA name. The fifth and sixth columns are the Pearson correlation coefficient and its p-value for the gene-miRNA pair. The seventh and eigth columns are the Spearman's rank correlation coefficient and its p-value for the gene-miRNA pair. Negative correlation indicates that a high level of the miRNA is correlated with a low expression level in its target gene. Positive correlation indicates that a high level of the miRNA is associated with a high level of its target gene.
Figure 12. Viewing the correlation spreadsheet
We can visualize the correlation between any miRNA and target gene.
Right-click a row header
Select Scatter Plot (Orig. Data) from the pop-up menu
The correlation plot shows miRNA intensitiy on the x-axis and gene expression on the y-axis (Figure 13). Here, we see a negative correlation between expression of xtr-miR-148a_st and its target gene, RAB14, in brain and heart tissues. Drawing the scatter plot will create a temporary file with miRNA and gene expression probe intensities for all samples that is used to draw the plot.
Figure 13. Viewing the scatter plot showing correlated miRNA and target gene expression
Please note that the correlation function is only useful for identifying miRNAs that affect mRNA stability, not translation.
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
We will not be using this temporary spreadsheet moving forward. You can close the spreadsheet by selecting