arrow-left

All pages
gitbookPowered by GitBook
1 of 3

Loading...

Loading...

Loading...

Analyze differentially expressed miRNAs

Typically, you would begin a miRNA expression analysis with the same steps outlined in the Importing Affymetrix CEL files section of the Gene Expression tutorial. Here, the data has already been imported and attributes added.

To being our analysis, we will open the miRNA Expression workflow.

  • Select the miRNA Expression workflow from the Workflows drop-down menu

The miRNA Expression workflow provides a series of steps for analyzing miRNA expression data and integrating it with gene expression data (Figure 1).

Figure 1. The miRNA Expression workflow

hashtag
Exploratory data analysis

Principal Components Analysis (PCA) is an excellent method to visualize similarities and differences between the samples in a data set. PCA can be invoked through a workflow, by selecting from the main command bar, or by selecting Scatter Plot from the View section of the main toolbar. We will use a workflow.

  • Select the Affy_miR_BrainHeart_intensities spreadsheet

This is the probe intensities spreadsheet for the miRNA expression data (Figure 2). Each row is a sample; columns 7 to 9 give attribute information about each sample including tissue, replicate number, and scan date, while columns 10 on give prove intensities values.

Figure 2. Viewing the miRNA probe intensities spreadsheet

  • Select PCA Scatter Plot from the QA/QC section of the workflow

A new tab will open showing a PCA scatter plot (Figure 3).

Figure 3. PCA scatter plot. Samples are spheres. Samples with more similar miRNA expression are close together while dissimilar samples are further apart.

In this PCA scatter plot, each point represents a sample in the spreadsheet. Points that are close together in the plot are more similar, while points that are far apart in the plot are more dissimilar.

To better view the data, we can rotate the plot.

  • Select () to activate Rotate Mode

  • Click and drag to rotate the plot

Rotating the plot allows us to look for outliers in the data on each of the three principal components (PC1-3). The percentage of the total variation explained by each PC is listed by its axis label. The chart label shows the sum percentage of the total variation explained by the displayed PCs.

Here, we can see that the brain and heart samples are well separated across PC1, which is expected.

For more information about customizing the plot, please see from the Gene Expression with Batch Effect tutorial.

hashtag
Detecting differentially expressed miRNAs

Next, we will identify miRNAs that are differentially expressed between brain and heart tissues.

  • Select the Analysis tab

  • Select the Affy_miR_BrainHeart_intensities spreadsheet

  • Select Detect Differentially Expressed miRNAs from the Analysis section of the workflow

The ANOVA dialog (Figure 4) allows us to configure the comparisons we want to make between samples and groups within the data set.

Figure 4. ANOVA dialog

  • Select Tissue from the Experimental Factor(s) panel

  • Select Add Factor > to move Tissue to the ANOVA Factor(s) panel

The Contrasts... button will now be available to select.

  • Select Contrasts...

The Configure ANOVA dialog (Figure 5) is used to set up contrasts. Contrasts are the comparisons between groups and are where experimental questions can be asked. In this study, we are asking what miRNAs are differentially expressed between heart and brain tissue.

Figure 5. ANOVA configuration dialog

  • Select Yes for Data is already log transformed?

  • Select Fold change for Report comparisons as

  • Select 7. Tissue from the Select Factor/Interaction drop-down menu

This contrast (Figure 6) will compare expression of miRNAs in brain samples to expression in heart samples with brain as the numerator and heart as the denominator for fold-change calculations.

Figure 6. Configuring a contrast between brain and heart tissue in the ANOVA dialog

  • Select Add Contrast

  • Select OK

The Contrasts... button should now read Contrasts Included.

  • Select OK to run the ANOVA as configured

An ANOVA Results sheet, ANOVAResults, will be created as a child spreadsheet of Affy_miR_BrainHeart_intensities (Figure 7). In this spreadsheet, each row represents a probe set and the columns represent the computation results for that probe set. Although not synonymous, probe set and gene will be treated as synonyms in this tutorial for convenience. By default, the genes are sorted in ascending order by the p-value of the first categorical factor, which, in this case, is Tissue. This means the most significant differentially expressed miRNAs between the brain and heart (up-regulated and donw-regulated) are at the top of the spreadsheet.

Figure 7. Viewing the ANOVA results spreadsheet

You may explore what is known about any listed miRNA using external databases TargetScan, miRBase, microRNA.org, or miR2Disease, by right-clicking a row header, selecting Find miRNA in... and choosing one of the external databases. This will open a web page in your default web browser and requires your computer be connected to the internet.

For more information about AVOVA in Partek Genomics Suite, see .

hashtag
Creating a list of miRNAs of interest

The ANOVA results spreadsheet includes every miRNA on the array for a total of 7815 miRNAs. However, many of these miRNAs are not significantly differentially expressed between brain and heart and, thus, are not of interest. Next, we will create a filtered list of significantly differentially expressed miRNAs.

  • Select the ANOVAResults spreadsheet

  • Select Create List from the Analysis section of the workflow

The List Manager dialog will open (Figure 8).

  • Select brain vs. heart under Contrast: find genes that change between two categories

By default, the fold-change and significance thresholds are set to > 2, < -2 and p-value with FDR < 0.05. These defaults are appropriate for this tutorial so we will leave them in place.

  • Select Create to create a new list, brain vs. heart containing only the 1404 miRNAs that pass the criteria

Figure 8. Creating a list of significantly differentially expressed miRNAs

A new spreadsheet, brain vs. heart will be created as a child spreadsheet of Affy_miR_BrainHeart (Figure 9).

Figure 9. Viewing brain vs. heart spreadsheet

To view the miRNAs with the largest difference between tissues, we can sort by fold-change.

  • Right-click the 6. Fold-Change(brain vs. heart) column header

  • Select Sort Descending by Absolute Value from the pop-up menu

The top 33 miRNAs we see (Figure 10) are all miR-124 from different species. The miRNA miR-124 is the most abundant miRNA in neuronal cells so this finding is expected. The multiple species versions of miR-124 are present because Affymetrix GeneChip miRNA arrays provide comprehensive coverage of miRNAs from multiple organisms including human, mouse, rat, canine, monkey, and many more on a single chip. The miRNAs from these different species are highly homologous so probes targeting miRNAs from other species will hybridize with human miRNAs. Therefore, we need to filter the list of miRNAs to include only human miRNAs.

Figure 10. miR-124 is highly differentially expressed in brain vs. heart

To do this, we need to add a new annotation column containing species information for each probe.

  • Right-click on the 2. Probeset ID column header

  • Select Insert Annotation from the pop-up menu

  • Select Add as categorical

Figure 11. Inserting species annotation column

The table now includes a column 3. Species Scientific Name with the species name of each miRNA. We can now filter to include only human miRNAs.

  • Right-click the 3. Species Scientific Name column header

  • Select Find / Replace / Select... from the pop-up menu

  • Type Homo sapiens for Find What

Figure 12. Configuring the Find / / Replace / Select... dialog

The search should find and select 251 miRNAs.

  • Select Close

  • Right-click any of the row headers that are selected

  • Select Filter Include from the pop-up menu

The spreadsheet will now include only the 251 miRNAs from human (Figure 13). The first row is still miR-124 with a fold change of 4087.94. The black and gold bar on the right-hand side of the spreadsheet indicates the fraction of rows that have been filtered. To retain this filtered list, we can create a new spreadsheet.

Figure 13. Viewing differentially expressed human miRNAs

  • Right-click the brain_vs_heart spreadsheet in the spreadsheet tree

  • Select Clone... from the pop-up menu

Cloning a spreadsheet while a filter is applied copies only the included rows/columns.

  • Name the spreadsheet brain_vs_heart_human

  • Select Affy_miR_BrainHeart_intensities from the drop-down menu Create new spreadsheet as a child of spreadsheet

  • Select

The new spreadsheet includes only the 251 human miRNAs that are significantly differentially expressed between brain and heart tissue (Figure 14).

Figure 14. Viewing the filtered human miRNAs spreadsheet

The next step in our analysis will be integrating miRNA and gene expression data.

hashtag
Additional Assistance

If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.

Select brain from the left panel

  • Select Add Contrast Level > to move brain to the upper group - initially Group 1

  • Select heart from the left panel

  • Select Add Contrast Level > to move heart to the lower group - initially Group 2

  • Check Species Scientific Name (Figure 11)
  • Select OK to add the annotation column

  • Select Only in column for Search
  • Select 3. Species Scientific Name from the drop-down menu next to the Only in column option

  • Select Select All (Figure 12)

  • Name the new file brain vs. heart human
  • Select Save

  • Exploring the data set with PCA
    Identifying differentially expressed genes using ANOVA
    our support pagearrow-up-right

    Integrate miRNA and Gene Expression data

    miRNAs regulate gene expression at the post-transcriptional level by base-pairing with the three prime untranslated region (3’ UTR) of the target gene, causing cleavage/degradation of the cognate mRNA or preventing translation initiation. Integration of miRNA expression with gene expression data to study the overall network of gene regulation is vital to understanding miRNA function in a given sample. Partek Genomics Suite provides a platform that can analyze miRNA and gene expression data independently, yet allows data to be integrated for downstream analysis.This integrative analysis can be accomplished at several different levels. If you only have miRNA data, then Partek Genomics Suite can search the predicted gene targets in a miRNA-mRNA database like TargetScan to provide a list of genes that might be regulated by the differentially expressed miRNAs. Alternatively, if you have only gene expression data, Partek Genomics Suite can use the same database to identify the microRNAs that putatively regulate those differentially expressed genes in a statistically significant manner. If you have gene expression data and miRNA data from comparable tissue/species, Partek Genomics Suite can combine the results of these separate experiments into one spreadsheet. Lastly, if the miRNA and mRNA from the same source was analyzed (as in this tutorial), then you may statistically correlate the results of miRNA and gene expression assays.

    hashtag
    Finding putative genes regulated by miRNAs

    This application is useful in the case where you have miRNA expression data, but not gene expression data. Using a database like TargetScan, microCosm, or a custom database, you can identify the list of genes that are predicted to be regulated by these differentially expressed miRNAs and then perform Biological Interpretation tasks on the list of genes.

    • Select Combine miRNAs with their mRNA targets from the miRNA Integration section of the miRNA Expression workflow

    • Select the Get All Targets tab

    • Select TargetScan7.1 for Database Name

    Figure 1. Identifying all predicted gene targets of differentially expressed miRNAs

    This will create a new spreadsheet PutativeGenes that contains a miRNA and a putative gene target in each row. Because each miRNA can regulate multiple genes, the list will be much longer than the input miRNA list. Each row contains a gene so this spreadsheet can be analyzed using GO Enrichment and Pathway Enrichment tasks from the Biological Interpretation section of the workflow.

    Another useful way to analyze this list is to determine which genes could be targeted by multiple miRNAs in the list. To do this:

    • Right-click on the column 13. Gene Symbol header

    • Select Create List With Occurrence Counts from the pop-up menu (Figure 2)

    Figure 2. Creating an occurrence counts list from the list of putative miRNA target genes

    The new spreadsheet is a temporary spreadsheet listing each gene in alphabetical order and giving the occurance count of each. Sorting by descending order will list the gene with the most occurances first (Figure 3).

    Figure 3. Occurrence list of putative miRNA target genes

    We will not be using this temporary spreadsheet moving forward. You can close the spreadsheet by selecting

    hashtag
    Finding overrepresented miRNA targets sets from gene expression data

    This application is useful when you only have gene expression results or a gene list of interest and are interested in identifying which miRNAs might regulated the genes. Using a databse like TargetScan, you can create a list of miRNAs that are statistically predicted to regulated those genes. miRNAs of particular interest could then be explored using a lower-throughput technique like RT-qPCR.

    Using the gene list as input, a Fisher's Exact right-tailed p-value is calculated to show the overrepresentation of genes of interest for each miRNA in the database. The smaller the p-value, the more overrepresented the miRNAs are for the dataset. Target associations are taken from a database, TargetScan in this example. If the input list is a filtered list of genes from an ANOVA calculation, the parent spreadsheet is used to identify the background list of genes from the array. Genes in the array but not in the significant gene list will be treated as background in the calculations.

    To begin, we need to create a list of significant genes using the ANOVAResults gene spreadsheet.

    • Select the ANOVAResults gene spreadsheet in the spreadsheet tree

    • Select Create List from the workflow

    • Select Brain vs. Heart

    Figure 4. Creating a list of significantly differentially expressed genes

    • Select Close to exit the List Manager dialog

    We will now use this list to identify overrepresented miRNA target sets.

    • Select Find overrepresented miRNA target sets from the miRNA Integration section of the workflow

    • Select TargetScan 7.1 from the Target Databse drop-down menu

    • Select brain vs. heart genes from the mRNA Spreadsheet drop-down menu

    Figure 5. Finding enriched miRNA target sets

    A new spreadsheet, enrichedAssociations, will be created with miRNAs from the database on rows (Figure 6). Column 1 contains the miRNA name and column 2 shows its p-value. The smaller the p-value, the more significant it is. Column 3 contains the number of genes from the (input) significant gene list that are targeted by this microRNA and Column 7 shows the number of significant genes from the input list that are not targeted by this microRNA. Columns 4 and 5 contain the number of significantly up- and down-regulated genes from the input significant gene list targeted by the miRNA. Column 6 shows the number of background genes (genes on the array but not in the input significant gene list) that are targeted by the miRNA and Column 8 shows the number of background genes on the array that are not targeted by the miRNA. The numbers in columns 3, 6, 7 and 8 will be used to calculate the Fisher’s Exact (right-tailed) p-value, a measure of the overrepresentation of the predicted miRNAs within a gene set.

    Figure 6. Output of the Find Overrepresented miRNA Target Sets tool

    As the enrichment p-values have not been corrected for running multiple statistical tests, we can the multiple test corrrection feature of Partek Genomics Suite to adjust the p-values.

    • Select the enrichedAssociations spreadsheet

    • Select Stat from the main menu toolbar

    • Select Multiple Test Correction

    Figure 7. Configuring the Multiple Test Correction dialog

    Columns for each of the test correction methods will be added to the enrichedAssociations spreadsheet and can be used to filter the list of miRNAs.

    hashtag
    Combine miRNAs with mRNA target genes

    This option is useful if you have miRNA and gene expression experiments you want to compare. The samples should be comparable, but do not have to originate from the same specimens.

    • Select Combine miRNAs with their mRNA targets from the miRNA Integration section of the workflow

    • Select the Get Targets from Spreadsheet tab

    • Select TargetScan 7.1 from the Target Database drop-down menu

    Figure 8. Combining miRNAs with their mRNA targets

    In the new spreadsheet, each row represents a specific miRNA associated with one of its target genes; a single miRNA can have multiple targets. For example, hsa-miR-133b_st has 659 rows, one for each target (Figure 9).

    Figure 9. Viewing the combined spreadsheet with miRNAs and mRNA targets

    Columns 1-12 are taken from the miRNA expression source spreadsheet while columns 13-26 are taken from the gene expression source spreadsheet.

    hashtag
    Correlating miRNA and gene expression data

    This application is useful when you have miRNA and mRNA expression data form the same samples and want to correlate the findings to determine whether up- or down-regulated miRNAs result in gene expression changes in their cognate genes. Pearson and Spearman correlation coefficients and their p-values are calculated.

    • Select Correlate miRNA and mRNA data from the miRNA Integration section of the workflow

    • Select TargetScan7.1 from the Target Database drop-down menu

    • Select Affy_miR_BrainHeart_intensities for the microRNA spreadsheet using the drop-down menu

    Figure 10. Configuring the Correlate miRNA-mRNA dialog

    Next, select the SmapleID column from each spreadsheet. These must match.

    • Select 6. SampleID for Affy_miR_BrainHeart_intensities

    • Select 6. SampleID for Affy_HuGeneST_BrainHeart_GeneIntensities

    • Select OK (Figure 11)

    Figure 11. Choosing matching Sample ID columns

    The new spreadsheet, correlation.txt (Figure 12). Each row contains one miRNA correlated with one of its target gnees. The first column contains the miRNA probeset ID from the miRNA intensities spreadsheet. The second column contains the mRNA probeset ID from the gene expression intensities spreadsheet. The third column lists the gene symbol and the fourth the miRNA name. The fifth and sixth columns are the Pearson correlation coefficient and its p-value for the gene-miRNA pair. The seventh and eigth columns are the Spearman's rank correlation coefficient and its p-value for the gene-miRNA pair. Negative correlation indicates that a high level of the miRNA is correlated with a low expression level in its target gene. Positive correlation indicates that a high level of the miRNA is associated with a high level of its target gene.

    Figure 12. Viewing the correlation spreadsheet

    We can visualize the correlation between any miRNA and target gene.

    • Right-click a row header

    • Select Scatter Plot (Orig. Data) from the pop-up menu

    The correlation plot shows miRNA intensitiy on the x-axis and gene expression on the y-axis (Figure 13). Here, we see a negative correlation between expression of xtr-miR-148a_st and its target gene, RAB14, in brain and heart tissues. Drawing the scatter plot will create a temporary file with miRNA and gene expression probe intensities for all samples that is used to draw the plot.

    Figure 13. Viewing the scatter plot showing correlated miRNA and target gene expression

    Please note that the correlation function is only useful for identifying miRNAs that affect mRNA stability, not translation.

    hashtag
    Additional Assistance

    If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.

    miRNA Expression and Integration with Gene Expression

    This tutorial outlines how to analyze miRNA expression data in Partek Genomics Suite and outlines how miRNA expression data can be integrated with mRNA expression data from gene expression microarrays.

    This tutorial illustrates how to:

    • Analyze differentially expressed miRNAs

    • Integrate miRNA and Gene Expression data

    Note: the workflow described below is enabled in Partek Genomics Suite version 7.0 software. Please fill out the form on to request this version or use the Help > Check for Updates command to check whether you have the latest released version. The screenshots shown within this tutorial may vary across platforms and across different versions of Partek Genomics Suite.

    hashtag
    Description of the data set

    The data set for this tutorial includes miRNA from 3 human brain samples and 3 heart samples quantified using the Affymetrix GeneChip miRNA 1.0 array. The same sample set was also processed on GeneChip Human Gene 1.0 ST arrays for mRNA expression.

    For this tutorial, the gene expression and miRNA expression studies have been analyzed and stored in Partek Genomics Suite project (ppj) format as miRNAmRNA integration. The project contains two Partek format files: Affy_miR_BrainHeart_intensities.fmt with the miRNA data and Affy_HuGeneST_BrainHeart_GeneIntensities.fmt with the analyzed mRNA data. There is also an ANOVA results spreadsheet open as a child spreadsheet of Affy_HuGeneST_BrainHeart_GeneIntensities.fmt.

    • Download the and save it in an easily accessible location on your computer

    We can now open the project in Partek Genomics Suite.

    • Select File

    • Select Import

    • Select Zipped Project...

    The project files will open in the Analysis tab (Figure 1).

    Figure 1. The miRNA tutorial data set

    hashtag
    Additional Assistance

    If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.

    Select brain vs. heart human for Spreadsheet Name

  • Set Column with microRNA labels to 2. Probeset ID

  • Name the Result file PutativeGenes

  • Select OK (Figure 1)

  • Set the Save list as to brain vs. heart genes
  • Leave other fields at their default values (Figure 4)

  • Select Create

  • Select 4. Gene Symbol from the Column with gene symbols drop-down menu (Figure 5)

  • Select OK

  • Select all the multiple test correction options
  • Transfer Enrichment p-value to the Selected Column(s) panel from the Candidate Column(s) panel (Figure 7)

  • Select brain vs. heart human from the microRNA Spreadsheet drop-down menu

  • Select 2. Probeset ID for Column with microRNA labels

  • Select ANOVAResults gene from the mRNA Spreadsheet drop-down menu

  • Select 4. Gene Symbol for Column with gene symbols (Figure 8)

  • Select OK

  • Select Affy_HuGeneST_BrainHeart_GeneIntensities as the mRNA spreadsheet using the drop-down menu (Figure 10)

  • Select OK

  • our support pagearrow-up-right
    Select the miRNA_tutorial_data.zip zipped folder
    Our support pagearrow-up-right
    miRNA Expression and Integration with Gene Expression data setarrow-up-right
    our support pagearrow-up-right