Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
This tutorial will illustrate:
Note: the workflow described below is enabled in Partek Genomics Suite version 7.0 software. Please fill out the form on Our support page to request this version or use the Help > Check for Updates command to check whether you have the latest released version. The screenshots shown within this tutorial may vary across platforms and across different versions of Partek Genomics Suite.
Down syndrome is caused by an extra copy of all or part of chromosome 21; it is the most common non-lethal trisomy in humans. At the time of the study used in this tutorial, conflicting reports had thrown into doubt whether individuals with Down syndrome have dysregulation of gene expression throughout the genome or primarily in genes from chromosome 21. To address this question, Affymetrix GeneChip™ Human U133A arrays were used to assay 25 samples taken from 10 human subjects, with or without Down syndrome, and 4 different tissues. The data revealed a significant upregulation of chromosome 21 genes at the gene expression level in individuals with Down syndrome; this dysregulation was largely specific to chromosome 21 and not a genome-wide phenomenon.
The raw data is available as experiment number GSE1397 in the Gene Expression Omnibus.
Data and associated files for this tutorial can be downloaded using this link - Gene Expression Analysis tutorial data (right-click the link and choose "Save Link As" to download the tutorial data).
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
Twenty-five CEL files (samples) have been imported into Partek Genomics Suite. Sample information must be added to define the grouping and the goals of the experiment.
Select Add Sample Attributes in the Import section of the Gene Expression workflow panel
Choose the option Add Attributes from an Existing Column
Select OK to open the Sample Information Creation dialog
In this tutorial, the file name (e.g., Down Syndrome-Astrocyte-748-Male-1-U133A.CEL) contains the information about a sample and is separated by hyphens (-). Choosing to split the file name by delimiters will separate the categories into different columns
In the Sample Information panel, specify the column labels (Labels 1-4) as Type, Tissue, Subject, and Gender, set each as categorical, and set the other columns as skip (Figure 1). Select OK
Figure 1. Configuring the Sample Information Creation dialog
A dialog window asking if you would like to save the spreadsheet with the new sample attribute will appear. Select Yes
Make column 5. (Subject) random by right-clicking on the column header and selecting Properties from the pop-up menu (Figure 2).
Figure 2. Changing column properties
Select the Random Effect check box from the Properties dialog (Figure 3) then select OK.
Figure 3. Setting column to Random Effect
The column 5. (Subject) will now be colored red, indicating that it is a random effect.
Note: More details on Random vs. Fixed Effects can be found later in this tutorial under the section Identifying differentially expressed genes using ANOVA.
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
Analysis of variance (ANOVA) is a very powerful technique for identifying differentially expressed genes in a multi-factor experiment such as this one. In this data set, ANOVA will be used to generate a list of genes that are significantly different between Down syndrome and normal samples with an absolute difference bigger than 1.3 fold.
The ANOVA model should include Type because it is the primary factor of interest. From the exploratory analysis using the PCA plot, we observed that tissue is a large source of variation; therefore, Tissue should be included in the model. In the experiment, multiple samples were taken from the same subject, so Subject must be included in the model. If Subject were excluded from the model, the ANOVA assumption that samples within groups are independent will be violated. Additionally, the PCA scatter plot showed that the Downs syndrome and normal samples separated within tissue type, so the Type*Tissue interaction should be included in the model.
To invoke the ANOVA dialog, select Detect Differentially Expressed Genes in the Analysis section of the Gene Expression workflow
In the Experimental Factor(s) panel, select Type, Tissue and Subject by pressing and left clicking each factor
Use the Add Factor > button to move the selections to the ANOVA Factor(s) panel
Select both Type and Tissue by holding on your keyboard and left clicking each factor
Select the Add Interaction > button to add a Type * Tissue interaction to the ANOVA Factor(s) panel (Figure 1)
Do NOT select OK or Apply. We will be adding contrasts to this ANOVA model in an upcoming section of the tutorial.
Figure 1. Configuring ANOVA factors and interactions
Most factors in ANOVA are fixed effects, whose levels in a data set represent all the levels of interest. In this study, Type and Tissue are fixed effects. If the levels of a factor in a data set only represent a random sample of all the levels of interest (for example, Subject), the factor is a random effect. The ten subjects in this study represent only a random sample of the global population about which inferences are being made. Random effects are colored red on the spreadsheet and in the ANOVA dialog. When the ANOVA model includes both random and fixed factors, it is a mixed-model ANOVA.
Another way to determine if a factor is random or fixed is to imagine repeating the experiment. Would the same levels of each factor be used again?
Type – Yes, the same types would be used again - a fixed effect
Tissue – Yes, the same tissues would be used again - a fixed effect
Subject - No, the samples would be taken from other subjects- a random effect
You can specify which factors are random and which are fixed when you import your data or after importing by right-clicking on the column corresponding to a categorical variable, selecting Properties, and checking Random Effect. By doing that, the ANOVA will automatically know which factors to treat as random and which factors to treat as fixed.
The subject factor in the ANOVA model is listed as “5. Subject (3. Type)”, which means that Subject is nested in Type. Partek Genomics Suite can automatically detect this sort of hierarchical design and will adjust the ANOVA calculation accordingly.
By default, an ANOVA only outputs a p-value for each factor/interaction. To get the fold change and ratio between Down syndrome and normal samples, a contrast must be set up.
Select Contrasts… to invoke the Configure dialog
Choose 3**.**Type from the Select Factor/Interaction drop-down list. The levels in this factor are listed on the Candidate Level(s) panel on the left side of the dialog
Left click to select Down Syndrome from the Candidate Level(s) panel and move it to the Group 1 panel (renamed Down Syndrome) by selecting Add Contrast Level > in the top half of the dialog.
Label 1 will be changed to the subgroup name automatically, but you can also manually specify the label name.
Select Normal from the Candidate Level(s) panel and move it to the Group 2 panel (renamed Normal)
The Add Contrast button can now be selected (Figure 2)
Figure 2. Adding a contrast of Down Syndrome and Normal samples
Because the data is log2 transformed, Partek Genomics Suite will automatically detect this and will automatically select Yes for Data is already log transformed? in the top right-hand corner of the dialog. Partek Genomics Suite will use the geometric mean of the samples in each group to calculate the fold change and mean ratio for the contrast between the Down syndrome and normal samples.
Select Add Contrast to add the Down Syndrome vs. Normal contrast
Select OK to apply the configuration
If successfully added, the Contrasts… button will now read Contrasts Included (Figure 3)
Figure 3. ANOVA configuration with contrasts included
By default, Specify Output File is checked and gives a name to the output file. If you are trying to determine which factors should be included in the model and you do not wish to save the output file, simply uncheck this box
Select OK in the ANOVA dialog to compute the 3-way mixed-model ANOVA
Several progress messages will display in the lower left-hand side of the ANOVA dialog while the results are being calculated.
The result will be displayed in a child spreadsheet, ANOVA-3way (ANOVAResults). In this spreadsheet, each row represents a probe set and the columns represent the computation results for that probe set (Figure 4). Although not synonymous, probe set and gene will be treated as synonyms in this tutorial for convenience. By default, the genes are sorted in ascending order by the p-value of the first categorical factor. In this tutorial,Type is the first categorical factor, which means the most highly significant differently expressed gene between Down syndrome and normal samples is at the top of the spreadsheet in row 1.
Figure 4. ANOVA spreadsheet
For additional information about ANOVA in Partek Genomics Suite, see Chapter 11 Inferential Statistics in the User’s Manual (Help > User’s Manual).
Deciding which factors to include in the ANOVA may be an iterative process while you decide which factors and interactions are relevant as not all factors have to be included in the model. For example, in this tutorial, Gender and Scan date were not included. The Sources of Variation plot is a way to quantify the relative contribution of each factor in the model towards explaining the variability of the data.
Select View Sources of Variation from the Analysis section of the Gene Expression workflow with the ANOVA result spreadsheet active
A Sources of Variation tab will appear (Figure 5) with a bar chart showing the signal to noise ratio for each factor accross the whole genome. Sources of variation can also be viewed as a pie chart showing sum or squares by selecting the Pie Chart (Sum of Squares) tab in the upper left-hand side of the Sources of Variation tab.
Figure 5. Sources of Variation tab showing a bar chart
This plot presents the mean signal-to-noise ratio of all the genes on the microarray. All the non-random factors in the ANOVA model are listed on the X-axis (including error). The Y-axis represents the mean of the ratios of mean square of all the genes to the mean square error of all the genes. Mean square is ANOVA’s measure of variance. Compare the bar for each signal to the bar for error; if a factor's bar is higher than error's bar, that factor contributed significant variation to the data across all the variables. Notice that this plot is very consistent with the results in the PCA scatter plot. In this data, on average, Tissue is the largest source of variation.
To view the source of variation for each individual gene, right click on a row header in the ANOVA-3way (ANOVAResults) spreadsheet and select Sources of Variation from the pop-up menu. This generates a Sources of Variation tab for the individual gene. View a few Sources of Variation plots from rows at the top of the ANOVA table and a few from the bottom of the table.
Another useful graph is an ANOVA Interaction Plot.
Right-click on a row header in the ANOVA spreadsheet (Figure 6)
Select ANOVA Interaction Plot to generate an Interaction Plot tab for that individual gene
Figure 6. Calling an ANOVA Interaction Plot for a gene
Generate these plots for rows 3 (DSCR3) and 8 (CSTB). If the lines in the interaction plot are not parallel, then there is a chance that there is an interaction between Tissue and Type. Error bars show standard error of the least squared mean. DSCR3 is a good example of this (Figure 7). We can look at the p-values in column 9, p-value(Type * Tissue) to check if this apparent interaction is statistically significant.
Figure 7. Interaction Plot for DSCR3
We can view the expression levels of a gene for each sample using a dot plot.
Right click on the gene row header and select Dot Plot (Orig. Data) from the pop-up menu. This generates a Dot Plot tab for the selected gene (Figure 8)
Figure 8. Dot Plot showing DSCR3 expression levels for each sample
In the plot, each dot is a sample of the original data. The Y-axis represents the log2 normalized intensity of the gene and the X-axis represents the different types of samples. The median expression of each group is different from each other in this example. The median of the Down syndrome samples is ~6.3, but the median of the normal samples is ~6.0. The line inside the Box & Whiskers represents the median of the samples in a group. Placing the mouse cursor over a Box & Whiskers plot will show its median and range.
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
Download the data from the Partek site to your local disk. The zip file contains both data and annotation files.
Unzip the files to C:\Partek Training Data\Down_Syndrome-GE or to a directory of your choosing. Be sure to create a directory or folder to hold the contents of the zip file
Copy or move the annotation files (HG-U133A.cdf, HG-U133A.na36.annot, HG-U133A.na36.annot.idx) to C:\Microarray Libraries.
Copying the annotation files to the default library location is done because newer annotation files that are released after the publication of this tutorial may cause the results to be different than what is shown in the published tutorial. If, however, you prefer to download the latest version, you may omit copying the HG-U133A files to C:\Microarray Libraries.
Start Partek® Genomics Suite® and select Gene Expression from the Workflows panel on the right side of the tool bar in the main window (Figure 1)
Figure 1. Selecting the gene expression workflow
Select Import Samples under the Import section of the workflow
Select Import from Affymetrix CEL Files and then select OK
Select the Browse button and select the C:\Partek Training Data\Down_Syndrome-GE folder. By default, all the files with a .CEL extension are selected (Figure 2)
Figure 2. Selecting the folder and CEL files for the experiment
Select the Add File(s) > button to move all the .CEL files to the right panel. Twenty-five CEL files will be processed
Select the Next > button to open the Import Affymetrix CEL Files dialog (Figure 3)
Figure 3. Configuring import files window
Select Customize… to open the Advanced Import Options dialog (Figure 4)
Figure 4. Configuring the Advanced Import Options dialog
Select Library Files… to open the Specify File Locations dialog (Figure 5). This dialog is used to specify the location of the library folder and the annotation files
Figure 5. Specifying Microarray Library files or change the default library directory
Partek Genomics Suite will automatically assign the annotation files according to the chip type stored in the .CEL files. If the annotation files are not available in the library directory, Partek Genomics Suite will automatically download and store them in the Default Library File Folder.
The default library location can be modified by selecting Change... in the Default Library File Folder panel. By default, the library directory is at C:\Microarray Libraries. This directory is used to store all the external libraries and annotation files needed for analysis and visualization. The library directory can also be modified from Tools > File Manager in the main Partek Genomics Suite menu
Select OK (Figure 5) to close the Specify File Locations dialog
Select the Outputs tab from the Advanced Import Options dialog (Figure 6)
Figure 6. Specifying Advanced Import Options to create chip images of and extract the scan date from the CEL files
In the Extract Time Stamp and Date from CEL File panel, make sure the Date button is selected to extract the chip scan date. This information can help you detect if there are batch effects caused by the process time
In the Quality Assess of Gene Expression panel, leave the QC report button unselected. A user guide for the microarray data quality assessment and quality control features is available in the User’s Manual
Select OK to exit the Advanced Import Options dialog
Select Import. The progress bar on the lower left of the Import Affymetrix CEL files dialog will update as .CEL files are imported. Once all files have been imported, the Import Affymetrix CEL Files dialog will close
After importing the .CEL files has finished, the result file will open in Partek Genomics Suite as a spreadsheet named 1 (Down_Syndrome-GE). The spreadsheet should contain 25 rows representing the micoarray chips (samples) and over 22,000 columns representing the probe sets (genes) (Figure 7).
Figure 7. Viewing the main or top-level spreadsheet
For additional information on importing data into Partek Genomics Suite, see Chapter 4 Importing and Exporting Data in the Partek User’s Manual. The User’s Manual is available from the Partek Genomic Suite software main menu Help > User’s Manual. The FAQ (Help > On-line Tutorials > FAQ) may also be helpful. As this tutorial only addresses some topics, you may need to consult the User’s Manual for additional information about other useful features.
It is recommended that you are familiar with Chapter 6 The Pattern Visualization System of the User’s Manual before going through the next section of the tutorial.
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
During data importation, the GeneChip annotation file was linked to the imported data. This linked annotation information can be added as new columns to the ANOVA or gene list spreadsheets. For example, we can add additional annotation to the gene list we created from the ANOVA results as follows:
In the Down_Syndrome_vs_Normal (A) spreadsheet, right click on the second column header 2. ProbesetID and select Insert Annotation from the pop-up menu (Figure 3)
Figure 1. Inserting an annotation
Select Chromosomal Location under the Column Configuration panel (Figure 4). Leave everything else as default
Select OK
Figure 2. Adding Chromosomal Location annotation
Interestingly, of the 23 genes of the Down_Syndrome_vs_Normal (A) spreadsheet, 20 genes are located on chromosome 21. This suggests that the gene expression changes associated with Down syndrome observed in this study are primarily located on chromosome 21, not distributed throughout the genome, an important finding of this study.
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
At this point in analysis, you should explore the data preliminarily. Do the genes you expected to be differentially regulated appear to have larger or smaller intensity values? Do similar samples resemble each other?
The latter question can be explored using Principal Components Analysis (PCA), an excellent method for reducing and visualizing high-dimensional data.
Select PCA Scatter Plot from the QA/AC section of the Gene Expression workflow
A Scatter Plot tab containing your PCA plot will open (Figure 1).
Figure 1. PCA Scatter Plot tab
In the scatter plot, each point represents a chip (sample) and corresponds to a row on the top-level spreadsheet. The color of the dot represents the Type of the sample; red represents a normal sample and blue represents a Down syndrome sample. Points that are close together in the plot have similar intensity values across the probe sets on the whole chip, while points that are far apart in the plot are dissimilar
Left-clicking on any point in the scatter plot selects that point. A dash with an identifying row number will appear on the selected PCA plot point. The spreadsheet in the Analysis tab will also jump to the corresponding row.
As you can see from rotating the plot, there is no clear separation between Down syndrome and normal samples in this data since the red and blue samples are not separated in space. However, there are other factors that may separate the data.
Color the points by column 4. Tissue and Size the points by column 3. Type
Select OK
Figure 2. Configuring the PCA scatter plot: Color by Tissue, size by Type
Notice now that the data are clustered by different tissues (Figure 3).
Figure 3. PCA scatter plot configured with color by Tissue, size by Type
Another way to see the cluster pattern is to put an ellipse around the Tissue groups.
Open the Plot Rendering Properties dialog and select the Ellipsoids tab
Select Add Ellipse/Ellipsoid
Select Ellipse in the Add Ellipse/Ellipsoid... dialog
Double click on Tissue in the Categorical Variable(s) panel to move it to the Grouping Variable(s) panel (Figure 4)
Select OK to close the Add Ellipse/Ellipsoid... dialog and select OK again to exit the Plot Rendering Properties dialog
Figure 4. Adding Ellipses to PCA Scatter Plot
By rotating this PCA plot, you can see that the data is separated by tissues, and within some of the tissues, the Down syndrome samples and normal samples are separated. For example, in the Astrocyte and Heart tissues, the Down syndrome samples (small dots) are on the left, and the normal samples (large dots) are on the right (Figure 5).
Figure 5. PCA scatter plot with ellipses, rotated to show separation by Type
PCA is an example of exploratory data analysis and is useful for identifying outliers and major effects in the data. From the scatter plot, you can see that the tissue is the biggest source of variation. There are many genes that express differently between the tissues, but not as many genes that express differently between type (Down syndrome and normal) across the whole chip.
The next step is to draw a histogram to examine the samples.
Select Sample Histogram in the QA/QC section of the Gene Expression workflow to generate the Histogram tab (Figure 6)
Figure 6. Histogram tab
The histogram plots one line for each of the samples with the intensity of the probes graphed on the X-axis and the frequency of the probe intensity on the Y-axis. This allows you to view the distribution of the intensities to identify any outliers. In this dataset, all the samples follow the same distribution pattern indicating that there are no obvious outliers in the data. As demonstrated with the PCA plot, if you click on any of the lines in the histogram, the corresponding row will be highlighted in the spreadsheet 1 (Down_Syndrome-GE). You can also change the way the histogram displays the data by clicking on the Plot Properties button. Feel free to explore these options on your own.
The decision to discard any samples would be based on information from the PCA plot, sample histogram plot, and QC metrics. To discard a sample and renormalize the data (without the effects of the outlier), start over with importing samples and omit the outlier sample(s) during the .CEL file import.
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
Now that you have obtained statistical results from the microarray experiment, you can create new spreadsheets containing just those genes that pass certain criteria. This will streamline data management by focusing on just those genes with the most significant differential expression or substantial fold change. The List Manager can be used to specify numerous conditions for selecting genes of interest. In this tutorial, we are going to create a gene list of gene with a fold change between -1.3 to 1.3 that has an unadjusted p-value of < 0.0005.
Invoke the List Manager dialog by selecting Create Gene List in the Analysis section of the Gene Expression workflow
Ensure that the 1/ANOVA-3way (ANOVAResults) spreadsheet is selected as this is the spreadsheet we will be using to create our new gene list as shown (Figure 1)
Select the ANOVA Streamlined tab.
Set Contrast: find genes that change between two categories panel, to Down Syndrome vs. Normal and select Have Any Change from the Setting drop-down menu
This will find genes with different expression levels in the different types of samples.
In the Configuration for “Down Syndrome vs Normal” panel, check that Include size of the change is selected and enter 1.3 into Change > and -1.3 in OR Change <
Select Include significance of the change, choose unadjusted p-value from the dropdown menu, and < 0.001 for the cutoff
The number of genes that pass your cutoff criteria will be shown next to the # Pass field. In this example, 30 genes pass the criteria.
Set Save the list as A
Select Create to generate the new list A
Select Close to view the new gene list spreadsheet
Figure 1. Creating a gene list from ANOVA results
The spreadsheet Down_Syndrome_vs_Normal (A) will be created as a child spreadsheet under the Down_Syndrome-GE spreadsheet.
This gene list spreadsheet can now be used for further analysis such as hierarchical clustering, gene ontology, integration of copy number data, or be exported into other data analysis tools such as pathway analysis.
Next, we will generate a list of genes that passed a p-value threshold of 0.05 and fold-changes greater than 1.3 using a volcano plot.
Select the 1/ANOVA-3way (ANOVAResults) spreadsheet in the Analysis tab. This is the spreadsheet our gene list will be drawn from
Select View > Volcano Plot from the Partek Genomics Suite main menu (Figure 2)
Figure 2. Generating a Volcano Plot from ANOVA results
Set X Axis (Fold-Change) to 12. Fold-Change(Down Syndrome vs. Normal), and the Y axis (p-value) to be 10. p-value(Down Syndrome vs. Normal)
Select OK to generate a Volcano Plot tab for genes in the ANOVA spreadsheet (Figure 3)
Figure 3. Volcano plot generated from ANOVA spreadsheet
In the plot, each dot represents a gene. The X-axis represents the fold change of the contrast (Down syndrome vs. Normal), and the Y-axis represents the range of p-values. The genes with increased expression in Down syndrome samples are on the right side of the N/C (no change) line; genes with reduced expression in Down syndrome samples are on the left. The genes become more statistically significant with increasing Y-axis position. The genes that have larger and more significant changes between the Down syndrome and normal groups are on the upper right and upper left corner.
In order to select the genes by fold-change and p-value, we will draw a horizontal line to represent the p-value 0.05 and two vertical lines indicating the –1.3 and 1.3-fold changes (cutoff lines).
Choose the Axes tab
Check Select all points in a section to allow Partek Genomics Suite to automatically select all the points in any given section
Select the Set Cutoff Lines button and configure the Set Cutoff Lines dialog as shown (Figure 4)
Figure 4. Setting cutoff lines for -1.3 to 1.3 fold changes and a p-value of 0.05
Select OK to draw the cutoff lines
Select OK in the Plot Rendering Properties dialog to close the dialog and view the plot
The plot will be divided into six sections. By clicking on the upper-right section, all genes in that section will be selected.
Right-click on the selected region in the plot and choose Create List to create a list including the genes from the section selected (Figure 5). Note that these p-values are uncorrected
Figure 5. Creating a gene list from a volcano plot
Note: If no column is selected in the parent (ANOVA) spreadsheet, all of the columns will be included in the gene list; if some columns are selected, only the selected columns will be included in the list.
Specify a name for the gene list (example: volcano plot list) and write a brief description about the list.
The description is shown when you right-click on the spreadsheet > Info > Comments. Here, I have named the list "volcano plot list" and described it as "Genes with >1.3 fold change and p-value <0.05" (Figure 6). The list can be saved as a text file (File > Save As Text File) for use in reports or by downstream analysis software.
Figure 6. Saving a list created from a volcano plot
To save changes to the spreadsheet, select the Save Active Spreadsheet icon (). Spreadsheets with unsaved changes have an asterisk next to their name in the spreadsheet tree.
To save changes to the spreadsheet, select the Save Active Spreadsheet icon ().
While pressing the mouse wheel down, drag the mouse to rotate the plot or select the Rotate Mode icon () on the left side of the Scatter Plot tab. With Rotate Mode selected, press the left mouse button and drag to rotate the plot. Rotating the plot allows you to examine the grouping pattern or outliers of the data on the first three principal components (PCs).
Scrolling the mouse wheel up or down while the cursor is on the PCA plot will zoom in and out or select the Zoom Mode icon () on the left side of the Scatter Plot tab.
Selecting the Reset icon () option on the left side of the Scatter Plot tab will return the PCA plot to its original orientation and zoom.
In the Scatter Plot tab, select the Rendering Properties icon () and configure the plot as shown (Figure 2)
You can practice creating new gene list criteria of your own to become familiar with the List Manager tool. For more information, you can always click on the () buttons.
Select Rendering Properties ()
If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.
The gene list in spreadsheet Down_Syndrome_vs_Normal (A) can be used for hierarchical clustering to visualize patterns in the data.
Under the Visualization section in the Gene Expression workflow, select Cluster Based on Significant Genes
The Cluster Significant Genes dialog asks you to specify the type of clustering you want to perform.
Choose Hierarchical Clustering and select OK
Choose the Down_Syndrome_vs_Normal (A) spreadsheet under the Spreadsheet with differentially expressed genes
Choose the Standardize – shift genes to mean of zero and scale to standard deviation of one under the Expression normalization panel (Figure 1)
This option will adjust all the gene intensities such that the mean is zero and the standard deviation is 1.
Figure 1. Configuring Hierarchical Clustering
Select OK to generate a Hierarchical Clustering tab (Figure 2)
Figure 2. Hierarchical Clustering of Down_Syndrome_vs_Normal (A)
The graph (Figure 2) illustrates the standardized gene expression level of each gene in each sample. Each gene is represented in one column, and each sample is represented in one row. Genes with no difference in expression have a value of zero and are colored black. Genes with increased expression in Down syndrome samples have positive values and are colored red. Genes with reduced expression in Down syndrome samples have negative values and are colored green. Down syndrome samples are colored red and normal samples are colored orange. On the left-hand side of the graph, we can see that the Down syndrome samples cluster together.
For more information on the methods used for clustering, you can refer to Chapter 8: Hierarchical & Partitioning Clustering in Help > User’s Manual. For a tutorial on configuring the clustering plot, please refer to Hierarchical Clustering Analysis
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.