LogoLogo
Illumina KnowledgeIllumina SupportSign In
Partek
  • Home
Partek
  • Overview
  • Partek Flow
    • Frequently Asked Questions
      • General
      • Visualization
      • Statistics
      • Biological Interpretation
      • How to cite Partek software
    • Quick Start Guide
    • Installation Guide
      • Minimum System Requirements
      • Single Cell Toolkit System Requirements
      • Single Node Installation
      • Single Node Amazon Web Services Deployment
      • Multi-Node Cluster Installation
      • Creating Restricted User Folders within the Partek Flow server
      • Updating Partek Flow
      • Uninstalling Partek Flow
      • Dependencies
      • Docker and Docker-compose
      • Java KeyStore and Certificates
      • Kubernetes
    • Live Training Event Recordings
      • Bulk RNA-Seq Analysis Training
      • Basic scRNA-Seq Analysis & Visualization Training
      • Advanced scRNA-Seq Data Analysis Training
      • Bulk RNA-Seq and ATAC-Seq Integration Training
      • Spatial Transcriptomics Data Analysis Training
      • scRNA and scATAC Data Integration Training
    • Tutorials
      • Creating and Analyzing a Project
        • Creating a New Project
        • The Metadata Tab
        • The Analyses Tab
        • The Log Tab
        • The Project Settings Tab
        • The Attachments Tab
        • Project Management
        • Importing a GEO / ENA project
      • Bulk RNA-Seq
        • Importing the tutorial data set
        • Adding sample attributes
        • Running pre-alignment QA/QC
        • Trimming bases and filtering reads
        • Aligning to a reference genome
        • Running post-alignment QA/QC
        • Quantifying to an annotation model
        • Filtering features
        • Normalizing counts
        • Exploring the data set with PCA
        • Performing differential expression analysis with DESeq2
        • Viewing DESeq2 results and creating a gene list
        • Viewing a dot plot for a gene
        • Visualizing gene expression in Chromosome view
        • Generating a hierarchical clustering heatmap
        • Performing biological interpretation
        • Saving and running a pipeline
      • Analyzing Single Cell RNA-Seq Data
      • Analyzing CITE-Seq Data
        • Importing Feature Barcoding Data
        • Data Processing
        • Dimensionality Reduction and Clustering
        • Classifying Cells
        • Differentially Expressed Proteins and Genes
      • 10x Genomics Visium Spatial Data Analysis
        • Start with pre-processed Space Ranger output files
        • Start with 10x Genomics Visium fastq files
        • Spatial data analysis steps
        • View tissue images
      • 10x Genomics Xenium Data Analysis
        • Import 10x Genomics Xenium Analyzer output
        • Process Xenium data
        • Perform Exploratory analysis
        • Make comparisons using Compute biomarkers and Biological interpretation
      • Single Cell RNA-Seq Analysis (Multiple Samples)
        • Getting started with the tutorial data set
        • Classify cells from multiple samples using t-SNE
        • Compare expression between cell types with multiple samples
      • Analyzing Single Cell ATAC-Seq data
      • Analyzing Illumina Infinium Methylation array data
      • NanoString CosMx Tutorial
        • Importing CosMx data
        • QA/QC, data processing, and dimension reduction
        • Cell typing
        • Classify subpopulations & differential expression analysis
    • User Manual
      • Interface
      • Importing Data
        • SFTP File Transfer Instructions
        • Import single cell data
        • Importing 10x Genomics Matrix Files
        • Importing and Demultiplexing Illumina BCL Files
        • Partek Flow Uploader for Ion Torrent
        • Importing 10x Genomics .bcl Files
        • Import a GEO / ENA project
      • Task Menu
        • Task actions
        • Data summary report
        • QA/QC
          • Pre-alignment QA/QC
          • ERCC Assessment
          • Post-alignment QA/QC
          • Coverage Report
          • Validate Variants
          • Feature distribution
          • Single-cell QA/QC
          • Cell barcode QA/QC
        • Pre-alignment tools
          • Trim bases
          • Trim adapters
          • Filter reads
          • Trim tags
        • Post-alignment tools
          • Filter alignments
          • Convert alignments to unaligned reads
          • Combine alignments
          • Deduplicate UMIs
          • Downscale alignments
        • Annotation/Metadata
          • Annotate cells
          • Annotation report
          • Publish cell attributes to project
          • Attribute report
          • Annotate Visium image
        • Pre-analysis tools
          • Generate group cell counts
          • Pool cells
          • Split matrix
          • Hashtag demultiplexing
          • Merge matrices
          • Descriptive statistics
          • Spot clean
        • Aligners
        • Quantification
          • Quantify to annotation model (Partek E/M)
          • Quantify to transcriptome (Cufflinks)
          • Quantify to reference (Partek E/M)
          • Quantify regions
          • HTSeq
          • Count feature barcodes
          • Salmon
        • Filtering
          • Filter features
          • Filter groups (samples or cells)
          • Filter barcodes
          • Split by attribute
          • Downsample Cells
        • Normalization and scaling
          • Impute low expression
          • Impute missing values
          • Normalization
          • Normalize to baseline
          • Normalize to housekeeping genes
          • Scran deconvolution
          • SCTransform
          • TF-IDF normalization
        • Batch removal
          • General linear model
          • Harmony
          • Seurat3 integration
        • Differential Analysis
          • GSA
          • ANOVA/LIMMA-trend/LIMMA-voom
          • Kruskal-Wallis
          • Detect alt-splicing (ANOVA)
          • DESeq2(R) vs DESeq2
          • Hurdle model
          • Compute biomarkers
          • Transcript Expression Analysis - Cuffdiff
          • Troubleshooting
        • Survival Analysis with Cox regression and Kaplan-Meier analysis - Partek Flow
        • Exploratory Analysis
          • Graph-based Clustering
          • K-means Clustering
          • Compare Clusters
          • PCA
          • t-SNE
          • UMAP
          • Hierarchical Clustering
          • AUCell
          • Find multimodal neighbors
          • SVD
          • CellPhoneDB
        • Trajectory Analysis
          • Trajectory Analysis (Monocle 2)
          • Trajectory Analysis (Monocle 3)
        • Variant Callers
          • SAMtools
          • FreeBayes
          • LoFreq
        • Variant Analysis
          • Fusion Gene Detection
          • Annotate Variants
          • Annotate Variants (SnpEff)
          • Annotate Variants (VEP)
          • Filter Variants
          • Summarize Cohort Mutations
          • Combine Variants
        • Copy Number Analysis (CNVkit)
        • Peak Callers (MACS2)
        • Peak analysis
          • Annotate Peaks
          • Filter peaks
          • Promoter sum matrix
        • Motif Detection
        • Metagenomics
          • Kraken
          • Alpha & beta diversity
          • Choose taxonomic level
        • 10x Genomics
          • Cell Ranger - Gene Expression
          • Cell Ranger - ATAC
          • Space Ranger
          • STARsolo
        • V(D)J Analysis
        • Biological Interpretation
          • Gene Set Enrichment
          • GSEA
        • Correlation
          • Correlation analysis
          • Sample Correlation
          • Similarity matrix
        • Export
        • Classification
        • Feature linkage analysis
      • Data Viewer
      • Visualizations
        • Chromosome View
          • Launching the Chromosome View
          • Navigating Through the View
          • Selecting Data Tracks for Visualization
          • Visualizing the Results Using Data Tracks
          • Annotating the Results
          • Customizing the View
        • Dot Plot
        • Volcano Plot
        • List Generator (Venn Diagram)
        • Sankey Plot
        • Transcription Start Site (TSS) Plot
        • Sources of variation plot
        • Interaction Plots
        • Correlation Plot
        • Pie Chart
        • Histograms
        • Heatmaps
        • PCA, UMAP and tSNE scatter plots
        • Stacked Violin Plot
      • Pipelines
        • Making a Pipeline
        • Running a Pipeline
        • Downloading and Sharing a Pipeline
        • Previewing a Pipeline
        • Deleting a Pipeline
        • Importing a Pipeline
      • Large File Viewer
      • Settings
        • Personal
          • My Profile
          • My Preferences
          • Forgot Password
        • System
          • System Information
          • System Preferences
          • LDAP Configuration
        • Components
          • Filter Management
          • Library File Management
            • Library File Management Settings
            • Library File Management Page
            • Selecting an Assembly
            • Library Files
            • Update Library Index
            • Creating an Assembly on the Library File Management Page
            • Adding Library Files on the Library File Management Page
            • Adding a Reference Sequence
            • Adding a Cytoband
            • Adding Reference Aligner Indexes
            • Adding a Gene Set
            • Adding a Variant Annotation Database
            • Adding a SnpEff Variant Database
            • Adding a Variant Effect Predictor (VEP) Database
            • Adding an Annotation Model
            • Adding Aligner Indexes Based on an Annotation Model
            • Adding Library Files from Within a Project
            • Microarray Library Files
            • Adding Prep kit
            • Removing Library Files
          • Option Set Management
          • Task Management
          • Pipeline managment
          • Lists
        • Access
          • User Management
          • Group Management
          • Licensing
          • Directory Permissions
          • Access Control Log
          • Failed Logins
          • Orphaned files
        • Usage
          • System Queue
          • System Resources
          • Usage Report
      • Server Management
        • Backing Up the Database
        • System Administrator Guide (Linux)
        • Diagnosing Issues
        • Moving Data
        • Partek Flow Worker Allocator
      • Enterprise Features and Toolkits
        • REST API
          • REST API Command List
      • Microarray Toolkit
        • Importing Custom Microarrays
      • Glossary
    • Webinars
    • Blog Posts
      • How to select the best single cell quality control thresholds
      • Cellular Differentiation Using Trajectory Analysis & Single Cell RNA-Seq Data
      • Spatial transcriptomics—what’s the big deal and why you should do it
      • Detecting differential gene expression in single cell RNA-Seq analysis
      • Batch remover for single cell data
      • How to perform single cell RNA sequencing: exploratory analysis
      • Single Cell Multiomics Analysis: Strategies for Integration
      • Pathway Analysis: ANOVA vs. Enrichment Analysis
      • Studying Immunotherapy with Multiomics: Simultaneous Measurement of Gene and Protein
      • How to Integrate ChIP-Seq and RNA-Seq Data
      • Enjoy Responsibly!
      • To Boldly Go…
      • Get to Know Your Cell
      • Aliens Among Us: How I Analyzed Non-Model Organism Data in Partek Flow
    • White Papers
      • Understanding Reads in RNA-Seq Analysis
      • RNA-Seq Quantification
      • Gene-specific Analysis
      • Gene Set ANOVA
      • Partek Flow Security
      • Single Cell Scaling
      • UMI Deduplication in Partek Flow
      • Mapping error statistics
    • Release Notes
      • Release Notes Archive - Partek Flow 10
  • Partek Genomics Suite
    • Installation Guide
      • Minimum System Requirements
      • Computer Host ID Retrieval
      • Node Locked Installation
        • Windows Installation
        • Macintosh Installation
      • Floating/Locked Floating Installation
        • Linux Installation
          • FlexNet Installation on Linux
        • Installing FlexNet on Windows
        • License Server FAQ's
        • Client Computer Connection to License Server
      • Uninstalling Partek Genomics Suite
      • Updating to Version 7.0
      • License Types
      • Installation FAQs
    • User Manual
      • Lists
        • Importing a text file list
        • Adding annotations to a gene list
        • Tasks available for a gene list
        • Starting with a list of genomic regions
        • Starting with a list of SNPs
        • Importing a BED file
        • Additional options for lists
      • Annotation
      • Hierarchical Clustering Analysis
      • Gene Ontology ANOVA
        • Implementation Details
        • Configuring the GO ANOVA Dialog
        • Performing GO ANOVA
        • GO ANOVA Output
        • GO ANOVA Visualisations
        • Recommended Filters
      • Visualizations
        • Dot Plot
        • Profile Plot
        • XY Plot / Bar Chart
        • Volcano Plot
        • Scatter Plot and MA Plot
        • Sort Rows by Prototype
        • Manhattan Plot
        • Violin Plot
      • Visualizing NGS Data
      • Chromosome View
      • Methylation Workflows
      • Trio/Duo Analysis
      • Association Analysis
      • LOH detection with an allele ratio spreadsheet
      • Import data from Agilent feature extraction software
      • Illumina GenomeStudio Plugin
        • Import gene expression data
        • Import Genotype Data
        • Export CNV data to Illumina GenomeStudio using Partek report plug-in
        • Import data from Illumina GenomeStudio using Partek plug-in
        • Export methylation data to Illumina GenomeStudio using Partek report plug-in
    • Tutorials
      • Gene Expression Analysis
        • Importing Affymetrix CEL files
        • Adding sample information
        • Exploring gene expression data
        • Identifying differentially expressed genes using ANOVA
        • Creating gene lists from ANOVA results
        • Performing hierarchical clustering
        • Adding gene annotations
      • Gene Expression Analysis with Batch Effects
        • Importing the data set
        • Adding an annotation link
        • Exploring the data set with PCA
        • Detect differentially expressed genes with ANOVA
        • Removing batch effects
        • Creating a gene list using the Venn Diagram
        • Hierarchical clustering using a gene list
        • GO enrichment using a gene list
      • Differential Methylation Analysis
        • Import and normalize methylation data
        • Annotate samples
        • Perform data quality analysis and quality control
        • Detect differentially methylated loci
        • Create a marker list
        • Filter loci with the interactive filter
        • Obtain methylation signatures
        • Visualize methylation at each locus
        • Perform gene set and pathway analysis
        • Detect differentially methylated CpG islands
        • Optional: Add UCSC CpG island annotations
        • Optional: Use MethylationEPIC for CNV analysis
        • Optional: Import a Partek Project from Genome Studio
      • Partek Pathway
        • Performing pathway enrichment
        • Analyzing pathway enrichment in Partek Genomics Suite
        • Analyzing pathway enrichment in Partek Pathway
      • Gene Ontology Enrichment
        • Open a zipped project
        • Perform GO enrichment analysis
      • RNA-Seq Analysis
        • Importing aligned reads
        • Adding sample attributes
        • RNA-Seq mRNA quantification
        • Detecting differential expression in RNA-Seq data
        • Creating a gene list with advanced options
        • Visualizing mapped reads with Chromosome View
        • Visualizing differential isoform expression
        • Gene Ontology (GO) Enrichment
        • Analyzing the unexplained regions spreadsheet
      • ChIP-Seq Analysis
        • Importing ChIP-Seq data
        • Quality control for ChIP-Seq samples
        • Detecting peaks and enriched regions in ChIP-Seq data
        • Creating a list of enriched regions
        • Identifying novel and known motifs
        • Finding nearest genomic features
        • Visualizing reads and enriched regions
      • Survival Analysis
        • Kaplan-Meier Survival Analysis
        • Cox Regression Analysis
      • Model Selection Tool
      • Copy Number Analysis
        • Importing Copy Number Data
        • Exploring the data with PCA
        • Creating Copy Number from Allele Intensities
        • Detecting regions with copy number variation
        • Creating a list of regions
        • Finding genes with copy number variation
        • Optional: Additional options for annotating regions
        • Optional: GC wave correction for Affymetrix CEL files
        • Optional: Integrating copy number with LOH and AsCN
      • Loss of Heterozygosity
      • Allele Specific Copy Number
      • Gene Expression - Aging Study
      • miRNA Expression and Integration with Gene Expression
        • Analyze differentially expressed miRNAs
        • Integrate miRNA and Gene Expression data
      • Promoter Tiling Array
      • Human Exon Array
        • Importing Human Exon Array
        • Gene-level Analysis of Exon Array
        • Alt-Splicing Analysis of Exon Array
      • NCBI GEO Importer
    • Webinars
    • White Papers
      • Allele Intensity Import
      • Allele-Specific Copy Number
      • Calculating Genotype Likelihoods
      • ChIP-Seq Peak Detection
      • Detect Regions of Significance
      • Genomic Segmentation
      • Loss of Heterozygosity Analysis
      • Motif Discovery Methods
      • Partek Genomics Suite Security
      • Reads in RNA-Seq
      • RNA-Seq Methods
      • Unpaired Copy Number Estimation
    • Release Notes
    • Version Updates
    • TeamViewer Instructions
  • Getting Help
    • TeamViewer Instructions
Powered by GitBook
On this page
  • What is Hierarchical Clustering?
  • Visualizing Hierarchical Clustering
  • Configuring the Hierarchical Clustering Plot
  • Additional Assistance

Was this helpful?

Export as PDF
  1. Partek Genomics Suite
  2. User Manual

Hierarchical Clustering Analysis

PreviousAnnotationNextGene Ontology ANOVA

Last updated 7 months ago

Was this helpful?

What is Hierarchical Clustering?

Hierarchical clustering groups similar objects into clusters. To start, each row and/or column is considered a cluster. The two most similar clusters are then combined and this process is iterated until all objects are in the same cluster. Hierarchical clustering displays the resulting hierarchy of the clusters in a tree called a dendrogram. Hierarchical clustering is useful for exploratory analysis because it shows how samples group together based on similarity of features.

Hierarchical clustering is an unsupervised clustering method. Unsupervised clustering methods do not take the identity or attributes of samples into account when clustering. This means that experimental variables such as treatment, phenotype, tissue, number of expected groups, etc. do not guide or bias cluster building. Supervised clustering methods do consider experimental variables when building clusters.

Visualizing Hierarchical Clustering

To illustrate the capabilities and customization options of hierarchical clustering in Partek Genomics Suite, we will explore an example of hierarchical clustering drawn from the tutorial . The data set in this tutorial includes gene expression data from patients with or without Down syndrome. Using this data set, 23 highly differentially expressed genes between Down syndrome and normal patient tissues were identified. These 23 differentially regulated genes were then used to perform hierarchical clustering of the samples. Follow the steps outlined in to perform hierarchical clustering and launch the Hierarchical Clustering tab (Figure 1).

Figure 1. Heatmap showing results of hierarchical clustering

The right-hand section of the Hierarchical Clustering tab is a heat map showing relative expression of the genes in the list used to perform clustering. The heat map can be configured using the properties panel on the left-hand side of the tab. In this example, the low expression value is colored in green, the high expression value is in red, and the mid-point value between min and max is colored in black.The dendrograms on the left-hand side and top of the heat map show clustering of samples as rows and features (probes/genes in this example) as columns. Columns are labeled with the gene symbol if there is enough space for every gene to be annotated. Rows are colored based on the groups of the first sample categorical attribute in the source spreadsheet. The sample legend below the heat map indicates which colors correspond to which attribute group. In this example, Down syndrome patient samples are red and normal patient samples are orange.

The heat map can be configured using the properties panel on the left-hand side of the Hierarchical clustering tab.

Configuring the Hierarchical Clustering Plot

Labeling Sample Groups in the Heat Map

  • Select the Rows tab

  • Verify that Type appears in the annotation box

  • Set Width (in pixels) to 25

This will increase the width of the color box indicating sample Type.

  • Select Show Label

  • Set Text size to 12

  • Set Text angle to 90

This angle is relative to the x-axis. When set to 90, the text will run along the y-axis.

  • Select Apply

The sample attributes are now labeled with group titles (Figure 2).

Figure 2. Labeling heat map with sample attribute groups

Adding a Sample Attribute to the Heat Map

  • Select the Rows tab

  • Select Tissue from the New Annotation drop-down menu

  • Select Apply

Color blocks indicating the tissue of each sample have been added to the row labels and sample legend (Figure 3).

Figure 3. Sample attributes can be added to the heat map as sample labels

Changing the Orientation of the Rows and Columns

By default, Partek Genomics Suite displays samples on rows and features on columns. We can transpose the heat map using the Heat Map tab in the plot properties panel.

  • Select the Heat Map tab

  • Select Transpose rows and columns in the Orientation section

  • Select Apply

The plot has been transposed with samples on columns and features on rows. The label for the sample groups is now in the vertical orientation because the settings we applied to Rows has been applied to Columns.

  • Select the Columns tab

  • Select the Type track

  • Set Text angle to 0

  • Select Apply

The sample group label for Type is now visible (Figure 4).

Figure 4. Heat map columns and rows can be transposed

Flipping Columns or Rows

Each cluster node has two sub-cluster branches (legs) except for the bottom level in the dendrogram, the order of the two branches (or legs) is arbitrary, so the two sub-clusters position can be flipped within the cluster. This does not change the clustering, only the position of the clusters on the plot.

  • Clicking on a line (or drawing a bounding box on a line using left mouse button) that represents a sub-cluster branch (or dendrogram leg) will flip the selected leg with the other one leg within the same parent cluster. In this example, clicking on the bottom line will move it to the top of the heat map (Figure 5).

Figure 5. Rows and columns can be flipped by using Flip Mode to select dendrogram legs

Changing Heat Map Colors

The minimum, maximum, and midpoint colors of the heart map intensity plot can be customized.

  • Select the Heat Map tab

  • Select Apply

The heat map and plot intensity legend now show maximum values in yellow and minimum values in light blue with a black midpoint (Figure 6). The data range can also be customized by changing the values of Min and Max.

Figure 6. Heat map colors for minimum, maximum, and midpoint intensity can be customized

Zooming to Selected Rows/Columns

We can use the hierarchical clustering heat map to examine groups of genes that exhibit similar expression patterns. For example, genes that are up-regulated in Down syndrome samples and down-regulated in normal samples.

  • Select on the middle cluster of the rows dendrogram as shown (Figure 7) by clicking on the line or drawing a bounding box around the line

The lines within the selected cluster will be bold and the corresponding columns (or rows) on the spreadsheet in the analysis tab will be highlighted.

Figure 7. Selecting a dendrogram cluster using Selection Mode

  • Right-click anywhere in the viewer

  • Select Zoom to Fit Selected Rows

The same steps can be used to zoom into columns or rows. Here, we have zoomed in on rows, but not columns to show the expression levels of the selected genes for all samples (Figure 8).

Figure 8. Viewing only selected genes for all samples

  • Left click anywhere in the hierarchical clustering plot to deselect the dendrogram

Exporting a List of Genes From a Selected Cluster

Partek Genomics Suite can export a list of genes from any cluster selected, allowing large gene sets to be filtered based on the results of hierarchical clustering.

  • Select the bottom cluster of the rows dendrogram

  • Right-click to open the pop-up menu

  • Select Create Row List... (Figure 9)

Figure 9. Creating gene list from selected cluster

  • Name the gene set down in normal

  • Select OK

  • Save the list as down in normal

In the Analysis tab, there is now a spreadsheet row_list (down in normal.txt) containing the 6 genes that were in the selected cluster. The same steps can be used to create a list of samples from the hierarchical clustering by selecting clusters on the sample dendrogram.

Saving Plot Properties

Once you have created a customized plot, you can save the plot properties as a template for future hierarchical clustering analyses.

  • Select the Save/Load tab

  • Select Save current...

  • Name the current plot properties template; we selected Transposed Blue and Yellow

The new template now appears in the Save/Load panel as an option. To load a template, select it in the Load/Save panel and select Load selected. Note that all properties, including Min and Max values and sample groups (based on the column number of the attribute in the source spreadsheet) that may not be appropriate for a different data set, will be applied.

Exporting the Hierarchical Clustering Plot Image

The hierarchical clustering plot can be exported as a publication quality image.

  • Select the Hierarchical Clustering tab

  • Select File from the main toolbar

  • Select Save Image As... from the drop-down menu

  • Select a destination and name for the file

  • Select PNG or your preferred image type from the pull-down menu

  • Select Save

Additional Assistance

Select () from the Mouse Mode icon set to activate Flip Mode

Set Min color to () using the color picker tool

Set Max color to () using the color picker tool

Select () from the Mouse Mode icon set to activate Selection Mode

To reset zoom select () on the y-axis to show all rows and the x-axis to show all columns.

Select () on the y-axis to show all rows

Select () from the Mouse Mode icon set to activate Selection Mode

If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.

our support page
Gene Expression Analysis
Performing hierarchical clustering