# Compare expression between cell types with multiple samples

* [Filter cells](#filter-cells)
* [Identify differentially expressed genes](#identify-differentially-expressed-genes)
* [Exploring differentially expressed genes](#exploring-differentially-expressed-genes)

Differential expression analysis can be used to compare cell types. Here, we will compare glioma and oligodendrocyte cells to identify genes differentially regulated in glioma cells from the oligodendroglioma subtype. Glioma cells in oligodendroglioma are thought to originate from oligodendrocytes, thus directly comparing the two cell types will identify genes that distinguish them.

## Filter cells

To analyze only the oligodendroglioma subtype, we can filter the samples.

* Click the **Filtered counts** data node
* Expand **Filtering** in the task menu
* Click **Filter cells** (Figure 1)

![Figure 1. Invoking the sample filter](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-24f5b28d66fdb0cec0af2ece039a49374438d9b7%2FScreenshot%202023-01-10%20at%2014.44.48.png?alt=media)

The filter lets us include or exclude samples based on sample ID and attribute.

* Set the filter to **Include** samples where **Subtype is Oligodendroglioma**
* Click **AND**
* Set the second filter to **exclude Cell type (multi-sample) is Microglia**
* Click **Finish** to apply the filter (Figure 2)

![Figure 2. Configuring the group filter](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-5f7f4d86cd96b735cb9d8f7f735f6f2741f97d68%2FScreenshot%202023-01-10%20at%2014.46.37.png?alt=media)

A *Filtered counts* data node will be created with only cells that are from oligodendroglioma samples (Figure 3).

![Figure 3. Filtering groups generates a Filtered counts data node](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-d54c38eaa049cf1679e164c913336fe397e7e5a4%2FScreenshot%202023-02-08%20at%2015.53.40.png?alt=media)

## Identify differentially expressed genes

* Click the new **Filtered counts** data node
* Click **Statistics** > **Differential analysis** in the task menu
* Click **GSA**

The configuration options (Figure 4) includes sample and cell-level attributes. Here, we want to compare different cell types so we will include *Cell type (multi-sample)*.

* Click **Cell type (multi-sample)**
* Click **Next**

![Figure 4. Choosing attributes to include in the statistical test](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-e3bde4ada53623138aca737482f49279f448c221%2FScreenshot%202023-01-10%20at%2015.05.03.png?alt=media)

Next, we will set up a comparison between glioma and oligodendrocyte cells.

* Click **Glioma** in the top panel
* Click **Oligodendrocytes** in the bottom panel
* Click **Add comparison** (Figure 5)

This will set up fold calculations with glioma as the numerator and oligodendrocytes as the denominator.

![Figure 5. Defining the comparison between Glioma and Oligodendrocytes](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-346737611c82e3dafc0ee8e5aaf721d07c3fb240%2FScreenshot%202023-01-10%20at%2015.06.07.png?alt=media)

* Click **Finish** to run the GSA

A green *GSA* data node will be generated containing the results of the GSA.

* Double-click the **green** **GSA** data node to open the GSA report

Because of the large number of cells and large differences between cell types, the p-values and FDR step up values are very low for highly significant genes. We can use the volcano plot to preview the effect of applying different significance thresholds.

* Click ![image2018-2-16 13\_8\_45](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-578fb9619b29547466f026e1a653cbd219cede23%2Fvolcano_icon_gray.png?alt=media) to view the **Volcano plot**
* Open the **Style** icon on the left, change *Size* **point size** **to 6**
* Open the **Axes** icon on the left and change the Y-axis to **FDR step up (Glioma vs Oligodendrocytes)**
* Open the **Statistics** icon and change the *Significance* of **X threshold to -10 and 10** and the **Y threshold to 0.001**
* Open the **Select & Filter** icon, set the **Fold change thresholds to -10 and 10**
* In **Select & Filter**, click ![Remove\_icon](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-016f275f46ce989a8b4b26fbbd32763488970ca3%2Fgrey-x-icon-2.png?alt=media) to remove the **P-value (Glioma vs Oligodendrocytes)** selection rule. From the drop-down list, add **FDR step up (Glioma vs Oligodendrocytes)** as a selection rule and set the maximum to 0.001

Note these changes in the icon settings and volcano plot below (Figure 6).

![Figure 6. Previewing a filter by adjusting the size of the points, changing the Y-axis, adjusting the X & Y significance thresholds and changing the selection criteria](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-033dcca69b9c35d00c5432411f2ff3b1d7d0e2b9%2FScreenshot%202023-02-08%20at%2015.58.11.png?alt=media)

We can now recreate these conditions in the GSA report filter.

* Click **GSA report** tab in your web browser to return to the GSA report
* Click **FDR step up**
* Set the **FDR step up** filter to **Less than or equal to** **0.001**
* Press **Enter**
* Click **Fold change**
* Set the **Fold change** filter to **From -10 to 10**
* Press **Enter**

The filter should include 291 genes.

* Click ![image2018-2-16 13\_19\_34](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-e0b5c6c7941a43927a7f1322c4c2f1248ca3fdd0%2Fimage2018-2-16%2013_19_34.png?alt=media) to apply the filter and generate a *Filtered Feature list* node

## Exploring differentially expressed genes

To visualize the results, we can generate a hierarchical clustering heatmap.

* Click the **Filtered feature list** produced by the *Differential analysis filter* task
* Click **Exploratory analysis** in the task menu
* Click **Hierarchical clustering/heatmap**

Using the hierarchical clustering options we can choose to include only cells from certain samples. We can also choose the order of cells on the heatmap instead of clustering. Here, we will include only glioma cells and order the samples by sample name (Figure 7).

* Make sure **Cluster** is unchecked for *Cell order*
* Click **Filter cells** under *Filtering* and set the filter to **include Cell type (multi-sample) is Glioma**
* Choose **Sample name** from the *Cell order* drop-down menu in the *Assign order* section
* Click **Finish**

![Figure 7. Configuring hierarchical clustering](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-94ff3b17c9cf3af3fea79aba39d2f8f373d43c87%2FHeatmap%20dialouge.png?alt=media)

* Double click the green **Hierarchical clustering** node to open the heatmap

The heatmap differences may be hard to distinguish at first; the range from red to blue with a white midpoint is set very wide because of a few outlier cells. We can adjust the range to make more subtle differences visible. We can also adjust the color.

* Set the **Range** toggle **Min** to **-1.5**
* Set the **Range** toggle **Max** to **1.5**

The heatmap now shows clear patterns of red and blue.

* Click **Axis titles** and deselect the **Row labels** and **Column labels** of the panel to hide sample and feature names, respectively.
* Select **Sample name** from the *Annotations* drop-down menu

Cells are now labeled with their sample name. Interestingly, samples show characteristic patterns of expression for these genes (Figure 8).

![Figure 8. Hierarchical clustering heatmap with cells on rows (ordered by sample name) and genes on columns (clustered)](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-7cee4f560dada0fd8ff6fa38153112ae36ce4536%2Fimage2022-8-31_14-24-24.png?alt=media)

* Click **Glioma (multi-sample)** to return to the *Analyses* tab.

We can use gene set enrichment to further characterize the differences between glioma and oligodendrocyte cells.

* Click the **Filtered feature list** node
* Click **Biological interpretation** in the task menu
* Click **Gene set enrichment**
* Change *Database* to **Gene set database** and click Finish to continue with the most recent gene set (Figure 9)

![Figure 9. Gene set enrichment dialogue](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-46ebafb064ffb96d4c21bb2a400e153117082b19%2Fimage2022-8-31_14-29-18.png?alt=media)

A *Gene set* *enrichment* node will be added to the pipeline .

* Double-click the **Gene set enrichment** task node to open the task report

Top GO terms in the enrichment report include "ensheathment of neurons" and "axon ensheathment" (Figure 10), which corresponds well with the role of oligodendrocytes in creating the myelin sheath that supports and protect axons in the central nervous system.

![Figure 10. GO enrichment task report](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-7495e56a240cbc8ccce72efcff9851faa10557bf%2Fimage2018-3-21%2014_47_38.png?alt=media)

## Additional Assistance

If you need additional assistance, please visit [our support page](http://www.partek.com/support) to submit a help ticket or find phone numbers for regional support.
