There are 5 data files needed for CosMx data import: polygons.csv, exprMat_file.csv, fov_positions_file.csv, metadata_file.csv, tx_file.csv. The last file (tx_file.csv) contains the transcript information and is not required when importing protein data. Additionally the user will need an image folder, called either CellComposite or CellOverlay (please ensure the images in this folder are not blank before uploading them to Flow). This image folder contains one image per FOV and will be used for the visualisation of the spatial data.
Create a folder per sample, the folder will contain the 5 files and the image folder.
Transfer the data to your Flow server before moving onto the import step.
Create a new project and click the 'Add data' button in the Analyses tab
Select Single cell > Spatial > NanoString CosMx and click Next
Click Add sample, name it and select the sample folder
The importer will automatically select the image folder. If you have uploaded more than one image folder, select the one you want to use for the project.
Select the appropriate annotation files (in this case we'll use hg38 - Ensembl Transcripts release 109)
Leave the rest of the settings as default
Click Finish
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
Now that we have clustered our data we can start identifying cell populations present in our tissue. Using the results of our dimension reduction analysis, we will look at the clustering results on the UMAP and the spatial map:
Double click the Spatial report node. This will automatically open a new Data viewer session and plot the spatial map with the high resolution image in the background.
Click on New plot > 3D Scatter plot and select the UMAP node.
Add the biomarkers table from the graph-based clustering task.
Click on New plot > Table and select the Biomarkers table generated after the Graph-based clustering task
Change the color style to the graph-based clusters.
Click anywhere on the Spatial plot, then click Style, click on the node selector next to the Color by drop-down, select the Graph-based clustering node
Select 'Graph based' from the drop-down options
Copy these settings on the UMAP.
Click on the legend title, drag and drop it on the color option on the UMAP plot:
Now that we have plotted the clustering results we can start exploring them. We have identified markers for 13 clusters, with apparent spatial separation. The sample is human front cortex, thus we can expect our clusters to represent some of the major cell types found in the tissue (e.g. astrocytes, microglia, oligodendrocytes etc). Looking at the markers for Cluster 4, we can see there are a few known astrocyte markers: AQP4, AGT, FGFR3.
Drag and drop each one of the 3 genes from the biomarkers table onto of the 'Green, Red, Blue' features of the spatial plot to visualize the in-situ expression:
Zoom in to better observe the cell-level expression patterns.
Use the mouse wheel control to zoom in and out on the image
Let's classify the cells from cluster 4 as astrocytes:
Click Select & Filter > Criteria, drag the Graph-based attribute on the Add criteria box and select only Cluster 4.
Click Classify > Classify selection. Type 'Astrocytes' and Save.
Click Apply classifications, type 'Cell type' in the box and click Run.
If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.
Astrocytes populations are often found in two activation states: the neurotoxic or pro-inflammatory phenotype (A1) and the neuroprotective or anti-inflammatory phenotype (A2). Now that we have identified our astrocyte population we can move onto the sub-classification of A1/A2 astrocytes. For this we are going to use the GFAP marker, a commonly used marker for A1 astrocytes.
Double click the Spatial report node in the Analyses tab of your project to open a new data viewer session.
Click Select & Filter , select Criteria, and add the 'Cell type' attribute in the criteria box.
Select only the Astrocytes and click to include only the selected points
Now add GFAP to the criteria box and toggle the Pin histogram option
The histogram shows the presence of two distinct populations that segregate based on GFAP expression
Set the upper threshold as 5
Classify the population
Classify > Classify selection, name the cells 'A2' and Save
Now slide the upper threshold to the max and set the lower threshold to 5, then Classify > Classify selection, name the cells 'A1' and Save
We are now ready to Apply classifications, name the attribute 'Astrocyte sub-population' and Run
Having classified our sub-populations we can now use that information to identify genes and biologically processes differentially activated between the two.
Click on the Normalized counts node, select Statistics > Differential analysis > Hurdle model, click Next
Select the Astrocyte sub-population > Add factors, then click Next
Drag A1 to the Numerator and A2 to the Denominator box, Add comparison, then click Finish
Once the differential expression analysis task has completed, you can explore the report and the subsequent analysis steps following this tutorial: Compare expression between cell types with multiple samples
\
\
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
Once the data has been imported in the project we can start pre-processing the data:
We will first remove all non-expression features in the data (e.g. NegProbes).
Click on Filtering > Filter features from the menu on the right
Select Metadata and set the task settings as follows
Click Finish
In the Analyses tab
Click on the resulting filtered counts node
Select QA/QC > Single cell QA/QC from toolbox, once the task has completed we can open the report by double-clicking the node:
We will remove the cells with low counts and number of detected features.
Click on Select & Filter and set lower threshold to 50 for both (remember that this is data-dependent and will change based on your dataset)
Click Apply observation filter to the filtered counts node:
Click on the node generated by the filtering task in the Analyses tab.
Click Filtering > Filter features. Apply a noise reduction filter:
We can now normalize our filtered data.
Now that we have filtered low quality cells and normalized our data, we can start clustering to identify cell populations.
Click on the normalized data node
From the menu on the right select Exploratory analysis > PCA. We are going to use the top 2000 features by variance and calculate the first 50 principal components (PCs):
Once the PCA has run, click on the PCA result node in the Analyses tab.
Select Exploratory analysis > UMAP from the toolbox. Set the UMAP parameters as follows:
Top 20 PCs
Local neighborhood size 60
Minimal distance 0.20
While the UMAP is running we can also queue a clustering task. Click on the PCA result node in the Analyses tab, select Exploratory analysis > Graph-based clustering.
We are going to use the Leiden algorithm to cluster our data (make sure to select the radio button for it)
Set the number of PCs to 10
In the advanced settings, set the resolution parameter to 8e-5 and click Apply:
\
Click Filter include
Click Normalization and scaling > Normalization. Use the recommended settings by clicking :
If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.