# QA/QC, data processing, and dimension reduction

### QA/QC & Data Processing <a href="#qa-qc-dataprocessing-anddimensionreduction-qa-qc-and-dataprocessing" id="qa-qc-dataprocessing-anddimensionreduction-qa-qc-and-dataprocessing"></a>

<figure><img src="https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-63e6e300d6f9e2d0c938f46de2332d94d5e3dbe8%2Fcosmx%204.png?alt=media" alt=""><figcaption></figcaption></figure>

Once the data has been imported in the project we can start pre-processing the data:

We will first remove all non-expression features in the data (e.g. NegProbes).

* Click on *Filtering > Filter features* from the menu on the right
* Select *Metadata* and set the task settings as follows
* Click **Finish**

<figure><img src="https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-12997ed4d21cd395dce3ca51f7c5300b28822613%2Fcosmx%205.png?alt=media" alt=""><figcaption></figcaption></figure>

In the Analyses tab

* Click on the resulting filtered counts node
* Select *QA/QC > Single cell* QA/QC from toolbox, once the task has completed we can open the report by double-clicking the node:

<figure><img src="https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-a3b17677310640d721bab5e8e1e832526f6bae31%2Fcosmx%206.png?alt=media" alt=""><figcaption></figcaption></figure>

We will remove the cells with low counts and number of detected features.

* Click on *Select & Filter* and set lower threshold to 50 for both (remember that this is data-dependent and will change based on your dataset)
* Click Filter![](https://documentation.partek.com/download/thumbnails/98206053/Screenshot%202024-07-04%20at%2017.18.31.png?version=1\&modificationDate=1720109917705\&api=v2) include
* Click *Apply observation filter* to the filtered counts node:

<figure><img src="https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-4325e86d3f2b55123bd5a9c027d03661bbd931a9%2Fcosmx%207.png?alt=media" alt=""><figcaption></figcaption></figure>

Click on the node generated by the filtering task in the Analyses tab.

* Click *Filtering > Filter features.* Apply a noise reduction filter:

<figure><img src="https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-8ea027457e44f5634f146c30018af29ac5e2e7e7%2Fcosmx%208.png?alt=media" alt=""><figcaption></figcaption></figure>

We can now normalize our filtered data.

* Click *Normalization and scaling > Normalization.* Use the recommended settings by clicking ![](https://documentation.partek.com/download/thumbnails/98206053/Screenshot%202024-07-04%20at%2017.29.01.png?version=1\&modificationDate=1720110545187\&api=v2):

<figure><img src="https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-3f104298182c8c4e19983a38b9ad46b195c3d5a5%2Fcosmx%209.png?alt=media" alt=""><figcaption></figcaption></figure>

### Data Exploration <a href="#qa-qc-dataprocessing-anddimensionreduction-dataexploration" id="qa-qc-dataprocessing-anddimensionreduction-dataexploration"></a>

Now that we have filtered low quality cells and normalized our data, we can start clustering to identify cell populations.

* Click on the normalized data node
* From the menu on the right select *Exploratory analysis > PCA.* We are going to use the top 2000 features by variance and calculate the first 50 principal components (PCs):

<figure><img src="https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-8c2396e897d99c04f067f143548af3ad3906e8cc%2Fcosmx%2010.png?alt=media" alt=""><figcaption></figcaption></figure>

Once the PCA has run, click on the PCA result node in the Analyses tab.

* Select *Exploratory analysis > UMAP* from the toolbox. Set the UMAP parameters as follows:
  * Top **20** PCs
  * Local neighborhood size **60**
  * Minimal distance **0.20**

<figure><img src="https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-bb9ff52b12234832762a717a7b8ffa6bea91ebe1%2Fcosmx%2011.png?alt=media" alt=""><figcaption></figcaption></figure>

While the UMAP is running we can also queue a clustering task. Click on the PCA result node in the Analyses tab, select *Exploratory analysis > Graph-based clustering.*

* We are going to use the Leiden algorithm to cluster our data (make sure to select the radio button for it)
* Set the number of PCs to **10**
* In the advanced settings, set the resolution parameter to **8e-5** and click **Apply**:

<figure><img src="https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-b48bc71a9d8003b98593a8b85e63db2e10e98c5a%2Fcosmx%2012.png?alt=media" alt=""><figcaption></figcaption></figure>

\\

### Additional Assistance <a href="#qa-qc-dataprocessing-anddimensionreduction-additionalassistance" id="qa-qc-dataprocessing-anddimensionreduction-additionalassistance"></a>

If you need additional assistance, please visit [our support page](http://www.partek.com/support) to submit a help ticket or find phone numbers for regional support.

![](https://documentation.partek.com/download/resources/com.adaptavist.confluence.rate:rate/resources/themes/v2/gfx/loading_mini.gif)
