# Perturb-seq

## Illumina Connected Multiomics

Illumina Connected Multiomics (ICM) is available for further tertiary analysis of Illumina Single Cell Transcriptomics Perturb-seq data and other multiomic data.

### Getting Started <a href="#getting-started" id="getting-started"></a>

Refer to the following links to the ICM user guide to get started with ICM:

* [Registration and Login](https://help.connected.illumina.com/icm/introduction/icm)
* [Data Inputs](https://help.connected.illumina.com/icm/introduction/data-inputs)
* [Creating a Study from a ICA Project](https://help.connected.illumina.com/icm/studies/create-study)
* [Viewing Results and Navigating in ICM](https://help.connected.illumina.com/icm/studies/enter-study)

### Demo Data

Demo data that can be used to follow along with this walkthrough is found in the Connected Multiomics Demo Data repository. The dataset will be provided shortly at /Multiomics-Demo-Data/Perturb-seq. Each sample will need the following files as input (when specifying sample1 as the sample id):

* sample1.scRNA.filtered.matrix.mtx.gz
* sample1.scRNA.filtered.barcodes.tsv.gz
* sample1.scRNA.filtered.features.tsv.gz
* sample1.scRNA.feature\_barcode\_reference.csv
* sample1.scRNA.positive\_cell\_guide\_assignments.csv

### Default Single Cell Perturb-seq Analysis

The default Perturb-seq analysis runs a pre-defined pipeline on each sample and presents analysis results in visualizations in a *Data viewer*.

### Creating a Default Analysis

After adding data to a study, follow the following steps to create a Default analysis.

* Click on *+ New Analysis*
* In the pop-up window, provide a name for the analysis
* select *Default: Illumina Single Cell Transcriptomics Perturb-seq* from the dropdown as the *Analysis Type*
* choose a sample group to be included in the analysis (all samples option is selected by default)
* click on the *Run Analysis* button

<figure><img src="https://580316046-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWMxqQAMFOJtu98OBk9KN%2Fuploads%2Fgit-blob-a90a9be9edfabbfdbf36776ced957d18111c85ce%2FptbDefault-1.png?alt=media" alt=""><figcaption></figcaption></figure>

### View Default Analysis Results

When the analysis status is *Complete*, click on the analysis tile to open results. The analysis opens into a *Data viewer* that consists of one UMAP colored by cluster IDs, a biomarker table, a cell composition pie chart and the distributions of *Total count* and *Expressed genes* within different clusters.

<figure><img src="https://580316046-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWMxqQAMFOJtu98OBk9KN%2Fuploads%2Fgit-blob-63d14b6f2b5e7f6475df2627ba63d15852a90c2b%2FptbDefault-2.png?alt=media" alt=""><figcaption></figcaption></figure>

The plots can be configured by selecting the *Configure* option from the toolbox on the left within each plot. Here is one example that converts the above UMAP to a feature plot by recoloring single cells with a ‘feature’: one of the *gRNAs* (*PDCD10\_4*). In the feature plot, all cells in red carry the perturbation of the *gRNA*, while all the non-pertubation cells are in grey fot this specific guide.

<figure><img src="https://580316046-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWMxqQAMFOJtu98OBk9KN%2Fuploads%2Fgit-blob-b944bf057ef86cf3166833607cf9648d2f66b968%2FptbDefault-2.1.png?alt=media" alt=""><figcaption></figcaption></figure>

### View Default Analysis Pipeline

To view the default analysis pipeline, click on the *Analysis* name on breadcrumb on top of the *Data viewer* to go to *Analyses* page. The default analysis pipeline looks like this:

<figure><img src="https://580316046-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FWMxqQAMFOJtu98OBk9KN%2Fuploads%2Fgit-blob-ffb43f44b6393d171e01b0683788e9010fde7e52%2FptbDefault-3.png?alt=media" alt=""><figcaption></figcaption></figure>

The first task(*Split by feature type*) running on the imported data is to split the data into different features: *CRISPR Direct Capture(gRNAs)* and *Gene Expression*. Because having *gRNAs* included in standard analyses for both clustering and differential expression can lead to unexpected clustering results. The rest of the pipeline is the same as regular *scRNA-seq* where users can read more details about the tasks after clicking the [link](https://app.gitbook.com/s/qVEYIKB8JFfdScsTocFN/tertiary-analysis/illumina-connected-multiomics#default-single-cell-analysis).
