> For the complete documentation index, see [llms.txt](https://help.connected.illumina.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://help.connected.illumina.com/partek/partek-genomics-suite/tutorials/copy-number-analysis/exploring-the-data-with-pca.md).

# Exploring the data with PCA

Principal component analysis (PCA) is a way to explore the overall similarity between samples, visualize possible groupings within the data set, and detect outliers.

* Select **PCA Scatter Plot** from the *QA/QC*

![](/files/EkuDox9yEPbCltRDijCj)

Figure 1. Principal component analysis showing total allele intensities of normal (blue) and cancer (red) samples. Each dot represents a single sample.

Each dot on the plot corresponds to a single sample and can be thought of as a summary of all normalized marker intensities for the sample. The first categorical column is used to color the plot; here, tumor samples are shown in red and normal samples are shown in blue.

To better view the data, we can rotate the plot.

* Select ![](/files/L3y1GsarXk1FqCqvfiwl) to activate *Rotate Mode*
* Click and drag to rotate the plot

Rotating the plot allows us to look for outliers in the data on each of the three principal components (PC1-3). The percentage of the total variation explained by each PC is listed by its axis label. The chart label shows the sum percentage of the total variation explained by the displayed PCs.

We can see that the peripheral blood samples (normal) cluster together whereas the cancer tissue samples (tumor) are more dispersed and show considerable variability. This corresponds well with the known genomic variability of cancer cells.

To view the similarity of paired normal and tumor samples from the same patient, we can connect dots by Subject ID.

* Select **4. SubjectID** from the *Connect by* drop-down menu in the upper right-hand corner of the plot tab

Paired tumor and normal samples are now connected by lines, illustrating the range of differences between normal and tumor copy number in the data set (Figure 2).

![](/files/UDGlx9XsuHM3YXlPLiSd)

Figure 2. Lines connect paired tumor and normal samples

## Additional Assistance

If you need additional assistance, please visit [our support page](http://www.partek.com/support) to submit a help ticket or find phone numbers for regional support.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://help.connected.illumina.com/partek/partek-genomics-suite/tutorials/copy-number-analysis/exploring-the-data-with-pca.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
