# Detect differentially methylated CpG islands

The approach described in previous sections relies on ANOVA to detect differentially methylated CpG sites and takes individual sites as a starting point for interpretation. Since ANOVA compares M values at each site independently, this strategy is robust to type I/type II probe bias.

An alternative could be to first summarize all the probes belonging to a CpG island region (i.e. island, N-shore, N-shelf, S-shore, S-shelf) and then use ANOVA to compare regions across the groups. Since the summarization will include both type I and type II probes, you may want to split the analysis in two branches and analyze type I and type II probes independently. To do this, we need to annotate each probe as type I or type II.

* Select the **mvalue** spreadsheet
* Select **Transform** from the main toolbar
* Select **Create Transposed Spreadsheet...** from the *Transform* drop-down menu (Figure 1)

![](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-39412a66395fc9a3c77d783d50ff2c791890beda%2Fimage2017-12-26%2013_37_7.png?alt=media)

Figure 1. Creating a transposed spreadsheet

* Select **Sample ID** for *Column:* and **numeric** for *Data Type:*
* Select OK

A new temporary spreadsheet will be created with a row for each probe and columns for each sample.

* Right-click on column *1. ID* to bring up the pop-up menu
* Select **Insert Annotation**
* Select **Add as categorical**
* Select **Infinium\_Design\_Type** and **UCSC\_CpG\_Islands\_Name** from the *Column Configuration* options (Figure 2)

![](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-b94d790d1a3959aa2a9c82471e918c93ead82a9a%2Fimage2017-12-26%2013_38_12.png?alt=media)

Figure 2. Adding Infinium design type and CpG island annotations

* Select **OK** to add the Inifinium design type and UCSC CpG island name as categorical columns on the spreadsheet

Now, we can use the interactive filter to create separate spreadsheets for type I and type II probes.

* Select (![](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-7f7de8359902ff1937fa80d15fc65413d93f1081%2Finteractive%20filter%20icon.png?alt=media)) to launch the interactive filter
* Select **2. Infinium\_Design\_Type** from the drop-down menu if not selected by default
* Left-click the **type I** column to exclude it
* Right-click the temporary spreadsheet in the spreadsheet tree to bring up the pop-up dialog
* Select **Clone...** (Figure 3)

![](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-d0a13bc87e9218cec46e148104c34c1d82e2e50a%2Fimage2017-12-26%2013_40_13.png?alt=media)

Figure 3. Creating a probe list with only Infinium type II probes

* Name the new spreadsheet **female\_only\_typeII\_probes**
* Select **OK**
* Save the created spreadsheet, we chose the file name *female\_only\_typeII\_probes*
* Repeat process to create a spreadsheet for type I probes

The temporary spreadsheet is no longer needed so we can close it.

* Close the temporary spreadsheet by selecting it in the file tree and selecting (![](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-50243ff924901a858d037a8db93ba03a8e7dde29%2Fclose%20spreadsheet.png?alt=media))

We can use these spreadsheets to generate lists of M values at CpG island regions

* Select spreadsheet **female\_only\_typeII\_probes**
* Select **Stat** from the main toolbar
* Select **Column Statistics...** under *Descriptive* (Figure 4)

![](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-802386a28f427435050628b5cdab4866f806d26f%2Fimage2017-12-26%2013_43_12.png?alt=media)

Figure 4. Selecting column statistics

* Add **Mean** to the *Selected Measure(s)* panel
* Select **Group By** and set it to **3. UCSC\_CpG\_Islands\_Name** (Figure 5)

![](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-c0f3e612be0155559b3d42de33498d3e017b7d6e%2Fimage2017-12-26%2013_43_39.png?alt=media)

Figure 5. Configuring column statistics

* Select **OK**

The new temporary spreadsheet has one CpG island region per row (Figure 6), samples on columns, and the values in the cells represent the mean of M values of all the CpG probes in the region.

![](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-950d5d79c2566377e17225a413e51f8515f298e7%2Fimage2017-12-26%2013_44_24.png?alt=media)

Figure 6. New spreadsheet with average M values for probes at each CpG island; probes not at CpG islands are collected into the first row "- Mean"

Note the first row, with label “*– Mean*”. It corresponds to all the probes that map outside of UCSC CpG islands. As it is not needed for the downstream analysis, we will remove it.

* Right-click on the row header for *Mean*
* Select **Delete** to remove the row

The final step is to transpose the data back to its original orientation.

* Select **Transform** from the main toolbar
* Select **Create Transposed Spreadsheet...** from the *Transform* drop-down menu
* Select **2. Level** for *Column:* and **numeric** for *Data Type:*
* Select OK

The layout of the new transposed spreadsheet is as follows: one sample per row with CpG island regions on columns; cell entries correspond to mean methylation status of the region (Figure 7). The column with a blank value for the column header is the average of all probes not associated with CpG island regions. You can delete this column if you like.

![](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-480562488871b2b831cbaac9b77f955cdaa041f0%2Fimage2017-12-26%2013_46_27.png?alt=media)

Figure 7. Spreadsheet with average M values of probes in each CpG island for each sample

* Right-click the transposed spreadsheet, *2\_transpose*
* Select **Save as...** from the pop-up menu
* Name it **mvalues\_typeII\_probes\_CpG\_islands**
* Close the source temporary spreadsheet by selecting it in the spreadsheet tree and selecting (![](https://1384254481-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FJVEESmJAPppJ3ijFq5aR%2Fuploads%2Fgit-blob-50243ff924901a858d037a8db93ba03a8e7dde29%2Fclose%20spreadsheet.png?alt=media))

The *mvalues\_typeII\_probes\_CpG\_islands* spreadsheet can be used as a starting point for ANOVA and other analyses. You can also repeat the steps above to create an equivalent spreadsheet for type I probes.

## Additional Assistance

If you need additional assistance, please visit [our support page](http://www.partek.com/support) to submit a help ticket or find phone numbers for regional support.
