General linear model

This method is based on general linear model, much like ANOVA in reverse, it calculates the variation attributed to the factor(s) being removed then adjusting the original values to remove the variation.

By including batch in the differential analysis model, the variability due to the batch effect is accounted for when calculating p-values. In this sense, batch effects are best handled as part of the differential analysis model. However, clustering data or visualizing biological effects can be very difficult if batch effects are present in the original data. We transform the original values to remove the batch effect using this tool.

We recommend normalizing your data to log normal distribution prior to removing batch effects using this method, but the task will run on any counts data node.

Click the counts data node
Click the Batch removal section of the toolbox
Click General linear model

The batch effect removal dialog is similar to the dialog for ANOVA. To set up the model, you need to choose factors and interaction of factors based on your experiment design in addition to attribute(s) represent batch you would like to remove.

For example, in the case where you have samples from different cell types, treatment, they are from different batches, batch may have a different effect on different cell types and/or different treatment, you would need to include both treatment, cell type, batch and maybe the interaction between treatment and cell type in the model.

Select the Remove checkbox for batch
Click Finish to run

The output of is a new data node, contains the batch effect corrected values, but the treatment and cell type variation is still kept. It can be used as the input for downstream tasks

PreviousBatch removal NextHarmony

Last updated 1 month ago

Was this helpful?