Descriptive statistics
Descriptive statistics task can be invoked on matrix data node e.g. Gene Counts, Normalized Counts data node in bulk RNA seq analysis pipeline or Single Cell counts Data node etc. It calculates measures of central tendency and variability on observations or features of the matrix data.
Running Descriptive statistics
Click on a matrix data node
Choose Descriptive Statistics in Statistics section of the toolbox

This will invoke the dialog configuration dialog; use it to specify which calculation(s) will be performed on observations or features.

When select the calculation is for observations (samples or cells), there is a drop-down option to use all the features in the input data node or a list of features. If you use a saved feature list, you can use the check button to select whether match of the saved list to your data is case sensitive or not.
When select the calculation is for features, selecting Group by drop-down list allows to compute the statistics in each group separately

Click on the button to add more than one attributes, the result will be on the groups from the interaction terms of selected attributes.
The available statistics are listed on the left panel, suppose "x1, x2, ..., xn"represent an array of numbers
Coefficient of variation (CV):
s represent the standard deviation
Geometric mean:
Max:
Mean:
Median: when n is odd, median is
, when n is even, median is
Median absolute deviation:
, where
Min:
Number of cells: Available when Calculate for is set to Features. Reports the number of cells with the value [<, <=, =, !=, > >=] (select one from the drop down list) than the cut off value entered in the text box. The cut off will be applied to the values present in the input data node, i.e. if invoked on non-normalised data node, the values are raw counts. For instance, use this option if you want to know the number of cells in which each feature was detected; possible filter: Number of cells whose value > 0.0
Percent of cells: Available when Calculate for is set to Features. Reports the number of cells with the value [<, <=, =, !=, > >=] (select one from the drop down list) than the cut off value entered in the text box.
Number of features: Available when Calculate for is set to Cells. Reports the number of features with the value [<, <=, =, !=, > >=] (select one from the drop down list) than the cut off value entered in the text box. The cut off will be applied to the values present in the input data node, i.e. if invoked on non-normalised data node, the values are raw counts. For example, use this option if you want to know the number of detected genes per each cell; filter: Number of features whose value > 0.0
Percent of features: Available when Calculate for is set to Cells. Reports the fraction of features with the value [<, <=, =, !=, > >=] (select one from the drop down list) than the cut off value entered in the text box.
Q1: 25th percentile
Q3: 75th percentile
Range: xmax - x min
Standard deviation:
where
Sum:
Variance:
Left click to select measurement and drag to move to the right panel one at a time, or when you mouse over on a measurement, click on the + button to move to the right panel and click Finish.
The output data node can be downloaded or visualized in Data Viewer:

Last updated
Was this helpful?