Filter samples/cells

Filter samples or cells depends on the input data in order to perform downstream analysis on a subset of data.

Click a count matrix or single cell counts data node, in the Filtering section of the pop-up menu, and choose to Filter samples (bulk data) or Filter cells (single cell data).

The dialog lets you build a series of filters based on sample or cell attributes.

Click Finish to apply the filter. If no sample or cell will pass the filter criteria, a warning message will appear and the task will not run.

Filter by metadata

The first drop-down menu allows you to choose to include or exclude based on the specified criteria.

The second drop-down menu allows you to choose any categorical or numeric attribute to use for the filter criteria.

If the attribute is categorical, the third drop-down menu includes in and not in as options. A fourth drop-down menu allows you to search and choose from the levels of the selected attribute.

If the attribute is numeric, the the third drop-down includes:

  • <: less than

  • <=: less than or equal to

  • == equal to

  • >: greater than

  • >=: greater than or equal to

The threshold is set using the text box. The input must be a number; it can be an integer or decimal, positive or negative.

Using the OR and AND options, you can combine multiple filters.

When combining multiple filters all set to Include:

With AND, if all statements must be true for the sample to meet the filter criteria.

With OR, if any statement is true, the sample will meet the filter criteria.

When combining multiple filters all set to Exclude:

With AND, if any statement is true, the sample will meet the filter criteria.

With OR, all statements must be true for the sample to meet the filter criteria.

Filter by features

You can use a certain feature expression values to generate a subset of the data. For instance, if you want to include all the samples/cells whose GAPDH gene expression value is greater or equal to 5:

You can search for a feature by typing in the search box in the 2nd drop-down list.

The output of the task will contain a data node with the same features as the input data but only the observations meet the filter criteria.

Last updated

Was this helpful?