HTSeq
Last updated
Last updated
HTSeq is set of tools for processing high-throughput sequencing data1. In Partek Flow, we have implemented the htseq-count script from HTSeq for quantifying aligned reads to an annotation model.
The input for HTSeq is an Aligned reads data node and a Gene/Feature annotation file.
To run HTSeq:
Click an Aligned reads data node
Click the Quantification section of the toolbox
Click HTSeq
Please note that HTSeq has not been optimized for performance and can take a very long time to run compared with Quantify to annotation model (Partek E/M) on the same data.
HTSeq includes basic options (Figure 1) and advanced options accessible by clicking Configure (Figure 2).
The annotation file contains the features the aligned reads will be quantified to. For more information about adding an annotation model, please see Adding an Annotation Model.
Depending on the library preparation method, information about the strand of the original transcript may be faithfully preserved or lost. This setting controls whether HTSeq considers strand during quantification. Consult your library preparation method user manual if you are unsure about if and how the method preserved strand information.
If set to no, a read is considered to be overlapping a feature regardless of whether it maps to the same or opposite strand as the feature.
If set to yes, the read has to be matched to the same strand as the feature if single-end reads and matched to the same strand for the first read and the opposite strand for the second read if paired-end reads.
If set to reverse, the read has to be matched to the opposite strand as the feature if single-end reads and matched to the opposite strand for the first read and the same strand for the second read if paired-end reads.
By default, only features (e.g., genes) with one or more aligned read will be included in the output. If this option is selected, all features from the annotation model, including those without any matching aligned reads, will be included.
HTSeq will skip reads with an alignment quality lower than the value specified here. The default is 10.
This option determines how HTSeq handles reads that partially overlap features. The default is union.
If set to union, any read that is partially overlapped by a feature will be assigned to that feature. Assignment is non-exclusive if multiple feature overlap.
If set to intersection-strict, only reads that are fully overlapped by a feature will be assigned to that feature. Assignment is non-exclusive if multiple feature fully overlap.
If set to intersection-nonempty, reads are assigned to the feature that has the greatest overlap. Assignment is non-exclusive if multiple feature overlap the same amount.
This option determines how HTSeq counts reads that are assigned to more than one feature. The default is none.
If set to none, reads that are assigned to more than one feature are not counted for any feature.
If set to all, reads are counted for all features they are assigned to.
HTSeq outputs a Gene counts data node (Figure 3). There is no task report.
Simon Anders, Paul Theodor Pyl, Wolfgang Huber. HTSeq — A Python framework to work with high-throughput sequencing data. Bioinformatics (2014), in print, online at doi:10.1093/bioinformatics/btu638
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.