Annotate Regions

ATAC-seq and 5-base DNA methylation analysis generate regions in genome. To better understand roles of the regions in regulating gene expression, we can use Annotate regions task to add information about overlapping or nearby genomic features to the regions.

Running Annotate regions

The input for Annotate regions task is a region data node.

  • Click to select a regions data node.

  • Select Region analysis from context-sensitive menu, select Annotate regions.

  • Select an Assembly and Annotation model.

  • Select an option for Genomics overlaps:

    • Report one gene region per genomic feature (precedence applies): chooses one gene section for each region using the precedence order when more than one gene section overlaps a region. The order of precedence is TSS, TTS, CDS Exon, 5' UTR Exon, 3' UTR Exon, Intron, Intergenic.

    • Report all gene regions per features: creates a row for each gene section that overlaps a region in the task report.

  • Define transcription start site (TSS) and transcription termination site (TTS) limit in the unit of bp.

  • Click Finish to run.

Annotate regions output

When completed, an Annotate regions task node and an Annotated regions data node are generated. Double-click on the Annotated regions data node to open task report. The report consists of a pie chart and a table. The pie chart shows breakdown on gene section among the regions. The table show annotated information, if run with the option to report all gene sections per region, each region will have a row for each gene section it overlaps; if run with the option to report one gene section per region, each region will have one row with the gene section it overlaps chosen using the order of precedence. The table can be sorted by any of its columns.

Click on the Optional columns on the upper-left corner of the table to add more information on each region.

Gene sections

TSS

Transcription start site (TSS) is -1000bp and +1000bp (default setting) from the TSS for a transcript

TTS

Transcription termination site (TTS) is -1000bp and +1000bp (default setting) from the TTS for a transcript

CDS Exon

Coding sequence (CDS) Exon is overlapping a coding exon in a transcript

5' UTR Exon

5' Untranslated Region (UTR) Exon is overlapping an exon in the 5' UTR of a transcript

3' UTR Exon

3' Untranslated Region (UTR) Exon is overlapping an exon in the 3' UTR of a transcript

Intron

Intron is overlapping an intron in a transcript

Intergenic

Intergenic is not located within 1000bp of a transcript

Last updated

Was this helpful?