LogoLogo
Illumina KnowledgeIllumina SupportSign In
Partek
  • Home
Partek
  • Overview
  • Partek Flow
    • Frequently Asked Questions
      • General
      • Visualization
      • Statistics
      • Biological Interpretation
      • How to cite Partek software
    • Quick Start Guide
    • Installation Guide
      • Minimum System Requirements
      • Single Cell Toolkit System Requirements
      • Single Node Installation
      • Single Node Amazon Web Services Deployment
      • Multi-Node Cluster Installation
      • Creating Restricted User Folders within the Partek Flow server
      • Updating Partek Flow
      • Uninstalling Partek Flow
      • Dependencies
      • Docker and Docker-compose
      • Java KeyStore and Certificates
      • Kubernetes
    • Live Training Event Recordings
      • Bulk RNA-Seq Analysis Training
      • Basic scRNA-Seq Analysis & Visualization Training
      • Advanced scRNA-Seq Data Analysis Training
      • Bulk RNA-Seq and ATAC-Seq Integration Training
      • Spatial Transcriptomics Data Analysis Training
      • scRNA and scATAC Data Integration Training
    • Tutorials
      • Creating and Analyzing a Project
        • Creating a New Project
        • The Metadata Tab
        • The Analyses Tab
        • The Log Tab
        • The Project Settings Tab
        • The Attachments Tab
        • Project Management
        • Importing a GEO / ENA project
      • Bulk RNA-Seq
        • Importing the tutorial data set
        • Adding sample attributes
        • Running pre-alignment QA/QC
        • Trimming bases and filtering reads
        • Aligning to a reference genome
        • Running post-alignment QA/QC
        • Quantifying to an annotation model
        • Filtering features
        • Normalizing counts
        • Exploring the data set with PCA
        • Performing differential expression analysis with DESeq2
        • Viewing DESeq2 results and creating a gene list
        • Viewing a dot plot for a gene
        • Visualizing gene expression in Chromosome view
        • Generating a hierarchical clustering heatmap
        • Performing biological interpretation
        • Saving and running a pipeline
      • Analyzing Single Cell RNA-Seq Data
      • Analyzing CITE-Seq Data
        • Importing Feature Barcoding Data
        • Data Processing
        • Dimensionality Reduction and Clustering
        • Classifying Cells
        • Differentially Expressed Proteins and Genes
      • 10x Genomics Visium Spatial Data Analysis
        • Start with pre-processed Space Ranger output files
        • Start with 10x Genomics Visium fastq files
        • Spatial data analysis steps
        • View tissue images
      • 10x Genomics Xenium Data Analysis
        • Import 10x Genomics Xenium Analyzer output
        • Process Xenium data
        • Perform Exploratory analysis
        • Make comparisons using Compute biomarkers and Biological interpretation
      • Single Cell RNA-Seq Analysis (Multiple Samples)
        • Getting started with the tutorial data set
        • Classify cells from multiple samples using t-SNE
        • Compare expression between cell types with multiple samples
      • Analyzing Single Cell ATAC-Seq data
      • Analyzing Illumina Infinium Methylation array data
      • NanoString CosMx Tutorial
        • Importing CosMx data
        • QA/QC, data processing, and dimension reduction
        • Cell typing
        • Classify subpopulations & differential expression analysis
    • User Manual
      • Interface
      • Importing Data
        • SFTP File Transfer Instructions
        • Import single cell data
        • Importing 10x Genomics Matrix Files
        • Importing and Demultiplexing Illumina BCL Files
        • Partek Flow Uploader for Ion Torrent
        • Importing 10x Genomics .bcl Files
        • Import a GEO / ENA project
      • Task Menu
        • Task actions
        • Data summary report
        • QA/QC
          • Pre-alignment QA/QC
          • ERCC Assessment
          • Post-alignment QA/QC
          • Coverage Report
          • Validate Variants
          • Feature distribution
          • Single-cell QA/QC
          • Cell barcode QA/QC
        • Pre-alignment tools
          • Trim bases
          • Trim adapters
          • Filter reads
          • Trim tags
        • Post-alignment tools
          • Filter alignments
          • Convert alignments to unaligned reads
          • Combine alignments
          • Deduplicate UMIs
          • Downscale alignments
        • Annotation/Metadata
          • Annotate cells
          • Annotation report
          • Publish cell attributes to project
          • Attribute report
          • Annotate Visium image
        • Pre-analysis tools
          • Generate group cell counts
          • Pool cells
          • Split matrix
          • Hashtag demultiplexing
          • Merge matrices
          • Descriptive statistics
          • Spot clean
        • Aligners
        • Quantification
          • Quantify to annotation model (Partek E/M)
          • Quantify to transcriptome (Cufflinks)
          • Quantify to reference (Partek E/M)
          • Quantify regions
          • HTSeq
          • Count feature barcodes
          • Salmon
        • Filtering
          • Filter features
          • Filter groups (samples or cells)
          • Filter barcodes
          • Split by attribute
          • Downsample Cells
        • Normalization and scaling
          • Impute low expression
          • Impute missing values
          • Normalization
          • Normalize to baseline
          • Normalize to housekeeping genes
          • Scran deconvolution
          • SCTransform
          • TF-IDF normalization
        • Batch removal
          • General linear model
          • Harmony
          • Seurat3 integration
        • Differential Analysis
          • GSA
          • ANOVA/LIMMA-trend/LIMMA-voom
          • Kruskal-Wallis
          • Detect alt-splicing (ANOVA)
          • DESeq2(R) vs DESeq2
          • Hurdle model
          • Compute biomarkers
          • Transcript Expression Analysis - Cuffdiff
          • Troubleshooting
        • Survival Analysis with Cox regression and Kaplan-Meier analysis - Partek Flow
        • Exploratory Analysis
          • Graph-based Clustering
          • K-means Clustering
          • Compare Clusters
          • PCA
          • t-SNE
          • UMAP
          • Hierarchical Clustering
          • AUCell
          • Find multimodal neighbors
          • SVD
          • CellPhoneDB
        • Trajectory Analysis
          • Trajectory Analysis (Monocle 2)
          • Trajectory Analysis (Monocle 3)
        • Variant Callers
          • SAMtools
          • FreeBayes
          • LoFreq
        • Variant Analysis
          • Fusion Gene Detection
          • Annotate Variants
          • Annotate Variants (SnpEff)
          • Annotate Variants (VEP)
          • Filter Variants
          • Summarize Cohort Mutations
          • Combine Variants
        • Copy Number Analysis (CNVkit)
        • Peak Callers (MACS2)
        • Peak analysis
          • Annotate Peaks
          • Filter peaks
          • Promoter sum matrix
        • Motif Detection
        • Metagenomics
          • Kraken
          • Alpha & beta diversity
          • Choose taxonomic level
        • 10x Genomics
          • Cell Ranger - Gene Expression
          • Cell Ranger - ATAC
          • Space Ranger
          • STARsolo
        • V(D)J Analysis
        • Biological Interpretation
          • Gene Set Enrichment
          • GSEA
        • Correlation
          • Correlation analysis
          • Sample Correlation
          • Similarity matrix
        • Export
        • Classification
        • Feature linkage analysis
      • Data Viewer
      • Visualizations
        • Chromosome View
          • Launching the Chromosome View
          • Navigating Through the View
          • Selecting Data Tracks for Visualization
          • Visualizing the Results Using Data Tracks
          • Annotating the Results
          • Customizing the View
        • Dot Plot
        • Volcano Plot
        • List Generator (Venn Diagram)
        • Sankey Plot
        • Transcription Start Site (TSS) Plot
        • Sources of variation plot
        • Interaction Plots
        • Correlation Plot
        • Pie Chart
        • Histograms
        • Heatmaps
        • PCA, UMAP and tSNE scatter plots
        • Stacked Violin Plot
      • Pipelines
        • Making a Pipeline
        • Running a Pipeline
        • Downloading and Sharing a Pipeline
        • Previewing a Pipeline
        • Deleting a Pipeline
        • Importing a Pipeline
      • Large File Viewer
      • Settings
        • Personal
          • My Profile
          • My Preferences
          • Forgot Password
        • System
          • System Information
          • System Preferences
          • LDAP Configuration
        • Components
          • Filter Management
          • Library File Management
            • Library File Management Settings
            • Library File Management Page
            • Selecting an Assembly
            • Library Files
            • Update Library Index
            • Creating an Assembly on the Library File Management Page
            • Adding Library Files on the Library File Management Page
            • Adding a Reference Sequence
            • Adding a Cytoband
            • Adding Reference Aligner Indexes
            • Adding a Gene Set
            • Adding a Variant Annotation Database
            • Adding a SnpEff Variant Database
            • Adding a Variant Effect Predictor (VEP) Database
            • Adding an Annotation Model
            • Adding Aligner Indexes Based on an Annotation Model
            • Adding Library Files from Within a Project
            • Microarray Library Files
            • Adding Prep kit
            • Removing Library Files
          • Option Set Management
          • Task Management
          • Pipeline managment
          • Lists
        • Access
          • User Management
          • Group Management
          • Licensing
          • Directory Permissions
          • Access Control Log
          • Failed Logins
          • Orphaned files
        • Usage
          • System Queue
          • System Resources
          • Usage Report
      • Server Management
        • Backing Up the Database
        • System Administrator Guide (Linux)
        • Diagnosing Issues
        • Moving Data
        • Partek Flow Worker Allocator
      • Enterprise Features and Toolkits
        • REST API
          • REST API Command List
      • Microarray Toolkit
        • Importing Custom Microarrays
      • Glossary
    • Webinars
    • Blog Posts
      • How to select the best single cell quality control thresholds
      • Cellular Differentiation Using Trajectory Analysis & Single Cell RNA-Seq Data
      • Spatial transcriptomics—what’s the big deal and why you should do it
      • Detecting differential gene expression in single cell RNA-Seq analysis
      • Batch remover for single cell data
      • How to perform single cell RNA sequencing: exploratory analysis
      • Single Cell Multiomics Analysis: Strategies for Integration
      • Pathway Analysis: ANOVA vs. Enrichment Analysis
      • Studying Immunotherapy with Multiomics: Simultaneous Measurement of Gene and Protein
      • How to Integrate ChIP-Seq and RNA-Seq Data
      • Enjoy Responsibly!
      • To Boldly Go…
      • Get to Know Your Cell
      • Aliens Among Us: How I Analyzed Non-Model Organism Data in Partek Flow
    • White Papers
      • Understanding Reads in RNA-Seq Analysis
      • RNA-Seq Quantification
      • Gene-specific Analysis
      • Gene Set ANOVA
      • Partek Flow Security
      • Single Cell Scaling
      • UMI Deduplication in Partek Flow
      • Mapping error statistics
    • Release Notes
      • Release Notes Archive - Partek Flow 10
  • Partek Genomics Suite
    • Installation Guide
      • Minimum System Requirements
      • Computer Host ID Retrieval
      • Node Locked Installation
        • Windows Installation
        • Macintosh Installation
      • Floating/Locked Floating Installation
        • Linux Installation
          • FlexNet Installation on Linux
        • Installing FlexNet on Windows
        • License Server FAQ's
        • Client Computer Connection to License Server
      • Uninstalling Partek Genomics Suite
      • Updating to Version 7.0
      • License Types
      • Installation FAQs
    • User Manual
      • Lists
        • Importing a text file list
        • Adding annotations to a gene list
        • Tasks available for a gene list
        • Starting with a list of genomic regions
        • Starting with a list of SNPs
        • Importing a BED file
        • Additional options for lists
      • Annotation
      • Hierarchical Clustering Analysis
      • Gene Ontology ANOVA
        • Implementation Details
        • Configuring the GO ANOVA Dialog
        • Performing GO ANOVA
        • GO ANOVA Output
        • GO ANOVA Visualisations
        • Recommended Filters
      • Visualizations
        • Dot Plot
        • Profile Plot
        • XY Plot / Bar Chart
        • Volcano Plot
        • Scatter Plot and MA Plot
        • Sort Rows by Prototype
        • Manhattan Plot
        • Violin Plot
      • Visualizing NGS Data
      • Chromosome View
      • Methylation Workflows
      • Trio/Duo Analysis
      • Association Analysis
      • LOH detection with an allele ratio spreadsheet
      • Import data from Agilent feature extraction software
      • Illumina GenomeStudio Plugin
        • Import gene expression data
        • Import Genotype Data
        • Export CNV data to Illumina GenomeStudio using Partek report plug-in
        • Import data from Illumina GenomeStudio using Partek plug-in
        • Export methylation data to Illumina GenomeStudio using Partek report plug-in
    • Tutorials
      • Gene Expression Analysis
        • Importing Affymetrix CEL files
        • Adding sample information
        • Exploring gene expression data
        • Identifying differentially expressed genes using ANOVA
        • Creating gene lists from ANOVA results
        • Performing hierarchical clustering
        • Adding gene annotations
      • Gene Expression Analysis with Batch Effects
        • Importing the data set
        • Adding an annotation link
        • Exploring the data set with PCA
        • Detect differentially expressed genes with ANOVA
        • Removing batch effects
        • Creating a gene list using the Venn Diagram
        • Hierarchical clustering using a gene list
        • GO enrichment using a gene list
      • Differential Methylation Analysis
        • Import and normalize methylation data
        • Annotate samples
        • Perform data quality analysis and quality control
        • Detect differentially methylated loci
        • Create a marker list
        • Filter loci with the interactive filter
        • Obtain methylation signatures
        • Visualize methylation at each locus
        • Perform gene set and pathway analysis
        • Detect differentially methylated CpG islands
        • Optional: Add UCSC CpG island annotations
        • Optional: Use MethylationEPIC for CNV analysis
        • Optional: Import a Partek Project from Genome Studio
      • Partek Pathway
        • Performing pathway enrichment
        • Analyzing pathway enrichment in Partek Genomics Suite
        • Analyzing pathway enrichment in Partek Pathway
      • Gene Ontology Enrichment
        • Open a zipped project
        • Perform GO enrichment analysis
      • RNA-Seq Analysis
        • Importing aligned reads
        • Adding sample attributes
        • RNA-Seq mRNA quantification
        • Detecting differential expression in RNA-Seq data
        • Creating a gene list with advanced options
        • Visualizing mapped reads with Chromosome View
        • Visualizing differential isoform expression
        • Gene Ontology (GO) Enrichment
        • Analyzing the unexplained regions spreadsheet
      • ChIP-Seq Analysis
        • Importing ChIP-Seq data
        • Quality control for ChIP-Seq samples
        • Detecting peaks and enriched regions in ChIP-Seq data
        • Creating a list of enriched regions
        • Identifying novel and known motifs
        • Finding nearest genomic features
        • Visualizing reads and enriched regions
      • Survival Analysis
        • Kaplan-Meier Survival Analysis
        • Cox Regression Analysis
      • Model Selection Tool
      • Copy Number Analysis
        • Importing Copy Number Data
        • Exploring the data with PCA
        • Creating Copy Number from Allele Intensities
        • Detecting regions with copy number variation
        • Creating a list of regions
        • Finding genes with copy number variation
        • Optional: Additional options for annotating regions
        • Optional: GC wave correction for Affymetrix CEL files
        • Optional: Integrating copy number with LOH and AsCN
      • Loss of Heterozygosity
      • Allele Specific Copy Number
      • Gene Expression - Aging Study
      • miRNA Expression and Integration with Gene Expression
        • Analyze differentially expressed miRNAs
        • Integrate miRNA and Gene Expression data
      • Promoter Tiling Array
      • Human Exon Array
        • Importing Human Exon Array
        • Gene-level Analysis of Exon Array
        • Alt-Splicing Analysis of Exon Array
      • NCBI GEO Importer
    • Webinars
    • White Papers
      • Allele Intensity Import
      • Allele-Specific Copy Number
      • Calculating Genotype Likelihoods
      • ChIP-Seq Peak Detection
      • Detect Regions of Significance
      • Genomic Segmentation
      • Loss of Heterozygosity Analysis
      • Motif Discovery Methods
      • Partek Genomics Suite Security
      • Reads in RNA-Seq
      • RNA-Seq Methods
      • Unpaired Copy Number Estimation
    • Release Notes
    • Version Updates
    • TeamViewer Instructions
  • Getting Help
    • TeamViewer Instructions
Powered by GitBook
On this page
  • STAR Algorithm
  • TopHat-Fusion Algorithm
  • Additional Assistance

Was this helpful?

Export as PDF
  1. Partek Flow
  2. User Manual
  3. Task Menu
  4. Variant Analysis

Fusion Gene Detection

PreviousVariant AnalysisNextAnnotate Variants

Last updated 7 months ago

Was this helpful?

A fusion gene is a hybrid gene that combines parts of two or more original genes. They can form as a result of chromosomal rearrangements (such as translocation, interstitial deletion, or chromosomal inversion) or abnormal transcription and have been shown to act as drivers of malignant transformation or/and progression in various neoplasms (1). The discovery and characterization of fusion genes have been greatly facilitated by the use of NGS (2) and several computational algorithms have been developed to detect them.

This chapter covers will illustrate how to detect fusion genes by:

STAR Algorithm

General Overview

The aligner also has the ability to detect fusion genes (referred to as “chimeric alignments”) (5,6). During the first phase of alignment, STAR searches for maximal mappable prefixes (seeds) of sequencing reads. In the second phase, all the seeds that align within user-defined genomic windows are stitched together. If an alignment within one genomic window does not cover the entire read sequence, STAR will try to find two or more windows that cover the entire read. This essentially results in the detection of fusion events, with different parts of reads aligning to distal genomic locations, or different chromosomes, or different strands.

STAR fusion detection is performed in two steps: chimeric alignment of reads with the STAR aligner and fusion detection with STAR-Fusion. Performing fusion detection in two steps is equivalent to running the analysis in "Kickstart" mode, as . We recommend using STAR version 2.7.8a (see to check which version you are running).

To save time, you can import the pre-built STAR-Fusion pipeline from our hosted pipeline page. This pipeline includes the two steps outlined below, where the advanced options for the STAR 2.7.8a alignment have been optimized for fusion detection according to the STAR-Fusion author's recommendations. See for more information.

Running STAR Chimeric Alignment within Partek Flow

When performing an alignment with STAR, chimeric alignment can be activated by tick-marking the Chimeric alignment option in the Advanced options of the aligner (the Advanced options dialog is reached via the Configure link in the setup dialog). When the Chimeric alignment checkbox is selected, additional options specific to the fusion search algorithm are shown (Figure 1). For a discussion on the details of the options, see STAR documentation.

The output is associated with the Chimeric junctions data node (Figure 2), which is a part of the STAR results in addition to Aligned reads node and, optionally, Unaligned reads node.

Running STAR-Fusion on Chimeric results

STAR-Fusion v1.10 is wrapped into Partek Flow. STAR-Fusion will process the chimeric output generated by the STAR aligner to map junction reads and spanning reads to a reference annotation set. To run fusion detection, select the Chimeric junctions data node and choose STAR-Fusion from the Variant analysis menu in the toolbox (Figure 5).

To change any of the advanced options, click the Configure link (Figure 7). To run the task, click Finish.

The resulting Fusion predictions task node (Figure 18) can be downloaded to your local machine by selecting the data node and clicking Download data from the toolbox. There will be one tab-separated (.tsv) file per sample. To view the full table, double-click the new data node to open the task report (Figure 9). Each row of the table is a fusion event and the columns contain information about each detected fusion.

  • FusionName: the name of the fusion event, given as LeftGene--RightGene. Multiple fusion events can be detected across the same pair of genes, so the FusionName of an event is not necessarily unique;

  • JunctionReadCount: indicates the number of RNA-Seq fragments containing a read that aligns as a split read at the site of the putative fusion junction;

  • SpanningFragCount: indicates the number of RNA-Seq fragments that encompass the fusion junction such that one read of the pair aligns to a different gene than the other paired-end read of that fragment;

  • est_J: estimated junction read counts corrected for multiple mappings;

  • est_S: estimated spanning fragment counts corrected for multiple mappings;

  • SpliceType: indicates whether the proposed breakpoint occurs at reference exon junctions as provided by the reference transcript structure annotations (Gencode);

  • LeftGene: name of the first (left) gene;

  • LeftBreakpoint: genome coordinates for the breakpoint in left gene;

  • RightGene: name of the second (right) gene;

  • RightBreakpoint: genome coordinates for the breakpoint in right gene;

  • JunctionReads: sequence identifiers for all junction reads;

  • SpanningFrags: sequence identifiers for all spanning fragments;

  • LargeAnchorSupport: indicates whether there are split reads that provide 'long' (set to 25bp) alignments on both sides of the putative breakpoint;

  • FFPM: fusion fragments per million reads

  • LeftBreakDinuc: dinucleotide base pairs at the left breakpoint

  • LeftBreakEntropy: the Shannon entropy of the 15 exonic bases flanking the left breakpoint

  • RightBreakDinuc: dinucleotide base pairs at the right breakpoint

  • RightBreakEntropy: the Shannon entropy of the 15 exonic bases flanking the right breakpoint

  • annots: provides a simplified annotation for fusion transcript

TopHat-Fusion Algorithm

General Overview

The reads are first aligned to the genome. The unaligned reads resulting from this initial alignment are split into multiple 25 bp sequences which are, in turn, aligned to the genome by Bowtie. The TopHat-Fusion algorithm identifies the cases where the first and the last 25 bp segments are aligned to either two different chromosomes or two locations on the same chromosome (spacing is defined by the user). The whole read is used to identify a fusion point. After the initial fusion candidates are defined, all the segments from the initially unaligned reads are realigned against the fusion points (as well as intron boundaries and indels). The resulting alignments are combined with the full read alignments.

The most up-to-date TopHat-Fusion version implemented in Partek® Flow® when the manual was written (2.1.0) focuses on fusions due to chromosomal rearrangements, while fusions resulting from read-through transcription or trans-splicing were not supported. For details as well as discussion of TopHat-Fusion options, see TopHat-Fusion home page (4).

Running TopHat-Fusion within Partek Flow

TopHat-Fusion is integrated in the TopHat 2 task and is invoked by using the Fusion search check box in the Alignment options dialog (Figure 10).

The output is generated as a new data node Fusion results (Figure 11) stemming as part of the if the TopHat 2 align reads task (in addition to Aligned reads node and, optionally, Unaligned reads node).

Clicking the Download data downloads a *.fusion file to the local computer. The file is human-readable and can be opened in a text editor (example in Figure 13). For details refer to TopHat-Fusion documentation.

A list of annotated fusion genes, in a form of Fusion report can be obtained by first selecting the Fusion report task node and then the Task report link from the task menu. Since the task provides an annotated report, an annotation file needs to be specified first (Figure 14).

The resulting Fusion report task node (Figure 15) can be double-clicked to reveal the full table (Figure 7).

Each row of the table in Figure 7 is a potential fusion event, with the columns providing the following information.

  • Sample ID: sample in which the fusion event was identified

  • Chromosome 1: chromosome hosting the first (left) segment of the fusion transcript

  • Stop 1: end of the first (left) segment of the fusion transcript

  • Chromosome 2: chromosome hosting the second (right) part of the fusion transcript

  • Start 2: beginning of the second (right) segment of the fusion transcript

  • Gene1: gene on the left side of the fusion

  • Gene2: gene on the right side of the fusion

  • Spanning reads: number of reads which were unaligned during the initial phase of TopHat and where only one mate is used as evidence of the fusion event

  • Mate Pairs: number of reads which were unaligned during the initial phase of TopHat and where both mates are used as evidence of the fusion event

  • Spanning mate pairs: number of reads where both mates were aligned during the initial phase of TopHat, but their pairing is discordant (e.g. different chromosomes, different orientation etc.)

  • Contradicting reads: number of reads which do not support the fusion

  • Left bases: number of bases on the left side of the fusion

  • Right bases: number of bases on the right side of the fusion

All the columns can be sorted by using the arrow buttons in column headers, while the type-in boxes can be used for searching. TopHat-Fusion does not report exact start and stop position for each side of the fusion event. It has a single location for the end of the upstream segment (Stop 1) and the beginning of the downstream segment (Start 2). Therefore, columns Start 1 and Stop 2 are added for (internal) consistency with other Partek Flow tools.

The checkboxes Disrupted Genes and Gene/Gene fusions are filter tools. When selected, Disrupted Genes removes all the rows (fusion events) which have no genes assigned to it, i.e. those that merge two intergenic regions. However, if there is a fusion between a gene and an intergenic region, it will be kept in the table. The Gene/Gene fusions filters in only those fusion events which have an annotated gene on both sides of the breakpoint. In other words, only gene to gene fusions are kept in the table.

A new data node, Fusion attribute report, is generated in the Analysis tab (Figure 18) and it provides access to the Task report link in the task menu.

The output, Fusion report table resembles the basic TopHat-Fusion output; each row of the table is a single fusion event while the information on the merged segments is on the columns.

  • Chromosome 1: chromosome hosting the first (left) segment of the fusion transcript;

  • Start 1: beginning of the first (left) segment of the fusion transcript;

  • Stop 1: end of the first (right) segment of the fusion transcript;

  • Chromosome 2: chromosome hosting the second (right) segment of the fusion transcript;

  • Start 2: beginning of the second (right) segment of the fusion transcript;

  • Stop 2: end of the second (left) segment of the fusion transcript;

  • Gene1: gene on the left side of the fusion;

  • Gene2: gene on the right side of the fusion;

  • % in (category name): fraction of samples within the category with the fusion event.

The checkboxes Disrupted Genes and Gene/Gene fusions are filter tools. When selected, Disrupted Genes removes all the rows (fusion events) which have no genes assigned to it, i.e. those that merge two intergenic regions. However, if there is a fusion between a gene and an intergenic region, it will be kept in the table. The Gene/Gene fusions filters in only those fusion events which have an annotated gene on both sides of the breakpoint. In the other words, only gene to gene fusions are kept in the table.

References

  1. Annala MJ, Parker BC, Zhang W, Nykter M. Fusion genes and their discovery using high throughput sequencing. Cancer Lett. 2013;340:192-200.

  2. Costa V, Aprile M, Esposito R, Ciccodicola A. RNA-Seq and human complex diseases: recent accomplishments and future perspectives. Eur J Hum Genet. 2013;21:134-142.

  3. Kim D, Salzberg SL. TopHat-Fusion: an algorithm for discovery of novel fusion transcripts. Genome Biology. 2011;12:R72

  4. Dobin A, Davies CA, Schlesinger F et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15-21.

  5. Haas B.J, Dobin A, Li B. et al. Accuracy assessment of fusion transcript detection via read-mapping and de novo fusion transcript assembly-based methods. Genome Biol. 2019;20**:**213 (2019)

Additional Assistance

To obtain a .fusion file that summarizes the chimeric reads across samples, double click on Chimeric junctions data node to open the report, click on View output files link, select the chimeric_result.fusion file and click download icon (Figure 3). The file is human-readable and can be opened in a text editor (example in Figure 4). For details refer to STAR's documentation.

Choose the STAR-Fusion annotation from the drop-down list. We provide automatic downloads of the distributed by Trinity Cancer Transcriptome Analysis Toolkit (CTAT) for Human hg38 (Gencode v22 and v37) and hg19 (Gencode v19) assemblies (Figure 6). If you wish to add your own STAR-Fusion library, you can either import a pre-build CTAT library or gather the appropriate files and build it in Partek Flow. for more details on the files you need.

TopHat-Fusion is a version of with the ability to align reads across fusion points and detect fusion genes resulting from breakage and re-joining of two different chromosomes or from rearrangements within a chromosome (3). It is independent of gene annotation and can discover fusion products from known genes, unannotated splice variants of known genes or completely unknown genes.

Selecting the Fusion results data node opens the task menu, with four options (Figure 12): Fusion report, Fusion attribute report, and Download data.

Another table which can be generated based on a Fusion results node is the Fusion attribute report. When the option is selected, it brings up the dialog shown in Figure 17. First, you need to specify one or more categorical attributes (Select attribute(s) to test), which have at least two categories (see ). Second, you need to specify an annotation file, using the Assembly and Gene/feature annotation drop-down lists.

TopHat-Fusion. An algorithm for discovery of novel fusion transcripts. Accessed on April 25, 2014

If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.

plug-n-play libraries
See here
TopHat
Data summary report,
Data tab
http://tophat.cbcb.umd.edu/fusion_index.html
our support page
STAR
described by the authors of STAR-Fusion
Task management
Importing a Pipeline
STAR Algorithm
TopHat-Fusion Algorithm
Figure 1. Controls of the STAR fusion gene detection algorithm (aligner defaults are shown)
Figure 2. Chimeric results node as a result of STAR’s chimeric alignment algorithm
Figure 3. STAR task details page, click on View output files to download chimeric_reuslt.fusion file
Figure 4. STAR's .fusion file opened in a text editor (example)
Figure 5. Choose STAR-Fusion from the menu
Figure 6. STAR-Fusion task set up
Figure 7. STAR-Fusion advanced options
Figure 8. Fusion predictions data node
Figure 9. STAR-Fusion fusion prediction table
Figure 10. Activating TopHat-Fusion algorithm for detection of fusion genes (bovine genome shown as an example)
Figure 11. Fusion results node as a result of TopHat-Fusion algorithm
Figure 12. TopHat-Fusion results section of the toolbox, invokable on TopHat-Fusion's results (data size is an example)
Figure 13. TopHat-Fusion's .fusion file opened in a text editor (example)
Figure 14. Selecting an annotation file to annotate TopHat-Fusion results (an example)
Figure 15. Fusion report task node as a result of annotating Fusion results generated by TopHat-Fusion algorithm
Figure 16. Fusion report of TopHat-Fusion fusion gene detection algorithm. Each row represents a fusion gene candidate (an example is shown) (table truncated)
Figure 17. Selecting attributes to be tested for association with fusion events (the attribute Conception and the annotation files are an example)
Figure 18. Fusion attribute report node as a result of annotating Fusion results generated by TopHat-Fusion algorithm
Figure 19. Fusion attribute report of TopHat-Fusion fusion gene detection algorithm. Each row represents a fusion gene candidate (the example shows comparison of number of fusion events detected in the AI group vs. the SCNT group)