Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
The Chromosome view in Partek Flow is a visualization tool for next-generation sequencing (NGS) and microarray data. The viewer can display different types of information, including aligned reads, genomic databases (e.g. genes, transcripts, or variants), isoform proportions, and reference sequence.
This chapter will illustrate how to:
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
Partek Flow plots genomic information on the canvas and is organized into horizontal sections called tracks. The exact number, type, and presentation of tracks depend on several factors, such as the underlying pipeline, available annotation, and the level of zoom. The tracks are added, removed, or customized via the Select tracks dialog (Figure 1).
The content of the Select tracks dialog depends on the data nodes present on the Analysis tab of the current project (an example is shown in Figure 2). Current pipeline is depicted in the center of the window, while data nodes that can be visualised are highlighted by the colour of their layer. Tracks can be turned on or off by selecting the check boxes in the list of possible tracks (and data nodes) on the right. To uncheck all, use the Clear selection button.
For the ease of use, the pipeline and the list of tracks are linked: hovering over the track list highlights the matching data node in the pipeline and vice versa, i.e. selecting a data node in the pipeline panel highlights the respective node in the track list (Figure 3). Once you decided on the tracks that should be plotted, push Display tracks to depict them on the canvas.
\
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
A user can browse through the results by using one of the tools in the navigation bar (on the top of the view; Figure 1). Select tracks tool is the topic of a separate section, while the remaining tools are described below.
You can use the Search box to zoom to genomic features that are available in the annotation track. Start typing a search term and Partek Flow will show you the first 10 suggestions (Figure 2). To select one, use the arrow keys or mouse, or type the full feature name and hit enter.
Next, the mode selector (Figure 3) helps you to quickly navigate through the results.
When pointer mode is activated, the appearance of the cursor will change to an arrow (Figure 4). Pointer mode provides details on any item (e.g. short sequencing read, variant, microarray probe, annotation feature) selected on the canvas. The selected item is highlighted by a green box (Figure 4).
When zoom mode is activated, the appearance of the cursor will change to a plus sign ( + ). With the zoom mode on, you can magnify a specific region by positioning the cursor ( + ) to the left of the area of interest and then <left-click> & drag the mouse to the right of the area of interest (Figure 5). When the viewer refreshes, it will come "closer" to the region that was selected (by halving the number of basis displayed on the screen).
Alternatively, <left-click> on the canvas and Partek Flow will zoom in one level, by halving the number of bases visible on the screen. To zoom out one level Ctrl & <left-click> should be used; as a result, the number of visible bases will be roughly doubled.
When panning mode is activated, the appearance of the cursor will change to four arrows (Figure 6). <Left-click> and drag the canvas to the left or to the right to move upstream or downstream in the genome, respectively.
Zooming out and in can also be achieved with the zoom tool (Figure 7) by moving the golden slider left or right, respectively, or by selecting the magnifying glass icons (– and +).
\
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
The Chromosome view can be invoked from some data nodes on the Analysis tab, giving a global overview of the results; or from certain Task report or result pages, providing a focused view, i.e. pointing to a specific feature of interest.
On the Analysis tab, selecting a data node containing aligned reads, variants, or feature lists, shows Chromosome view in the Exploratory analysis section of the context-sensitive menu (Figure 1).
If Partek Flow has no information on the genome build, you will need to provide the species and genome build in a subsequent dialog (not shown). Otherwise, chromosome view will come up directly.
A new Chromosome view task node will be added to the canvas (Figure 2) and in order to invoke the viewer <double-click> on the node (you can also select it and then go to Task report in the menu). When invoked in this way, the default visualization in the Chromosome view is the first 100,000 bases of the first chromosome.
\
\
Depending on the task report used to invoke the Chromosome view was invoked, some tracks may be pre-selected and customized. For example, when invoked from a variant table, the reads histogram track will be colored by bases (rather than the default of coloring by sample).
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
Data tracks section of the Select tracks dialog enables you to specify the tracks for visualization on the canvas. An overview of the available track types is provided in Figure 1. Note that not all tracks are visible at all times and that their presence depends whether specific data types are present in the project as well as the zoom level. The tracks can be customized and their appearance changed by using the control panel on the left. Different track also has it own specific configuration settings which allow you to pin the track to move the track, pin the track, change the styles of the track and hide the track using the settings options on the left of each track ()
Alignments track
Isoform proportion track
Variants track
Amino acids track
Reads pileup track
Probe intensities track
Peaks track
Figure 1. Data tracks in Chromosome view (examples)
Alignments track displays a histogram view of alignments present in .bam files in a stacked histogram fashion (similar to Partek Genomics Suite). The y-axis shows number of (raw) base calls per position. By default, reads are coloured by sample. Variations on the track are displayed below. They can be configured in the following ways:
The histogram can be colored by sample, attribute groups or by base call. By default, it is colored by sample except when the chromosome viewer is invoked from a variant data table.
When colouring reads by sample, the reads are stacked (on top of each other), i.e. in the example below there are more reads in the red sample than in the blue sample. This is an example of a Sum histogram type. This can also be configured to display overlays or averages.
Reads can be split into two tracks corresponding to the strand that they map to. This can be invoked by clicking the Split read histogram by strand checkbox.
Y-axis can be scaled for all samples to have the same max or each sample has its own specific max
For more information on configuring tracks see our page on Customizing the chromosome viewer
Reads coloured by sample
Reads coloured by base calls
Reads split by strand
Figure 2. Various configurations of the Alignments track
The Isoform proportion track displays the reads mapped to transcripts and helps to visualize differential expression and alternative splicing, this track is only available on the feature list data node which is generated from differential expression analysis. It uses standard symbols for exons (boxes) and introns (lines connecting the boxes). The height and color of each transcript is proportional to its LS mean value. Figure 3 shows a gene with two transcripts in RefSeq database; the top transcript is more abundant than the bottom transcript and is preferentially expressed in the "blue" condition (labeled as 0 uM). The bottom transcript, on the other hand, seems to be expressed at the same level across all three conditions (i.e. 0 uM, 5 uM, 10 uM). The number and structure of transcripts on the plot depend on the transcript model that was used for mapping.
For more information on configuring tracks see our page on Customizing the chromosome viewer
Variant tracks show single nucleotide variants (SNVs) and indels, and appear in the Select track dialog if Detect variants task has been performed. Presentation of variants depends on the level of zoom. With low power magnification, SNVs are seen as purple columns and indels are bars (insertions: green bars; deletions: red bars) (Figure 4).
Figure 4. Variants track at low power magnification: SNVs are symbolized by purple columns and an insertion is presented as a green bar (an example is shown). A deletion is presented as a red bar (none is visible on the figure)
Upon zoom-in, SNVs are drawn as pie charts, representing the proportion of each base call at that locus (Figure 5)
Figure 5. Variants track at high power magnification: each SNV is presented as a pie chart and each slice symbolises the relative frequency of each base call (an example is shown). Base call colour codes are given by the track name
At higher modification, insertions are seen as green boxes, with individual inserted bases presented using a pie chart, while deletions look like red boxes and the affected bases are also presented by a pie (Figure 6).
Figure 6. Variants track at high power magnification: insertion is presented as a green box, deletion is presented as a red box. An example is shown.
Amino acids track becomes available in the Select tracks dialog after completing the Annotate variants task. The actual appearance of the track depends on the zoom level. With low-power magnification, you will see a message View not available at this zoom level, Please zoom in to view amino acids.
When you zoom closer to the genome, all the amino acids become visible as colored boxes (Figure 7) and labeled using the single-letter amino acid code. Alternative amino acids are depicted as additional box on the top of the consensus sequence.
Figure 7. Amino acids track at high power magnification: consensus amino acid sequence is at the bottom of the track, while a variant is shown on the top (change from Threonine to Proline is shown)
If an amino acid spans two exons, its box will be truncated and the line connecting the exons will be dashed. An example is in Figure 8.
Figure 8. Amino acids track: exon-spanning amino acids indicated by truncated boxes (i.e. Alanine on the left) (an example is shown)
An empty gray box on the top of consensus sequence is used to indicate a STOP codon, which is a consequence of a mutation (Figure 9).
Figure 9. Amino acids track: A variant which is in fact a STOP codon is represented by an empty box, as seen on the top of the G (an example is shown)
Untranslated bases, such as ones downstream of a STOP codon are depicted by lighter shades. Figure 10 shows two transcripts in an amino acid track; the direction is from left to right, so amino acids downstream of a STOP codon (P > G > L) are lightly shaded.
Figure 10. Amino acids track: amino acids downstream of a STOP codon are depicted by lighter shades. STOP codon is represented by "." in the middle, direction is from right to left (an example is shown)
Reads pileup track plots the short sequencing reads, as present in the .bam file. The track is not on by default (go to Select tracks to turn it on) and its appearance depends on the magnification; if you are zoomed out a message - Zoom in to view individual reads - will be displayed.
Forward strand reads are in sky blue, while reverse strand reads are in parakeet green. If paired-end chemistry was used, the paired reads will be depicted as half reads within a gray rectangle encompassing the pair (Figure 11). Singletons will be depicted as thicker reads.
Figure 11. Reads pileup track: short sequencing reads are represented as bars. Paired-end reads are located within a gray box encompassing both pairs. Singletons, such as that on the top right, are depicted as thicker reads (an example is shown)
If you used a junction-aware aligner (such as TopHat or STAR), the junction reads will be depicted using dashed lines, which connect exon-spanning parts of the same read (Figure 12).
Figure 12. Reads pileup track: junction reads are depicted using dashed lines. A RefSeq track is added at the top, to visualise the exons (an example is shown)
Deleted bases can also be seen on a Reads pileup track, as fat black lines (Figure 13).
Figure 13. Reads pileup track: deleted bases depicted using fat black lines (an example is shown)
Microarray probes are visualised by the Probe intensities track. The probes are shown as bars and their colour depends on the probe intensity, ranging from white (low) to admiral blue (high) (Figure 14).
Figure 14. Probe intensities track: probes are depicted as bars and their colour reflects the intensity (an example is shown)
As with the Reads pileup track, probes may not be visible with low power magnification and you will see a message - Zoom in to view individual microarray probes.
The Peaks track displays the results of peak caller tasks. It displays a bar that spans genomic location of each peak call. If summits are identified by the peak caller, such as the MACS2 algorithm, then its genomic location is marked by a vertical line. The color marks either the pair being compared by the peak caller or, if present, the sample attribute associated with the sample.
Figure 15. Peaks track shown for two different pairs
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
By default, the Chromosome view shows a cytoband track at the top of the canvas. If a cytoband file for your genome has not been added to Partek Flow, a warning will appear (Figure 1). In that case, go to the Library File Management page and download or create a cytoband file.
Figure 1. Warning message indicating that Chromosome view can not be launched because of missing cytoband file
The red box (Figure 2) indicates the part of the chromosome that is currently depicted on the canvas.
Figure 2. Cytoband track: highlighted part is currently depicted on the canvas (an example is shown)
Reference Genome
The sequence of the reference genome is added to the Chromosome view by default, as long as it has been added to the respective genome on the Library File Management page. However, its presence (or absence) in the viewer depends on the current magnification. At low power, the track is hidden and you will see the message - Track hidden (zoom to view). At high power, on the other hand, the Reference genome track becomes visible (Figure 3) and is supplemented by the genomic coordinates (below the sequence). A vertical guide helps you to align the bases between Aligned reads and Reference genome tracks. Depending on the reference genome file, some bases may be shown in lowercase letters, symbolizing repetitive sequences, or other sequences masked by a tool such as RepeatMasker.
Figure 3. Reference genome track. Numbers beneath the sequence are coordinates
Variant Database
If a variant database file (such as dbSNP) for your genome is present on the Library File Management page, you will be able to include variant annotation track in your visualization (to add a variant database to the viewer, use the control panel on the right).
The variants will be shown adjacent to the Reference genome track (Figure 3). If the database contains no frequency information on alternative alleles, the alleles will be drawn as bars (an example is the SNP on the left in Figure 4). If the frequency information is available, the relative frequency of each variant will be represented by a column (the SNP on the right in Figure 4).
Figure 4. Reference genome track with added variant annotation: single nucleotide variants present in the chosen database are depicted as bars (if no frequency information is available) or columns (columns reflect relative frequency of each alternative allele as stored in the database)
Note that the frequency information for each allele will be parsed out from the chosen database. That information can be retrieved by selecting a variant using the selection mode and will be shown in the Selection details section of the control panel. Using the example shown in Figure 4, the details of the left database variant can be seen in Figure 5. The most frequent allele at that locus is G (hence, the yellow column is plotted above the Reference genome track), which matches the base call of the reference genome.
Figure 5. Selection details section of the control panel showing details of a SNV, as present in the selected database
If your variant database stores indels, they will be depicted using green (insertion) or red (deletion) symbols (Figure 6) pointing to deleted bases.
Figure 6. Reference genome track with added variant annotation: insertions are shown in green, deletions in red. In this example, an insertion of a single base has described in the database, between G and T. An adjacent deletion of T and C bases has also been seen before
Additional annotation tracks can be added to the viewer with the help of the Select tracks dialog as long as they have been associated with the genome you are working on in the Library File Management:
A common choice of an additional track is a transcript database, such as RefSeq (Figure 7). All the database entries are displayed, using a common depiction of exons as boxes and introns as lines connecting them. Untranslated regions (UTRs) are seen as narrow boxes. The arrows indicate directionality.
Figure 7. Transcript database track: a gene with two transcripts is shown as an example. Exons are plotted as boxes and introns as lines connecting them. Untranslated regions (UTRs) are seen as narrow boxes. The arrows indicate directionality
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
Partek Flow has a List generator function that allows you to investigate unique and common features between different lists within a project. These can be visualized in a Venn diagram consisting of up to five sets.
Feature lists can be generated by RNA-Seq tasks such as GSA and ANOVA. Filter your feature lists as needed. In the example below, three feature lists were generated by filtering the same multivariate GSA result on three different contrasts. Each filtered list appears as its own layer on the Analyses tab.
Figure 1. Project with multiple gene lists
Note that you must be a collaborator in the project to see the purple List Generator button.
Figure 2. List selector window
If the current project does not contain any feature lists, a message will appear.
Figure 3. Error message appears when no lists within a project
The List name field can be customized to provide specific information about the selected list and will also serve as the set's name on the Venn diagram. Hovering your mouse over a specific feature list highlights its node location on the task graph.
Figure 4. Selecting feature list highlights data node
Conversely, clicking a feature list data node highlights the feature list on the right panel.
Figure 5. Selecting the data node
Figure 6. The Select identifier dialog box
For your convenience, example entries for each possible identifier is displayed. Select the radio button of the identifier you would like to use for comparison. Note that the List generator only uses Identifiers to determine common features within a list. For feature lists generated by automated tasks like Cufflinks' novel transcript discovery function, keep in mind that these identifiers may not be referring to the same feature.
Once you have selected your lists, click the Display selection button to invoke the Venn diagram.
Figure 7. Venn diagram window
To save the current diagram displayed, click the Save image button. Images can be either in SVG or PNG format. You can also define the size and resolution you want the diagram to be. To maintain the fidelity of the colors in the Venn Diagram, we recommend downloading the image in PNG format. If a vector-based file format is required, you can also export the diagram in SVG format.
The Venn diagram is also interactive. Hovering over specific sections of the diagram highlight the feature list represented by that section. A tooltip also appears describing the number of elements in that section.
Figure 8. Highlighted Venn diagram
To download a list of identifiers belonging to a section, click on the section (a striped pattern will appear) and click the Download list button. Multiple sections can be selected to download a merged list of all corresponding identifiers. The Total features selected display updates the number of features selected as you select more sections.
Figure 9. Downloading list of sections selected within a Venn diagram
To change the lists displayed, click the Select lists button. This will bring up the List selector again.
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
Sankey diagram is a variation of flow diagram, in which the width of the connectors (links) is proportional to the quantity of flow between two connected items (nodes).
In Partek Flow Sankey plots are used to visualize different classes of nucleotide variants (single nucleotide variants or SNVs, as well as indels) and filter variants based on those classes. Sankey plot is automatically created for each Annotated Variants data node and depicts the classification of variants for the selected sample (one at the time) with respect to the particular annotation model (Figure 1).
Figure 1. Sankey plot showing different classes of variants in the current sample (upper left). The nodes of the plot depend on the annotation database. The width of the links reflects the number of variants. Numbers in the parenthesis correspond to the number of particular variants in each class. Clicking on a node performs filtering
Plot controls are on the left. The first one is the Variant file drop-down list, which is used to display another sample within the same project. The Selected annotations panel shows the annotations that are visible on the plot (as data nodes); to remove one from the plot, select the red minus icon on the right. To add an annotation use the - Add annotation - drop-down list. The sequence of the annotations / nodes from left to right can be changed using the Selected annotation panel, by dragging and dropping the annotation classes. Figure 2 shows the same plot as in Figure 1, after moving the Region class to the left.
Figure 2. The same Sankey plot as in Figure 1, but with a different order of the nodes
The plot is interactive and acts as a filter tool. Selecting one of the variant classes on the plot (left-click, Control & left-click to select multiple) and then pushing the Apply selection as filter turns on the filter and applies it to the table of variants below the plot. To turn off the filter, click outside the plot. The filter shown in Figure 3 keeps only SNVs in the variant table (i.e. filters out the indels).
Figure 3. Sankey plot used as a filter tool. Selecting one (or more) variant classes on the plot applies the filter on the variant table. Selected class is highlighted in green. To apply the filter, click on the Apply selection as filter button
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
Controls
Chromosome view can be customized by using the control panel on the left (Figure 1). The Attribute and Order By controls show options depending on the current project, while the content of the Annotate amino acids control depends on the annotation files associated with the current genome build in the Library File Management. In order for any change to take place, push the Apply button.
Figure 1. Control panel (an example is shown)
The first option, Group data by, specifies the number of Alignments tracks (Figure 2). All will result in only one track, with all the samples on it. Sample creates one track per sample, while Attribute produces one Alignments track per level of the Attribute (i.e. one track per group).
All
Sample
Attribute
Figure 2. Group data by: All creates one Alignments track for the entire project, Sample creates one Alignments track for each sample, Attribute creates one Alignments track for each group (an example is shown)
Annotate amino acids by controls the appearance of the Amino acids track and allows you to pick the transcript database that will be used to plot codons (Figure 3). The drop down list shows the databases currently available for the selected genome (additional databases can be added via Library File Management).
Figure 3. Annotate amino acids by: transcript models currently associated with the chosen genome are displayed in the drop-down list and can be used to plot Amino acids track (an example is shown)
Color by option affects the colouring of the Alignments track and Isoform proportion track. When Sample is selected from the drop-down list, individual samples will be shown on the aforementioned tracks, each sample being given a different colour. If attributes were assigned to samples, they will also be visible in the Color by drop-down (Figure 4) and you will be able to highlight levels of the selected attribute (Figure 5).
Figure 4. Color by: the options control colouring of Alignments and Isoform proportion tracks. Sample, Base, and Match options are present by default. If attributes have been assigned to samples, they will appear in the drop-down list. In this example, that is the "Tissue" attribute
Color by Sample
Color by
Figure 5. Difference between Color by Sample and Color by . Color by Sample uses different colours to depict individual samples; Color by uses different colours to depict levels of the selected sample attribute (as present in the Data tab). Alignments and Isoform proportion tracks are shown (an example)
The effect of the option to Color by Base can be seen with high power magnification (Figure 6). Individual base calls are highlighted by different colours. When that option is chosen at low power magnification, all the bases are shown in grey.
Figure 6. Color by Base highlights the base calls by colours. Different colours are visible with high power magnification; otherwise all the bases are shown in gray (an example)
Finally, Color by Match can be used to quickly identify mismatches against the reference genome. A matching base is coloured in blue, while mismatch bases are shown in yellow.
The maximum of the y-axis of Alignments tracks is set by Read histogram Y axis scales option (Figure 7). When using Project max, the y-axis for each track is set individually, based on the maximum within that sample. On the other hand, Track max uses the maximum across all the samples and uses that value as the maximum for all.
Project max
Track max
Figure 7. Read histogram Y axis scales. When set to Linked, all the tracks have the same Y axis maximum, which depends on the sample with the highest coverage. Using Independent sets Y axis maximum independently for each sample.
Read histogram type changes the presentation of the Alignments track and should be used in conjunction with the Group data by and Color by tracks to get the desired visualisation.
When set to Sum, the Read histogram type shows the sum of base calls at each position, i.e. total coverage per position. Figure 8 shows an Alignments track with three samples. With the Sum option, the number of reads at each base in each sample is added and displayed. The contribution of individual samples is not visible since the track is Colored by Group (but that would make sense in this example).
Figure 8. Alignments track: total coverage per locus is shown by using "Read histogram type" set to "Sum" and "Group data by" set to
To show the average coverage per locus, switch Read histogram type to Average and leave Color by as is (i.e. by group) (Figure 9). With this setting, Chromosome view will calculate the average by dividing the total coverage per locus by the number of samples. Note that using Color by Sample would not make sense here. Although Figure 8 looks quite like Figure 7, the y-axis range is different.
Figure 9. Alignments track: average coverage per locus is shown by using "Read histogram type" set to "Average", "Group data by" set to "Attribute", and "Color by" set to
Finally, the option Overlay is useful if you want to directly compare base counts over several samples (or groups) as each will be represented by a line (i.e. no stacking). The example in Figure 10 is based on microarray data, showing three groups on the same Alignments track. The red group has the highest base counts, while the counts in the blue group are much lower.
Figure 10. Alignments track: coverage per locus is shown by using "Read histogram type" set to "Overlay". Each plot is a single experimental condition ("Group data by" set to "Attribute", "Color by" set to ). Lines are rectangular since microarray data is used (an example)
To view the reads histogram grouped by the specific strand that they've mapped to, click the Split read histogram by strand checkbox. This displays the forward reads at the top half of the track and the reverse reads on the bottom half of the track (Figure 11). This track can be helpful in studies such as ChIP-Seq, where strand-specific read distributions can display hallmarks of DNA-protein interactions.
Figure 11. Viewing reads by strand along the reads histogram
You can use the Transcript label selector to specify labels on the reference transcript track and Isoform proportion track (Figure 12).
Transcript label: Gene
Transcript label: Transcript
Figure 12. ranscript label: setting the control to Gene shows only gene label, while Transcript shows transcript labels. Both transcript database and Isoform proportion tracks are affected Short sequencing reads can be coloured by strand (Reads pileup color: Strand) or by base (Reads pileup color: Base).
Reads pileup and probe color
Reads pileup color: Strand
Reads pileup color: Base
Figure 13. Reads pileup color: colouring of the short sequencing reads by Strand or by Base
Probe color control customizes the appearance of Probe intensities track (Figure 14). When set to Intensity, the color of a probe reflects its intensity, using a color gradient from white (low) to admiral (high). Alternatively, when Strand is turned on, probes on the reverse strand are in parakeet green, while probe on the forward strand is in sky blue.
Probe color: Intensity
Probe color: Strand
Figure 14. Probe color: "intensities" colors probes proportionally to their intensity, "strand" uses colors to indicate probe positioning (an example is shown)
To change any of the colors on the canvas, use the Customize track colors tool. A resulting dialog will help you to pick another color (drop-down button opens the color-picker) (Figure 15).
Figure 15. Customize colors dialog: selecting a drop-down arrow opens the color-picker tool
A track can be hidden (meaning it will not be visible) by selecting the red minus, or unhidden by selecting the green plus icon.
The tracks can be reordered by drag and drop.
Figure 16. Track order tool: To change the position of a track drag and drop to the new position. To pin a track to the top / bottom of the canvas, use the up and down arrows. To unpin a track, select the pin icon. A track can be hidden by clicking on the red minus symbol and unhidden by selecting the green plus. Coloured dot by a track names indicates the layers to which the track belongs (an example is shown)
At the bottom of the control panel you will find the Selection details section (Figure 17). It is used to display information on the element selected on the canvas (using the Pointer mode).
Figure 17. Selection details showing information on the element selected on the canvas. The example shows details of a microarray probe. Note the two link-outs ("Browse on UCSC" and "BLAST this sequence")
Visualizations convey complicated information in a powerful but intuitive way. They can help you decide how best to analyze your data and investigate specific findings in greater detail. When you're ready to share the results of your work, you can export any visualization in Partek Flow as a publication-quality image.
The Sources of variation plot presents the relative contribution of each factor included in the ANOVA model towards explaining the variability of the data for a feature analyzed by the ANOVA.
ANOVA partitions the variation for a feature among all the observations into different components. One component is the variation among the different groups of the factors in the model, which represents signal; another component is the variation within groups, which represents noise. The variance within groups is also called error, because it is the variability that is not explained by any factors. F-statistic is a ratio of the two components, that is signal vs noise ratio
F = Between group variance / within group variance
A large F value for a factor indicates that between group variation is greater than within group variation, which results in a small p-value for that factor. Conversely, a small F value for a factor indicates the between group variation is smaller than within group variation, which results in a big p-value for that factor.
In Partek Flow, the Sources of variation plot uses a bar chart to show the F value for each factor in the ANOVA model relative to the error. The plot is available for each gene in an ANOVA result. To open the Sources of variation plot for a gene in the ANOVA results table, click on the icon next to the gene symbol to invoke the plot. It is useful to examine the plot to understand the contribution of all the factors in the ANOVA model to the overall variance in the data set relative to their within group variation (Figure 1).
Figure 1. Source of variation plot
The chart title is the feature ID (e.g. gene or transcript). The Y-axis is the F value, X-axis categories are the factors specified in ANOVA model and the Error term. The Error term will always have an F value of 1 because error, i.e., within group variance, is the denominator in the F value calculation.
The Correlation plot is used to visualize the relationship between feature and numeric attribute values for each sample or cell. The y-axis is the feature and the x-axis is the numeric attribute. Points on the plot are cells or samples.
The correlation plot is available on and (if a numeric attribute is included) task reports.
Click the View correlation plot button in the View column of the task report (Figure 1)
Figure 1. Opening a correlation plot
The correlation plot can be configured using the control panel to the left of the plot (Figure 2).
Figure 2. Correlation plot
The points on the plot are the cells or samples.
The feature ID is shown in the plot title.
The x-axis is the selected numeric attribute.
The y-axis is the normalized counts for the feature.
The points can be colored, sized, connected, and labeled using the drop-down menus in the control panel. By default, the points are colored by the first numeric attribute.
To switch the numeric attribute on the x-axis, use the X-axis drop-down menu.
The linear regression line is shown on the plot as a grey dashed line.
Click < Previous and Next > to move up and down the table.
The volcano plot is a special 2-D scatter plot used to visualize significance and the magnitude of changes in features (e.g. genes or transcripts) within a given comparison. By convention, the X-axis represents the fold change between the two groups and is on a log2 scale. On the other hand, the Y-axis shows negative log10 of the p-values from the statistical test of the comparison.
As a prerequisite for invoking the volcano plot, you must run a Differential gene expression analysis. When setting-up the analysis, include all relevant contrasts you would like to view the volcano plots for.
Once the analysis is complete, open the resulting Feature list data node. Navigate to the Gene list section of the page and click on the volcano plot icon, located on the left of the table header (Figure 1).
Figure 1. Invoking a volcano plot
Each point on the plot represents the statistical result for a single feature (e.g. gene, transcript etc.). The black vertical and horizontal lines represent threshold of fold change and p-value respectively. By default the two vertical lines represent fold changes of -2 and +2, the horizontal line represents significant p-value of 0.05. Features that are up/down-regulated by at least 2 fold and have a p-value less than 0.05 are at the upper-right and upper-left corner of the plot, they are highlighted in different colors. By default, significantly up-regulated features are in red, significantly down-regulated features are in blue, and the remaining features are in grey or black (Figure 2). The plot header is derived from the name of the contrast.
Figure 2. Volcano plot: each dot on the plot is a single gene/transcript/feature. Horizontal axis: fold change; vertical axis: p-value (in log10 scale). Colour coding is based on the fold change. Thick vertical lines highlight fold changes of -2 and +2, while a thick horizontal line represents a p-value of 0.05
The Axes and Statistics menus allow to customize which data sources will be used for the axes as well as annotations, limits and significance thresholds for the plot (Figure 3).
Figure 3. The plot can be configured using the menus on the left.
The colors can also be customised using the Style menu (Figure 4).
Figure 4. Volcano plot with a customized color scheme.
You can also highlight and label any of the genes in the plot (Figure 5).
Figure 5. Easily highlight any gene by clicking on it directly.
Click the Export image button on the left panel to save a PNG, SVG, or PDF image to your computer.
Click the To notebook button on the left panel to send the image to a page in the Notebook.
The Interaction plot is used to visualize values of a feature for groups considered by a statistical test. An interaction tests for whether the effect of one factor is dependent on another factor. For example, in an experiment where drug-resistant or drug-sensitive cell lines are treated with either vehicle or drug, we would expect the effect of the drug to depend on whether the cell line is drug-resistant or drug-sensitive; to account for this, we would include an interaction between cell line and treatment in the statistical test.
The x-axis of the interaction plot is one of the categorical attributes or factors included in the interaction. The y-axis is the LSMean of the normalized counts for a feature. The points represent the groups formed by combination of the two factors included in the interaction, e.g., drug-sensitive vehicle treated, drug-sensitive drug treated, drug-resistant vehicle treated, and drug-resistant drug treated.
The Interaction plot is available in Feature list data nodes generated by , , and Hurdle model differential analysis tasks when two categorical attributes are included as factors in the statistical test. For ANOVA and Hurdle model, an interaction between the two factors must also be included.
Click the View interaction plot button in the View column of the task report (Figure 1)
Figure 1. Opening an interaction plot
The interaction plot can be configured using the control panel to the left of the plot (Figure 2).
Figure 2. Interaction plot optionsPoints on the plot are the groups formed by the combination of the two factors in the selected interaction.
The feature ID is shown in the plot title.
The y-axis is the LSmean of the normalized counts for the feature.
The x-axis is used to group points by the levels of one of the two factors.
The points on the plot are colored, shaped, and connected by the factor in the selected interaction that is not being used as the x-axis.
To switch between interactions included in the statistical test, use the Interaction drop-down menu.
To switch the factors being used to group and color/shape the points, use the X-axis drop-down menu to change the x-axis grouping factor.
Click < Previous and Next > to move up and down the table.
\
A histogram is a plot that summarizes the underlying frequency of a set of data with the variable of interest on one axis and the frequency distribution of that variable in the other axis. In Partek Flow, histogram can be invoked on continuous or categorical variable.
From a data viewer session, click on New plot > Bar chart (Figure 1).
Figure 1. Select the Bar chart option from the New plot menu.
Upon clinking on the Bar chart menu, a dialogue opens up with the different data sources that can be displayed on the histogram. Select your data node of interest and the content data (Figure 2).
Figure 2. Select the appropriate data node and content data to display on the histogram plot
The first row in the data will be displayed by default in the histogram and in this case, it is the histogram of the expression values for the gene A1BG (Figure 3).
Figure 3. Histogram showing the distribution of AIBG expression (red)
Change the data displayed on the histogram by using the Configure > Axes menu and selecting the desired variable to display. Here the data displayed was switched to “Expressed genes” which is a continuous variable (Figure 4).
Figure 4. Histogram showing the distribution of expressed genes variable (red rectangle)
Use the "Sort by" function to sort the plot. The default sorting is by Value on the x-axis and this default setting is sorted in ascending order. Users have the option to change that by changing the Default to value or frequency in the sort option (Figure 5)
Figure 5. Sort by function can be by Value or Frequency (red)
Users can color the histograms by a categorical attribute using the Color by function (in red below). The bars were colored by the graph-based classifications in the example below (Figure 6).
Figure 6. Histogram annotated by automatic classifications
The bars in the histogram above were stacked. They can be unstacked using the Style menu as seen below in red (Figure 7).
Figure 7. Unstacked bars in the histogram plot
Users also have the option to bin by either Count or Size. When binned by Count, the user specifies the number of bins for the data and the distribution is fit into the specified number of bins. Data below is binned by Count (Figure 8).
Figure 8. Histogram of expressed genes with number of bins specified as 10
When binned by Size, the user specifies the number of items in the bin (size of a bin). This is used to calculate the number of bins required for the data. Data below is binned by Size (Figure 10).
Figure 9. Histogram of expressed genes with size of bin specified as 75
Heatmaps can be used to explore gene expression trends across samples. They are a useful tool in visualising the results of the hierarchical clustering of a gene list. They can be employed to visually distinguish between different samples and different treatments across a dataset. After having performed a hierarchical clustering task, double-clicking the Hierarchical clustering / heatmap task node will automatically launch the heatmap in a new data viewer session (Figure 1).
Figure 1. A heatmap showing normalized gene (columns) expression levels across samples (rows).
Samples are shown on rows and genes on columns. Clustering for samples and genes is shown through the dendrogram trees. More similar samples/genes are separated by fewer branch points of the dendrogram tree.
The heatmap displays standardized expression values with a mean of zero and standard deviation of one.
The heatmap can be customized to improve data visualization using the menu on the Configuration panel on the left. You can select Annotations to choose the appropriate attribute to annotate the rows, as well as coloring the dendogram using the Dendograms menu. The color palette used to viusalize gene expression can be modified by accessing the Heatmap menu on the left and selecting the colors to be used from the sliding bar (Figure 4). (Figure 2).
Figure 2. Annotations menus used to costumize the appearace of the heatmap.
Once selected, the custom annotations will appear on the figure (Figure 3).
Figure 3. Annotated heatmap.
A Pie Chart is a type of graph that displays data in a circular graph. It gives you a snapshot of how a group is broken down into smaller pieces. Because the pieces of a Pie chart are proportional to the fraction of the whole in each category. In order to make a Pie chart, you must have a list of categorical variables (descriptions of your categories, like ‘cell type’) as well as numeric variables (e.g., cell numbers). In Partek Flow, the default numeric variable is the cell numbers. Therefore, the Pie chart indicates the fraction of the whole cell numbers in each category.
To make a Pie chart, open a new Data Viewer session in Flow (Figure 1).
Figure 1. New Data Viewer session in Flow.
Select New plot > Pie chart from the menu on the left. Then select any data node to create a new empty Pie chart (Figure 2).
Figure 2. Create a new empty Pie chart in Data Viewer.
Categorical attributes that can be used for the Pie chart would display after any data node from the Data card on the left side of the Data Viewer has been clicked.
Simply select the attribute, in this case cell type, to generate a new Pie chart (Figure 3).
Figure 3. Add attribute to Pie chart.
A Pie chart that demonstrates the fraction of the whole cell numbers in each sample has been created (Figure 4).
Figure 4. Example Pie chart with cell types.
The specific cell number in this category and its percentage of total would appear when a cursor is moved over on it. For instance, NK cells include 16,434 cells which accounts for 2866% of the total cells in the study (Figure 5).
Figure 5. The mouseover example Pie chart.
Accessing the Configure > Axes menu you can easily split the plot by any categorical attribute available, in this case the Pie chart was split by age (Figure 6)
Figure 6. Configuration and splitted example of Pie chart.
Figure 7. Different modes of Pie Chart.
Once you are pleased with the appearance of the Pie plot, push Export image button to save it to the local machine or click Save button to save the Data Viewer. The resulting dialog (Figure 8) controls the Format, Size and Resolution of the image file. The image will be saved in your favorite format (.svg, .png and .pdf).
Figure 8. Save image dialog (default settings)
The Position box enables the user to visualize a region in the genome. Coordinates are accepted in the following format: chromosome:start – end (zero-based). To show an entire chromosome, it is sufficient to enter just the chromosome number. The U-turn icon on the right takes you back to the original view, i.e. resets the zoom level to the view that was shown when the viewer was first opened.
The next time you want to go directly to the same location, select the name of the bookmark (example in the Figure 8 lists B2M - exon #4 as the bookmark name) and Partek Flow will plot the region as defined in the Location column. To remove a bookmark, select the red cross icon .
Once the plot has been modified, you can save the current appearance of the canvas by using the save icon . The resulting dialog (shown in Figure 9) enables you to change the image Format (options include: .svg, .png, .pdf), Size, and Resolution. The image will be saved in your Downloads directory.
Another way to get the Chromosome view is through a Task report. You can launch the viewer by selecting the chromosome icon in the View column (Figure 3) of GSA or View Variants reports. In that case, the Chromosome view will browse directly to the selected genomic location (i.e. a transcript or a variant, depending on the pipeline).
Insertion
Deletion
To invoke the Venn diagram, click the button. This brings up the list selector window. Select the data nodes you would like to include on the diagram using the check boxes.
Click on the icon to select the identifier that will be used to compare between the sets. The Select identifier dialog box will then appear.
If a variant database is available for the current genome, the variants can be added to the track. To show the variants, point the Variant database control to the database of your choice.
The position of the tracks on canvas can be controlled by using the Track order tool. If you want a track to be visible all the time, i.e. while scrolling up or down, pin it to the top or to the bottom. Below shows the Cytoband track pinnned to the top of the canvas and Reference genome track pinned to the bottom of the canvas. To unpin a track, click on the pin icon (). The track will be unpinned and a message No tracks are pinnned to the top / bottom will appear. To pin a track, drag the track name to the No tracks… message. Alternatively, you can use the green arrows () to pin a track. When you mouse over an arrow, the new position of the track will be highlighted on the canvas; click on the arrow to accept.
If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.
If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.
Plot controls are in the panel on the left. The <Previous and Next> buttons enable you to switch between features as they appear in the ANOVA results table. The bar chart colors are controlled by the Customize colors option. If you want to customize the colors, click the icon to configure the color palette.
If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.
Click the Save image button to save a PNG or SVG image to your computer.
Click the Send to notebook button to send the image to a page in the Notebook.
If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.
If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.
Click the Save image button to save a PNG or SVG image to your computer.
Click the Send to notebook button to send the image to a page in the Notebook.
If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.
Click the Save image button to save a PNG or SVG image to your computer.
Click the Send to notebook button to send the image to a page in the Notebook.
If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.
Click the Save image button to save a PNG or SVG image to your computer.
Click the Send to notebook button to send the image to a page in the Notebook.
If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.
If more than one categorical attribute were added to the data, Pie chart would have two different modes - Pointer mode and Zoom mode (Figure 7). The default is Pointer mode, the Zoom mode is accessed by clicking on the zoom icon and subsequently clicking on any slice of interest. In this case Figure 7 shows the distribution of NK cells between male and female patients in the supercentenarian panel.
If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.
Stacked violin plots can be used to summarize multiple features expression in one image.
Start by adding a 2D scatter plot to the Data viewer (Figure 1).
Select the data node from the pipeline to use (e.g. normalized counts) then choose the attribute of interest for the X-axis (e.g. cell type or sample name) and select Numeric list for the Y-axis (Figure 2).
Navigate to the left panel and open the Axes Configure option. Add the features (e.g. genes) to include on the Y-axis by typing the name and selecting the feature (Figure 3). These can be ordered by drag and drop within the box which is reflected on the visualization.
Use the Show advanced options to modify additional axes parameters on the visualization.
Navigate to the left panel and open the Style Configure option. Use the Summary options to toggle on and off the Violin and other visual preferences (Figure 4).
Use the Show advanced options to toggle the points (dots) on and off.
When you are ready to export this image to your machine, navigate to the right corner and choose Export image with the necessary format, size, and resolution requirements. Click Save.
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
Dot plot is special 2D scatterplot with X-axis is a categorical variable and Y-axis is a numerical variable, typically used to visualize gene expression (RNA-seq or scRNA-seq), with dots representing observations (samples or cells), X-axis showing treatment groups and y-axis showing expression level on an appropriate scale.
In differential analysis report data node, select the dot plot icon () in the View column to invoke the plot in data viewer. The plot will be displayed in a new browser tab (an example plot is on Figure 1).
Figure 1. Dot plot (an example). Each dot is a sample. The plot title (ADRBK2) is the gene symbol of the selected gene. Expression levels are on the vertical axis
The chart title is based on the feature (e.g. gene or transcript) that the plot was invoked on. The y-axis is scaled automatically, based on the range of the data, and the units correspond to the input node, i.e. if the data were normalized using transcripts per million (TPM), the y-axis will be in TPM-normalised counts. Dots represent observations (e.g. samples or cells). Hovering the cursor over a sample invokes a popup balloon message shows sample ID and the respective expression value.
The order of the group on the X-axis by default is based on the order in the attribute management page in the Data tab. However, you can manually click on the name of the group on X-axis, drag and drop to change the order (Figure 2)
Figure 2. Drag and drop to change the order of the X-axis groups
When you have a lot of observations to display, you can choose to add box-whiskers and/or violin plot on the graph (Figure 3), by turning on/off the options to display the different type of plots in Style card from the configuration panel.
Figure 3. Add Box-Whisker and Violinplot
You can have multiple categorical attributes on an axis, e.g. X-axis represent both cell type and age group (Figure 4). Click on the attribute to drag to change the order of the attributes
Figure 4. X-axis represent multiple categorical attributes
Click the Export image button to save a PNG, SVG, or PDF image to your computer.
Click the To notebook button to send the image to a page in the Notebook.
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
The transcription start site plot (TSS) displays the number of reads that map around the transcription start site of genes within defined regions, such as peaks called during ChIP-seq experiments. This visualization can help identify patterns of chromatin, transcription factor and other protein binding sites along this stretch of the DNA.
You can invoke the creation of the plot on the annotated peak data node. After the peaks are annotated with transcripts, coverage of the reads across the genes overlapped with the regions will be calculated, this information is presented by profile plot (Figure 1) and heatmap (Figure 4).
Select the Annotated regions results node then under Exploratory analysis in the task menu, select TSS plot.
Figure 1. TSS profile plot\
In the profile plot, the Y-axis is the total read counts, X-axis is the window defined in the annotation task, mid point is the transcription start site (TSS) of all the genes overlapping with the peak regions, the start and end point of X-axis is defined by the promoter upstream/downstream limit specified in the annotate peaks task (Figure 2). Each line represents a selected sample total read counts at each position, all the selected samples are shown in the plot to compare.
Figure 2. Annotate peaks task dialogBy default, all the samples are selected to display, to remove a sample track, click on the button next to the sample name on the sample control panel (Figure 3); to add a sample in the plot, select the sample name from the Add sample drop-down list, a new line will be added on the plot.
\
Figure 3. Sample selection control panel\
The heatmap is another way to view the overlap between the gene body and peaks detected (Figure 4). The X-axis represents the same information as the profile plot. While the maximum value of the Y-axis varies among different samples. In the heatmap, each row represents a gene, while the color encodes the total read count at each position and the genes are sorted based on total read counts in descending order.
Figure 4. TSS heatmap\
The color and color scale on the heatmap can be customized using the control panel on the left (Figure 5). You can change the number of heatmaps per row, with a higher number decreasing the heatmap size. For the heat map colors, changing the Low and High values will change the color scale of the heatmap. You can also change the color of the profile line by selecting Customize sample colors.
Figure 5. TSS heatmap control panel\
\
\
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
After performing exploratory analyses such as PCA, UMAP and t-SNE is is helpful to visualize the results on a scatter plot. This can help visually assess the source of variation affecting the results of an experiment, classify cells and select samples for downstream analysis. Here we have a PCA scatter plot generated from the analysis of 12 samples from a scRNA sequencing study. The first three most informative PCs are plotted by default and the percentage of variation explained is stated next to each one of them.
Figure 1. Example of a 3D PCA scatterplot .
The Configure > Style menu on the left can then be used to color the features in the scatter plot based on an attribute (Figure 2). In this case, Figure 3 shows the cells being colored based on their cell-type.
Figure 2. Customization menu.
Figure 3. PCA scatterplot colored by cell-type.
Additionally, you can adjust the opacity of the points to better assess the density across the groups (Figure 4). It is also possible to split the plot based on the same attribute in the Configure > Grouping menu (Figure 5)
Figure 4. Adjusted opacity shows point density more accurately.
Figure 5. Splitting by an attribute can help better visualize their effect on the data.
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
Click the Save image button to save a PNG or SVG image to your computer.
Click the Send to notebook button to send the image to a page in the Notebook.