Variations in nucleotide sequence, in the form of single nucleotide variants (SNVs) and insertion and deletion events (INDELs), can exist within the germline or can be acquired by somatic alterations. Partek Flow provides pipeline creation tools to identify both SNVs and INDELs using aligned reads generated from targeted, whole exome, or whole genome DNA-Seq (or RNA-Seq) data. Detection of these variants can be performed by comparison against either the reference sequence utilized for alignment or among paired samples in a project. Tools for variant detection are performed on either Aligned reads or Filtered reads data nodes (Figure 1), and the Detect variants task node will produce a Variants data node. The Variants data node will contain Variant Call Format (vcf) files for each sample in the project. Three detection tools, each employing unique algorithms to identify variants in aligned sequence data, are available under the Variant callers section of the context sensitive menu:
Figure 1. Showing Variant callers from an aligned reads node
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
A very popular variant detection approach that performs well in many situations, FreeBayes (version 1.0.1) employs a Bayesian statistical framework to determine the most likely combination of genotypes in the sample(s) at each position in a reference genome for any number of individuals from a population. It is haplotype-based, calling variants based on the literal sequences of reads aligned to a particular target and not their precise alignment. This method can identify both single nucleotide variants and insertions/deletion events. Information on the model underlying the variant detection are detailed by Garrison et al.1
Selecting FreeBayes from the context sensitive menu will bring up the Freebayes task dialog (Figure 1), which contains two sections: Select Reference sequence and Advanced options.
Select Reference sequence will specify the reference assembly to utilize for variant detection. If the alignment was generated in Partek Flow, the Assembly will be displayed as text in the section, and you do not have the option to change the reference. In the event that alignment was performed outside of Partek Flow, you will need to select the appropriate Assembly utilized for alignment in the drop-down list. Assemblies previously added to library files (see Library File Management) will be available for selection or New assembly… can be utilized to import the reference sequence to library files from within the task.
Advanced options provides a means to tune parameters in the variant detection for optimal performance. Upon invoking the task dialog, Option set is set to Default, and these parameters are provided by the FreeBayes developers. Clicking Configure button will open a window to tune advanced options. Freebayes has advanced options for Population model, Allele scope, Indel realignment, Input filters, Mappability priors, Genotype likelihoods, Algorithmic features, and Report options. Moving the mouse cursor over the info button will provide details for each parameter. Please refer to the for further information on tuning these parameters.
Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing. July 2012. .
If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.
LoFreq (version 2.1.3a) is a very sensitive and fast variant caller that can be employed to robustly call low-frequency variants. Uilizing sources of sequencing error in the detection model, LoFreq can identify variants below the sequencing error rate. The significance of each variant is calculated to allow for control of false positives. This method can identify both single nucleotide variants and insertions/deletion events, although the current implementation does not produce discrete genotype calls. Information on the model underlying the variant detection is detailed by Wilm et al.1
Selecting LoFreq from the context sensitive menu will bring up the LoFreq task dialog (Figure 1), which contains two sections: Select Reference sequence and Advanced options.
Select Reference sequence will specify the reference assembly to utilize for variant detection. If the alignment was generated in Partek Flow, the Assembly will be displayed as text in the section, and you do not have the option to change the reference. In the event that alignment was performed outside of Partek Flow, you will need to select the appropriate Assembly utilized for alignment in the drop-down list. Assemblies previously added to library files (see Library File Management) will be available for selection or New assembly… can be utilized to import the reference sequence to library files from within the task.
Wilm A, Aw PPK, Bertrand D, et al. LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets. Nucleic Acids Res. 2012;40(22):11189-11201.
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
Advanced options provides a means to tune parameters in the variant detection for optimal performance. Upon invoking the task dialog, Option set is set to Default, and these parameters are provided by the LoFreq developers. Clicking Configure will open a window to tune advanced options. LoFreq has advanced options for Region control, Base-call quality, Base-alignment (BAQ) and indel-alignment (IDAQ) qualities, Mapping quality, Indels, Source quality, P-values, and Other. Moving the mouse cursor over the info button will provide details for each parameter. Please refer to the LoFreq documentation for further suggestions on tuning these parameters.
SAMtools1 (version 1.2) utilizes the mpileup command to look at observed bases in the reads covering every genomic position represented in the aligned sequence data and calculate the likelihood of every possible genotype at a locus. Subsequently, bcftools applies the prior probability and uses Bayesian inference to call actual genotypes, outputting variant information in Variant Call Format (vcf). This method can identify both single nucleotide variants and insertions/deletion events. General information about the underlying algorithm utilized by SAMtools is detailed by Li. 2,3
Selecting SAMtools from the context sensitive menu will bring up the SAMtools task dialog, which contains three default sections: Variant detection method, Select Reference sequence, and Advanced options.
In the Variant detection method drop-down list, Against reference will compare base composition for each sample against the reference sequence assembly, independently (Figure 1).
In the event paired samples exist within the project, detection Paired samples can be utilized to identify loci with differing genotypes between the pair once each sample has been compared to the reference sequence assembly. In instances where there is limited information to accurately determine genotypes in one or both of the samples, the same genotype may be called for case and control if it differs from the reference. The Filter variants task can be used to exclude these spurious loci. To perform this analysis, sample attributes must be added in the Data tab of the project (Figure 2). Specifically, an attribute must be added for sample ID (shared between the paired samples) and an attribute must also be added for sample type that differentiates the paired samples.
Examples of the latter can include case and control or tumor and normal. If these attributes are present, a section for Analysis options will be displayed below the Variant detection method (Figure 3). To utilize this feature, select Paired analysis. Match ID must then be specified and should correspond to the attribute that references the sample ID shared between the pair. Selecting Case/control will allow for discriminating genotypes between paired samples in downstream tasks. Attribute should correspond to the attribute that defines type within sample pairs, and Control can be specified for whatever category relates to the reference sample.
Select Reference sequence will specify the reference assembly to utilize for variant detection. If the alignment was generated in Partek® Flow®, the Assembly will be displayed as text in the section, and you do not have the option to change the reference. In the event that alignment was performed outside of Partek Flow, you will need to select the appropriate Assembly utilized for alignment in the drop-down list. Assemblies previously added to library files (see Library File Management) will be available for selection or New assembly… can be utilized to import the reference sequence to library files from within the task.
Li H, Handsaker B, Wysoker A, et al. The Sequence Alignment/Map format and SAMtools. Bioinforma Oxf Engl. 2009;25(16):2078-2079.
Li H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics. 2011;27(21):2987-2993.
Li H. Improving SNP discovery by base alignment quality. Bioinformatics. 2011;27(8):1157-1158.
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
Advanced options provides a means to tune parameters in the variant detection for optimal performance. Upon invoking the task dialog, Option set is set to Default, and these parameters are provided by the SAMtools developers. Clicking Configure will open a window to tune advanced options Moving the mouse cursor over the info button will provide details for each parameter. Please refer to the SAMtools documentation for further details on any of these parameters.