LogoLogo
Illumina KnowledgeIllumina SupportSign In
Partek
  • Home
Partek
  • Overview
  • Partek Flow
    • Frequently Asked Questions
      • General
      • Visualization
      • Statistics
      • Biological Interpretation
      • How to cite Partek software
    • Quick Start Guide
    • Installation Guide
      • Minimum System Requirements
      • Single Cell Toolkit System Requirements
      • Single Node Installation
      • Single Node Amazon Web Services Deployment
      • Multi-Node Cluster Installation
      • Creating Restricted User Folders within the Partek Flow server
      • Updating Partek Flow
      • Uninstalling Partek Flow
      • Dependencies
      • Docker and Docker-compose
      • Java KeyStore and Certificates
      • Kubernetes
    • Live Training Event Recordings
      • Bulk RNA-Seq Analysis Training
      • Basic scRNA-Seq Analysis & Visualization Training
      • Advanced scRNA-Seq Data Analysis Training
      • Bulk RNA-Seq and ATAC-Seq Integration Training
      • Spatial Transcriptomics Data Analysis Training
      • scRNA and scATAC Data Integration Training
    • Tutorials
      • Creating and Analyzing a Project
        • Creating a New Project
        • The Metadata Tab
        • The Analyses Tab
        • The Log Tab
        • The Project Settings Tab
        • The Attachments Tab
        • Project Management
        • Importing a GEO / ENA project
      • Bulk RNA-Seq
        • Importing the tutorial data set
        • Adding sample attributes
        • Running pre-alignment QA/QC
        • Trimming bases and filtering reads
        • Aligning to a reference genome
        • Running post-alignment QA/QC
        • Quantifying to an annotation model
        • Filtering features
        • Normalizing counts
        • Exploring the data set with PCA
        • Performing differential expression analysis with DESeq2
        • Viewing DESeq2 results and creating a gene list
        • Viewing a dot plot for a gene
        • Visualizing gene expression in Chromosome view
        • Generating a hierarchical clustering heatmap
        • Performing biological interpretation
        • Saving and running a pipeline
      • Analyzing Single Cell RNA-Seq Data
      • Analyzing CITE-Seq Data
        • Importing Feature Barcoding Data
        • Data Processing
        • Dimensionality Reduction and Clustering
        • Classifying Cells
        • Differentially Expressed Proteins and Genes
      • 10x Genomics Visium Spatial Data Analysis
        • Start with pre-processed Space Ranger output files
        • Start with 10x Genomics Visium fastq files
        • Spatial data analysis steps
        • View tissue images
      • 10x Genomics Xenium Data Analysis
        • Import 10x Genomics Xenium Analyzer output
        • Process Xenium data
        • Perform Exploratory analysis
        • Make comparisons using Compute biomarkers and Biological interpretation
      • Single Cell RNA-Seq Analysis (Multiple Samples)
        • Getting started with the tutorial data set
        • Classify cells from multiple samples using t-SNE
        • Compare expression between cell types with multiple samples
      • Analyzing Single Cell ATAC-Seq data
      • Analyzing Illumina Infinium Methylation array data
      • NanoString CosMx Tutorial
        • Importing CosMx data
        • QA/QC, data processing, and dimension reduction
        • Cell typing
        • Classify subpopulations & differential expression analysis
    • User Manual
      • Interface
      • Importing Data
        • SFTP File Transfer Instructions
        • Import single cell data
        • Importing 10x Genomics Matrix Files
        • Importing and Demultiplexing Illumina BCL Files
        • Partek Flow Uploader for Ion Torrent
        • Importing 10x Genomics .bcl Files
        • Import a GEO / ENA project
      • Task Menu
        • Task actions
        • Data summary report
        • QA/QC
          • Pre-alignment QA/QC
          • ERCC Assessment
          • Post-alignment QA/QC
          • Coverage Report
          • Validate Variants
          • Feature distribution
          • Single-cell QA/QC
          • Cell barcode QA/QC
        • Pre-alignment tools
          • Trim bases
          • Trim adapters
          • Filter reads
          • Trim tags
        • Post-alignment tools
          • Filter alignments
          • Convert alignments to unaligned reads
          • Combine alignments
          • Deduplicate UMIs
          • Downscale alignments
        • Annotation/Metadata
          • Annotate cells
          • Annotation report
          • Publish cell attributes to project
          • Attribute report
          • Annotate Visium image
        • Pre-analysis tools
          • Generate group cell counts
          • Pool cells
          • Split matrix
          • Hashtag demultiplexing
          • Merge matrices
          • Descriptive statistics
          • Spot clean
        • Aligners
        • Quantification
          • Quantify to annotation model (Partek E/M)
          • Quantify to transcriptome (Cufflinks)
          • Quantify to reference (Partek E/M)
          • Quantify regions
          • HTSeq
          • Count feature barcodes
          • Salmon
        • Filtering
          • Filter features
          • Filter groups (samples or cells)
          • Filter barcodes
          • Split by attribute
          • Downsample Cells
        • Normalization and scaling
          • Impute low expression
          • Impute missing values
          • Normalization
          • Normalize to baseline
          • Normalize to housekeeping genes
          • Scran deconvolution
          • SCTransform
          • TF-IDF normalization
        • Batch removal
          • General linear model
          • Harmony
          • Seurat3 integration
        • Differential Analysis
          • GSA
          • ANOVA/LIMMA-trend/LIMMA-voom
          • Kruskal-Wallis
          • Detect alt-splicing (ANOVA)
          • DESeq2(R) vs DESeq2
          • Hurdle model
          • Compute biomarkers
          • Transcript Expression Analysis - Cuffdiff
          • Troubleshooting
        • Survival Analysis with Cox regression and Kaplan-Meier analysis - Partek Flow
        • Exploratory Analysis
          • Graph-based Clustering
          • K-means Clustering
          • Compare Clusters
          • PCA
          • t-SNE
          • UMAP
          • Hierarchical Clustering
          • AUCell
          • Find multimodal neighbors
          • SVD
          • CellPhoneDB
        • Trajectory Analysis
          • Trajectory Analysis (Monocle 2)
          • Trajectory Analysis (Monocle 3)
        • Variant Callers
          • SAMtools
          • FreeBayes
          • LoFreq
        • Variant Analysis
          • Fusion Gene Detection
          • Annotate Variants
          • Annotate Variants (SnpEff)
          • Annotate Variants (VEP)
          • Filter Variants
          • Summarize Cohort Mutations
          • Combine Variants
        • Copy Number Analysis (CNVkit)
        • Peak Callers (MACS2)
        • Peak analysis
          • Annotate Peaks
          • Filter peaks
          • Promoter sum matrix
        • Motif Detection
        • Metagenomics
          • Kraken
          • Alpha & beta diversity
          • Choose taxonomic level
        • 10x Genomics
          • Cell Ranger - Gene Expression
          • Cell Ranger - ATAC
          • Space Ranger
          • STARsolo
        • V(D)J Analysis
        • Biological Interpretation
          • Gene Set Enrichment
          • GSEA
        • Correlation
          • Correlation analysis
          • Sample Correlation
          • Similarity matrix
        • Export
        • Classification
        • Feature linkage analysis
      • Data Viewer
      • Visualizations
        • Chromosome View
          • Launching the Chromosome View
          • Navigating Through the View
          • Selecting Data Tracks for Visualization
          • Visualizing the Results Using Data Tracks
          • Annotating the Results
          • Customizing the View
        • Dot Plot
        • Volcano Plot
        • List Generator (Venn Diagram)
        • Sankey Plot
        • Transcription Start Site (TSS) Plot
        • Sources of variation plot
        • Interaction Plots
        • Correlation Plot
        • Pie Chart
        • Histograms
        • Heatmaps
        • PCA, UMAP and tSNE scatter plots
        • Stacked Violin Plot
      • Pipelines
        • Making a Pipeline
        • Running a Pipeline
        • Downloading and Sharing a Pipeline
        • Previewing a Pipeline
        • Deleting a Pipeline
        • Importing a Pipeline
      • Large File Viewer
      • Settings
        • Personal
          • My Profile
          • My Preferences
          • Forgot Password
        • System
          • System Information
          • System Preferences
          • LDAP Configuration
        • Components
          • Filter Management
          • Library File Management
            • Library File Management Settings
            • Library File Management Page
            • Selecting an Assembly
            • Library Files
            • Update Library Index
            • Creating an Assembly on the Library File Management Page
            • Adding Library Files on the Library File Management Page
            • Adding a Reference Sequence
            • Adding a Cytoband
            • Adding Reference Aligner Indexes
            • Adding a Gene Set
            • Adding a Variant Annotation Database
            • Adding a SnpEff Variant Database
            • Adding a Variant Effect Predictor (VEP) Database
            • Adding an Annotation Model
            • Adding Aligner Indexes Based on an Annotation Model
            • Adding Library Files from Within a Project
            • Microarray Library Files
            • Adding Prep kit
            • Removing Library Files
          • Option Set Management
          • Task Management
          • Pipeline managment
          • Lists
        • Access
          • User Management
          • Group Management
          • Licensing
          • Directory Permissions
          • Access Control Log
          • Failed Logins
          • Orphaned files
        • Usage
          • System Queue
          • System Resources
          • Usage Report
      • Server Management
        • Backing Up the Database
        • System Administrator Guide (Linux)
        • Diagnosing Issues
        • Moving Data
        • Partek Flow Worker Allocator
      • Enterprise Features and Toolkits
        • REST API
          • REST API Command List
      • Microarray Toolkit
        • Importing Custom Microarrays
      • Glossary
    • Webinars
    • Blog Posts
      • How to select the best single cell quality control thresholds
      • Cellular Differentiation Using Trajectory Analysis & Single Cell RNA-Seq Data
      • Spatial transcriptomics—what’s the big deal and why you should do it
      • Detecting differential gene expression in single cell RNA-Seq analysis
      • Batch remover for single cell data
      • How to perform single cell RNA sequencing: exploratory analysis
      • Single Cell Multiomics Analysis: Strategies for Integration
      • Pathway Analysis: ANOVA vs. Enrichment Analysis
      • Studying Immunotherapy with Multiomics: Simultaneous Measurement of Gene and Protein
      • How to Integrate ChIP-Seq and RNA-Seq Data
      • Enjoy Responsibly!
      • To Boldly Go…
      • Get to Know Your Cell
      • Aliens Among Us: How I Analyzed Non-Model Organism Data in Partek Flow
    • White Papers
      • Understanding Reads in RNA-Seq Analysis
      • RNA-Seq Quantification
      • Gene-specific Analysis
      • Gene Set ANOVA
      • Partek Flow Security
      • Single Cell Scaling
      • UMI Deduplication in Partek Flow
      • Mapping error statistics
    • Release Notes
      • Release Notes Archive - Partek Flow 10
  • Partek Genomics Suite
    • Installation Guide
      • Minimum System Requirements
      • Computer Host ID Retrieval
      • Node Locked Installation
        • Windows Installation
        • Macintosh Installation
      • Floating/Locked Floating Installation
        • Linux Installation
          • FlexNet Installation on Linux
        • Installing FlexNet on Windows
        • License Server FAQ's
        • Client Computer Connection to License Server
      • Uninstalling Partek Genomics Suite
      • Updating to Version 7.0
      • License Types
      • Installation FAQs
    • User Manual
      • Lists
        • Importing a text file list
        • Adding annotations to a gene list
        • Tasks available for a gene list
        • Starting with a list of genomic regions
        • Starting with a list of SNPs
        • Importing a BED file
        • Additional options for lists
      • Annotation
      • Hierarchical Clustering Analysis
      • Gene Ontology ANOVA
        • Implementation Details
        • Configuring the GO ANOVA Dialog
        • Performing GO ANOVA
        • GO ANOVA Output
        • GO ANOVA Visualisations
        • Recommended Filters
      • Visualizations
        • Dot Plot
        • Profile Plot
        • XY Plot / Bar Chart
        • Volcano Plot
        • Scatter Plot and MA Plot
        • Sort Rows by Prototype
        • Manhattan Plot
        • Violin Plot
      • Visualizing NGS Data
      • Chromosome View
      • Methylation Workflows
      • Trio/Duo Analysis
      • Association Analysis
      • LOH detection with an allele ratio spreadsheet
      • Import data from Agilent feature extraction software
      • Illumina GenomeStudio Plugin
        • Import gene expression data
        • Import Genotype Data
        • Export CNV data to Illumina GenomeStudio using Partek report plug-in
        • Import data from Illumina GenomeStudio using Partek plug-in
        • Export methylation data to Illumina GenomeStudio using Partek report plug-in
    • Tutorials
      • Gene Expression Analysis
        • Importing Affymetrix CEL files
        • Adding sample information
        • Exploring gene expression data
        • Identifying differentially expressed genes using ANOVA
        • Creating gene lists from ANOVA results
        • Performing hierarchical clustering
        • Adding gene annotations
      • Gene Expression Analysis with Batch Effects
        • Importing the data set
        • Adding an annotation link
        • Exploring the data set with PCA
        • Detect differentially expressed genes with ANOVA
        • Removing batch effects
        • Creating a gene list using the Venn Diagram
        • Hierarchical clustering using a gene list
        • GO enrichment using a gene list
      • Differential Methylation Analysis
        • Import and normalize methylation data
        • Annotate samples
        • Perform data quality analysis and quality control
        • Detect differentially methylated loci
        • Create a marker list
        • Filter loci with the interactive filter
        • Obtain methylation signatures
        • Visualize methylation at each locus
        • Perform gene set and pathway analysis
        • Detect differentially methylated CpG islands
        • Optional: Add UCSC CpG island annotations
        • Optional: Use MethylationEPIC for CNV analysis
        • Optional: Import a Partek Project from Genome Studio
      • Partek Pathway
        • Performing pathway enrichment
        • Analyzing pathway enrichment in Partek Genomics Suite
        • Analyzing pathway enrichment in Partek Pathway
      • Gene Ontology Enrichment
        • Open a zipped project
        • Perform GO enrichment analysis
      • RNA-Seq Analysis
        • Importing aligned reads
        • Adding sample attributes
        • RNA-Seq mRNA quantification
        • Detecting differential expression in RNA-Seq data
        • Creating a gene list with advanced options
        • Visualizing mapped reads with Chromosome View
        • Visualizing differential isoform expression
        • Gene Ontology (GO) Enrichment
        • Analyzing the unexplained regions spreadsheet
      • ChIP-Seq Analysis
        • Importing ChIP-Seq data
        • Quality control for ChIP-Seq samples
        • Detecting peaks and enriched regions in ChIP-Seq data
        • Creating a list of enriched regions
        • Identifying novel and known motifs
        • Finding nearest genomic features
        • Visualizing reads and enriched regions
      • Survival Analysis
        • Kaplan-Meier Survival Analysis
        • Cox Regression Analysis
      • Model Selection Tool
      • Copy Number Analysis
        • Importing Copy Number Data
        • Exploring the data with PCA
        • Creating Copy Number from Allele Intensities
        • Detecting regions with copy number variation
        • Creating a list of regions
        • Finding genes with copy number variation
        • Optional: Additional options for annotating regions
        • Optional: GC wave correction for Affymetrix CEL files
        • Optional: Integrating copy number with LOH and AsCN
      • Loss of Heterozygosity
      • Allele Specific Copy Number
      • Gene Expression - Aging Study
      • miRNA Expression and Integration with Gene Expression
        • Analyze differentially expressed miRNAs
        • Integrate miRNA and Gene Expression data
      • Promoter Tiling Array
      • Human Exon Array
        • Importing Human Exon Array
        • Gene-level Analysis of Exon Array
        • Alt-Splicing Analysis of Exon Array
      • NCBI GEO Importer
    • Webinars
    • White Papers
      • Allele Intensity Import
      • Allele-Specific Copy Number
      • Calculating Genotype Likelihoods
      • ChIP-Seq Peak Detection
      • Detect Regions of Significance
      • Genomic Segmentation
      • Loss of Heterozygosity Analysis
      • Motif Discovery Methods
      • Partek Genomics Suite Security
      • Reads in RNA-Seq
      • RNA-Seq Methods
      • Unpaired Copy Number Estimation
    • Release Notes
    • Version Updates
    • TeamViewer Instructions
  • Getting Help
    • TeamViewer Instructions
Powered by GitBook
On this page
  • Creating a New Elastic Compute Cloud Instance for Partek Flow Software
  • Enabling External Access to the Partek Flow Elastic Compute Cloud Instance
  • Attaching the Amazon Elastic Block Store Volume for Partek Flow Data Storage
  • Installing Partek Flow on a New Elastic Compute Cloud Instance
  • Partek Amazon Web Services Support
  • General Recommendations
  • Amazon Web Services Instance Type Resources and Costs
  • Elastic Block Store Volumes
  • Additional Assistance

Was this helpful?

Export as PDF
  1. Partek Flow
  2. Installation Guide

Single Node Amazon Web Services Deployment

PreviousSingle Node InstallationNextMulti-Node Cluster Installation

Last updated 4 months ago

Was this helpful?

Creating a New Elastic Compute Cloud Instance for Partek Flow Software

Note: This guide assumes all items necessary for the Amazon elastic Comput Clout (EC2) instance does not exist, such as Amazon Virtual Private Cloud (VPC), subnets, and security groups, thus their creation is covered as well.

Log in to the Amazon Web Services (AWS) management console at https://console.aws.amazon.com

Click on EC2

Switch to the region intended to deploy Partek Flow software. This tutorial uses US East (N. Virginia) as an example.

On the left menu, click on Instances, then click the Launch Instance button. The Choose an Amazon Machine Image (AMI) page will appear.

Click the Select button next to Ubuntu Server 16.04 LTS (HVM), SSD Volume Type - ami-f4cc1de2. NOTE: Please use the latest Ubuntu AMI. It is likely that the AMI listed here will be out of date.

Choose an Instance Type, the selection depends on your budget and the size of the Partek Flow deployment. We recommend m4.large for testing or cluster front-end operation, m4.xlarge for standard deployments, and m4.2xlarge for alignment-heavy workloads with a large user-base. See the section AWS instance type resources and costs for assistance with choosing the right instance. In most cases, the instance type and associated resources can be changed after deployment, so one is not locked into the choices made for this step.

NOTE: New instance types will become available. Please use the latest mX instance type provided as it will likely perform better and be more cost effective than older instance types.

On the Configure Instance Details page, make the following selections:

  • Set the number of instances to 1. An autoscaling group is not necessary for single-node deployments

  • Purchasing Option: Leave Request Spot Instances unchecked. This is relevant for cost-minimization of Partek Flow cluster deployments.

  • Network: If you do not have a virtual private cloud (VPC) already created for Partek Flow, click Create New VPC. This will open a new browser tab for VPC management.

    • Use the following settings for the VPC:

      • Name Tag: Flow-VPC

      • IPv4 CIDR block: 10.0.0.0/16

      • Select No IPv6 CIDR Block

      • Tenancy: Default

    • Click Yes, Create. You may be asked to select a DHCP Option set. If so, then make sure the dynamic host configuration protocol (DHCP) option set has the following properties:

      • Options: domain-name = ec2.internal;domain-name-servers = AmazonProvidedDNS;

      • DNS Resolution: leave the defaults set to yes

      • DNS Hostname: change this to yes as internal DNS resolution may be necessary depending on the Partek Flow deployment

    • Once created, the new Flow-VPC will appear in the list of available VPCs. The VPC needs additional configuration for external access. To continue, right click on Flow-VPC and select Edit DNS Resolution, select Yes, and then Save. Next, right click the Flow-VPC and select Edit DNS Hostnames, select Yes, then Save.

    • Make sure the DHCP option set is set to the one created above. If it is not, right-click on the row containing Flow-VPC and select Edit DHCP Option Sets.

    • Close the VPC Management tab and go back to the EC2 Management Console.

  • Click the refresh arrow next to Create New VPC and select Flow-VPC.

  • Click Create New Subnet and a new browser tab will open with a list of existing subnets. Click Create Subnet and set the following options:

    • Name Tag: Flow-Subnet

    • VPC: Flow-VPC

    • VPC CIDRs: This should be automatically populated with the information from Flow-VPC

    • Availability Zone: It is OK to let Amazon choose for you if you do not have a preference

    • IPv4 CIDR block: 10.0.1.0/24

  • Stay on the VPC Dashboard Tab and on the left navigation menu, click Internet Gateways, then click Create Internet Gateway and use the following options:

    • Name Tag: Flow-IGW

    • Click Yes, Create

  • The new gateway will be displayed as Detached. Right click on the Flow-IGW gateway and select Attach to VPC, then select Flow-VPC and click Yes, Attach.

  • Click on Route Tables on the left navigation menu.

  • If it exists, select the route table already associated with Flow-VPC. If not, make a new route table and associate it with Flow-VPC. Click on the new route table, then click the Routes tab toward the bottom of the page. The route Destination = 10.0.0.0/16 Target = local should already be present. Click Edit, then Click Add another route and set the following parameters:

    • Destination: 0.0.0.0/0

    • Target set to Flow-IGW (the internet gateway that was just created)

  • Click Save

  • Close the VPC Dashboard browser tab and go back to the EC2 Management Console tab. Note that you should still be on Step 3: Configure Instance Details.

Click the refresh arrow next to Create New Subnet and select Flow-Subnet.

Auto-assign Public IP: Use subnet setting (Disable)

Placement Group: No placement group

IAM role: None.

Note: For multi-node Partek Flow deployments or instances where you would like Partek to manage AWS resources on &#xNAN;your behalf, please see Partek AWS support and set up an IAM role for your Partek Flow EC2 instance. In most cases &#xNAN;a specialized IAM role is unnecessary and we only need instance ssh keys.

Shutdown Behaviour: Stop

Enable Termination Protection: select Protect against accidental termination

Monitoring: leave Enable CloudWatch Detailed Monitoring disabled

EBS-optimized Instance: Make sure Launch as EBS-optimized Instance is enabled. Given the recommended choice of an m4 instance type, EBS optimization should be enabled at no extra cost.

Tenancy: Shared - Run a shared hardware instance

Network Interfaces: leave as-is

Advanced Details: leave as-is

Click Next: Add Storage. You should be on Step 4: Add Storage

For the existing root volume, set the following options:

  • Size: 8 GB

  • Volume Type: Magnetic

  • Select Delete on Termination

    • Note: All Partek Flow data is stored on a non-root EBS volume. Since only the OS is on the root volume and not frequently re-booted, a fast root volume is probably not necessary or worth the cost. For more information about EBS volumes and their performance, see the section EBS volumes.

Click Add New Volume and set the following options:

  • Volume Type: EBS

  • Device: /dev/sdb (take the default)

  • Do not define a snapshot

  • Size (GiB): 500

    • Note: This is the minimum for ST1 volumes, see: EBS volumes

  • Volume Type: Throughput optimized HDD (ST1)

  • Do not delete on terminate or encrypt

Click Next: Add Tags

  • You do not need to define any tags for this new EC2 instance, but you can if you would like.

Click Next: Configure Security Group

  • For Assign a Security Group select Create a New Security Group

  • Security Group Name: Flow-SG

  • Description: Security group for Partek Flow server

  • Add the following rules:

    • SSH set Source to My IP (or the address range of your company or institution)

    • Click Add Rule:

    • Set Type to Custom TCP Rule

    • Set Port Range to 8080

    • Set Source to anywhere (0.0.0.0/0, ::/0)

      • Note: It is recommended to restrict Source to just those that need access to Partek Flow.

Click Review and Launch

  • The AWS console will suggest this server not be booted from a magnetic volume. Since there is not a lot of IO on the root partition and reboots are will be rare, choosing Continue with Magnetic will reduce costs. Choosing an SSD volume will not provide substantial benefit but it OK if one wishes to use an SSD volume. See the EBS Volumes section for more information.

Click Launch

Create a new keypair:

  • Name the keypair Flow-Key

  • Download this keypair, the run chmod 600 Flow-Key.pem (the downloaded key) so it can be used.

  • Backup this key as one may lose access to the Partek Flow instance without it.

The new instance will now boot. Use the left navigation bar and click on Instances. Click the pencil icon and assign the instance the name Partek Flow Server

Enabling External Access to the Partek Flow Elastic Compute Cloud Instance

The server should be assigned a fixed IP address. To do this, click on Elastic IPs on the left navigation menu from the EC2 Management Console.

  • Click Allocate New Address

  • Assign Scope to VPC

  • Click Allocate

On the table containing the newly allocated elastic IP, right click and select Associate Address

  • For Instance, select the instance name Flow Test Server

  • For Private IP, select the one private IP available for the Partek Flow EC2 instance, then click Associate

Note: For the remaining steps, we refer to the elastic ip as elastic.ip

SSH to the new Flow-Server instance:

$ chmod 600 Flow-Key.pem
$ ssh -i Flow-Testing.pem ubuntu@elastic.ip

Attaching the Amazon Elastic Block Store Volume for Partek Flow Data Storage

Attach, format, and move the ubuntu home directory onto the large ST1 elastic block store (EBS) volume. All Partek Flow data will live in this volume. Consult the AWS EC2 documentation for further information about attaching EBS volumes to your instance.

$ sudo su
$ mkfs -t ext4 /dev/xvdb

Note: Under Volumes in the EC2 management console, inspect Attachment Information. It will likely list the large ST1 EBS volume as attached to /dev/sdb. Replace "s" with "xv" to find the device name to use for this mkfs command.

Make a note of the newly created UUID for this volume

Copy the ubuntu home directory onto the EBS volume using a temporary mount point:

$ mount -t ext4 /dev/xvdb /mnt/
$ rsync -avr /home/ /mnt/
$ umount /mnt/

Make the EBS volume mount at system boot:

Add the following to /etc/fstab: UUID=the-UUID-from-the-mkfs-command-above /home ext4 defaults,nofail 0 2

$ mount -a

Disconnect the ssh session, then log in again to make sure all is well

Installing Partek Flow on a New Elastic Compute Cloud Instance

Note: For additional information about Partek Flow installations, see our generic Installation Guide

Before beginning, send the media access control (MAC) address of the EC2 instance to MAC address of the EC2 instance to licensing@partek.com. The output of ifconfig will suffice. Given this information, Partek employees will create a license for your AWS server. MAC addresses will remain the same after stopping and starting the Partek Flow EC2 instance. If the MAC address does change, let our licensing department know and we can add your license to our floating license server or suggest other workarounds.

Install required packages for Partek Flow:

$ sudo apt-get update
$ sudo apt-get install software-properties-common
$ sudo add-apt-repository -y ppa:openjdk-r/ppa
$ sudo apt-get install openjdk-8-jdk python python-pip python-dev zlib1g-dev python-matplotlib r-base python-htseq libxml2-dev perl make gcc g++ zlib1g libbz2-1.0 libstdc++6 libgcc1 libncurses5 libsqlite3-0 libfreetype6 libpng12-0 zip unzip libgomp1 libxrender1 libxtst6 libxi6 debconf 
$ sudo pip install --upgrade pip && pip install --upgrade --upgrade-strategy eager --force-reinstall virtualenv numpy pysam cnvkit

Install Partek Flow:

Note: Make sure you are running as the ubuntu user.

$ cd (we will install Partek Flow to ubuntu's home directory)
$ wget --content-disposition packages.partek.com/linux/flow-release
$ unzip PartekFlow*.zip
$ ./partek_flow/start_flow.sh

Partek Flow has finished loading when you see INFO: Server startup in xxxxxxx ms in the partek_flow/logs/catalina.out log file. This takes ~30 seconds.

Enter license key

Set up the Partek Flow admin account

Leave the library file directory at its default location and check that the free space listed for this directory is consistent with what was allocated for the ST1 EBS volume.

Done! Partek Flow is ready to use.

Partek Amazon Web Services Support

After the EC2 instance is provisioned, we are happy to assist with setting up Partek Flow or address other issues you encounter with the usage of Partek Flow. The quickest way to receive help is to allow us remote access to your server by sending us Flow-Key.pem and amending the SSH rule for Flow-SG to include access from IP 97.84.41.194 (Partek HQ). We recommend sending us the Flow-Key.pem via secure means. The easiest way to do this is with the following command:

$ curl -F "file=@FlowKey.pem" https://installfeedback.partek.com/fupload

We also provide live assistance via GoTo meeting or TeamViewer if you are uncomfortable with us accessing your EC2 instance directly. Before contacting us, please run $ ./partek_flow/flowstatus.sh to send us logs and other information that will assist with your support request.

General Recommendations

The network performance of the EC2 instance type becomes an important factor if the primary usage of Partek Flow is for alignment. For this use case, one will have to move copious amounts of data back (input fastq files) and forth (output bam files) between the Partek Flow server and the end users, thus it is important to have as what AWS refers to as high network performance which for most cases is around 1 Gb/s. If the focus is primarily on downstream analysis and visualization (e.g. the primary input files are ADAT) then network performance is less of a concern.

We recommend HVM virtualization as we have not seen any performance impact from using them and non-HVM instance types can come with significant deployment barriers.

Make sure your instance is EBS optimized by default and you are not charged a surcharge for EBS optimization.

T-class servers, although cheap, may slow responsiveness for the Partek Flow server and generally do not provide sufficient resources.

We do not recommend placing any data on instance store volumes since all data is lost on those volumes after an instance stops. This is too risky as there are cases where user tasks can take up unexpected amounts of memory forcing a server stop/reboot.

Amazon Web Services Instance Type Resources and Costs

Instance Type
Memory
Cores
EBS throughput
Network Performance
Monthly cost

m4.large

8.0 GB

2 vCPUs

56.25 MB/s M

Medium

$78.840

r4.large

15.25 GB

2 vCPUs

50 MB/s H(10G int)

High (+10G interface)

$97.09

m4.xlarge

16.0 GB

4 vCPUs

93.75 MB/s H

High

$156.950

r4.xlarge

30.5 GB

4 vCPUs

100 MB/s H

High

$194.180

m4.2xlarge

32.0 GB

8 vCPUs

125 MB/s H

High

$314.630

r4.2xlarge

61.0 GB

8 vCPUs

200 MB/s H(10G int)

High (+10G interface)

$388.360

Single server recommendation: m4.xlarge or m4.2xlarge

Elastic Block Store Volumes

Choice of a volume type and size:

This is dependent on the type of workload. For must users, the Partek Flow server tasks are alignment-heavy so we recommend a throughput optimized HDD (ST1) EBS volume since most aligner operations are sequential in nature. For workloads that focus primarily on downstream analysis, a general purpose SSD volume will suffice but the costs are greater. For those who focus on alignment or host several users, the storage requirements can be high. ST1 EBS volumes have the following characteristics:

Max throughput 500 MiB/s

$0.045 per GB-month of provisioned storage ($22.5 per month for a 500 GB of storage).

Additional Assistance

Alternative: Install Flow with Docker. Our base packages are located here:

Open Partek Flow with a web browser:

With newer EC2 instance types, it is possible to change the instance type of an already deployed Partek Flow EC2 server. We recommend doing several rounds of benchmarks with production-sized workloads and evaluate if the resources allocated to your Partek Flow server are sufficient. You may find that reducing resources allocated to the Partek Flow server may come with significant cost savings, but can cause UI responsiveness and job run-times to reach unacceptable levels. Once you have found an instance type that works, you may wish to use reserved instance pricing which is significantly cheaper than on-demand instance pricing. Reserved instances come with 1 or 3-year usage terms. Please see the to sell or purchase existing reserved instances at reduced rates.

The values below were updated April 2017. The latest pricing and EC2 resource offerings can be found at

for US-EAST-1 correspond to: Low ~ 50Mb/s, Medium ~ 300Mb/s, High ~ 1Gb/s.

Note that EBS volumes can be grown or performance characteristics changed. To minimize costs, start with a smaller EBS volume allocation of 0.5 - 2 TB as most mature Partek Flow installations generate roughly this amount of data. When necessary, the EBS volume and the (making ext4 a good choice). Shrinking is also possible but may require the Partek Flow server to be offline.

If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.

https://hub.docker.com/r/partekinc/flow/tags
http://elastic.ip:8080/
EC2 Reserved Instance Marketplace
http://www.ec2instances.info
Network performance values
underlying file system can be grown on-line
our support page
Creating a New Elastic Compute Cloud Instance for Partek Flow Software
Enabling External Access to the Partek Flow Elastic Compute Cloud Instance)
Attaching the Amazon Elastic Block Store Volume for Partek Flow Data Storage)
Installing Partek Flow on a New Elastic Compute Cloud Instance
Partek Amazon Web Services Support
General Recommendations
Amazon Web Services Instance Type Resources and Costs
Elastic Block Store Volumes