1 of 6

Server Management

The Partek Flow server can be maintained using the tools covered in this section.

Backing Up the Database
System Administrator Guide (Linux)
Diagnosing Issues
Moving Data
Partek Flow Worker Allocator

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Backing Up the Database

Linux

Open a Linux terminal and enter the following commands.

Obtain root access:

Change to the user account that runs Partek® Flow® . Suppose it is user account flow, then_:_

Then the default home directory should be/home/flow. Run the following command to make the backup database in home/flow directory, the archived file name is flowdbbackup.tar.gz:

Log out of the user account that runs Flow: Ctrl+D

MacOS

Open a Terminal window on the home directory of the user that installed Partek Flow.

Run the following command to make the backup database in user's home directory

The archived file name is flowdbbackup.tar.gz.

Additional Assistance

System Administrator Guide (Linux)

This section provides additional tools that may be useful for system administrators who maintain the Partek Flow server.\

Verifying that Partek Flow is Running on the Server
Locating the log files
Converting a Zip Installation to Use the Package Manager
Transferring Partek Flow to a new machine

Verifying that Partek Flow is Running on the Server

At anytime, you wish to know the status of Partek Flow:

$ service partekflowd status

Possible outputs are RUNNING or STOPPED.

Locating the log files

The Partek Flow server log is contained in the file /opt/partek_flow/logs/catalina.out In some cases, it may be necessary to run $ ps aux | grep flow to determine the current Partek Flow installation path. That should lead us to the most recent catalina.out file.
To zip all the logs of Partek Flow (server and task logs)

zip /opt/partek_flow/logs/

zip /home/flow/.partekflow/logs

(or the whole .partekflow folder that has the database)

These zipped files contain the complete logs of Partek Flow and can be sent to Partek technical support for troubleshooting. Note that running the flowstatus.sh script and sending the error report would upload these files to Partek.

Changing the Temporary Folder Location

By default, temporary files resulting from genomic data uploads to Flow are stored in /opt/partek_flow/temp and are removed upon upload completion. If Flow is installed on a small root partition (<20GB), exceedingly large uploads may fill the root partition and necessitate moving this temporary directory to a larger partition. In order to select a new Flow temp folder, complete the following logged in as root:

Shutdown Flow

# service partekflowd stop

For this example we will use the new temporary folder location of /home/flow/partek_flow/temp. Adjust this path to meet your needs.
Open the configuration file /etc/partekflow.conf and append the following line to the end of the file:

CATALINA_TMPDIR=/home/flow/partek_flow/temp

Ensure the new temporary directory exists and is writeable by Flow. If you use a different Linux user account to run Flow, make sure this folder can be modified by this user.

# mkdir -p /home/flow/partek_flow/temp

# chown flow:flowuser /home/flow/partek_flow/temp

Start Flow

# service partekflowd start

Converting a Zip Installation to Use the Package Manager

If you have used a .zip file to install a previous build of Partek Flow and you wish to convert your installation to a package manager, we recommend that you contact the Partek Licensing Support for assistance in this process. Briefly, we describe the conversion steps below.

$ cd

$ ~/partek_flow/stop_flow.sh

Ensure Partek Flow is no longer running. If the output contains only "grep bin/flow" this requirement is met.

$ ps aux | grep bin/flow

If Partek Flow is running and repeating step 1 above does not shut down the Partek Flow server, then use the following command where PID is the process ID of Partek Flow. The PID is found from the output of step 2 above, column two

$ kill -9 PID

Backup the existing Partek Flow database and installation directories. Substitute the example paths below with those specific to the Partek Flow installation.

$ cp -r ~/.partekflow ~/.partekflow_backup

$ mv ~/partek_flow ~/partek_flow_backup

Follow the Installation steps relevant to the Linux distribution on the Partek Flow server.

For Debian/Ubuntu: Upon reaching Configure Partek Flow installation settings enter

$ sudo dpkg-reconfigure partekflow

For Redhat/Fedora/Centos: Edit the following file: /etc/partekflow.conf

These prompts set the existing Linux account name and home directory used to run the previous Partek Flow server installation.

Transferring Partek Flow to a new machine

Contact your Account Manager or email licensing@partek.com to request for transfer and to obtain a new license.dat file based on the Host ID of your new machine. Follow the steps below to move the Partek Flow license and database:

On OLD MACHINE

Shutdown existing Partek Flow installation: $ sudo service partekflowd stop
Backup Partek Flow database:

$ sudo su - flow

$ tar -czvf partekflowdb.bkup.tgz /home/flow/.partekflow

Copy partekflowdb.bkup.tgz to new machine
Remove existing Partek Flow installation: Debian/Ubuntu:

$ sudo apt-get remove partekflow partekflow-bin

RedHat/Fedora/CentOS

$ sudo yum remove partekflow partekflow-bin

On NEW MACHINE

Install Partek Flow as described earlier in this document. When prompted for license, paste the license generated for the new machine.
Shutdown Partek Flow to install previous database:

$ sudo service partekflowd stop

Unpack partekflowdb.bkup.tgz:

$ sudo su - flow

$ tar -xzvf partekflowdb.bkup.tgz

$ exit

Restart Partek Flow

$ sudo service partekflowd start

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Diagnosing Issues

Partek Flow comes with a standalone diagnostic script that reports how Partek Flow is installed and detects common installation problems.

flowstatus.sh script

This script can be run independently of Partek Flow as installation issues or crashes can prevent Partek Flow from starting. This utility gathers Partek Flow log files and server information which, upon customer approval, will be sent to Partek so our support team has all requisite information to address the issue. Some examples of when this script should be run include:

Support team needs additional information and will request this script be run
Partek Flow crashes or is otherwise inaccessible
Partek Flow is unable to see input or output directories or projects are suddenly missing
Unexpected behavior after a Partek Flow or system update
Tasks fail to run due to missing files or directory permission issues

When a task fails, the first course of action is to enter its task's details page (Figure 1), then click on the button labeled Send logs to Partek. This creates a support ticket and you will be contacted. In some cases the task failure logs sent when clicking on this button do not contain adequate information. In this case, Partek Technical Support team will request that you run this script. Whenever possible, please run this script as the root user to ensure that system log information is collected.

If you are unable to install Partek Flow, this script will not be available. Please contact Partek Technical Support if you cannot install Partek Flow.

Running flowstatus.sh via the command line

Locate the Partek Flow installation directory. This is defined as the FLOWhome variable in the file /etc/partekflow.conf.

For this example, we assume the Partek Flow install directory is /opt/partek_flow. If it is not, replace it with the directory found in step 1.

Run the script

After the script is run, a report will appear on the screen, then you will be asked if you wish to upload this report to Partek. If this report is uploaded, you will be contacted by support team who will assist with your issue.

Running flowstatus.sh via the command line (alternative method)

If you get an error saying, "No such file or directory" or you are unable to find the flowstatus.sh script on your system, then run the following:

Linux

MacOS

Interpreting flowstatus.sh reports

When running the flowstatus.sh script, you will see a report similar to Figure 2.

The relevant details of the report are:

Script running as Linux user: The user account the flowstatus.sh script was run under

Flow status: Is the Partek Flow server running or not?

Flow is running as Linux user: The user account under which the Partek Flow server runs. This defaults to 'flow', however, this could have been changed to ameliorate permission issues by running Partek Flow under the same user that is the primary user of this server (i.e. the user that logs into and uses the desktop on this server).

Flow installation method: For all default installs, Partek Flow is installed with the package manager. If this is not your installation method, you are advised to contact Partek support in order to maintain your Partek Flow installation or assist with installation issues. The conversion steps are described in the next section.

Flow install directory: By default, this should be /opt/partek_flow. If this is not the case, the upgrade process for Partek Flow becomes more involved.

Flow database directory: This is a relatively small directory that stores all Partek Flow configuration and information about analysis and projects generated by Partek Flow. It is crucial that this directory be backed up regularly. If it is removed or corrupted, ALL projects in Partek Flow disappear. The actual raw input and output files for all projects are not lost, however.

After displaying Partek Flow configuration information, several installation checks are performed. This covers common issues that can break a Partek Flow installation such as full disks or running Partek Flow under the wrong user account.

Sending the error report to Partek

At the end of the report, you will be given an option to send the error to Partek (Figure 3).

Additional Assistance

Moving Data

If you wish to move a Partek Flow project data from one disk to another and continue using the project, please follow the steps below:

Use rsync to copy the project data from the old disk to the new disk. For example, if you wish to move Project A from an internal disk to an external drive mounted as hdd:

$ rsync -avr /home/user/FlowData/Project_A /mnt/HDD/FlowData2/Project_A

Using rsync -avr guarantees that the time stamps would not change.\

In Partek Flow, logon as a user with Admin privileges. Go to Settings>Directory permissions page
In the Directory Reassignment section, type the Former directory and select the Current directory (Figure 1)

For example:

Former directory: /home/user/FlowData/Project_A
Current directory: /mnt/HDD/FlowData2/Project_A

Select Reassign button
Open the project.

In the Analyses tab, check if the Project disk space on the lower right has been updated
In the Data tab, check the project output directory has changed to the new directory\

You can now remove the former directory

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Partek Flow Worker Allocator

A worker is a program installed on a single computer within a computer cluster and receives job requests from the Partek Flow server. The program will determine if the computer has the needed resources to complete the requested job, and based on the answer, deny or accept the job and report when completed. Workers can be allocated using the tool below.

Note: This tool requires the Flow rest API license feature

Obtaining the Allocator

Configuring the Allocator

The configuration file flow.worker.allocator.json must exist in the home directory. If this file is not present, an example configuration file will be written by running the allocator like so:

The allocator will exit immediately so the example configuration file can be edited. This configuration file is documented in the Configuration File Format section below.

Note: The system’s temp folder (e.g. /tmp) must be writable and mounted with execute permissions in order for the allocator to run.

Authenticating the Allocator to Flow

After flow.worker.allocator.json has been configured, run the allocator:

When prompted, enter the Flow administrator username and password. Upon success, an authentication token is stored in ~/flow.auth.json which is used for all following authentication attempts. If this token is removed, the allocator will again prompt for a Flow username and password. This token only allows access to the Flow rest API and if compromised can not be used to derive Flow account passwords.

Starting the Allocator as a Background Process

The allocator takes no arguments. All configuration is stored in ~/flow.worker.allocator.json.

Stopping the Allocator

If the allocator was run as a foreground process, CONTROL-C or SIGTERM will cause the process to exit.

Logging

The allocator writes daily rotated log files to ~/flow.worker.allocator.log. For verbose output, set DebugMode to 1 in the configuration file.

Allocation Criteria

Once configured, the allocator will poll the Flow server every CheckIntervalSec seconds to ask if there is pending work that would be able to start immediately if another Flow worker was added. If true, WorkerCounterCmd is used to query the job scheduler to see how many Flow workers have been allocated. If this is below the WorkerResourceLimit : MaxWorkers limit, one worker is allocated using WorkerAllocatorCmd.

It is recommended that WorkerResourceLimit : IdleShutdownMin be relatively short so that allocations are elastic: large workloads are able to allocate all available resources and quickly return those resources to the job scheduler when Flow workers are idle.

Configuration File Format

FlowAPITimeoutSec : integer

The length of time in seconds the allocator will wait for a response from the Flow server.

CheckIntervalSec : integer

The length of time in seconds the allocator will ask Flow if a worker is needed.

InfrastructureWaitTillUpdateTimeSec : integer

The allocator will communicate with the internal resource allocation infrastructure to see how many workers are running. In most cases, this infrastructure is a job scheduler (e.g. torque, lsf, sge) where there can be a delay between the request of resources and the acknowledgement that the request has been made. This parameter tells the allocator to wait for the job scheduler to update before making any further allocation decisions. Note: InfrastructureWaitTillUpdateTimeSec should be less than CheckIntervalSec.

DebugMode : integer

If set to 1, the allocator will use verbose logging. This includes reporting on all allocation decisions.

FlowExternalServerURL : string

The URL used to log into Flow. For example: http://flow.server.url:8080

This URL must be network accessible from the server running the allocator. If the allocator is running on the same server as the Flow server, this URL is likely to be http://localhost:8080

FlowServerWorkerConnectionHost : string

The DNS name or IP of the Flow server from the worker node’s perspective. In most cases, workers that are launched by a job scheduler run on a private network. This means the name of the Flow server that the worker needs to connect to may be different than the one listed under FlowExternalServerURL.

FlowDaemonUser : string

The Linux user under which job allocation requests are made. This is used when communicating with a job scheduler to query the number of running or pending Flow workers.

WorkerResourceLimit : JSON data

Defines resource limits for every allocated worker. These values are used by RunWorkerCMD and WorkerAllocatorCmd to inform the worker and job scheduler about resource requirements. The following are the resource limit types:

MaxWorkers : string

The maximum number of workers that will be allocated regardless of Flow resource demands. This should not be more than the licensed number of workers or more than the number of jobs the queue will accept for FlowDaemonUser.

MaxCores : string

This defines the maximum number of CPU cores that a worker will use. This should be consistent with the job queue limits.

MaxMemoryMB : string

The maximum amount of memory a worker is allowed to consume. This should be consistent with the job queue limits.

RuntimeMin : string

Maximum lifetime of a worker. This should be less than or equal to the runtime limits imposed by the job queue.

IdleShutdownMin : string

Flow workers that are idle for this number of minutes will auto-shutdown and release their resources back to the job scheduler.

RunWorkerCMD : JSON data

This is used to build the shell command which starts a Flow worker. This command is submitted to the job scheduler. The parameters are as follows:

Type : string

The only supported method at this time is SHELL.

Binary : string

The full path to partekFlowRemoteWorker.sh

Options : JSON data

These options must be consistent with those defined in WorkerResourceLimit and FlowServerWorkerConnectionHost. Each option is appended to the job submission command string in the same order it is defined here. The keys 1 … n are merely placeholders as is arg1. Keys labeled as @self refer to fields (referenced above) in this configuration file where their value (encoded as a simple array) denote the json key hierarchy from where to lookup this value. In most cases changes are not necessary unless a new type of worker limit is being added or removed.

WorkerAllocatorCmd : JSON data

This is used to build the shell command that interacts with the job scheduler. These values require modification based on the job scheduler, queue limits, and submission options.

Type : string

Defines the type of job scheduler. This is just a label and has no functional impact. Examples include: SGE, TORQUE, LSF, SWARMKIT.

Binary : string

The executable used to submit jobs. This must be in your path. Examples: bsub, qsub

Options : JSON data

Keys define the command line options (for example: -x, -q, -M). The values can be strings, null, or @self to read configuration options from this configuration file. @self can contain the key append in order to append static strings to command line values.

WorkerCounterCmd : JSON data

This is used to build the command that asks the job scheduler how many workers have been queued or are running. The output from this command is parsed according to the OutputParser definition.

Type : string

Defines the type of job scheduler. This is just a label and has no functional impact. Examples include: SGE, TORQUE, LSF, SWARMKIT

Binary : string

The executable used to query submitted job information. This must be in the user's path. Examples include: qstat, jobstat

Options : JSON data

OutputParser: JSON data

Currently the only type is LineGrepCount which returns the number of lines output from WorkerCounterCmd that contain the strings defined by LineGrepCount.

Additional Assistance

System Administrator Guide (Linux)

This section provides additional tools that may be useful for system administrators who maintain the Partek Flow server.\

Verifying that Partek Flow is Running on the Server
Locating the log files
Converting a Zip Installation to Use the Package Manager
Transferring Partek Flow to a new machine

Verifying that Partek Flow is Running on the Server

At anytime, you wish to know the status of Partek Flow:

$ service partekflowd status

Possible outputs are RUNNING or STOPPED.

Locating the log files

The Partek Flow server log is contained in the file /opt/partek_flow/logs/catalina.out In some cases, it may be necessary to run $ ps aux | grep flow to determine the current Partek Flow installation path. That should lead us to the most recent catalina.out file.
To zip all the logs of Partek Flow (server and task logs)

zip /opt/partek_flow/logs/

zip /home/flow/.partekflow/logs

(or the whole .partekflow folder that has the database)

Changing the Temporary Folder Location

Shutdown Flow

# service partekflowd stop

For this example we will use the new temporary folder location of /home/flow/partek_flow/temp. Adjust this path to meet your needs.
Open the configuration file /etc/partekflow.conf and append the following line to the end of the file:

CATALINA_TMPDIR=/home/flow/partek_flow/temp

Ensure the new temporary directory exists and is writeable by Flow. If you use a different Linux user account to run Flow, make sure this folder can be modified by this user.

# mkdir -p /home/flow/partek_flow/temp

# chown flow:flowuser /home/flow/partek_flow/temp

Start Flow

# service partekflowd start

Converting a Zip Installation to Use the Package Manager

$ cd

$ ~/partek_flow/stop_flow.sh

Ensure Partek Flow is no longer running. If the output contains only "grep bin/flow" this requirement is met.

$ ps aux | grep bin/flow

If Partek Flow is running and repeating step 1 above does not shut down the Partek Flow server, then use the following command where PID is the process ID of Partek Flow. The PID is found from the output of step 2 above, column two

$ kill -9 PID

Backup the existing Partek Flow database and installation directories. Substitute the example paths below with those specific to the Partek Flow installation.

$ cp -r ~/.partekflow ~/.partekflow_backup

$ mv ~/partek_flow ~/partek_flow_backup

Follow the Installation steps relevant to the Linux distribution on the Partek Flow server.

For Debian/Ubuntu: Upon reaching Configure Partek Flow installation settings enter

$ sudo dpkg-reconfigure partekflow

For Redhat/Fedora/Centos: Edit the following file: /etc/partekflow.conf

These prompts set the existing Linux account name and home directory used to run the previous Partek Flow server installation.

Transferring Partek Flow to a new machine

On OLD MACHINE

Shutdown existing Partek Flow installation: $ sudo service partekflowd stop
Backup Partek Flow database:

$ sudo su - flow

$ tar -czvf partekflowdb.bkup.tgz /home/flow/.partekflow

Copy partekflowdb.bkup.tgz to new machine
Remove existing Partek Flow installation: Debian/Ubuntu:

$ sudo apt-get remove partekflow partekflow-bin

RedHat/Fedora/CentOS

$ sudo yum remove partekflow partekflow-bin

On NEW MACHINE

Install Partek Flow as described earlier in this document. When prompted for license, paste the license generated for the new machine.
Shutdown Partek Flow to install previous database:

$ sudo service partekflowd stop

Unpack partekflowdb.bkup.tgz:

$ sudo su - flow

$ tar -xzvf partekflowdb.bkup.tgz

$ exit

Restart Partek Flow

$ sudo service partekflowd start

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Partek Flow Worker Allocator

Note: This tool requires the Flow rest API license feature

Obtaining the Allocator

Configuring the Allocator

The configuration file flow.worker.allocator.json must exist in the home directory. If this file is not present, an example configuration file will be written by running the allocator like so:

The allocator will exit immediately so the example configuration file can be edited. This configuration file is documented in the Configuration File Format section below.

Note: The system’s temp folder (e.g. /tmp) must be writable and mounted with execute permissions in order for the allocator to run.

Authenticating the Allocator to Flow

After flow.worker.allocator.json has been configured, run the allocator:

$ ./flow_worker_allocator

Starting the Allocator as a Background Process

$ nohup ./flow_worker_allocator 2>/dev/null &

The allocator takes no arguments. All configuration is stored in ~/flow.worker.allocator.json.

Stopping the Allocator

$ killall flow_worker_allocator

If the allocator was run as a foreground process, CONTROL-C or SIGTERM will cause the process to exit.

Logging

The allocator writes daily rotated log files to ~/flow.worker.allocator.log. For verbose output, set DebugMode to 1 in the configuration file.

Allocation Criteria

Configuration File Format

FlowAPITimeoutSec : integer

The length of time in seconds the allocator will wait for a response from the Flow server.

CheckIntervalSec : integer

The length of time in seconds the allocator will ask Flow if a worker is needed.

InfrastructureWaitTillUpdateTimeSec : integer

DebugMode : integer

If set to 1, the allocator will use verbose logging. This includes reporting on all allocation decisions.

FlowExternalServerURL : string

The URL used to log into Flow. For example: http://flow.server.url:8080

This URL must be network accessible from the server running the allocator. If the allocator is running on the same server as the Flow server, this URL is likely to be http://localhost:8080

FlowServerWorkerConnectionHost : string

FlowDaemonUser : string

The Linux user under which job allocation requests are made. This is used when communicating with a job scheduler to query the number of running or pending Flow workers.

WorkerResourceLimit : JSON data

MaxWorkers : string

MaxCores : string

This defines the maximum number of CPU cores that a worker will use. This should be consistent with the job queue limits.

MaxMemoryMB : string

The maximum amount of memory a worker is allowed to consume. This should be consistent with the job queue limits.

RuntimeMin : string

Maximum lifetime of a worker. This should be less than or equal to the runtime limits imposed by the job queue.

IdleShutdownMin : string

Flow workers that are idle for this number of minutes will auto-shutdown and release their resources back to the job scheduler.

RunWorkerCMD : JSON data

This is used to build the shell command which starts a Flow worker. This command is submitted to the job scheduler. The parameters are as follows:

Type : string

The only supported method at this time is SHELL.

Binary : string

The full path to partekFlowRemoteWorker.sh

Options : JSON data

WorkerAllocatorCmd : JSON data

This is used to build the shell command that interacts with the job scheduler. These values require modification based on the job scheduler, queue limits, and submission options.

Type : string

Defines the type of job scheduler. This is just a label and has no functional impact. Examples include: SGE, TORQUE, LSF, SWARMKIT.

Binary : string

The executable used to submit jobs. This must be in your path. Examples: bsub, qsub

Options : JSON data

WorkerCounterCmd : JSON data

This is used to build the command that asks the job scheduler how many workers have been queued or are running. The output from this command is parsed according to the OutputParser definition.

Type : string

Defines the type of job scheduler. This is just a label and has no functional impact. Examples include: SGE, TORQUE, LSF, SWARMKIT

Binary : string

The executable used to query submitted job information. This must be in the user's path. Examples include: qstat, jobstat

Options : JSON data

OutputParser: JSON data

Currently the only type is LineGrepCount which returns the number of lines output from WorkerCounterCmd that contain the strings defined by LineGrepCount.

Additional Assistance

If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.

Diagnosing Issues

Partek Flow comes with a standalone diagnostic script that reports how Partek Flow is installed and detects common installation problems.

flowstatus.sh script

Support team needs additional information and will request this script be run
Partek Flow crashes or is otherwise inaccessible
Partek Flow is unable to see input or output directories or projects are suddenly missing
Unexpected behavior after a Partek Flow or system update
Tasks fail to run due to missing files or directory permission issues

If you are unable to install Partek Flow, this script will not be available. Please contact Partek Technical Support if you cannot install Partek Flow.

Running flowstatus.sh via the command line

Locate the Partek Flow installation directory. This is defined as the FLOWhome variable in the file /etc/partekflow.conf.

$ grep FLOWhome /etc/partekflow.conf

For this example, we assume the Partek Flow install directory is /opt/partek_flow. If it is not, replace it with the directory found in step 1.

Run the script

$ bash /opt/partek_flow/flowstatus.sh

Running flowstatus.sh via the command line (alternative method)

If you get an error saying, "No such file or directory" or you are unable to find the flowstatus.sh script on your system, then run the following:

Linux

$ wget https://customer.partek.com/flowstatus.sh

$ bash flowstatus.sh

MacOS

$ curl -O  https://customer.partek.com/flowstatus.sh

$ bash flowstatus.sh

Interpreting flowstatus.sh reports

When running the flowstatus.sh script, you will see a report similar to Figure 2.

The relevant details of the report are:

Script running as Linux user: The user account the flowstatus.sh script was run under

Flow status: Is the Partek Flow server running or not?

Flow HTTP port: To use Partek Flow, most users will access the URL . The number associated with this URL is the HTTP port which defaults to 8080. Sometimes this port will be changed to another value. For example, if the port was changed to 8081, you will need to access Partek Flow by visiting the URL .

Flow install directory: By default, this should be /opt/partek_flow. If this is not the case, the upgrade process for Partek Flow becomes more involved.

Sending the error report to Partek

At the end of the report, you will be given an option to send the error to Partek (Figure 3).

In some cases, the https connections may be blocked from the server and sending the error would fail. The logs can be zipped and sent manually using the method described .

Additional Assistance

If you need additional assistance, please visit to submit a help ticket or find phone numbers for regional support.