All pages
Powered by GitBook
1 of 8

Importing Data

Partek Flow can import a wide variety of data types including raw counts, matrices, microarray, variant call files as well as unaligned and aligned NGS data.

  • Navigating the file browser to transfer files to the server

  • Associate fastq files for multi-omic data

  • SFTP File Transfer Instructions

  • Sample Table from a Text File

  • Import single cell data

  • Importing 10x Genomics Matrix Files

  • Importing and Demultiplexing Illumina BCL Files

  • Partek Flow Uploader for Ion Torrent

  • Importing 10x Genomics .bcl Files

  • Import a GEO / ENA project

The following file types are valid and will be recognized by the Partek Flow file browser.

  • bam

  • bcf

  • bcl

  • bgx

  • bpm

  • cbcl

  • CEL

  • csv

  • fa

  • fasta

  • fastq

  • fcs

  • fna

  • fq

  • gz

  • h5ad

  • h5 matrix

  • idat

  • loom

  • mtx

  • probe_tab

  • qual

  • raw

  • rds

  • sam

  • sff

  • sra

  • tar

  • tsv

  • txt

  • vcf

  • zip

In cases where paired end fastq data is present, files will also be automatically recognized and their paired relationship will be maintained throughout the analysis.

Matching on paired end files is based on file names: every character in both file names must match, except for the section that determines whether a file is the first or the second file. For instance, if the first file contains "_R1", "_1", "_F3", "_F5" in the file name, the second file must contain something in the lines with the following: "_R2", "_2", "_F5", "_F5-P2", "_F5-BC", "_R3", "_R5" etc. The identifying section must be separated from the rest of the filename with underscores or dots. If two conflicting identifiers are present then the file is treated as single end. For example, s_1_1 matches s_1_2, as described above. However, s_2_1 does not mate with s_1_2 and the files will be treated as two single-end files.

Apart from paired-end data, files with conventional filename suffixes that indicate that they belong to the same sample are consolidated. These suffixes include:

  • Adapter sequences

    • "bbbbbb" followed by "" or at the end of the file name, where each "b" is "A", "C", "G", or "T"

  • Lane numbers

    • "L###" followed by "" or at the end of the file name, where each "#" is a digit 0 to 9

  • Dates

    • in the form "####-##-##" preceded or followed by a period or underscore

  • Set number

    • of the form "_###" from the end

Navigating the file browser to transfer files to the server

The file browser is used to transfer files to the server so that these files can be added to a project for analysis. If you are importing a Bioproject from GEO/ENA or using URLs for data import, there is no need to transfer the files to the server.

To access the file browser and upload data to the server, use any of these options:

  • access Transfer files on the Partek® Flow® homepage

  • within a project, after selecting the file type to transfer, using the transfer files link available within all file import options

  • from the settings, go to Access management > Transfer files

Using the file browser to transfer files to the server:

  • Click Transfer files to access the file browser

  • Drag and drop or click My Device to add files from your machine

  • Click Browse to modify the Upload directory or create a new folder. The Upload directory should be specified, known, and distinguishable for project file management. You will return to this directory and access the files to import them into a project

  • To continue to add more files use + Add more in the top right corner. To cancel the process select Cancel in the top left corner

  • Click Upload to complete the file upload

  • Do not exit the browser tab or let the computer go to sleep or shut down until the transfer has completed

File size displayed in the table is binary format, not decimal format (e.g. GB displayed in the table is gigibyte not gigabyte. 1 gigibyte is 1,073,741,824 bytes. 1 gigabyte is 1,000,000,000 bytes. 1 gigibyte is 1.074 gigabytes).

Associate fastq files for multi-omic data

There are projects with more than one file type, such as single cell multi-omic assays that generate protein and RNA together. In these cases, files need to be associated with each other if starting in fastq format. If we start with processed data, there is no need to associate these files.

  • Define the type of data the file represents when importing files into the project.

Select data type

If this step is skipped, the data type can be changed after import by right clicking the data node.

Change data type
  • After importing both types of data, associate fastqs with the already imported data. (e.g. associate RNA fastqs with ATAC fastqs already imported into Partek Flow.

Associate files with this sample

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

SFTP File Transfer Instructions

  • Introduction

  • SFTP with WinSCP

  • SFTP with FileZilla

  • SFTP command line usage

  • Points of Caution

Only use this method if encounter issues using Transfer file function on Flow homepage, contact support@partek.com to obtain private key.

Introduction

The following instructions detail the use of SFTP (Secured File Transfer Protocol) to transfer data to and from your Partek Flow instance. SFTP offers significant performance and security enhancements over FTP for file transfers. It also enables the use of robust file syncing utilities, e.g. RSYNC, and is compatible with common file transfer programs such as FileZilla and WinSCP.

To transfer files with SFTP, you will need to have your Partek Flow:

  • Server Name. Example: myname.partek.com

  • Username. Example: flowloginname

  • Private authentication key

This information should have been e-mailed to you from the Partek licensing team. If you lose this information, contact Partek support and we will resend your authentication key to you.

SFTP with WinSCP

WinSCP is an open source, free SFTP client for Windows. Its main function is file transfer between a local and a remote computer.

Downloading WinSCP

To download WinSCP, visit WinSCP's official site: https://winscp.net/eng/download.php On the WinSCP page you may need to scroll a bit down, to reach the green button Download WinSCP.

&#xNAN;Figure 1. Download button on WinSCP's official web site. Note: the version may change from the time of writing of this document

Connecting to your Partek server with WinSCP

Download and install WinSCP on your local computer and then launch the program.

  • On the Login page click on the New Site icon.

&#xNAN;Figure 2. Adding NewSite on WinSCP's Login page

  • Type in the Host Name, which is the same as the web address that you use to access your instance of Partek Flow

The web address for your instance of Partek Flow has been sent to you by Partek's Licensing team. In this example, the web address is ilukic5i.partek.com.

  • Type in the User name, that has also been sent to you (and is the same user name that you use to log on to Partek Flow). In this example, the web address is lukici.

&#xNAN;Figure 3. Adding Host name and User name information. Use the host and user name that has been sent to you by Partek's licensing team

  • To proceed click on the Advanced... button

  • Then select the Authentication in the SSH section of the Advanced Site Settings dialog

&#xNAN;Figure 4. Adding the id_rsa file to WinSCP. Use the Advanced Site Settings tab and select Authentication

Select the ... button (under Private key file) to browse for the id_rsa file.

  • The file has been sent to you by Partek's licensing team attached to the same email that gave you your URL and username.

If you do not see it in the Select private key file browser, switch to All Files (.)

&#xNAN;Figure 5. Showing all files in the Select private key file dialog

  • Click Open

WinSCP will ask you to confirm file format conversion

  • Click OK.

&#xNAN;Figure 6. Converting key file format

WinSCP will create a file in .ppk format.

  • Click Save to save the converted key file, id_rsa.ppk, to a secure location on your local computer.

  • Click OK again to confirm the change.

Your private key has been saved in .ppk format and added to WinSCP

Click OK to proceed

&#xNAN;Figure 7. Private key in .ppk format added to WinSCP

  • Click Save to save the new WinSCP settings.

This will open the Save session as site dialog. You can accept the default name (in this example lukici@ilukic5i.partek.com) or add a custom name. The name that you specify here will appear in the left panel of the Login dialog.

  • Once you have made your edits, click OK.

&#xNAN;Figure 8. Customising the name for the new site on the Login dialog. In this example, the name is lukici@ilukic5i.partek.com

  • On the Login page, select your newly created site (in this example: lukici@ilukic5i.partek.com) and click the Login button.

The first time you connect, a warning message will appear, asking you whether you want to connect to an unknown server.

  • Click Yes to proceed.

&#xNAN;Figure 9. The first time you connect to your Partek server, WinSCP will present a warning message. Click Yes to connect to the server

The progress towards establishing a connection will be displayed in a dialog. This process is automatic and you do not need to do anything.

&#xNAN;Figure 10. Progress of the connection to your Partek server will be displayed on the screen

The WinSCP interface includes is split into two panels. The panel on the left shows the directory structure of your local computer and the panel on the right shows the directory structure of your Partek Flow file server.

&#xNAN;Figure 11. WinSCP screen after connection divides into two panels: the files one the left are on the local computer, while the files on the right are on Partek server

To transfer a file, just drag and drop the file from one panel to the other. The progress of your transfer will be shown on the screen.

&#xNAN;Figure 12. Progress of file transfer is shown on screen

SFTP with FileZilla

FileZilla is a graphical file transfer tool that runs on Windows, OSX, and Linux. It is great when needing to do bulk transfers as all transfers are added to a queue and processed in the background. It is possible to browse your files on the Partek Flow server while transfers are active. This is also the best solution when you are not on a computer with command line access or you are uncomfortable with command line operations.

Downloading FileZilla

We recommend downloading the FileZilla install packages from us. They are also available from download aggregator sites (e.g. CNET, download.com, sourceforge) but these sites have been known to bundle adware and other unwanted software products into the downloads they provide, so avoid them.

  • Mac OSX: http://packages.partek.com/bin/filezilla/fz-osx.app.tar.bz2

  • Windows 32-bit: http://packages.partek.com/bin/filezilla/fz-win32.exe

  • Windows 64-bit: http://packages.partek.com/bin/filezilla/fz-win64.exe

  • Linux (Please use your distribution's package manager to install Filezilla):

  • Ubuntu:

$ sudo apt-get update
$ sudo apt-get install filezilla
  • RedHat, see the following guide: http://juventusitprofessional.blogspot.com/2013/09/linux-install-filezilla-on-centos-or.html

  • OpenSuse, see: https://software.opensuse.org/package/filezilla

Connecting to your Partek server with FileZilla

After starting FileZilla, click on the Site Manager icon located at the top left corner of the FileZilla window.

&#xNAN;Figure 13

Click on the New Site button on the left of the popup dialog.

&#xNAN;Figure 14

Type in a name for the connection. Example: “Partek SFTP”.

&#xNAN;Figure 15

The connection details to the right need to be changed to reflect the information you received via email. The default settings will NOT work.

&#xNAN;Figure 16

  • Set Host: to your partek server name

  • Leave Port: blank

  • Change Protocol: to SFTP - SSH File Transfer Protocol

  • Change User: to your Partek Flow login name

  • Change Logon Type: to Key File and select the key file received via email.

&#xNAN;Figure 17

When selecting your key file, change the file selection from its default of PPK files to All files. Otherwise you key file will not be visible in the file browser.

After selecting your key file, click the Connect button.

&#xNAN;Figure 18

Click the checkbox to always trust this host and click OK. Once connected, you can begin to browse and transfer files. The files and folders to the left are on your computer, the ones on the right are on the Flow server.

&#xNAN;Figure 19

SFTP command line usage

Importing your private authentication key

You will receive a file called id_rsa via email. Download this file, note where you downloaded it to, then use ssh-add to import the key. If you logout or reboot your computer, you will need to re-run the commands below. After key import, you will not be asked a password when transferring files to your Partek Flow server.

$ cd directory/with/key
$ chmod 600 id_rsa
$ eval $(ssh-agent)
$ ssh-add id_rsa

Copying files and folders between your Partek Flow server and local computer

RSYNC usage

RSYNC is useful when resuming a failed transfer. Instead of re-uploading or downloading what has already been transferred, RSYNC will copy only what it needs.

The command below will sync the folder "local_folder" with the "remote_folder" on Partek's servers. To transfer in the other direction, reverse the last two parameters.

$ rsync -avr --progress ./local_folder/ flowloginname@myname.partek.com:~/remote_folder/

With rsync, don't forget the trailing '/' on directory names.

Before moving the files, we strongly advise you to use FileZilla to explore the directory structure of the Partek server and then create a new directory to transfer the files to.

Points of Caution

  • When you delete files from the Partek Flow server they are gone and can not be recovered.

  • Please use Partek Flow to delete projects and results. Manually removing data using SFTP could break your server.

  • Wait until ALL input data for a particular project has been transferred to the Partek Flow server before importing data via Partek Flow. If you try to import samples while the upload is occurring the import job will crash.

  • When upload raw data to Partek hosted Flow server, we recommend to a create subfolder for each experiment at the same level of "FlowData" folder or inside "FlowData" folder.

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Import single cell data

  • Import single cell data for different assay types and formats

  • Import single cell data from count matrix text file(s) using Full count matrix as the data format

Import single cell data for different assay types and formats

Select Single cell, choose the assay type (scRNA-Seq, Spatial transcriptomics, scATAC-Seq, V(D)J, Flow/Mass cytometry), and select the data format (Figure 1). Use the Next button to proceed with import.

Figure 1. Choose import single cell data option

Import single cell data from count matrix text file(s) using Full count matrix as the data format

Partek Flow supports single cell data analysis in count matrix text format using the Full count matrix data format (Figure 2). Each matrix text file is assumed to represent on sample, each value in the matrix represents expression value of a feature (e.g. a gene, or a transcript) in a cell. The expression value can be raw count, or normalized count. The requirement of the format of each text file should be the same as count matrix data.

Specify text file location, only one text file (in other words one sample) can be imported at once, preview of the file will be displayed, configuration of the file format is the same as Import count matrix data. In addition, you need to specify the details about this file.

Figure 2. Choose Full count matrix as the data format

Click Finish, the sample will be imported, on the data tab, number of cells in the sample will be displayed.

To import multiple samples, repeat the above steps by clicking Import data on the Metadata tab or within the task menu (toolbox) on the Analyses tab. Make the same previous selections using the cascading menu.

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Importing 10x Genomics Matrix Files

Importing 10x Genomics Matrix Files

  • Importing single cell data

    • Importing matrices into Partek Flow (this Market Exchange Format is popular for public repositories)

    • Importing matrices in h5 format (this Hierarchical Data Format is recommended for multiple samples)

  • Importing spatial data

    • Importing Xenium Output Bundle

Importing single cell data

Partek Flow supports the import of filtered gene-barcode matrices generated by 10x Genomics' Cell Ranger pipeline.

Below is a video summarizing the import of these files:

Importing matrices into Partek Flow (this Market Exchange Format is popular for public repositories)

To import the matrices into Partek Flow, create a new project and click Add data then select Import scRNA count feature-barcode-mtx under Single cell > scRNA-Seq.

Figure 1. Importing single cell data

Samples can be added using the Add sample button. Each sample should be given a name and three files should be uploaded per sample using the Browse button.

Figure 2. Import three feature-barcode matrix files for each sample

If you have not already, transfer the files to the server to be accessed when you click Browse. Follow the directions here to add files to the server. Make sure the files are decompressed before they are uploaded to the server.

By default, the Cell Ranger pipeline output will have a folder called filtered_gene_bc_matrices (Figure 3). It is helpful to rename and organize the files prior to transfer using the File browser.

There are folders nested within the matrix folder, typically representing the reference genome it was aligned to. Navigate to the lowest subfolder, this should contain three files:

  • barcodes.tsv

  • genes.tsv

  • matrix.mtx

Select all 3 files for import into Partek Flow

Figure 3. Filtered matrix folder from Cell Ranger pipeline

Specify the annotation file used when running the pipeline for additional information such as mitochondrial counts (Figure 4). Other information can also be specified, such as the count value format. All features can be reported or features with non-zero values across all samples can be reported and the read count threshold can be modified to make the import more efficient.

Figure 4. Configuring matrix metadata

Click Finish when you have completed configuration. This will queue the import task.

Importing matrices in h5 format (this Hierarchical Data Format is recommended for multiple samples)

The Cell Ranger pipeline can also generate the same filtered gene barcode matrix in h5 format. This gives you the ability to select just one file per matrix and select multiple matrices to import in batch. To import an h5 matrix, select the Import scRNA full count matrix or h5 option (Figure 1). Browse for the files and modify any configuration options. Remember the files need to be transferred to the server.

Figure 5. Importing matrix in h5 format

This feature is also useful for importing multiple samples in batch. Simply put all h5 files from your experiment on a single folder, navigate to the folder and select all the matrices you would like to import.

Configure all the relevant sample metadata, including sample name and the annotation that was used to generate the matrices, and click Finish when completed. Note that all matrices must have been generated using the same reference genome and annotation to be imported into the same project.

Importing spatial data

Importing Xenium Output Bundle

Raw output data generated by the 10x Genomics' Xenium Onboard Analysis pipeline consists of decoded transcript counts and morphology images. The raw output and other standard output files derived from them are compiled into a zipped file called Xenium Output Bundle.

To import the Xenium Output Bundle into Partek Flow, create a new project and click Add data, then select Import 10x Genomics Xenium under Single cell > Spatial, click Next.

Figure 6. Importing Xenium spatial data

Samples can be added using the Add sample button. Each sample should be given a name and a folder containing the required 6 files: cell_feature_matrix.h5, cells.csv.gz, cell_boundaries.csv.gz, nucleus_boundaries.csv.gz, transcripts.csv.gz, morphology_focus.ome.tif should be uploaded per sample using the Browse button. The required 6 files should be all included in the Xenium Output Bundle folder.

Figure 7. Import Xenium Output Bundle folder for each sample

If you have not already, transfer the files to the server to be accessed when you click Browse. Follow the directions here to add files to the server. You will need to decompress the Xenium Output Bundle zip file before they are uploaded to the server. After decompression, you can drag and drop the entire folder into the Transfer files dialog, all individual files in the folder will be listed in the Transfer files dialog after drag & drop, with no folder structure. The folder structure will be restored after upload is completed.

Figure 8. Drag & drop unzipped Xenium Output Bundle folder into Transfer files dialog

Once you have uploaded the folder into the server, you can continue to select the folder for each sample from Browse. Once the folder is selected, the Cells and Features values will auto-populate. You can choose an annotation file that matches what was used to generate the feature count. Then, click Finish to start importing the data into your project.

Figure 9. Add Xenium Output Bundle and select annotation

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Importing and Demultiplexing Illumina BCL Files

Primary sequencing output of an Illumina sequencer are per-cycle base call (bcl) files, which first need to be converted to fastq format, so that the data can be pushed to downstream applications. Partek Flow software comes with a conversion tool that can be used to import data in the bcl file format . In addition to the file conversion, this tool also demultiplexes the bcl files in the same step and outputs demultiplexed fastq files as the result.

We recommend you start by transferring the entire Illumina run folder to the Partek Flow server. To start a new project with bcl files, first select bcl under the Other import tab (Figure 1)

Figure 1. Bcl file import setup dialog. Required input includes: RunInfo.xml file, SampleSheet.csv file, and a directory hosting .bcl files

The resulting window shows the configuration dialog (Figure 2).

Figure 2. Bcl file import setup dialog. Required input includes: RunInfo.xml file, SampleSheet.csv file, and a directory hosting .bcl files

The bcl files hold the base calls and are in the Data directory within the whole Illumina run folder. Note that the Data directory file path needs to point to the directory, not to an individual bcl file.

The RunInfo.xml file is generated by the primary analysis software and contains information on the run, flow cell, instrument, time stamp, and the read structure (number of reads, number of cycles per read, whether a read is an index read). This file is typically stored at the top level in the Illumina run folder.

The SampleSheet.csv file provides the information on the relationship between the samples and indices specified during library creation. Although it has four sections, two sections (Settings and Data) are important for the data import and conversion. For more information on the files, consult Illumina documentation.

Selecting the Configure option under the Advanced options section enables a granular control of the import (Figure 3).

Figure 3. Advanced options of bcl importer

The Select tiles option (--tiles) enables the user to process only a subset of tiles available in the flow cell. The input for this option is a comma-separated list of regular expressions.

Min trimmed read length (--minimum-trimmed-read-length) specifies the minimum read length after adapter removal.

Mask short adapter reads (--mask-short-adapter-reads) applies when a read is trimmed below the length specified by Min trimmed read length. If the number of bases after adapter removal is less than Min trimmed read length, it forces the read length to be equal to Min trimmed read length by replacing the adapter bases that fall below the specified length by Ns. If the number of remaining bases falls below Mask short adapter sequences, then it replaces all the bases in a read with Ns.

Adapter stringency (--adapter-stringency) specifies the minimum match rate that triggers the masking or trimming of adapters. The rate is calculated as MatchCount / (MatchCount + MismatchCount). Only the reads exceeding the specified rate of sequence identity with adapters are trimmed.

Barcode mismatches (--barcode-mismatches) controls the number of allowed mismatches per index sequence.

Use bases mask (--use-bases-mask) defines a custom read structure that may be different to the structure specified in the RunInfo.xml file. The input for this option is a comma-separated list where Y and I are followed by a number indicating how many sequencing cycles to include in the fastq file. For example, if the option is set to Y26,I8,Y98, 26 cycles (26bp) will be used to generate the R1 sequence, 8 cycles (8bp) will be used for the sample index, and 98 cycles (98bp) will be used to generate the R2 sequence.

Do not split files by lane (--no-lane-splitting) prevents splitting of fastq files by lane, i.e. the converter will merge multiple lanes and generate one fastq file per sample.

Create fastq for index reads (--create-fastq-for-index-reads) creates an extra fastq file for each sample containing the sample index sequence for each read. This will be imported as an extra sample into the project.

Ignore missing bcls (--ignore-missing-bcl) will interpret missing base call files as N.

Ignore missing filter (--ignore-missing-filter) will ignore missing filter files and assume all clusters pass the filter.

Ignore missing positions (--ignore-missing-positions) will write new, unique coordinates into the header line if the cluster location files are missing.

Ignore missing controls (--ignore-missing-control) will interpret missing control files as missing not-set control bits.

Save undetermined fastq will take the reads that could not be assigned to a sample index and collect them into an Undetermined_S0.fastq file, which will be imported as a new sample.

The result of the import is an Unaligned reads data node, containing demultiplexed fastq files.

For more information about the BCL to FASTQ conversion tool, including information on the proper folder structure and instructions for formatting the SampleSheet.csv file, please consult the bcl2fastq2 Conversion Software Guide.

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Partek Flow Uploader for Ion Torrent

The Partek Flow Uploader is a Torrent Browser plugin that lets users upload run results to Partek Flow for further analysis.

  • Quick Video Tutorial on running the Plugin

  • Downloading the Plugin

  • Adding the Plugin to your Run Plan

  • Running the Plugin from a Report

  • Sample table created by the Plugin

  • Conversion of UBAM to FASTQ files

Quick Video Tutorial on running the Plugin

The clip above (video only, no audio) shows the Partek Flow Uploader plugin in action.

Downloading the Plugin

Download the Partek® Flow® Uploader from the links below:

https://customer.partek.com/plugins/PFU/PartekFlowUploader-1.03.zip

This is a compressed zipped file. Do not unzip.

Installation of the Plugin

Installation only needs to be performed once per Torrent Browser. All users of the same instance of Torrent Browser will be able to use the plugin. For future versions of the plugin, the steps below can also be used for updating.

To install the plugin, first log into Torrent Browser (Figure 1).

Figure 1. Torrent Browser login page

Navigate to Plugins under dropdown menu in upper-right corner under the gear icon (Figure 2).

Figure 2. Accessing the Plugins

Click the Install or Upgrade Plugin button (Figure 3).

Figure 3. Installing a new plugin in the Torrent Browser

Click Select File (Figure 4) and use the file browser to select the zip file you downloaded from the download link. Click Upload and Install.

Figure 4. Uploading and installing the zip file of the plugin

Verify that the Partek Flow Uploader is listed and that the Enabled checkbox is selected (Figure 5).

Figure 5. Table showing the Partek Flow Uploader successfully installed

From the plugins table, click the (Manage) gear icon for the Partek Flow Uploader and select Configure in the drop-down menu (Figure 6).

Figure 6. Accessing the Global configuration of the Partek Flow Uploader

Global Partek Flow configuration settings can be entered into the plugin. When set, it will serve as the default for all users of the Torrent Browser. If multiple Partek Flow users are expected to run the plugin, it is recommended to leave the username and password fields blank so that individual users can enter them as needed.

In the configuration dialog (Figure 7), enter the Partek Flow URL, your username and your password. Clicking on Check configuration would verify your credentials and indicate if a valid username and password has been entered. Click Save when done.

Figure 7. Global Partek Flow Uploader configuration settings

Click the Rescan Plugins for Changes button. Rescanning the plugins will finish the installation and save the configuration.

Adding the Plugin to your Run Plan

In the Torrent Browser, you can configure a Run Plan to include the Partek Flow Uploader. You can create a new Run Plan (from a Sample or a Template) or edit an existing Run Plan. In the example in Figure 8, the Partek Flow Uploader will be included in an existing Run Plan. From the Planned Runs page, click the gear icon in the last column, and choose Edit.

Figure 8. Selecting an existing Run Plan to Edit

In the Edit Plan page, go to the Plugins tab (Figure 9) and select the checkbox next to the PartekFlowUploader.

Figure 9. Editing the Plugins section of the Run Plan

Click the Configure hyperlink next to the PartekFlowUploader (Figure 10). If necessary, enter the Partek Flow URL, your username and your password. These are the same credentials you use to access Partek Flow directly on a web browser. Note that some fields may already be pre-populated depending on the global plugin configuration, you can edit the entries as needed. All fields are required to successfully run the plugin.

The Project Name field will be used in Partek Flow to create a new project where the run results will be exported. However, if a project with that name already exists, the samples will be added to that existing project. This enables you to combine multiple runs into one project. Project Names are limited to 30 characters. If not specified, the plugin will use the Run Name as the Project Name. Click the Check configuration button to see if you typed a valid username and password. When ready, click Save Changes to proceed.

Figure 10. Configuring the Partek Flow Uploader as part of a Run Plan

Proceed with your Run Plan. The plugin will wait for the base calling to be finished before exporting the data to Partek Flow.

Once the Run Plan is executed, data will be automatically exported to the Partek Flow Server. In the Run Report, go to the Plugin Summary tab and the plugin status will be displayed. An example of a successful Plugin upload is shown in Figure 11.

Figure 11. Partek Flow Uploader showing successful transfer

To access the project, click on the Partek Flow hyperlink in the plugin results (Figure 11). You can also go directly to Partek Flow in a new browser window and access your account. In your Partek Flow homepage (Figure 12), you will now see the project created by the Partek Flow Uploader.

Figure 12. Partek Flow Homepage with the new Project created by Partek Flow Uploader

Running the Plugin from a Report

You can manually invoke the plugin from a completed run report. This allows you to export the data from the Torrent Server if you did not include the plugin in the original run plan. This also gives you the flexibility to export the same run results onto different project(s). Open the run report and scroll down to the bottom of the page (Figure 13). In the Plugin Summary tab, click the Select Plugins to Run button.

Figure 13. Running the Plugin from a completed run

From the plugin list (Figure 14), select the PartekFlowUploader plugin.

Figure 14. Selecting the Partek Flow Uploader Plugin

Configure the Partek Flow Uploader. Enter the Partek Flow URL, your username and your password (Figure 15). These are the same credentials you use to access Partek Flow directly on a web browser. Although some fields may already be pre-populated depending on the global plugin configuration, you can edit the entries as needed. All fields are required to successfully run the plugin.

Figure 15. Configuring the Partek Flow Uploader from a Report

The Project Name field will be used in Partek Flow to create a new project where the run results will be exported. However, if a project with that name already exists, the samples will be added to that existing project. This enables you to combine multiple runs into one project. Project Names are limited to 30 characters. The default project name is the Run Name.

When ready, click Export to Partek Flow to proceed. If you wish to cancel, click on the X on the lower right of the dialog box.

Note that configuring the Plugin from a report (Figure 15) is very similar to configuring it as part of a Run Plan (Figure 10) with two notable differences:

  1. The Check configuration has been replaced by Export to Partek Flow button, which when clicked, immediately proceeds to the export.

  2. The Save changes button has been removed so any change in the configuration cannot be saved (compared to editing a run plan where plugin settings are saved)

Once the plugin starts running, it will indicate that it is Queued on the upper right corner of the Plugins Summary (Figure 16). There will also be a blue Stop button to cancel the operation.

Figure 16. The plugin is Queued as indicated and a blue Stop button is available

Click the Refresh plugin status button to update. The plugin status will show Completed once the export is done and the data is available in Partek Flow (Figure 11).

Sample table created by the Plugin

The Partek Flow Uploader plugin sends the unaligned bam files to the Partek Flow server. For each file, a Sample of the same name will be created in the Data tab (Figure 17).

Figure 17. Expanded Sample Table showing Data files

Reads that had no detectable barcodes have been combined in a sample with a prefix: nomatch_rawlib.basecaller.bam (Row 17 in Figure 17). You can removed this sample from your analysis by clicking the gear icon next to the sample name and choosing Delete sample.

The data transferred by the Partek Flow Uploader is stored in a directory created for the Project within the user's default project output directory. For example, in Figure 17, the data for this project is stored in: /home/flow/FlowData/jsmith/Project_CHP Hotspot Panel.

Conversion of UBAM to FASTQ files

The plugin transfers the Unaligned BAM data from the Torrent Browser. The UBAM file format retains all the information of the Ion Torrent Sequencer. In the Partek Flow Project, the Analyses tab would show a circular data node named Unaligned bam. Click on the data node and the context-sensitive task menu will appear on the right (Figure 18).

Unaligned BAM files are only compatible with the TMAP aligner, which can be selected in the Aligners section of the Task Menu. If you wish to use other aligners, you can convert the unaligned BAM files to FASTQ using the Convert to FASTQ task under Pre-alignment tools. Some information specific to Ion Torrent Data (such as Flow Order) are not retained in the FASTQ format. However, those are only relevant to Ion Torrent developed tools (such as the Torrent Variant Caller) and are not relevant to any other analysis tools.

Figure 18. Analyses tab of Partek Flow showing the task menu available for the selected Unaligned BAM file data node

Once converted, the reads can then be aligned using a variety of aligners compatible with FASTQ input (Figure 19). You can also perform other tasks such as Pre-alignment QAQC or run an existing pipeline. Another option is to include the Convert to FASTQ task in your pipeline and you can invoke the pipeline directly from an Unaligned bam data node.

Figure 19. Unaligned reads in the FASTQ format are compatible with more tasks

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Importing 10x Genomics .bcl Files

Partek Flow supports .bcl files based on 10x Genomics library preparation. The following document will guide you through the steps.

To start the import, create a new project and then select Import Data > Import bcl files. The Import bcl dialog will come up (Figure 1).

Figure 1. Import bcl dialog

Use the Data directory option to point to the location of the directory holding the data. It is located at the top level of the run directory and is typically labeled Data. Please see the tool tip for more info.

Use the Run info file option to point to the RunInfo.xml file. It is located at the top level of the run directory.

Use the Sample sheet file to point to the sample sheet file, which is usually a .csv file. Partek Flow can accept 10X Genomics' "simple" and Illumina Experiment Manager (IEM) sample sheet format, which utilize 10X Genomics' sample index set codes. Each index set code corresponds to a mixture of four sample index sequences per sample. Alternatively, Partek Flow will also accept a sample sheet file that has been correctly formatted using the sample sheet generator provided by 10X Genomics.

The click on the Configure link and make the following changes (Figure 2).

  • Min trimmed read length: 8

  • Mask short adapter reads: 8

  • Use bases mask: see below

  • Create fastq for index reads: OFF

  • Ignore missing bcls: ON

  • Ignore missing filter: ON

  • Ignore missing positions: ON

  • Ignore missing controls: ON

For the Use bases mask option, the read structure for Chromium Single cell 3' v2 prep kit is typically Y26,I8,Y98. The settings for Chromium Single cell 3' v3/v3.1 is typically Y28,I8,Y91. Please check the read structure detailed in the RunInfo.xml file and adjust the values to match your data.

Figure 2. Setting the advanced options to import bcl files. The Use bases mask settings shown here are for Chromium v2 chemistry

Click Apply to accept and then Finish to import your files.

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Import a GEO / ENA project

  • How to import a study from GEO / ENA

  • Common Issues

    • Error Message - The project did not yield any data. Double-check the project ID, or try importing the data manually

    • The project was imported, but the Analyses tab is empty and there are no FASTQ files

    • Something is missing or the import failed

  • FAQ

    • What are GEO and ENA?

    • How do I know if a GEO project is also in ENA?

How to import a study from GEO / ENA

If a project is publicly available in the Gene Expression Omnibus (GEO) and European Nucleotide Archive (ENA) databases, you can import associated FASTQ files and sample attributes automatically into Partek Flow.

  • On the Homepage click New Project to create a project and give the project a name

  • Click Add data

  • Select fastq as the file type after choosing between Single cell or Bulk as the assay types

  • Click Next

  • Choose GEO / ENA

  • Enter the BioProject ID of the data set you would like to download. The format of a BioProject ID is PRJNA followed by one to six numbers (e.g. PRJNA381606)

A GEO ID can also be used in the format GSE followed by one to five numbers (e.g. GSE71578).

  • Click Finish

It may take a while for the download to complete depending on the size of the data. FASTQ files are downloaded from the ENA BioProject page.

  • FASTQ files will be added as an Unaligned reads data node in the Analyses tab

Common Issues

Error Message - The project did not yield any data. Double-check the project ID, or try importing the data manually

If the study is not publicly available in both GEO and ENA, project import will not succeed.

The project was imported, but the Analyses tab is empty and there are no FASTQ files

If there is an ENA project, but the FASTQ files are not available through ENA, the project will be created, but data will not be imported.

Something is missing or the import failed

A variety of other issues and irregularities can cause imports to not succeed or partially succeed, including, but not limited to, a BioProject having multiple associated GSE IDs, incomplete information on the GEO or ENA page, and either the GEO or ENA project not being publicly available.

FAQ

What are GEO and ENA?

The Gene Expression Omnibus (GEO) and the European Nucleotide Archive (ENA) are web-accessible public repositories for genomic data and experiments. Access and learn more about their resources at their respective websites:

GEO - https://www.ncbi.nlm.nih.gov/geo/

ENA - https://www.ebi.ac.uk/ena

How do I know if a GEO project is also in ENA?

  • You can search ENA using the GEO ID (e.g., GSE71578) to check if there is a matching ENA project.

  • Open the Study result to view the BioProject ID (e.g., PRJNA381606) and a table with information about the samples and files included in the project

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.