3 of 8

Importing Data

Partek Flow can import a wide variety of data types including raw counts, matrices, microarray, variant call files as well as unaligned and aligned NGS data.

Navigating the file browser to transfer files to the server
Associate fastq files for multi-omic data
SFTP File Transfer Instructions
Sample Table from a Text File
Import single cell data
Importing 10x Genomics Matrix Files
Importing and Demultiplexing Illumina BCL Files
Partek Flow Uploader for Ion Torrent
Importing 10x Genomics .bcl Files
Import a GEO / ENA project

The following file types are valid and will be recognized by the Partek Flow file browser.

bam
bcf
bcl
bgx
bpm
cbcl
CEL
csv
fa
fasta
fastq
fcs
fna
fq
gz
h5ad
h5 matrix
idat
loom
mtx
probe_tab
qual
raw
rds
sam
sff
sra
tar
tsv
txt
vcf
zip

In cases where paired end fastq data is present, files will also be automatically recognized and their paired relationship will be maintained throughout the analysis.

Matching on paired end files is based on file names: every character in both file names must match, except for the section that determines whether a file is the first or the second file. For instance, if the first file contains "_R1", "_1", "_F3", "_F5" in the file name, the second file must contain something in the lines with the following: "_R2", "_2", "_F5", "_F5-P2", "_F5-BC", "_R3", "_R5" etc. The identifying section must be separated from the rest of the filename with underscores or dots. If two conflicting identifiers are present then the file is treated as single end. For example, s_1_1 matches s_1_2, as described above. However, s_2_1 does not mate with s_1_2 and the files will be treated as two single-end files.

Apart from paired-end data, files with conventional filename suffixes that indicate that they belong to the same sample are consolidated. These suffixes include:

Adapter sequences
- "bbbbbb" followed by "" or at the end of the file name, where each "b" is "A", "C", "G", or "T"
Lane numbers
- "L###" followed by "" or at the end of the file name, where each "#" is a digit 0 to 9
Dates
- in the form "####-##-##" preceded or followed by a period or underscore
Set number
- of the form "_###" from the end

Navigating the file browser to transfer files to the server

The file browser is used to transfer files to the server so that these files can be added to a project for analysis. If you are importing a Bioproject from GEO/ENA or using URLs for data import, there is no need to transfer the files to the server.

To access the file browser and upload data to the server, use any of these options:

access Transfer files on the Partek® Flow® homepage
within a project, after selecting the file type to transfer, using the transfer files link available within all file import options
from the settings, go to Access management > Transfer files

Using the file browser to transfer files to the server:

Click Transfer files to access the file browser
Drag and drop or click My Device to add files from your machine
Click Browse to modify the Upload directory or create a new folder. The Upload directory should be specified, known, and distinguishable for project file management. You will return to this directory and access the files to import them into a project
To continue to add more files use + Add more in the top right corner. To cancel the process select Cancel in the top left corner
Click Upload to complete the file upload
Do not exit the browser tab or let the computer go to sleep or shut down until the transfer has completed

File size displayed in the table is binary format, not decimal format (e.g. GB displayed in the table is gigibyte not gigabyte. 1 gigibyte is 1,073,741,824 bytes. 1 gigabyte is 1,000,000,000 bytes. 1 gigibyte is 1.074 gigabytes).

Associate fastq files for multi-omic data

There are projects with more than one file type, such as single cell multi-omic assays that generate protein and RNA together. In these cases, files need to be associated with each other if starting in fastq format. If we start with processed data, there is no need to associate these files.

Define the type of data the file represents when importing files into the project.

If this step is skipped, the data type can be changed after import by right clicking the data node.

After importing both types of data, associate fastqs with the already imported data. (e.g. associate RNA fastqs with ATAC fastqs already imported into Partek Flow.

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

SFTP File Transfer Instructions

Introduction
SFTP with WinSCP
SFTP with FileZilla
SFTP command line usage
Points of Caution

Only use this method if encounter issues using Transfer file function on Flow homepage, contact support@partek.com to obtain private key.

Introduction

The following instructions detail the use of SFTP (Secured File Transfer Protocol) to transfer data to and from your Partek Flow instance. SFTP offers significant performance and security enhancements over FTP for file transfers. It also enables the use of robust file syncing utilities, e.g. RSYNC, and is compatible with common file transfer programs such as FileZilla and WinSCP.

To transfer files with SFTP, you will need to have your Partek Flow:

Server Name. Example: myname.partek.com
Username. Example: flowloginname
Private authentication key

This information should have been e-mailed to you from the Partek licensing team. If you lose this information, contact Partek support and we will resend your authentication key to you.

SFTP with WinSCP

WinSCP is an open source, free SFTP client for Windows. Its main function is file transfer between a local and a remote computer.

Downloading WinSCP

To download WinSCP, visit WinSCP's official site: https://winscp.net/eng/download.php On the WinSCP page you may need to scroll a bit down, to reach the green button Download WinSCP.

&#xNAN;Figure 1. Download button on WinSCP's official web site. Note: the version may change from the time of writing of this document

Connecting to your Partek server with WinSCP

Download and install WinSCP on your local computer and then launch the program.

On the Login page click on the New Site icon.

&#xNAN;Figure 2. Adding NewSite on WinSCP's Login page

Type in the Host Name, which is the same as the web address that you use to access your instance of Partek Flow

The web address for your instance of Partek Flow has been sent to you by Partek's Licensing team. In this example, the web address is ilukic5i.partek.com.

Type in the User name, that has also been sent to you (and is the same user name that you use to log on to Partek Flow). In this example, the web address is lukici.

&#xNAN;Figure 3. Adding Host name and User name information. Use the host and user name that has been sent to you by Partek's licensing team

To proceed click on the Advanced... button
Then select the Authentication in the SSH section of the Advanced Site Settings dialog

&#xNAN;Figure 4. Adding the id_rsa file to WinSCP. Use the Advanced Site Settings tab and select Authentication

Select the ... button (under Private key file) to browse for the id_rsa file.

The file has been sent to you by Partek's licensing team attached to the same email that gave you your URL and username.

If you do not see it in the Select private key file browser, switch to All Files (.)

&#xNAN;Figure 5. Showing all files in the Select private key file dialog

Click Open

WinSCP will ask you to confirm file format conversion

Click OK.

&#xNAN;Figure 6. Converting key file format

WinSCP will create a file in .ppk format.

Click Save to save the converted key file, id_rsa.ppk, to a secure location on your local computer.
Click OK again to confirm the change.

Your private key has been saved in .ppk format and added to WinSCP

Click OK to proceed

&#xNAN;Figure 7. Private key in .ppk format added to WinSCP

Click Save to save the new WinSCP settings.

This will open the Save session as site dialog. You can accept the default name (in this example lukici@ilukic5i.partek.com) or add a custom name. The name that you specify here will appear in the left panel of the Login dialog.

Once you have made your edits, click OK.

&#xNAN;Figure 8. Customising the name for the new site on the Login dialog. In this example, the name is lukici@ilukic5i.partek.com

On the Login page, select your newly created site (in this example: lukici@ilukic5i.partek.com) and click the Login button.

The first time you connect, a warning message will appear, asking you whether you want to connect to an unknown server.

Click Yes to proceed.

&#xNAN;Figure 9. The first time you connect to your Partek server, WinSCP will present a warning message. Click Yes to connect to the server

The progress towards establishing a connection will be displayed in a dialog. This process is automatic and you do not need to do anything.

&#xNAN;Figure 10. Progress of the connection to your Partek server will be displayed on the screen

The WinSCP interface includes is split into two panels. The panel on the left shows the directory structure of your local computer and the panel on the right shows the directory structure of your Partek Flow file server.

&#xNAN;Figure 11. WinSCP screen after connection divides into two panels: the files one the left are on the local computer, while the files on the right are on Partek server

To transfer a file, just drag and drop the file from one panel to the other. The progress of your transfer will be shown on the screen.

&#xNAN;Figure 12. Progress of file transfer is shown on screen

SFTP with FileZilla

FileZilla is a graphical file transfer tool that runs on Windows, OSX, and Linux. It is great when needing to do bulk transfers as all transfers are added to a queue and processed in the background. It is possible to browse your files on the Partek Flow server while transfers are active. This is also the best solution when you are not on a computer with command line access or you are uncomfortable with command line operations.

Downloading FileZilla

We recommend downloading the FileZilla install packages from us. They are also available from download aggregator sites (e.g. CNET, download.com, sourceforge) but these sites have been known to bundle adware and other unwanted software products into the downloads they provide, so avoid them.

Mac OSX: http://packages.partek.com/bin/filezilla/fz-osx.app.tar.bz2
Windows 32-bit: http://packages.partek.com/bin/filezilla/fz-win32.exe
Windows 64-bit: http://packages.partek.com/bin/filezilla/fz-win64.exe
Linux (Please use your distribution's package manager to install Filezilla):
Ubuntu:

$ sudo apt-get update

$ sudo apt-get install filezilla

RedHat, see the following guide: http://juventusitprofessional.blogspot.com/2013/09/linux-install-filezilla-on-centos-or.html
OpenSuse, see: https://software.opensuse.org/package/filezilla

Connecting to your Partek server with FileZilla

After starting FileZilla, click on the Site Manager icon located at the top left corner of the FileZilla window.

&#xNAN;Figure 13

Click on the New Site button on the left of the popup dialog.

&#xNAN;Figure 14

Type in a name for the connection. Example: “Partek SFTP”.

&#xNAN;Figure 15

The connection details to the right need to be changed to reflect the information you received via email. The default settings will NOT work.

&#xNAN;Figure 16

Set Host: to your partek server name
Leave Port: blank
Change Protocol: to SFTP - SSH File Transfer Protocol
Change User: to your Partek Flow login name
Change Logon Type: to Key File and select the key file received via email.

&#xNAN;Figure 17

When selecting your key file, change the file selection from its default of PPK files to All files. Otherwise you key file will not be visible in the file browser.

After selecting your key file, click the Connect button.

&#xNAN;Figure 18

Click the checkbox to always trust this host and click OK. Once connected, you can begin to browse and transfer files. The files and folders to the left are on your computer, the ones on the right are on the Flow server.

&#xNAN;Figure 19

SFTP command line usage

Importing your private authentication key

You will receive a file called id_rsa via email. Download this file, note where you downloaded it to, then use ssh-add to import the key. If you logout or reboot your computer, you will need to re-run the commands below. After key import, you will not be asked a password when transferring files to your Partek Flow server.

$ cd directory/with/key

$ chmod 600 id_rsa

$ eval $(ssh-agent)

$ ssh-add id_rsa

Copying files and folders between your Partek Flow server and local computer

RSYNC usage

RSYNC is useful when resuming a failed transfer. Instead of re-uploading or downloading what has already been transferred, RSYNC will copy only what it needs.

The command below will sync the folder "local_folder" with the "remote_folder" on Partek's servers. To transfer in the other direction, reverse the last two parameters.

$ rsync -avr --progress ./local_folder/ flowloginname@myname.partek.com:~/remote_folder/

With rsync, don't forget the trailing '/' on directory names.

Before moving the files, we strongly advise you to use FileZilla to explore the directory structure of the Partek server and then create a new directory to transfer the files to.

Points of Caution

When you delete files from the Partek Flow server they are gone and can not be recovered.
Please use Partek Flow to delete projects and results. Manually removing data using SFTP could break your server.
Wait until ALL input data for a particular project has been transferred to the Partek Flow server before importing data via Partek Flow. If you try to import samples while the upload is occurring the import job will crash.
When upload raw data to Partek hosted Flow server, we recommend to a create subfolder for each experiment at the same level of "FlowData" folder or inside "FlowData" folder.

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Import single cell data

Import single cell data for different assay types and formats
Import single cell data from count matrix text file(s) using Full count matrix as the data format

Import single cell data for different assay types and formats

Select Single cell, choose the assay type (scRNA-Seq, Spatial transcriptomics, scATAC-Seq, V(D)J, Flow/Mass cytometry), and select the data format (Figure 1). Use the Next button to proceed with import.

Import single cell data from count matrix text file(s) using Full count matrix as the data format

Partek Flow supports single cell data analysis in count matrix text format using the Full count matrix data format (Figure 2). Each matrix text file is assumed to represent on sample, each value in the matrix represents expression value of a feature (e.g. a gene, or a transcript) in a cell. The expression value can be raw count, or normalized count. The requirement of the format of each text file should be the same as count matrix data.

Specify text file location, only one text file (in other words one sample) can be imported at once, preview of the file will be displayed, configuration of the file format is the same as Import count matrix data. In addition, you need to specify the details about this file.

Click Finish, the sample will be imported, on the data tab, number of cells in the sample will be displayed.

To import multiple samples, repeat the above steps by clicking Import data on the Metadata tab or within the task menu (toolbox) on the Analyses tab. Make the same previous selections using the cascading menu.

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Importing 10x Genomics Matrix Files

Importing single cell data
- Importing matrices into Partek Flow (this Market Exchange Format is popular for public repositories)
- Importing matrices in h5 format (this Hierarchical Data Format is recommended for multiple samples)
Importing spatial data
- Importing Xenium Output Bundle

Importing single cell data

Partek Flow supports the import of filtered gene-barcode matrices generated by 10x Genomics' Cell Ranger pipeline.

Below is a video summarizing the import of these files:

Importing matrices into Partek Flow (this Market Exchange Format is popular for public repositories)

To import the matrices into Partek Flow, create a new project and click Add data then select Import scRNA count feature-barcode-mtx under Single cell > scRNA-Seq.

Samples can be added using the Add sample button. Each sample should be given a name and three files should be uploaded per sample using the Browse button.

If you have not already, transfer the files to the server to be accessed when you click Browse. Follow the directions here to add files to the server. Make sure the files are decompressed before they are uploaded to the server.

By default, the Cell Ranger pipeline output will have a folder called filtered_gene_bc_matrices (Figure 3). It is helpful to rename and organize the files prior to transfer using the File browser.

There are folders nested within the matrix folder, typically representing the reference genome it was aligned to. Navigate to the lowest subfolder, this should contain three files:

barcodes.tsv
genes.tsv
matrix.mtx

Select all 3 files for import into Partek Flow

Specify the annotation file used when running the pipeline for additional information such as mitochondrial counts (Figure 4). Other information can also be specified, such as the count value format. All features can be reported or features with non-zero values across all samples can be reported and the read count threshold can be modified to make the import more efficient.

Click Finish when you have completed configuration. This will queue the import task.

Importing matrices in h5 format (this Hierarchical Data Format is recommended for multiple samples)

The Cell Ranger pipeline can also generate the same filtered gene barcode matrix in h5 format. This gives you the ability to select just one file per matrix and select multiple matrices to import in batch. To import an h5 matrix, select the Import scRNA full count matrix or h5 option (Figure 1). Browse for the files and modify any configuration options. Remember the files need to be transferred to the server.

This feature is also useful for importing multiple samples in batch. Simply put all h5 files from your experiment on a single folder, navigate to the folder and select all the matrices you would like to import.

Configure all the relevant sample metadata, including sample name and the annotation that was used to generate the matrices, and click Finish when completed. Note that all matrices must have been generated using the same reference genome and annotation to be imported into the same project.

Importing spatial data

Importing Xenium Output Bundle

Raw output data generated by the 10x Genomics' Xenium Onboard Analysis pipeline consists of decoded transcript counts and morphology images. The raw output and other standard output files derived from them are compiled into a zipped file called Xenium Output Bundle.

To import the Xenium Output Bundle into Partek Flow, create a new project and click Add data, then select Import 10x Genomics Xenium under Single cell > Spatial, click Next.

Samples can be added using the Add sample button. Each sample should be given a name and a folder containing the required 6 files: cell_feature_matrix.h5, cells.csv.gz, cell_boundaries.csv.gz, nucleus_boundaries.csv.gz, transcripts.csv.gz, morphology_focus.ome.tif should be uploaded per sample using the Browse button. The required 6 files should be all included in the Xenium Output Bundle folder.

If you have not already, transfer the files to the server to be accessed when you click Browse. Follow the directions here to add files to the server. You will need to decompress the Xenium Output Bundle zip file before they are uploaded to the server. After decompression, you can drag and drop the entire folder into the Transfer files dialog, all individual files in the folder will be listed in the Transfer files dialog after drag & drop, with no folder structure. The folder structure will be restored after upload is completed.

Once you have uploaded the folder into the server, you can continue to select the folder for each sample from Browse. Once the folder is selected, the Cells and Features values will auto-populate. You can choose an annotation file that matches what was used to generate the feature count. Then, click Finish to start importing the data into your project.

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Importing and Demultiplexing Illumina BCL Files

Primary sequencing output of an Illumina sequencer are per-cycle base call (bcl) files, which first need to be converted to fastq format, so that the data can be pushed to downstream applications. Partek Flow software comes with a conversion tool that can be used to import data in the bcl file format . In addition to the file conversion, this tool also demultiplexes the bcl files in the same step and outputs demultiplexed fastq files as the result.

We recommend you start by transferring the entire Illumina run folder to the Partek Flow server. To start a new project with bcl files, first select bcl under the Other import tab (Figure 1)

The resulting window shows the configuration dialog (Figure 2).

The bcl files hold the base calls and are in the Data directory within the whole Illumina run folder. Note that the Data directory file path needs to point to the directory, not to an individual bcl file.

The RunInfo.xml file is generated by the primary analysis software and contains information on the run, flow cell, instrument, time stamp, and the read structure (number of reads, number of cycles per read, whether a read is an index read). This file is typically stored at the top level in the Illumina run folder.

The SampleSheet.csv file provides the information on the relationship between the samples and indices specified during library creation. Although it has four sections, two sections (Settings and Data) are important for the data import and conversion. For more information on the files, consult Illumina documentation.

Selecting the Configure option under the Advanced options section enables a granular control of the import (Figure 3).

The Select tiles option (--tiles) enables the user to process only a subset of tiles available in the flow cell. The input for this option is a comma-separated list of regular expressions.

Min trimmed read length (--minimum-trimmed-read-length) specifies the minimum read length after adapter removal.

Mask short adapter reads (--mask-short-adapter-reads) applies when a read is trimmed below the length specified by Min trimmed read length. If the number of bases after adapter removal is less than Min trimmed read length, it forces the read length to be equal to Min trimmed read length by replacing the adapter bases that fall below the specified length by Ns. If the number of remaining bases falls below Mask short adapter sequences, then it replaces all the bases in a read with Ns.

Adapter stringency (--adapter-stringency) specifies the minimum match rate that triggers the masking or trimming of adapters. The rate is calculated as MatchCount / (MatchCount + MismatchCount). Only the reads exceeding the specified rate of sequence identity with adapters are trimmed.

Barcode mismatches (--barcode-mismatches) controls the number of allowed mismatches per index sequence.

Use bases mask (--use-bases-mask) defines a custom read structure that may be different to the structure specified in the RunInfo.xml file. The input for this option is a comma-separated list where Y and I are followed by a number indicating how many sequencing cycles to include in the fastq file. For example, if the option is set to Y26,I8,Y98, 26 cycles (26bp) will be used to generate the R1 sequence, 8 cycles (8bp) will be used for the sample index, and 98 cycles (98bp) will be used to generate the R2 sequence.

Do not split files by lane (--no-lane-splitting) prevents splitting of fastq files by lane, i.e. the converter will merge multiple lanes and generate one fastq file per sample.

Create fastq for index reads (--create-fastq-for-index-reads) creates an extra fastq file for each sample containing the sample index sequence for each read. This will be imported as an extra sample into the project.

Ignore missing bcls (--ignore-missing-bcl) will interpret missing base call files as N.

Ignore missing filter (--ignore-missing-filter) will ignore missing filter files and assume all clusters pass the filter.

Ignore missing positions (--ignore-missing-positions) will write new, unique coordinates into the header line if the cluster location files are missing.

Ignore missing controls (--ignore-missing-control) will interpret missing control files as missing not-set control bits.

Save undetermined fastq will take the reads that could not be assigned to a sample index and collect them into an Undetermined_S0.fastq file, which will be imported as a new sample.

The result of the import is an Unaligned reads data node, containing demultiplexed fastq files.

For more information about the BCL to FASTQ conversion tool, including information on the proper folder structure and instructions for formatting the SampleSheet.csv file, please consult the bcl2fastq2 Conversion Software Guide.

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Partek Flow Uploader for Ion Torrent

The Partek Flow Uploader is a Torrent Browser plugin that lets users upload run results to Partek Flow for further analysis.

Quick Video Tutorial on running the Plugin
Downloading the Plugin
Adding the Plugin to your Run Plan
Running the Plugin from a Report
Sample table created by the Plugin
Conversion of UBAM to FASTQ files

Quick Video Tutorial on running the Plugin

The clip above (video only, no audio) shows the Partek Flow Uploader plugin in action.

Downloading the Plugin

Download the Partek® Flow® Uploader from the links below:

https://customer.partek.com/plugins/PFU/PartekFlowUploader-1.03.zip

This is a compressed zipped file. Do not unzip.

Installation of the Plugin

Installation only needs to be performed once per Torrent Browser. All users of the same instance of Torrent Browser will be able to use the plugin. For future versions of the plugin, the steps below can also be used for updating.

To install the plugin, first log into Torrent Browser (Figure 1).

Navigate to Plugins under dropdown menu in upper-right corner under the gear icon (Figure 2).

Click the Install or Upgrade Plugin button (Figure 3).

Click Select File (Figure 4) and use the file browser to select the zip file you downloaded from the download link. Click Upload and Install.

Verify that the Partek Flow Uploader is listed and that the Enabled checkbox is selected (Figure 5).

From the plugins table, click the (Manage) gear icon for the Partek Flow Uploader and select Configure in the drop-down menu (Figure 6).

Global Partek Flow configuration settings can be entered into the plugin. When set, it will serve as the default for all users of the Torrent Browser. If multiple Partek Flow users are expected to run the plugin, it is recommended to leave the username and password fields blank so that individual users can enter them as needed.

In the configuration dialog (Figure 7), enter the Partek Flow URL, your username and your password. Clicking on Check configuration would verify your credentials and indicate if a valid username and password has been entered. Click Save when done.

Click the Rescan Plugins for Changes button. Rescanning the plugins will finish the installation and save the configuration.

Adding the Plugin to your Run Plan

In the Torrent Browser, you can configure a Run Plan to include the Partek Flow Uploader. You can create a new Run Plan (from a Sample or a Template) or edit an existing Run Plan. In the example in Figure 8, the Partek Flow Uploader will be included in an existing Run Plan. From the Planned Runs page, click the gear icon in the last column, and choose Edit.

In the Edit Plan page, go to the Plugins tab (Figure 9) and select the checkbox next to the PartekFlowUploader.

Click the Configure hyperlink next to the PartekFlowUploader (Figure 10). If necessary, enter the Partek Flow URL, your username and your password. These are the same credentials you use to access Partek Flow directly on a web browser. Note that some fields may already be pre-populated depending on the global plugin configuration, you can edit the entries as needed. All fields are required to successfully run the plugin.

The Project Name field will be used in Partek Flow to create a new project where the run results will be exported. However, if a project with that name already exists, the samples will be added to that existing project. This enables you to combine multiple runs into one project. Project Names are limited to 30 characters. If not specified, the plugin will use the Run Name as the Project Name. Click the Check configuration button to see if you typed a valid username and password. When ready, click Save Changes to proceed.

Proceed with your Run Plan. The plugin will wait for the base calling to be finished before exporting the data to Partek Flow.

Once the Run Plan is executed, data will be automatically exported to the Partek Flow Server. In the Run Report, go to the Plugin Summary tab and the plugin status will be displayed. An example of a successful Plugin upload is shown in Figure 11.

To access the project, click on the Partek Flow hyperlink in the plugin results (Figure 11). You can also go directly to Partek Flow in a new browser window and access your account. In your Partek Flow homepage (Figure 12), you will now see the project created by the Partek Flow Uploader.

Running the Plugin from a Report

You can manually invoke the plugin from a completed run report. This allows you to export the data from the Torrent Server if you did not include the plugin in the original run plan. This also gives you the flexibility to export the same run results onto different project(s). Open the run report and scroll down to the bottom of the page (Figure 13). In the Plugin Summary tab, click the Select Plugins to Run button.

From the plugin list (Figure 14), select the PartekFlowUploader plugin.

Configure the Partek Flow Uploader. Enter the Partek Flow URL, your username and your password (Figure 15). These are the same credentials you use to access Partek Flow directly on a web browser. Although some fields may already be pre-populated depending on the global plugin configuration, you can edit the entries as needed. All fields are required to successfully run the plugin.

When ready, click Export to Partek Flow to proceed. If you wish to cancel, click on the X on the lower right of the dialog box.

Note that configuring the Plugin from a report (Figure 15) is very similar to configuring it as part of a Run Plan (Figure 10) with two notable differences:

The Check configuration has been replaced by Export to Partek Flow button, which when clicked, immediately proceeds to the export.
The Save changes button has been removed so any change in the configuration cannot be saved (compared to editing a run plan where plugin settings are saved)

Once the plugin starts running, it will indicate that it is Queued on the upper right corner of the Plugins Summary (Figure 16). There will also be a blue Stop button to cancel the operation.

Click the Refresh plugin status button to update. The plugin status will show Completed once the export is done and the data is available in Partek Flow (Figure 11).

Sample table created by the Plugin

The Partek Flow Uploader plugin sends the unaligned bam files to the Partek Flow server. For each file, a Sample of the same name will be created in the Data tab (Figure 17).

Reads that had no detectable barcodes have been combined in a sample with a prefix: nomatch_rawlib.basecaller.bam (Row 17 in Figure 17). You can removed this sample from your analysis by clicking the gear icon next to the sample name and choosing Delete sample.

The data transferred by the Partek Flow Uploader is stored in a directory created for the Project within the user's default project output directory. For example, in Figure 17, the data for this project is stored in: /home/flow/FlowData/jsmith/Project_CHP Hotspot Panel.

Conversion of UBAM to FASTQ files

The plugin transfers the Unaligned BAM data from the Torrent Browser. The UBAM file format retains all the information of the Ion Torrent Sequencer. In the Partek Flow Project, the Analyses tab would show a circular data node named Unaligned bam. Click on the data node and the context-sensitive task menu will appear on the right (Figure 18).

Unaligned BAM files are only compatible with the TMAP aligner, which can be selected in the Aligners section of the Task Menu. If you wish to use other aligners, you can convert the unaligned BAM files to FASTQ using the Convert to FASTQ task under Pre-alignment tools. Some information specific to Ion Torrent Data (such as Flow Order) are not retained in the FASTQ format. However, those are only relevant to Ion Torrent developed tools (such as the Torrent Variant Caller) and are not relevant to any other analysis tools.

Once converted, the reads can then be aligned using a variety of aligners compatible with FASTQ input (Figure 19). You can also perform other tasks such as Pre-alignment QAQC or run an existing pipeline. Another option is to include the Convert to FASTQ task in your pipeline and you can invoke the pipeline directly from an Unaligned bam data node.

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Importing 10x Genomics .bcl Files

Partek Flow supports .bcl files based on 10x Genomics library preparation. The following document will guide you through the steps.

To start the import, create a new project and then select Import Data > Import bcl files. The Import bcl dialog will come up (Figure 1).

Use the Data directory option to point to the location of the directory holding the data. It is located at the top level of the run directory and is typically labeled Data. Please see the tool tip for more info.

Use the Run info file option to point to the RunInfo.xml file. It is located at the top level of the run directory.

Use the Sample sheet file to point to the sample sheet file, which is usually a .csv file. Partek Flow can accept 10X Genomics' "simple" and Illumina Experiment Manager (IEM) sample sheet format, which utilize 10X Genomics' sample index set codes. Each index set code corresponds to a mixture of four sample index sequences per sample. Alternatively, Partek Flow will also accept a sample sheet file that has been correctly formatted using the sample sheet generator provided by 10X Genomics.

The click on the Configure link and make the following changes (Figure 2).

Min trimmed read length: 8
Mask short adapter reads: 8
Use bases mask: see below
Create fastq for index reads: OFF
Ignore missing bcls: ON
Ignore missing filter: ON
Ignore missing positions: ON
Ignore missing controls: ON

For the Use bases mask option, the read structure for Chromium Single cell 3' v2 prep kit is typically Y26,I8,Y98. The settings for Chromium Single cell 3' v3/v3.1 is typically Y28,I8,Y91. Please check the read structure detailed in the RunInfo.xml file and adjust the values to match your data.

Click Apply to accept and then Finish to import your files.

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Import a GEO / ENA project

How to import a study from GEO / ENA
Common Issues
FAQ
- What are GEO and ENA?
- How do I know if a GEO project is also in ENA?

How to import a study from GEO / ENA

If a project is publicly available in the Gene Expression Omnibus (GEO) and European Nucleotide Archive (ENA) databases, you can import associated FASTQ files and sample attributes automatically into Partek Flow.

On the Homepage click New Project to create a project and give the project a name

Click Add data

Select fastq as the file type after choosing between Single cell or Bulk as the assay types

Click Next
Choose GEO / ENA
Enter the BioProject ID of the data set you would like to download. The format of a BioProject ID is PRJNA followed by one to six numbers (e.g. PRJNA381606)

A GEO ID can also be used in the format GSE followed by one to five numbers (e.g. GSE71578).

Click Finish

It may take a while for the download to complete depending on the size of the data. FASTQ files are downloaded from the ENA BioProject page.

FASTQ files will be added as an Unaligned reads data node in the Analyses tab

Common Issues

Error Message - The project did not yield any data. Double-check the project ID, or try importing the data manually

If the study is not publicly available in both GEO and ENA, project import will not succeed.

The project was imported, but the Analyses tab is empty and there are no FASTQ files

If there is an ENA project, but the FASTQ files are not available through ENA, the project will be created, but data will not be imported.

Something is missing or the import failed

A variety of other issues and irregularities can cause imports to not succeed or partially succeed, including, but not limited to, a BioProject having multiple associated GSE IDs, incomplete information on the GEO or ENA page, and either the GEO or ENA project not being publicly available.

FAQ

What are GEO and ENA?

The Gene Expression Omnibus (GEO) and the European Nucleotide Archive (ENA) are web-accessible public repositories for genomic data and experiments. Access and learn more about their resources at their respective websites:

GEO - https://www.ncbi.nlm.nih.gov/geo/

ENA - https://www.ebi.ac.uk/ena

How do I know if a GEO project is also in ENA?

You can search ENA using the GEO ID (e.g., GSE71578) to check if there is a matching ENA project.

Open the Study result to view the BioProject ID (e.g., PRJNA381606) and a table with information about the samples and files included in the project

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.