Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
BaseSpace Sequence Hub is a cloud environment for analysis, storage, and sharing of genomic data. Stream data directly from your sequencing run and then analyze your results with BaseSpace Sequence Hub apps. BaseSpace Sequence Hub apps perform alignment, variant calling, data classification and visualization, and interpretation of results.
BaseSpace Sequence Hub has the following features:
Easily shares and transfers data for collaboration with colleagues and customers.
Analyzes and stores run data streamed from your sequencing system in real time.
Automatically converts data from standard file formats for use with any BaseSpace Core App or third-party analysis app.
Provides 1 Terabyte of complimentary data storage. Additional storage is available with Professional or Enterprise subscription accounts.
Records sample information and run parameters for quicker run setup.
Monitors sequencing runs using Sequencing Analysis Viewer (SAV) charts.
Offers an intuitive interface to prepare and launch complex analyses.
BaseSpace Sequence Hub includes features to support automating data analysis and management.
Import biosamples in bulk and automatically associate data from sample sheets.
Automatically set data QC according to your thresholds.
Track incoming yield and fulfillment status, so you can easily identify and requeue biosamples that need additional sequencing data.
Automatically aggregate FASTQ data sets for analysis, excluding data that does not meet your QC thresholds.
Schedule apps to automatically launch after required data or other dependencies have been met
Automatically set QC for analyses.
Website: www.illumina.com
Email: techsupport@illumina.com
Illumina Customer Support Telephone Numbers
Safety data sheets (SDSs)—Available on the Illumina website at support.illumina.com/sds.html.
Product documentation—Available for download from support.illumina.com.
Biosamples are the original DNA samples that needs to be prepared, sequenced, and analyzed to produce the desired results for a bioinformatician. In BaseSpace Sequence Hub, they are the central link for related physical entities and digital data such as libraries, pools, runs, lanes, analyses, and data sets.
Add biosamples in a biosample workflow file. You can download a template and upload completed files from the Biosamples page, available from the My Data tab. In the biosample workflow *.csv file, add information about the new biosamples, the projects you want to store data in, the library preps you want to use, the yields required to launch an app, and the analysis workflows you want to schedule. BaseSpace Sequence Hub validates the inputs and adds the biosamples to the system. For more information, see .
Samples can still be accessed through a Run or a Project, using the Samples tab.
Libraries and pools are not uploaded manually but are generated automatically using the information in a sample sheet with an instrument run upload. From the sample sheet, Sample ID is used for the biosample name, and Sample Name is used for the library name.
When a biosample name is recognized due to a previous biosample workflow upload or instrument run containing the biosample, BaseSpace Sequence Hub checks the sample sheet for a match to a library. If we find an exact match, we add sequencing data to the existing biosample and library. If we don't find exact matches, we create a new biosample, a new library, or both.
When more than one library is given for a single lane number in the sample sheet, we interpret this as a pooled sample merged together using the libraries. We automatically assign a name to the pool and link it to the biosample and libraries. If the same combination of libraries exists within the same instrument run, the generated data are linked to the same pool. Library combinations cannot be reliably matched to runs across different instruments; in those cases, new pools are created.
The libraries and pools automatically generated from a sample sheet can be found in the Libraries tab of the biosample details page.
Library prep kits are the names of the sample preparation kits used to turn biosamples into sample libraries. They are defined in the Prep Request column in the biosample workflow upload. BaseSpace Sequence Hub uses this information to separate data during data aggregation when there are two or more library prep kits used for the same biosample. For example, if you use a TruSeq PCR-Free library prep kit to prepare your libraries but receive poor results due to a low starting concentration of DNA, then make a second attempt with a TruSeq Nano library prep kit to amplify the DNA, you can use Sequence Hub to separate the data produced by the kit used to prepare the data.
Yield for a biosample is separated per unique library prep kit. View the list of prep kits for a biosample on the Summary tab of the biosample details page.
To launch an app using only the data from specific kits, select Select Biosample button when selecting the app inputs. This options enables a biosample chooser where you can select data by library prep kit.
Biosample metadata are key-value pairs used to save custom information to biosamples. The metadata can be viewed from the biosample summary page. Biosample metadata can only be entered when first creating the biosamples through the biosample workflow spreadsheet upload. When you add custom columns to the spreadsheet and define the values for the biosamples, the biosamples are imported with the metadata.
Biosamples do not belong in a project. Instead, biosamples are related to a project by producing sequencing data in the form of data sets, which do belong in projects. Biosamples can have a default project, which is the default location data is written to when it is produced through Generate FASTQ and other BaseSpace Sequence Hub apps.
Biosamples can be related to many projects by creating data sets in each of them. For example, a biosample may be assigned a default project named Project A, where its FASTQ data sets are saved to. You can select the biosample as an input to manually launch an app and specify a different project, Project B, as the output project. The app then creates general data sets in Project B. The biosample is now linked to both projects, but does not belong to either of them.
When you upload FASTQ files, you create a new FASTQ data set which must be linked to a new biosample and library. Our new data model uses automatic aggregation of data to exclude any failures or low quality data among the biosamples, libraries, pools, lanes, and data sets. To allow auto-app launch to work, manual uploading of FASTQ files must conform to this model. The modified file import page will allow the creation of new biosamples and libraries to support adding FASTQ files to Sequence Hub.
Biosamples themselves cannot be deleted. However, the data within biosamples can be deleted, either by deleting associated analyses or by deleting individual datasets using the FASTQ Datasets or Other Datasets tabs.
Canceling a biosample affects further work initiated to be performed on biosample data. Analyses that have not already been completed or stopped are canceled and their delivery status is changed to Do Not Deliver. These biosamples no longer appear in the available list of biosamples to be selected for app launch. Lab requeues can no longer be created for these biosamples and new biosamples cannot be created with the same name.
Yield is a measure of how much sequencing data has been produced, in units of base pairs. Yield is the most commonly used app dependency to automatically launch an analysis for a biosample. BaseSpace Sequence Hub determines how much yield was produced from each flow cell lane the biosample was sequenced on, even if the biosample was merged into a pool with other biosamples.
The Generate FASTQ app runs immediately after a run completes to convert .bcl files to .fastq files and demultiplex any indexing that occurred. If the app fails to finish, the status changes to Aborted, which causes the sequencing run status to change to Failed.
You can use the Fix Sample Sheet and Requeue option in the Run Details page to restart the Generate FASTQ analysis. This initiates a new Generate FASTQ analysis and resets the sequencing run status to Analyzing.
Analysis workflows are templates that contain pre-defined settings and QC thresholds for a specific app. These workflows can be scheduled in advance to automatically launch when minimum requirements, called dependencies, are met.
You can schedule apps to automatically launch or you can manually launch them.
To automatically launch an app, schedule an analysis workflow in a biosample workflow file. When enough yield or other dependencies are met for the analysis workflow, the analysis uses the biosample as an input to launch automatically.
Manually launch apps through the app details page. Apps that formerly required samples now require biosamples. Select inputs from a list of biosamples that contain FASTQ data sets.
The new data model supports data aggregation for the same biosample placed in multiple flow cell lanes in multiple runs. BaseSpace Sequence Hub now automatically locates and merges different samples for you before launching an app. When a biosample is linked with multiple libraries of a similar type, placed on different lanes, and placed on different flowcell runs, we can collect all of the FASTQ files produced exclusively for the original sample and input them into the app.
BaseSpace Sequence Hub excludes data that do not meet quality thresholds, which improves the chances of success in running apps. Immediately before an app is launched with biosamples as the input, BaseSpace Sequence Hub checks the statuses of all resources that produced the FASTQ data sets, including libraries, pools, data sets, runs, and lanes. For example, if a sequencing lane had failed due to quality, the app will not include any FASTQ data sets produced from that specific lane.
You can manually override QC statuses. For example, you can set a pool to Failed, which automatically excludes all FASTQ data sets produced by the pool.
The biosample workflow upload allows you to specify an existing biosample in the analysis workflow column of the spreadsheet. As long as the biosample name given is an exact match with a biosample already owned by the uploading user, the analysis workflow is added to the existing biosample.
Lab requeues are a way to request more yield when a biosample falls short of what is required to run an app successfully. When a biosample has not produced the required yield in the specified time, it is marked as Missing Yield. You can initiate a lab requeue to request the sequencing lab to produce more data to make up for the missing amount.
When you initiate a lab requeue, you specify the checkpoint in the sample prep steps the lab should begin from. You can initiate more than one requeue at the same time, but Illumina recommends that only one lab requeue be fulfilled at a time.
Datasets are bundles of one or more files output by BaseSpace Sequence Hub apps. They can be used as input to other BaseSpace Sequence Hub apps when chaining apps together. Datasets belong in projects and are included if the project they are in is shared or transferred.
Datasets can be viewed two ways,
In a Project view, using the FASTQs or Other Datasets tabs
In a Biosamples view, using the FASTQs or Other Datasets tabs
BaseSpace Sequence Hub offers a limited 30-day free trial for new accounts. New free trial accounts have access to the following features.
1 TB free storage—Additional storage can be purchased with promotional iCredits.
250 iCredits—Promotional iCredits are valid for 30 days and can be used to pay for additional storage, compute, and third-party app fees.
All BaseSpace Sequence Hub Apps—Access to all BaseSpace Sequence Hub apps during the trial period.
For information about upgrading your account, .
After the trial period ends, accounts that have not upgraded to Professional or Enterprise subscriptions are automatically converted to Basic accounts with limited storage and app access. To ensure uninterrupted service, contact your sales representative about upgrading to a Professional or Enterprise subscription.
Basic accounts are limited to 1 TB storage. To increase your storage, contact your sales representative about upgrading to a Professional or Enterprise Account.
All Basic accounts can perform instrument runs, download data, and delete data.
Basic accounts using more than 1 TB storage cannot move data, share data, or run any apps until storage is reduced to less than 1 TB
Basic accounts have access to a limited set of free BaseSpace Sequence Hub apps
BaseSpace Sequence Hub may automatically delete data if storage exceeds 1 TB for more than 30 days. For more information, see .
Paid subscription accounts offer expanded storage options, multi-user access, and workgroup support.
You can use the same email for a private domain account and an enterprise (domain) accounts, however these are treated as separate accounts with different credentials. You can be logged in to only one account at a time.
Illumina supports customers who need help to analyze and interpret their data by offering comprehensive bioinformatics consulting support. Whether you are a researcher who is new to bioinformatics, or you need to improve your current analytical skills, Illumina can help. Illumina Bioinformatics Professional Services are delivered by professionals with deep scientific and product knowledge, who are committed to helping researchers effectively pursue their scientific goals. Illumina experts can support a range of skill levels, and both standard and specialized workflows.
More information is available on the Illumina website at .
Biosample Workflow File Import—Add biosamples, library prep instructions, and analysis workflow instructions via a biosample workflow file. BaseSpace Sequence Hub creates new biosamples and schedules analyses.
After validating the manifest, BaseSpace Sequence Hub creates a new biosample and associated library prep requirements, and schedules a new analysis which includes analysis QC thresholds and a yield dependency.
Library Prep—BaseSpace Sequence Hub reports library prep status and QC.
Automatic Lane QC—BaseSpace Sequence Hub automatically performs lane QC on the new sequencing data based on the configured lane QC thresholds. Streaming Sequencing Data—The lab loads a flow cell and starts sequencing. Sequencing data begins to arrive in BaseSpace Sequence Hub.
Automatic FASTQ Dataset Generation—BaseSpace Sequence Hub performs demultiplexing, if needed, and generates FASTQ data sets for each library by lane and index (if indexing). QC-passed files are added to the yield calculations for the biosample.
Automatic App Launch—BaseSpace Sequence Hub automatically launches pending analyses that have yield dependencies met with the new data. BaseSpace Sequence Hub aggregates all FASTQ data sets for a biosample as inputs to the analysis, excluding biosamples that are QC failed or are associated with a different project. Data sets can originate from multiple flow cells or other uploads.
Automatic Analysis QC— BaseSpace Sequence Hub automatically performs analysis QC after analysis is complete.
Mark Analyses for Delivery—BaseSpace Sequence Hub users manually review analyses and change the status of acceptable analyses as Ready For Delivery.
Deliver Data—Deliver data using your preferred delivery mechanism, and mark the data as delivered.
When a project is shared or transferred, some biosample data to the project is shared with the collaborator. For more information, see .
Please see a thorough list of FAQs .
Yield amounts include only high quality yield and exclude failed data or data produced by an entity that was marked as failed. For more information, see .
BaseSpace Sequence Hub automatically tracks yield and updates status if yield is missing so you can request more sequencing data. For more information, see .
Lane QC thresholds are a user setting that applies to the metrics of lanes from all runs the user owns, once the run is complete. You can set the thresholds using the API. For more information, see the developer documentation at .
Use the biosample workflow upload to schedule analyses for either new or existing biosamples. The analyses remain in Pending status until they can be launched. For more information, see .
For more information, see and .
The delivery status of an analysis is a manually updated, independent status used for tracking the progress of sending data to another user. You can use this to mark the data to be delivered and track the status of review and delivery . For more information, see .
When yield shows up in the form of another sequencing run, the lab status transitions to Sequencing. If enough shows up to meet the requested amount, the lab requeue status updated to Fulfilled. For more information, see .
For information about upgrading your account, .
Accounts with no activity for more than 180 days are considered inactive and are subject to automatic data deletion. For more information, see .
Region
Toll Free
Regional
North America
+1.800.809.4566
Australia
+1.800.775.688
Austria
+43 800006249
+43 19286540
Belgium
+32 80077160
+32 34002973
China
400.066.5835
Denmark
+45 80820183
+45 89871156
Finland
+358 800918363
+358 974790110
France
+33 805102193
+33 170770446
Germany
+49 8001014940
+49 8938035677
Hong Kong, China
800960230
Ireland
+353 1800936608
+353 016950506
Italy
+39 800985513
+39 236003759
Japan
0800.111.5011
Netherlands
+31 8000222493
+31 207132960
New Zealand
0800.451.650
Norway
+47 800 16836
+47 21939693
Singapore
+1.800.579.2745
South Korea
+82 80 234 5300
Spain
+34 911899417
+34 800300143
Sweden
+46 850619671
+46 200883979
Switzerland
+41 565800000
+41 800200442
Taiwan, China
00806651752
United Kingdom
+44 8000126019
+44 2073057197
Other countries
+44.1799.534000
When launching an analysis, BaseSpace Sequence Hub automatically aggregates all data associated with the same biosample name. For more information, see Associating Biosample Data With Projects.
To correct aggregation of data, requeue the run with a corrected biosample name. Only the run owner can requeue a run.
Edit the biosample name.
If you are using a sample sheet, useFix Sample Sheet.
If you are using the Prep tab, change the sample name in the Prep tab.
Requeue the run.
The previous run data will be marked as failed and can be deleted to reduce storage costs.
Biosamples that were converted from samples are automatically locked if they contain data produced by two or more runs. To use the data for analyses, unlock the biosample. Before unlocking a Biosample, review the data to ensure accuracy.
Navigate to the Biosample summary page and select the FASTQ Datasets tab.
Review the data and identify any data that should be excluded.
To exclude data, do the following:
From the Datasets tab, select one or more datasets to exclude.
From the Status menu, point to Change, and then select FASTQ QC.
In the Change FASTQ QC Status dialog box, select QC Failed.
[Optional] Add a comment.
Select Save.
There are several ways to unlock Biosamples:
To unlock a biosample from the Biosample details page,
Navigate to the Biosample summary page.
A locked Biosample will display a red lock icon in the Biosample Status field.
From the Status menu, point to Unlock, and then select Biosample.
Select Continue.
To unlock multiple biosamples from a biosample list,
Navigate to the Biosample list page or the Project Biosamples tab.
Use the checkboxes to select Biosamples.
From the Status menu, point to Unlock, and then select Biosample.
Select Continue.
When unlocking multiple Biosamples, any selected Biosamples that are already unlocked will be ignored.
Biosamples that have not produced the requested amount of yield are marked as Missing Yield. Use a Lab Requeue to request more sequencing yield from the lab. After the requeue request is set to Acknowledged, incoming yield is applied to the request. For more information about yield calculations, see Yield.
A lab requeue requests additional biosample data. For information about requeueing a run with a fixed sample sheet, see Fix Sample Sheet.
The lab requeue returns the biosample to one of three stages in the lab preparation process.
Biosample — The original biosample is prepared in a new library for sequencing.
Library — An existing library is used for sequencing.
Pool — An existing pool is used for sequencing. You can track the status or cancel the requeue from the Requeues tab of the biosample details page.
Request a Biosample Requeue if there were problems with the sequencing data from a biosample, or the existing library or pool does not have enough volume to produce the required yield. This request initiates a new library preparation from the original biosample.
Open the biosample details page for the biosample you want to requeue.
Select the Summary tab.
Select the File drop-down arrow, point to Requeue, and then select Biosample.
In the Requeue Biosample dialog, specify the requeue parameters.
Library Prep kit
Requested yield (Gbp/Mbp)
Select Save.
Open the biosample details page for the biosample you want to requeue.
Select the Libraries tab.
Select the checkbox for the library you want to requeue. You can only requeue one library at a time.
Select the File drop-down arrow, point to Lab Requeue, and then select Library.
In the Requeue Library dialog, specify the requeue parameters.
Requested yield
Select Save.
Open the biosample details page for the biosample you want to requeue.
Select the Libraries tab.
Select the pool name for the desired library. If more than one pool exists for the library, select Pools, and then select the desired pool from the list.
Select the File drop-down arrow, point to Lab Requeue, and then select Pool.
In the Requeue Pool dialog, specify the requeue parameters.
Required pool yield
Unit of measurement
Select Save.
Pending Lab Requeues are listed the Requeues tab of the Biosample Details page. When you have determined that your lab can fulfill the request, change the status to Acknowledged.
For information about acknowledging requeues using the BaseSpace Sequence Hub API, see the developer documentation at developer.basespace.illumina.com.
If the requeue is not acknowledged, incoming yield is applied to the original prep request.
You can cancel a pending requeue. You might do this if the requeue is not needed, or there were errors in the requeue.
Open the biosample details page for the requeued biosample.
Select the Requeues tab.
To cancel a library, do the following:
Select a library.
Select the Status drop-down arrow, point to Cancel, and then select Library Requeue
To cancel a pool, do the following:
Select a pool.
Select the Status drop-down arrow, point to Cancel, and then select Pool Requeue
The following examples show possible yield scenarios:
Example 1: Required Yield Met — The lab produces the required yield specified in the biosample workflow file.
Example 2: Missing Yield — The lab does not produce all of the required yield. A lab requeue is initated and fulfilled.
Example 3: Missing Yield After Lab Requeue — The lab does not produce the required yield from the initial request or the lab requeue. A second lab requeue is initiated and fulfilled.
Requeue Sample Sheet
The requeue sample sheet feature loads the original run sample sheet into a text editor, allowing for review and/or editing before launching the requeue. Previously this feature only supported V1 sample sheets and now supports both V1 and V2 formats. For more information, see Fix Sample Sheet.
Requeue Planned Run
The requeue planned run feature uses the Run Planning tool for reviewing and/or editing V2 sample sheets. This feature supports both starting from the original sample sheet as well as uploading a new V2 sample sheet, which can be useful when the original sample sheet is the V1 format. Previously this feature was restricted to runs from certain instrument types and storage configurations and is now available for all runs. For more information, see Requeue a Planned Run.
Supported Requeue Applications
The V1 sample sheet format supports launching the FASTQ Generation app.
The V2 sample sheet format supports launching BCL Convert and supported Illumina DRAGEN secondary analysis applications:
Illumina DRAGEN Apps version 3 or higher.
DRAGEN TruSight Oncology 500 Apps version 2.1 or higher.
To simplify workflows in BaseSpace Sequence Hub, we are consolidating New and Classic mode interfaces. With this change, Biosamples and Samples will be visible in a single view with no need to toggle between modes. Instead, choose a preferred input type when launching applications.
For more information,
Around the middle of 2022, we will turn on an automated process in BaseSpace Sequence Hub that will start zipping the BCL and image files associated with old runs. Any run that hasn’t been modified for more than 90 days will be eligible for zipping. Metrics, log files, samplesheets, and other files in the run folder will be left untouched, and the ability to review run metrics remains unchanged.
For more information, see Automated Run Zipping.
BaseSpace Sequence Hub now includes archive storage. This feature allows accounts configured for New application mode to move runs and datasets to long-term storage, with significantly lower costs than standard storage.
Data transfer to and from archival storage takes several days to complete and incurs additional fees per TB transferred. Restoring data that have been archived for a short period of time can result in higher overall storage, so archiving is not recommended for data that you intend to use soon.
For more information, see Archival Storage.
The Demo Data page provides a list of run and project public data sets that you can add to your account.
Select the Demo Data tab.
Select a data set name to view details.
Select Import to import the run or project.
Select Accept to confirm the import.
If you do not need all of the files in a data set, you can download files individually.
Navigate to the analysis file you want to download.
Select the file.
Select Download.
If you are downloading a small text file, the download dialog displays a preview of the contents.
BaseSpace Sequence Hub allows you to download data as a package, individually, or as a group of FASTQ files.
The BaseSpace Sequence Hub Downloader supports downloading files through a proxy server and automatically inherits appropriate settings from the host system. The proxy server must be configured to support the SOCKS4/5 protocol for TCP connections. Contact your network administrator for help with establishing these settings.
The BaseSpace Sequence Hub Downloader has been updated and renamed to BaseSpace Sequence Hub Downloader. Make sure that you have the correct version of the downloader before proceeding.
Each sequencing run produces log files, instrument health data, run metrics, base call information (*.bcl files), and other data. BaseSpace Sequence Hub demultiplexes base call information to create the FASTQ files used in secondary analysis.
Biosamples represent the source biological sample being sequenced. They are associated with data aggregated from multiple sequencing runs according the sample name provided in the samplesheet of each run.
Samples represent a set of FASTQ files from a single sequencing run, according to the sample name provided in the samplesheet.
Libraries are produced when a biosample is prepped with a library prep kit.
Pools are an aliquot of one or more libraries, pooled together in order to be placed in a flowcell lane.
Datasets are sets of files produced by a Basespace application. Some views will refer to them as "Other Datasets" to distinguish them from Datasets containing FASTQ files. These were formerly referred to as "App Results."
FASTQ Datasets are a set of FASTQ files produced by FASTQ-Generation or BclConvert apps. Given their proment place in the Basespace data model, they're often treated as a distinct type from "Other Datasets," which aids in data management tasks like filtering & sorting.
Projects are the containers for datasets and dataset files, which can include FASTQ, BAM, and VCF files. Projects can be associated with runs, analyses, and other entities in BaseSpace Sequence Hub. If a given project contains FASTQ files, it will also be associated with one or more Samples & Biosamples.
Basespace apps that analyze FASTQ files can accept either Biosamples or Samples in the input form, and the system will utilize the proper set of FASTQ files in each situation.
Basespace users can select their preferred input at the top of the form, and Basespace will load the correct controls into the form:
Runs and Projects are compatible with both Biosamples and Samples to offer maximum flexibility to all types of users
A run's Biosamples tab will list all of the Biosamples that this run contributed yield to:
A project's Biosamples tab will list all of the Biosamples associated with FASTQ Datasets that live this this project:
Because BaseSpace Sequence Hub tracks data for the biosample, you can easily aggregate data from a biosample that has been sequenced as part of multiple libraries or pools.
Basespace still has full support for classic data types like Samples (see term definitions above for more info). You can continue to use Samples and the associated set of features if that model is a good fit for your lab's needs, like launching an app with input FASTQ files from a single run.
A run's Samples tab will continue to list the samples containing FASTQ files produced from that run's sequencing data:
A project's Samples tab will continue to list the samples associated with FASTQ files that live in that project:
When launching apps using biosamples as inputs, BaseSpace Sequence Hub automatically aggregates all of the valid (QC Passed) biosample FASTQ data associated with the same default biosamples. You can control which data are used in analyses by setting the QC status of a resource as QC Passed or QC Failed.
Aggregated FASTQ data can be produced from multiple libraries, lanes, or flow cells, and can contain data with different read lengths from the same biosample.
To prevent unintended aggregation, biosamples are locked if the biosample was converted from a sample and has data produced by two or more runs. The data cannot be used in analysis until the biosample has been reviewed and unlocked, however you can use samples to launch analysis without aggregation. For information about unlocking biosamples to make them available for analysis, see Unlock Biosamples.
Samples contain data from a single sequencing run only, and therefore do not support data aggregation.
When BaseSpace Sequence Hub collects data for an app launch, it automatically excludes QC-failed lanes, libraries, pools, and any downstream data they produced. For example, if you fail a flow cell lane, all FASTQ data sets produced from that lane are excluded when aggregating data for the biosamples and libraries put on those lanes. If you fail a FASTQ dataset, only that FASTQ dataset is excluded.
If a FASTQ dataset has been copied, BaseSpace Sequence Hub uses the original FASTQ dataset, or the most recent copy if the original is not available. To use a different copy, mark the other copies as QC Failed before starting the analysis.
The following resources can be excluded from data aggregation:
Lanes—Fail lanes using Automatic Lane QC, BaseSpace Sequence Hub API, or manually in BaseSpace Sequence Hub.
Libraries—Fail libraries using the BaseSpace Sequence Hub API.
Pools—Fail pools using the BaseSpace Sequence Hub API.
FASTQ Datasets—Fail FASTQ data sets using the BaseSpace Sequence Hub API, or manually in BaseSpace Sequence Hub.
When using apps that have not been updated to use biosamples or data sets as inputs, BaseSpace Sequence Hub automatically converts the FASTQ data sets into samples before launching the app.
If you specify a library prep kit when selecting a biosample for analysis, the analysis launches using only FASTQ datasets from libraries of the specified prep kit. In the following example, Prep Kit B is selected as input and the FASTQ files from Prep Kit A are excluded.
Yield is the amount of sequencing data produced for a particular biosample. When you add a new prep request with a biosample workflow, you specify the amount of yield you require before launching the analysis. BaseSpace Sequence Hub tracks the QC-passed data sets it receives and adds them to the cumulative yield total.
You can use yield information to easily identify the biosamples that do not have enough data and require a lab requeue.
In BaseSpace Sequence Hub, yield is tracked by library type, so biosamples with multiple library kits will have multiple yield calculations.
Yield values appear in the Summary tab of the biosample details page.
A biosample manifest is uploaded with a prep request required yield of 105 Gbp.
80 Gbp of QC-passed FASTQ data sets are received. The actual yield is increased and the amount still pending is changed to 25 GBP to reflect the remaining balance. The expected yield does not change because it represents the total amount expected from the lab and includes actual yield and pending yield.
The remaining 25 Gbp of FASTQ data sets are received. Actual yield is updated with the received amount. The full amount expected from the lab has been received, so there is no more pending yield.
You can configure BaseSpace Sequence Hub to automatically apply a QC status to the lanes of sequencing runs as they complete. BaseSpace Sequence Hub compares the lane metrics to your predefined thresholds and assigns a lane status as QC Passed or QC Failed.
When a lane is set to QC failed, its data are excluded from downstream data analyses and total yield calculations.
Lane QC thresholds are configured per user (or per workgroup) and apply to all lanes generated for sequencing runs owned by the user.
For information about using the BaseSpace Sequence Hub API or CLI to configure lane QC thresholds, see the developer documentation at .
Until automated QC settings have been configured, all lanes are set to QC Passed by default.
You can review and manually change the lane QC status after it has been applied. For information about manually applying QC status, see .
BaseSpace Sequence Hub tracks the status of biosample data from library prep through sequencing and delivery of analysis files.
For information about changing statuses, see and .
For information about sequencing yield status, see .
The delivery status of the analysis. Manually update statuses to track the progress of analysis deliverables. Possible values are:
On Hold—The analysis is not ready for delivery.
Do Not Deliver—The analysis files should not be delivered. You can use this status to confirm that a QC Failed analysis has been reviewed and should not be delivered, or to mark R&D and test analyses that are for internal use only.
Deliver—The analysis files can be delivered.
Ready For Delivery—The analysis files have been reviewed and are ready for delivery.
Delivery in Progress—The analysis files are currently being transferred.
Delivery Failed—Delivery of the analysis was attempted and failed.
Delivered—The analysis files have been delivered.
The status of the analysis through execution and QC. Possible values are:
Pending—The analysis is waiting for dependencies (eg, yield or completion of another analysis) to be met.
Queued for Analysis—The analysis workflow has met its dependencies (eg, yield or completion of another analysis) and is queued for analysis. Analyses with this status cannot be deleted.
Awaiting Authorization—The analysis has been created but has not been accepted.
Initializing—The app has launched and is awaiting execution.
Running—The analysis is running. Analyses with this status cannot be deleted.
Timed Out—The analysis did not complete within 48 hours and was terminated.
Needs Attention—There is a problem with the run sample sheet.
Canceled—A user manually canceled the analysis before its dependencies were met.
Aborting—The analysis is in the process of being stopped. Analyses with this status cannot be deleted.
Aborted—The analysis has been stopped.
Complete—The analysis has finished and is ready for use.
QC Passed—All analysis metrics met established thresholds or were reviewed and manually passed.
QC Failed—One or more analysis metrics did not meet established thresholds or were reviewed and manually failed.
Discarded—The QC Failed analysis has been reviewed and can be ignored.
Higher-level status of the biosample, from library preparation through sequencing. If more than one library has been prepared from a biosample, the status refers to the most recent library being processed. Possible values are:
New—The default state for a biosample created by LIMS systems.
Active—The biosample is being worked on in the lab. The default state for a biosample created by non-LIMS systems.
Done—The analysis files have been delivered. This status is set manually
Canceled—The biosample is canceled and will no longer be used or processed.
QC Failed—The biosample has failed Automatic QC in a LIMS system or was manually failed based on metrics.
On Hold—The biosample needs attention or further review. This status is set manually.
The status of lab tasks for the biosample, from library preparation through sequencing. If more than one library has been prepared from a biosample, the status refers to the most recent library being processed. Statuses are automatically updated by BaseSpace Sequence Hub. Possible values are:
New—The biosample has been added to BaseSpace Sequence Hub. This is the default status.
Visual QC Failed—The container of biosamples failed visual inspection quality control check and cannot continue.
Visual QC Passed—The container of biosamples passed visual inspection quality control check.
Quant QC Failed—The biosample did not meet concentration threshold at the quantification step.
Quant QC Passed—The biosample met concentration threshold at the quantification step.
Sequencing—The biosample is being sequenced by an instrument.
Requeued—A requeue for the biosample or library has been initiated and acknowledged by the lab.
Missing Yield—BaseSpace Sequence Hub has not received the required yield before the process has timed out.
Complete—The required yield for the biosample has been met.
The status of the generated or uploaded FASTQ dataset.
Default status for FASTQ data. Datasets in the default state are included in data aggregation for analysis.
QC Passed—The FASTQ data passed QC.
QC Failed—The FASTQ data failed QC.
The status of the requeue. Possible values are:
Pending—The requeue request has been initiated and is awaiting acknowledgment by a lab or LIMS.
Acknowledged—The requeue request has been acknowledged manually by a lab or automatically by a LIMS.
Canceled—The requeue has been stopped manually by a user.
Expired—The requeue has been acknowledged but has not met the minimum yield required within the configured amount of time.
Fulfilled—The requeue has met the requested yield.
Failed—The requeue request has been rejected because there is not enough volume to perform the requeue.
The status of the individual lane. Lane status can be changed by Automatic Lane QC, manually, or via API. Possible values are:
QC Passed—The lane data passed QC.
QC Failed—The lane data failed QC.
The overall status of the lanes on a run. Possible values are:
Initial—The default status for a lane until QC has been performed or is manually overwritten.
Lane Pending QC—Automatic lane QC is being processed against the run thresholds (set via an API).
Lane QC Failed—The lane has failed the automatic QC thresholds set by the user (via an API) or has been manually overwritten.
QC Passed—All lanes passed QC.
Rehybed—The run has been stopped because of a problem with the flow cell. After the flow cell is rehybed, it is used in a new run.
The status of the library. Library status is updated via the BaseSpace Sequence HubAPI. Possible values are:
Active—The default status of a library.
Canceled—The library has been marked as canceled and will no longer be worked on.
Failed—The library has failed QC metrics.
Consumed—The library has exhausted its volume and can no longer be requeued.
The status of the pool. Pool status is updated via the BaseSpace Sequence HubAPI. Possible values are:
QC Passed—The data from the pool has passed QC. This is the default status.
QC Failed—The data from the pool do not pass QC.
Failed—The pool failed due to contamination or other physical reason.
Consumed—The pool contents have been used and there is not enough volume for a requeue.
Requeued—A requeue has been initiated for the pool.
Active—A library kit and yield are specified in an imported biosample workflow file.
Missing Yield—BaseSpace Sequence Hub has not received the required yield before the yield timeout.
Canceled—The request to sequence the library has been canceled.
The status of the sequencing run: from Prep Tab planning, sequencing, file-upload, and FASTQ analysis. Statuses are automatically updated from the instrument.
Planning—The run is being planned in the Prep Tab.
Ready—The run prepared on the Prep Tab is ready to be sequenced.
Unknown—The status of the run is unknown.
Running—The instrument is sequencing the run.
Paused—A user paused the run on the instrument control software.
Stopped—The instrument has been put into a Stopped state through the instrument control software.
Uploading—The instrument completed sequencing and is uploading files.
Failed Upload—The instrument failed to upload files to the BaseSpace Sequence Hub.
Pending Analysis—The file has uploaded completely and is waiting for the Generate FASTQ analysis to begin.
Analyzing—The Generate FASTQ analysis is running.
Complete—The sequencing run and subsequent Generate FASTQ analysis completed successfully.
Failed—Either the sequencing run was failed using the instrument control software or Generate FASTQ failed to complete.
Rehybing—A flow cell has been sent back to a cBot for rehybing. When the new run is complete, the rehybing status is changed to Failed.
Needs Attention—There is an issue with the sample sheet associated with the run. The run can be requeued after the sample sheet is fixed.
Timed Out—There is an issue with the sample sheet associated with the run. The run can be requeued after the sample sheet is fixed.
In this example, the full amount of required yield is not received before the yield timeout, requiring a lab requeue to request additional sequencing data
A biosample manifest is imported with a prep request required yield of 105 Gbp.
80 Gbp of QC-passed FASTQ data sets are received. The actual yield is increased and the amount still pending is changed to 15 GBP to reflect the remaining balance. The expected yield does not change because it represents the total amount expected from the lab and includes actual yield and pending yield.
Yield timeout is reached with no additional yield received.
The default yield timeout is four days, however you can configure a different timeout period using the Basespace Sequence Hub API. For more information, see the developer documentation at .
When the yield timeout is reached, BaseSpace Sequence Hub assumes that the lab is unable to produce the requested quantity. The amount pending is classified as missing, and the expected yield is reduced to the actual yield received, indicating that no additional data are expected from the lab.
__
(table scrolls to the right)
A lab requeue for more yield is initiated. In this example, the user requests more than the missing amount.
Expected yield updates. The updated amount reflects the quantity received from the initial request and the additional amount requested from the lab requeue. The required yield does not change because that is the amount of data required for analysis.
Missing yield is reset to zero.
Requested yield updates on the Requeues tab.
(table scrolls to the right)
The lab produces 47 Gbp of QC-passed FASTQ data sets. Actual and expected yield are updated to reflect the total amount received from the lab.
(table scrolls to the right)
In this example, the original request and lab requeue do not produce the required yield before the yield timeout, requiring a second lab requeue.
A biosample manifest is imported with a prep request required yield of 105 Gbp.
The lab produces 80 Gbp of QC-passed FASTQ data sets. Actual yield is increased and the amount still pending is changed to 15 GBP to reflect the remaining balance. The expected yield does not change because it represents the total amount expected from the lab and includes actual yield and pending yield.
Yield timeout is reached with no additional yield received.
The default yield timeout is four days, however you can configure a different timeout period using the BaseSpace Sequence Hub API. For more information, see the developer documentation at .
When the yield timeout is reached, BaseSpace Sequence Hub assumes that the lab is unable to produce the requested quantity. The amount pending is classified as missing, and the expected yield is reduced to the actual yield received, indicating that no additional data are expected from the lab.
(table scrolls to the right...)
A lab requeue for more yield is initiated. In this example, the user requested more than the missing amount.
Expected yield updates. The updated amount reflects the quantity received from the initial request and the additional amount requested from the lab requeue. The required yield does not change because that is the amount of data required for analysis.
Missing yield is reset to zero.
Requested yield updates on the Requeues tab.
(table scrolls to the right...)
The lab produces 20 Gbp of QC-passed FASTQ data sets. Actual yield is increased and the amount still pending is changed to 10 GBP to reflect the remaining balance. The expected yield does not change because it represents the total amount expected from the lab and includes actual yield and pending yield.
(table scrolls to the right...)
The lab requeue expires after the timeout period. As with the yield timeout, the amount pending is classified as missing and the expected yield is reduced to the actual yield received, indicating that no additional data are expected from the lab.
(table scrolls to the right...)
Another lab requeue is initiated for 10 Mbp. The new request is listed on the biosample Requeues tab.
(table scrolls to the right...)
The lab produces 5 Mbp of QC-passed FASTQ data. The lab requeue is not completely fulfilled but the required yield dependency is met and analyses can automatically launch if they are not waiting for other dependencies.
(table scrolls to the right...)
The FASTQ file is a text format file used to represent sequences. Each record has four lines of data: an identifier (read descriptor), the sequence, +, and the quality scores. For a detailed description of the FASTQ format, see .
Make sure the FASTQ file adheres to the following upload requirements:
FASTQ files are generated on Illumina instruments and saved in gzip format
The name of the FASTQ files conforms to the following convention: SampleName_SampleNumber_Lane_Read_FlowCellIndex.fastq.gz
Examples: SampleName_S1_L001_R1_001.fastq.gz
SampleName_S1_L001_R2_001.fastq.gz
The read descriptor in the FASTQ files conforms to the following convention: @Instrument:RunID:FlowCellID:Lane:Tile:X:Y ReadNum:FilterFlag:0:SampleNumber:
Examples: Read 1 descriptor: @M00900:62:000000000-A2CYG:1:1101:18016:2491 1:N:0:13
Corresponding Read 2 descriptor has ReadNum field: @M00900:62:000000000-A2CYG:1:1101:18016:2491 2:N:0:13
Quality considerations:
The number of base calls for each read equals the number of quality scores.
The number of entries for Read 1 equals the number of entries for Read 2.
The uploader determines whether files are paired-end based on matching file names in which the only difference is the ReadNum.
For paired-end reads, read 1 and read 2 files need to be uploaded together or can be combined later.
Each read has passed filter.
In some cases, you might need to override a QC status that was automatically set by BaseSpace Sequence Hub. When a resource is set to QC failed, its data and any data produced from it are excluded from downstream data analyses and total yield calculations. For more information about excluding data from analyses, see .
Changing the QC status of a resource does not change the QC status of downstream resources produced from it.
You can fail Lanes, Libraries, Pools, and FASTQ Datasets using the API. For more information, see the developer documentation at .
For information about automatic lane QC, see .
Open the Run & Lane Metrics page for the desired run.
Select the Runs tab.
From the Runs list, select a run.
From the Run Summary page, select the Run & Lane Metrics icon
Select one or more lanes that have the same QC status.
Select the Status drop-down arrow, point to Change, and then select Run Lane QC.
In the Change Lane Status dialog, review the selected lanes and do the following:
Select the Change Status to drop-down arrow and then select a new status, QC Passed or QC Failed.
[Optional] Add a comment.
You can change the status of a generated or uploaded FASTQ dataset.
Open the Output Files tab for the desired biosample.
Select Biosamples.
From the Biosamples master list, select a biosample.
From the Biosample Details page, select the Output Files tab.
Select one or more FASTQ data sets that have the same QC status.
Select the Status drop-down arrow, point to Change, and then select FASTQ QC.
In the Change FASTQ QC Status dialog, review the selected FASTQ data sets and do the following:
Select the Change Status to drop-down arrow and then select a new status, QC Passed or QC Failed.
[Optional] Add a comment.
You can change the status of analyses that have completed. Changing the analysis QC does not affect the QC of the FASTQ data sets.
Select Analyses to open the Analyses master list.
Select one or more analysis workflows that have the same QC status.
Select the Status drop-down arrow, point to Change, and then select Analysis QC.
In the Change Analysis Status dialog, review the selected analyses and do the following:
Select the Change Status to drop-down arrow and then select a new status.
QC Passed—All analysis data sets passed QC.
QC Failed—One or more analysis data sets failed QC.
Discarded—The QC-failed analysis has been reviewed and can be ignored.
[Optional] Add a comment.
BaseSpace Sequence Hub stores data for sequencing runs, samples, projects, and analyses. You can manage the data in your account and save space by deleting files that you no longer need.
The process to delete data requires two steps: move the data to trash, and then empty the trash. You can restore files that have been moved to trash.
Moving items to the trash, especially items with large amounts of data (eg, projects), can take time. The 'Trash Status' column indicates the percentage of data that has been moved into the trash. Once the status is 'Complete', the item can then be deleted by emptying the trash, or restored back to your account.
The site offers an API to delete base call and related run files (FASTQ and analysis files are not deleted). Use this option to keep a record of a run and its metadata, while reducing storage costs. Contact your IT department or system administrator for assistance.
Select the Runs tab, and then one or more runs to delete.
Select Move to Trash.
Select the run file types to delete.
Data files only—Deletes the data folder but keeps SAV data.
Data files and the run record—Deletes the data folder and any links to projects but keeps SAV data.
All run related files—Deletes all files, data, and project links associated with the run.
Verify the deletion, and then select Confirm.
Do 1 of the following on the Projects tab:
Select a project to delete.
Open a project to delete.
Select Move to Trash.
Verify the deletion, and then select Confirm.
Select the Projects tab.
Select the project containing the sample to delete.
Select the checkboxes for the samples to delete.
Select Move to Trash.
Verify the deletion, and then select Confirm.
Deleting a sample will also delete corresponding FASTQ Datasets from the related biosample.
Select the Projects tab.
Select the project containing the analysis to delete.
Select Analyses, and then select the checkbox of the samples to delete.
Select Move to Trash.
Verify the deletion, and then select Confirm.
Analyses with the following statuses are in process and cannot be deleted: Running, Aborting, Payment Complete, or Queued for Analysis.
Select the Projects or Biosamples tab.
Select the project or biosample containing the datasets to delete.
Select either the FASTQs or Other Datasets tab.
Select the checkboxes for the datasets to delete.
Select Move to Trash.
Select the dataset file types to delete.
Data Files Only deletes the files but keeps the dataset record.
All Files deletes the files and removes the record of the dataset.
Verify the deletion, and then select confirm.
Select the Trash icon.
In the Trash window, select the files you want to restore.
Select Restore.
Select the Trash icon.
Select Empty Trash.
Type
Description
Required
The target amount of sequencing data required for analysis. This value is set in the Required Yield Gbp column of the biosample workflow file.
Expected
The total amount of data you can expect to receive for the library prep when the runs are complete. This value is initially set to the required amount specified in the biosample workflow file, and includes FASTQ files received (Actual Yield) and sequencing data that is in process or otherwise unaccounted for (Pending Yield). When pending data is not received and marked as missing, the expected yield decreases accordingly.
Actual
The amount of sequencing data generated and uploaded to BaseSpace Sequence Hub. QC failed data sets and data sets from failed libraries, pools, or flowcell lanes are excluded.
Pending
The amount of data unaccounted for that you expected the lab to produce. Pending yield updates when data arrives; if the expected data does not arrive before the request timeout, it is moved from Pending to Missing Yield. Pending data values are used to calculate missing yield and are not listed for the biosample.
Missing
The amount of yield missing from the required yield target. If pending data does not arrive in the required time, BaseSpace Sequence Hub determines that the lab was not able to produce the requested yield and you can no longer expect to receive the remaining balance. The balance is deducted from the Expected yield and marked as Missing. When you request a requeue, the requested amount is moved from Missing Yield to Pending.
Surplus
The amount of yield received in excess of the required yield target.
Requested
The amount requested from the lab to resequence. This is for lab requeues only and is a cumulative calculation of requeue requests.
Required
Expected
Actual
Pending
1a: Yield Requested
+105
+105
0
+105
Total
105
105
0
105
Required
Expected
Actual
Pending
1a: Yield Requested
+105
+105
0
+105
1b: Partial Yield Received
+80
–80
Total
105
105
80
25
Required
Expected
Actual
Pending
1a: Yield Requested
+105
+105
0
+105
1b: Partial Yield Received
+80
–80
1c: Balance of Yield Received
+25
–25
Total
105
105
105
Required | Expected | Actual | Pending | Missing |
2a: Yield Requested | +105 | +105 | 0 | +105 | 0 |
2b: Partial Yield Received | +80 | –80 | 0 |
2c: Yield Timeout | -25 | –25 | +25 |
Total | 105 | 80 | 80 | 0 | 25 |
Required | Expected | Actual | Pending | Missing | Requested |
2a: Yield Requested | +105 | +105 | 0 | +105 | 0 |
2b: Partial Yield Received | +80 | –80 | - |
2c: Yield Timeout | -25 | –25 | +25 |
2d: Lab Requeue for Additional Yield | +30 | +30 | –25 | +30 |
Total |
Required | Expected | Actual | Pending | Missing | Requested |
2a: Yield Requested | +105 | +105 | 0 | +105 | 0 |
2b: Partial Yield Received | +80 | –80 | - |
2c: Yield Timeout | -25 | –25 | +25 |
2d: Lab Requeue for Additional Yield | +30 | +30 | –25 | +30 |
2e: Balance of Yield Received | +17 | +47 | –30 |
Total | 105 | 127 | 127 | 0 | 0 | 30 |
Required | Expected | Actual | Pending | Missing |
3a: Yield Requested | +105 | +105 | 0 | +105 | 0 |
3b: Partial Yield Received | +80 | –80 | 0 |
3c: Yield Timeout | -25 | –25 | +25 |
Total | 105 | 80 | 80 | 0 | 25 |
Required | Expected | Actual | Pending | Missing | Requested |
3a: Yield Requested | +105 | +105 | 0 | +105 | 0 |
3b: Partial Yield Received | +80 | –80 | 0 |
3c: Yield Timeout | -25 | –25 | +25 |
3d: Lab Requeue for Additional Yield | +30 | +30 | –25 | +30 |
Total | 105 | 110 | 80 | 30 | 0 | 30 |
Required | Expected | Actual | Pending | Missing | Requested |
3a: Yield Requested | +105 | +105 | 0 | +105 | 0 |
3b: Partial Yield Received | +80 | –80 | 0 |
3c: Yield Timeout | -25 | –25 | +25 |
3d: Lab Requeue for Additional Yield | +30 | +30 | –25 | +30 |
3e: Partial Lab Requeue Data Received | +20 | –20 |
Total | 105 | 110 | 100 | 10 | 0 | 30 |
Required | Expected | Actual | Pending | Missing | Requested |
3a: Yield Requested | +105 | +105 | 0 | +105 | 0 |
3b: Partial Yield Received | +80 | –80 | 0 |
3c: Yield Timeout | -25 | –25 | +25 |
3d: Lab Requeue for Additional Yield | +30 | +30 | –25 | +30 |
3e: Partial Lab Requeue Data Received | +20 | –20 |
3f: Lab Requeue Expires | –10 | –10 | +10 |
Total | 105 | 100 | 100 | 0 | 10 | 30 |
Required | Expected | Actual | Pending | Missing | Requested |
3a: Yield Requested | +105 | +105 | 0 | +105 | 0 |
3b: Partial Yield Received | +80 | –80 | 0 |
3c: Yield Timeout | -25 | –25 | +25 |
3d: Lab Requeue for Additional Yield | +30 | +30 | –25 | +30 |
3e: Partial Lab Requeue Data Received | +20 | –20 |
3f: Lab Requeue Expires | –10 | –10 | +10 |
3g: New Lab Requeue | +10 | +10 | –10 | +10 |
Total | 105 | 110 | 100 | 10 | 0 | 40 |
Required | Expected | Actual | Pending | Missing | Requested |
3a: Yield Requested | +105 | +105 | 0 | +105 | 0 |
3b: Partial Yield Received | +80 | –80 | 0 |
3c: Yield Timeout | -25 | –25 | +25 |
3d: Lab Requeue for Additional Yield | +30 | +30 | –25 | +30 |
3e: Partial Lab Requeue Data Received | +20 | –20 |
3f: Lab Requeue Expires | –10 | –10 | +10 |
3g: New Lab Requeue | +10 | +10 | –10 | +10 |
3h: Lab Requeue Data Received | +5 | –5 | –10 |
Total | 105 | 110 | 105 | 5 | 0 | 40 |
Required | Expected | Actual | Pending |
2a: Yield Requested | +105 | +105 | 0 | +105 |
Total | 105 | 105 | 0 | 105 |
Required | Expected | Actual | Pending |
2a: Yield Requested | +105 | +105 | 0 | +105 |
2b: Partial Yield Received | +80 | –80 |
Total | 105 | 105 | 80 | 25 |
Required | Expected | Actual | Pending |
3a: Yield Requested | +105 | +105 | 0 | +105 |
Total | 105 | 105 | 0 | 105 |
Required | Expected | Actual | Pending |
3a: Yield Requested | +105 | +105 | 0 | +105 |
3b: Partial Yield Received | +80 | –80 |
Total | 105 | 105 | 80 | 25 |
Use the BaseSpace Sequence Hub Downloader to download FASTQ or general datasets. Datasets are linked to biosamples and are listed on the Datasets tab of the biosample details page. To download a package of datasets from a run, see Download Run Data Files.
The BaseSpace Sequence Hub Downloader has been updated and renamed to BaseSpace Sequence Hub Downloader. Make sure that you have the correct version of the downloader before proceeding.
From the Biosamples or Projects list, select the biosample or project.
Select the FASTQs or Other Datasets tab.
[Optional] Select a dataset name to view a list of file details.
Select the checkboxes for the desired datasets.
Select File, Download, then select Dataset.
Select Download. The BaseSpace Sequence Hub Downloader guides you through the download process, and starts the download of the files to the desired location.
Download data from a run as a package of FASTQ files or SAV files. Use the following steps to download a package.
Open the desired run.
Select File, point to Download, and then select Run.
Select the file type you want to download.
FASTQ—FASTQ files.
Sequencing Analysis Viewer (SAV)—InterOp and other files required to run SAV.
Data is downloaded in a compressed (*.zip) file format.
Use the BaseSpace Sequence Hub Downloader to download a package of all files in a project.
The BaseSpace Sequence Hub Downloader has been updated and renamed to BaseSpace Sequence Hub Downloader. Make sure that you have the correct version of the downloader before proceeding.
Select Projects.
Select a project.
Select Download.
The BaseSpace Sequence Hub Downloader guides you through the download process, and starts the download of the files to the desired location.
Use the BaseSpace Sequence Hub Downloader to download whole samples or individual files from a specific sample.
From the Projects list, select the project.
Select the Samples tab.
Select the checkboxes for the desired sample or samples.
Select File, Download, then select Sample.
Select Download. The BaseSpace Sequence Hub Downloader guides you through the download process, and starts the download of the files to the desired location.
From the Runs or Projects list, select the run or project.
Select the Samples tab.
Click the name of the desired sample to access the sample details page.
In the Files section, select the checkboxes of the file or files to download.
Select "Download Selected". The BaseSpace Sequence Hub Downloader guides you through the download process, and starts the download of the files to the desired location.
The file uploader imports the following file types to any project you have write access to: FASTQ (.fastq.gz), analysis (VCF and gVCF), manifest (.txt), or other file types. Use the file uploader when you want to analyze files generated outside of BaseSpace Sequence Hub, or to attach other information related to the project.
Open the project.
From the project, select File, Upload, and then select Files.
Select type of files to upload.
If you are uploading a FASTQ file, do as follows.
To upload FASTQs to a sample,
Set the "Save Upload To" toggle to "Sample".
Select Finish Upload.
To upload FASTQs to a biosample,
Select Select Biosample, then select an existing or create a new biosample to associate the FASTQ dataset with.
Enter a library name.
Select a prep kit.
Select Finish Upload.
If you are uploading a VCF file, do as follows.
[Optional] Select Select Biosample, then select or create a biosample that the VCF will be associated with.
Select Finish Upload.
If you are uploading a manifest file, do as follows.
[Optional] Select Select Biosample, then select or create a biosample that the manifest will be associated with.
Select Finish Upload.
If you are uploading other file types, do as follows.
[Optional] Select Select Biosample, then select or create a biosample that the files will be associated with.
Select Finish Upload.
Uploading multiple FASTQ, VCF, or manifest files in a single session requires files of the same type.
The FastQ importer only works for complete samples, you can not upload read2 of a FASTQ alone.
FASTQ files need to adhere to Illumina standards, as specified below:
Data for a single sample can constitute multiple files. The total number of files per sample and their combined size are limited to 16 and 25 GB respectively.
The uploader will only support gzipped FASTQ files generated on Illumina instruments.
The name of the FASTQ files must conform the following convention: SampleName_SampleNumber_Lane_Read_FlowCellIndex.fastq.gz (i.e. SampleName_S1_L001_R1_001.fastq.gz / SampleName_S1_L001_R2_001.fastq.gz)
The read descriptor in the FASTQ files must conform to the following convention: @Instrument:RunID:FlowCellID:Lane:Tile:X:Y ReadNum:FilterFlag:0:SampleNumber:
Read 1 descriptor would look like this: @M00900:62:000000000-A2CYG:1:1101:18016:2491 1:N:0:13
Read 2 would have a 2 in the ReadNum field, like this: @M00900:62:000000000-A2CYG:1:1101:18016:2491 2:N:0:13
Quality considerations
The number of base calls for each read must equal the number of quality scores
The number of entries for Read 1 must equal the number of entries for Read 2
The uploader will determine if files are paired-end based on the matching file names in which the only difference is the ReadNum
For paired-end reads, the descriptor must match for every entry for both reads 1 and 2
Each read has passed filter
Use the BaseSpace Sequence Hub Downloader to download a package of analysis files.
The BaseSpace Sequence Hub Downloader has been updated and renamed to BaseSpace Sequence Hub Downloader. Make sure that you have the correct version of the downloader before proceeding.
Select Analyses.
Select the checkboxes for the analyses you want to download.
Select File, point to Download, then select Analysis.
Select the file type you want to download.
All files including VCF, BAM, & FASTQ
VCF files only
BAM files only
Select Download. The BaseSpace Sequence Hub Downloader guides you through the download process, and starts the download of the files to the desired location.
Select the Filters icon to filter the list.
Copy data sets from one project to another to do any of the following:
Analyze a dataset in the context of two different projects.
Transfer ownership of a dataset, but keep a copy.
Correct a dataset assigned to the wrong project.
After copying, both the original and the copy appear in the data sets tab on the biosample details page. When aggregating data for analyses, BaseSpace Sequence Hub uses the original dataset or the most recent copy if the original is not available. To use a different copy, mark the other copies as QC failed before starting the analysis.
From the Biosamples page, select the biosample.
Select the Datasets tab
Select the checkbox of each dataset you want to copy.
Select File, point to Copy, and then select FASTQ Dataset.
[Optional] Select New to create a new project.
Select a project to copy the data sets to. You can only copy data sets to projects you own.
Select Confirm.
Transfer ownership or a run or project to give control of data to a collaborator or customer, or to move data between accounts. You can use this method to migrate data from a private account to a Professional or Enterprise workgroup. New owners must have a BaseSpace Sequence Hub account.
To transfer data to an Enterprise workgroup, both the transferring owner and the receiving owner must be members of the workgroup. For information about creating workgroups, enabling outside collaborators, and inviting users, see Workgroups.
When the new owner accepts the transfer, you lose control of the run or project. You cannot see the run or project unless the new owner shares it with you. For more information, see Sharing Data and Data Access After Share / Transfer.
Transferring a run or project does not alter any existing shares. For instructions on how to review and manage collaborators, see Manage Collaborator Access.
Select the project or run to transfer.
Select Share, point to Transfer, and then select Transfer Ownership.
Enter the email of the new owner and an optional message.
Select Continue.
Read the terms of transferring ownership, and select Transfer Now to accept. BaseSpace Sequence Hub emails the new owner to request acceptance of the ownership of the run or project. The transfer is complete when the new owner accepts.
[Optional] View a list of transfers and their status.
Select the Account menu, and then select Settings.
Select Transfer History.
If items from a project are in the trash, you cannot transfer ownership of the project.
You can cancel an ownership transfer before the collaborator accepts the transfer.
From the project or run, select Share, point to Transfer, and then select Cancel Transfer.
Select Cancel Transfer to confirm the cancellation.
A project or run that is transferred to you appears on the Dashboard. Accept the transfer to move the data into your account.
You can receive a project with a name that is the same as or similar to the name of an existing project in your account. After you have accepted the transfer, you can change the project name. For more information, see Edit Project Details.
Select the Dashboard tab.
Locate the project or run under Notifications, and then select Accept.
Select the account drop-down arrow and then select Settings.
From the Settings page, select Transfer History.
As detailed in the End User License Agreement (EULA), BaseSpace Sequence Hub may automatically and permanently delete all data in accounts that meet either of the following conditions:
Basic account storage exceeds the 1 Terabyte limit.
The account has been inactive for more than six months.
If your data is subject to deletion, BaseSpace Sequence Hub provides notice 30 days in advance.
Use Run Planning tool in BaseSpace Sequence Hub to create and configure your run settings.
If your instrument is configured for Cloud mode or Hybrid mode, submit the run configuration to your BaseSpace Sequence Hub account. The run becomes available in the planned runs control software of the instrument.
If your instrument is configured for Local mode, use Run Planning tool to create your sample sheet in v2 file format. Alternatively, create a sample sheet without BaseSpace Sequence Hub using a provided template.
If your instrument is not configured for Cloud mode integration, you can export the sample sheet in v2 file format. Placing the sample sheet in the instrument run folder and sequencing with Run Monitoring and Storage enabled will allow the specified analysis to launch automatically when the upload to BaseSpace Sequence Hub completes.
Planning a run in BaseSpace Sequence Hub is available for the following sequencing systems:
NextSeq 1000 and NextSeq 2000 Sequencing Systems - see Plan a NextSeq 1000/2000 Run
NovaSeq X Series Sequencing Systems - see Plan a NovaSeq X Series Run
If you want to plan a run for NextSeq 500, NextSeq 550 or MiniSeq, use the Prep Tab Run Planning.
To perform a secondary analysis, an Application may require certain types of files, such as:
AuxCnvPanelOfNormalsFile - for Enrichment workflow
AuxNoiseBaselineFile - for Enrichment workflow
BedFile - for Enrichment workflow
RnaGeneAnnotationFile - for RNA workflow
Reference files for NovaSeq X analysis are managed from Resources page.
Select Resources from the user menu on the top right of Sequence Hub page
Select Reference Files tab to see the list of reference files available for use in the run planning.
Both standard and custom files are included in the list.
Import a Custom Reference File
Select Import Custom Reference File to upload a custom file.
After file upload is completed,
Select the correct File Type. Run Planning tool will associate the new custom file with the Application based on the selected File Type.
Select one or more reference genome(s) that should be associated with the file.
[Optional] Enter a description for the custom reference file.
Select Save.
Edit a Custom Reference File
To edit a custom reference file's metadata, go to the listing page and select the file. Update the information on the Edit page and save it upon completion.
To update the file content, select Import Custom Reference File from the listing page and upload the new file.
Delete a Custom Reference File
To delete a custom reference file, go to the listing page, and select the delete icon beside the file.
You can correct errors in your index and regenerate FASTQ files using the Prep tab up to five times. This feature is available only for runs set up in BaseSpace Sequence Hub.
Go to the Summary page for the run you want to fix.
Select the Settings tab.
Select the Pool ID link at the bottom of the screen.
Select the Plate ID link in the table.
Select Edit.
Select the correct indexes from the index drop-down lists.
Select Save.
Return to the Summary page for the run.
Select Status, point to Requeue, and then select FASTQ Generation.
BaseSpace Sequence Hub regenerates the FASTQ files with the corrected indexes. The new FASTQ files are added to the biosample list and the original files are marked as QC Failed.
Archival storage moves data into long-term archives with lower data storage rates. It is intended for data that must be retained and do not require immediate access.
Data transfer to and from archival storage incurs additional fees per Terabyte transferred in the move. Restoring data that have been archived for a short period of time can result in higher storage costs.
Archived data will still be reflected in overall run and project file size. To see a breakdown of storage, view your usage report (see Generate Usage Reports)
For more information about storage costs, see the illumina.com iCredits information page.
The following data can be archived:
Runs—Includes all files within the Data folder.
FASTQ and data set files—Includes all files within the data set.
Archived data appear in the run and data set lists. Filter the run or FASTQ lists to view only active or archived data.
Data from sample sheets are matched to existing biosamples, libraries, and pools in the account belonging to the run owner. If the data do not match exactly, the biosamples, libraries, or pools are added as new. To correct mismatch errors, fix the sample sheet and perform a run requeue. For more information about fixing sample sheets, see Fix Sample Sheet.
To ensure that run data is correctly matched to entities in BaseSpace Sequence Hub, upload biosamples using a biosample workflow file, CLI, or API before uploading the sample sheet. For more information about uploading biosamples, see Biosample Workflow.
The following table lists the sample sheet data that is matched to biosample data.
Sample Sheet | Biosample Data | Description |
---|---|---|
In the following example, the sample name is missing. BaseSpace Sequence Hub creates a new library using the Saliva 2 name from the provided sample ID.
Select the Runs tab, and then select the New Run drop-down.
Select Run Planning.
Run Settings wizard will be loaded.
In the Run Name field, enter a unique name of your preference to identify the current run. The run name can contain a maximum of 255 alphanumeric characters, spaces, dashes, and underscores.
[Optional] In the Run Description field, enter a description of the current run. The run description can contain a maximum of 255 alphanumeric characters.
Select the Instrument Platform.
Select the analysis location. Depending on the selected instrument type, not all options may be available.
BaseSpace - Analyze sequencing data in the cloud.
Local - Analyze sequencing data on-instrument or generate a Sample Sheet v2 for Local or Hybrid mode.
[Optional] In the Library Tube ID field, optionally enter the library tube ID of the current run. The library tube id can contain a maximum of 255 alphanumeric characters.
Select Next
Configuration wizard will be loaded
Select an analysis type and version. For more information about secondary analyses, see DRAGEN Secondary Analysis Output Files on the system guide for your instrument or the BaseSpace Sequence Hub app documentation. If you selected DRAGEN Single Cell RNA analysis, see the NextSeq 1000/2000 Products Files page for information on third-party single cell RNA library prep kit compatibility.
For on-instrument analysis, the version selected must match the version of DRAGEN installed on the instrument. To confirm the version of DRAGEN installed on the instrument, see DRAGEN Workflow and License Updates on the system guide for your instrument.
[Optional] Set up custom index kits as follows. If you are using more than one library, the libraries must have the same index read lengths.
Select Add Custom Index Adapter Kit under the Index Adapter Kit dropdown.
Select a template type and enter the kit name, adapter sequences, index strategies, and index sequences. Make sure the second index (i5) adapter sequences are in forward orientation.
Select Create New Kit.
[Optional] Set up custom library prep kit as follows.
Select Add Custom Library Prep Kit under the Library Prep Kit dropdown.
Enter the name, read types, default read cycles, and compatible index adapter kits for your custom library prep kit.
Select Create New Kit.
Select the following instrument settings. Depending on the library prep kit, recommended options are automatically selected. Some library prep kits have hard-coded number of indexes reads and read types, which cannot be changed.
Library prep kit
Index adapter kit
Number of index reads
Read type
Number of sequencing cycles per read
If Not Specified is selected for library prep kit, enter the number of Index sequencing cycles to be used in the run.
Enter sample information into the Sample Data spreadsheet using one of the following options. To group samples for data aggregation during downstream analysis, assign a name for the group in the Project column.
Select Import Samples and the type of the import source file in the dropdown, either CSV or Sample Sheet, and then select your source file. If CSV file is selected, make sure that your file follows the template that can be downloaded by selecting Download Template button.
If Sample Sheet is selected, make sure that your sample sheet meets the formatting requirements. If CSV is selected, make sure to use the correct template as the template can be different depending on the selected index adapter kit and index strategy. For more details, see .
Paste sample IDs and either index plate well positions or i7 and i5 indexes directly from an external file. Before pasting, enter the number of sample rows in the Rows field, and then select +. Sample IDs can contain up to 100 alphanumeric characters, hyphens, and underscores.
Fixed-layout index plates require entries for well position. Indexes that do not have a fixed layout require entries for i7 and i5 indexes. i5 indexes must be entered in the forward orientation.
Manually enter sample IDs and corresponding well positions or indexes. If Not Specified is selected for the library prep kit, enter Index 2 (i5) sequences in the forward orientation.
Configure the settings for the analysis type selected for your run.
The sample sheet is a comma-delimited file (SampleSheet.csv) that stores the information needed to set up and analyze a sequencing experiment. The file includes a list of samples, their index sequences, and the sequencing workflow.
Every run in BaseSpace Sequence Hub requires an associated sample sheet to define projects and samples, assign indexes, and run workflow apps.
Use Illumina Experiment Manager software to set up a sample sheet for your library prep protocol.
BaseSpace Sequence Hub maps sample sheet data to biosamples and libraries in your account. To make sure the data in your sample sheet are associated correctly, upload your biosamples before you upload the sample sheet. For more information, see .
Storage Product
Cost
Details
Standard Storage
22.5 iCredits per TB per month
Default, instant free access
Archive Storage
2.0 Credits per TB per month
Around 2 days to restore, costs below
Transfer to Archive
20.0 iCredit per TB per transfer
Cost to move 1 TB from Standard to Archive
Restore from Archive
30.0 iCredit per TB per restore
Cost to move 1 TB from Archive back to Standard
If using somatic mode, you can generate a custom noise baseline file. The noise baseline file is built using normal samples that do not match to the subject the samples are from. The recommended number of normal samples is 50.
To generate a custom noise baseline file, use one of the following methods:
Use the DRAGEN Bio-IT Platform server. See the DRAGEN Bio-IT Platform Online Help for instructions.
Use DRAGEN Baseline Builder App on BaseSpace Sequence Hub. Use the BCL Convert pipeline in BaseSpace Sequence Hub Run Planning to generate FASTQ files. After the sequencing run is complete and 50 samples are available, input the FASTQ files into the DRAGEN Baseline Builder App.
For instructions to import noise baseline files to your instrument, refer to the system guide for your instrument NextSeq 2000 System Guide (document # 1000000109376)
Import Samples from CSV file is an alternative way to populate Sample table and sample-level analysis settings on the Configuration page.
Each CSV template is different depending on the Index Adapter Kit and number of Index Reads. To download a template, first make the appropriate selections in the Run Planning tool then use the Download Template link above the Sample Table.
The CSV template only includes editable columns in the Sample Table and the sample-level analysis settings if applicable (e.g. for RNA Seq application with Differential Expression). Import will auto-populate the derived columns from the input data, e.g. Index names and sequences are derived from the respective Well Position.
For custom or standard kit with fixed plates:
Sample ID*, Well Position*, Project, and sample-level analysis settings (if applicable)
For custom or standard kit with fixed indexes:
If single index strategy is selected: Sample ID*, I7 Index* , Project, and sample-level analysis settings (if applicable)
If dual indexes strategy is selected: Sample ID*, I7 Index* , I5 Index*, Project, and sample-level analysis settings (if applicable)
For Not Specified kit:
If single index strategy is selected: Sample ID*, Index 1*, Project, and sample-level analysis settings (if applicable)
If dual indexes strategy is selected: Sample ID*, Index 1*, Index 2*, Project, and sample-level analysis settings (if applicable)
Columns marked with asterisks are mandatory.
Import will fail if an incorrect CSV template is used (columns do not match the Sample table).
All rows in the input CSV file will be imported.
Invalid inputs will result in blank cells in the Sample table.
Invalid well position or Index name when custom or standard kit is used
Incorrect data type, e.g. string input when a number is expected
Out-of-range value
Invalid option for a select or dropdown list field
If there are multiple rows defining the same Sample ID, the sample-level analysis settings must be consistent.
If different values are specified, the resulting sample-level analysis settings will be blank.
For instruments that support loading separate lanes, the CSV template also includes Lanes column.
Enter one or more lane numbers in this column. When multiple lane numbers are specified, separate the values by commas.
If a Lanes cell contains an invalid lane number, the resulting cell will be blank.
NovaSeq X Series systems allow you to perform multiple DRAGEN analyzes in a single sequencing run. Before setting up secondary analysis, make sure you have installed the appropriate DRAGEN application on your instrument. For more information on installing DRAGEN applications, refer to the NovaSeq X Series Product Documentation.
If storing data in the cloud, you can create up to seven analysis application and reference genome combinations with an additional BCL Convert-only application. If storing data locally, you can create up to three analysis application and reference genome combinations with an additional BCL Convert-only application. For each combination, you can use up to eight configurations using a different library prep kit, index adapter kit, or configuration settings for an analysis application and reference genome combination already used.
The following combinations are included in the seven or three configuration limit:
The same analysis application and application version with a different reference genome
The same reference genome with a different application or application version
A different application or application version with a different reference genome
The MiniSeq, NextSeq, and NeoPrep systems provide an option to set up a run in BaseSpace Sequence Hub using the Prep tab.
Set Up a Custom Library Prep Kit
When you complete run setup in BaseSpace Sequence Hub, the run becomes available in the control software of the instrument. For instructions to complete the run, see the system guide for your instrument:
MiniSeq System Guide (document # 1000000002695)
NextSeq 550 System Guide (document # 15069765)
NextSeq 500 System Guide (document # 15046563)
NeoPrep System Guide (document # 15049720)
Analysis Configuration Template is a template containing configuration/settings for a secondary analysis to allow planning a run on Clarity LIMS. Analysis Configuration Templates created on BaseSpace Sequence Hub can be retrieved and used in a workflow on Clarity Lab Information Management System (LIMS).
To use Analysis Configuration Template, please turn on the Advanced LIMS Run Planning from the User Account Settings page (Note that Workgroup admin privilege is required). Once the setting is enabled, go to the Resources page by selecting Resources from the User Account menu and select the Analysis Configuration Templates tab.
In the Template Name field, enter a unique name of your preference to identify the template.
[Optional] In the Template Description field, enter a description for the template.
Select the Instrument Platform
Analysis Configuration Template is currently only supported for NovaSeq X Series.
Select the analysis location.
Cloud - Analyze sequencing data using Illumina pipeline in the cloud.
Local - Analyze sequencing data on-instrument.
Select an analysis type and version from the Application dropdown.
Select the Reference Genome (only applicable if analysis type is not BCL Convert).
Enter other settings that are applicable for the selected analysis.
RNA Differential Expression settings is not supported in Analysis Configuration Template. Please setup the Differential Expression on BaseSpace Sequence Hub after creating the planned run on Clarity.
Select Save to save the template.
The saved template shall be available on Clarity LIMS. After a run is planned on Clarity LIMS, it can be opened on BaseSpace Sequence Hub Run Planning for further editing. Please note that sample and kit information are not editable.
To begin editing, click on the template name hyperlink
Update the settings
Select the Save button to save the changes or the Cancel button otherwise
To delete a template, select the trash button located at the right most column.
Use the following steps to import biological samples from an external .csv file.
From the runs page, select New Run, and then select Prep Tab.
From the Prep page, select Biological Samples.
Select Import.
[Optional] Create a .csv file as follows.
Select the spreadsheet image to download a template.
Complete the following fields for each sample:
Sample ID—Enter a unique sample ID.
Name—Enter a descriptive name for the biological sample.
[Optional] Species—Enter the appropriate species.
Project—Enter the name of the project to save samples to. The selected project is the default project that contains biosample data output.
Nucleic Acid—Enter RNA or DNA.
Save the file.
Select Choose .csv File.
Browse to the appropriate file and select Open. Information from the .csv file populates the Biological Samples page.
[Optional] Select additional samples as follows.
Select Save & Continue Later.
Select the checkbox for each sample you want to use.
Select Prep Libraries.
Use the following instructions to plan a run for the NovaSeq X series systems in BaseSpace Sequence Hub.
Select the Runs tab, and then select the New Run drop-down.
Select Run Planning.
In the Run Name field, enter a unique name of your preference to identify the current run. The run name can contain a maximum of 255 alphanumeric characters, spaces, dashes, and underscores.
[Optional] Enter a description for the run. The run description can contain a maximum of 255 characters.
Select your sequencing system as the instrument platform.
Select one of the following analysis locations.
BaseSpace — Analyze sequencing data in the cloud.
Local — Analyze sequencing data on-instrument. When this option is selected, the planned run can only be exported to a sample sheet v2 file.
If analyzing data locally, select one of the following FASTQ output format options: The output format is not available for cloud storage and only applicable if you select to keep FASTQ files when setting up secondary analysis.
gzip — Save the FASTQ files in gzip format.
DRAGEN — Save FASTQ files in ora format. See DRAGEN Compression for more details.
Enter the number of cycles performed in each read: If using multiple analysis configurations, use the longest read length required by the configuration. When setting up a configuration, override automatically trims the length based on the recommended lengths for the selected library prep kit.
Read 1 — Enter up to 151 cycles.
Index 1 — Enter the number of cycles for the Index 1 (i7) primer. For a PhiX-only run, enter 0 in both index fields.
Index 2 — Enter the number of cycles for the Index 2 (i5) primer.
Read 2 — Enter up to 151 cycles. This value is typically the same as the Read 1 value.
[Optional] Enter the ID for your library tube. The library tube ID is located on the label of your library tube strip.
Select Next.
Please take note of the following when setting up a configuration.
Instrument Platform and Analysis location in Run Settings page are not editable once a Configuration is created.
Application version cannot be changed once a Configuration is saved. You need to delete the configuration and create a new one instead.
Select your analysis application.
[Optional] Enter a description for the configuration.
Select a library prep kit or add a new custom library prep kit as follows.
Select Add Custom Library Prep Kit under the Library Prep Kit dropdown.
Enter the name, read types, default read cycles, and compatible index adapter kits for your custom library prep kit.
Select Create New Kit.
Select an index adapter kit or add a new a custom index kits as follows. If you are using more than one library, the libraries must have the same index read lengths.
Select Add Custom Index Adapter Kit under the Index Adapter Kit dropdown.
Select a template type and enter the kit name, adapter sequences, index strategies, and index sequences. Make sure the second index (i5) adapter sequences are in forward orientation.
Select Create New Kit.
If applicable to your application, select a reference genome.
Select Next to configure secondary analysis settings.
Complete the following steps to create biological samples one at a time. For information about adding multiple samples, see Import Biological Samples.
Select the Prep tab, and then select Biological Samples.
Select Create.
Provide the following information:
Sample ID—Enter a unique sample ID.
Name—Enter a descriptive name for the biological sample.
[Optional] Species—Select a species from the drop-down list.
Project—Select Select Project to select or create a project to add your samples to. The selected project is the default project that contains biosample data output.
Nucleic Acid—Select the type of nucleic acid.
[Optional] Select additional samples as follows.
Select Save & Continue Later.
Select the checkbox for each sample you want to use.
Select Prep Libraries.
BaseSpace Sequence Hub converts sample name underscores to dashes in the output FASTQ files. To avoid unexpected FASTQ file names, use only alphanumeric characters and dashes in sample names.
The Run Planning tool can be used to requeue planned analyses for supported Cloud applications.
Illumina DRAGEN Apps version 3 or higher
DRAGEN TruSight Oncology 500 Apps version 2.1 or higher
To initiate a requeue, navigate to the Run Summary page and select Requeue > Planned Run from the Status menu.
You can use the sample sheet from the selected run or select a new sample sheet file to be load. This will load the analysis configurations contained in the selected Sample Sheet into the Run Planning tool, and allow edits before requeuing the analysis.
Depending on the instrument platform and analysis specified, some fields may be non-editable, including Run Name and Library Tube ID. See the Plan Runs page for specific instructions on using the Run Planning tool.
For instruments that support multiple analysis configurations, existing configurations can be edited or deleted. New configurations can also be added before requeueing.
When all changes have been made (if desired), review the settings on the Run Review page and click "Requeue" to initiate the requeue.
This method is appropriate for the following:
Prepared libraries
Samples with assigned indexes If your samples do not have indexes assigned and are not in libraries, use Import Biological Samples.
On the Libraries page, select Import.
[Optional] Create a CSV file as follows.
Select the spreadsheet image to download a template.
For each sample, enter the following information.
Sample ID—The unique sample ID.
Name—A descriptive name for the biological sample.
[Optional] Species—The appropriate species.
[Optional] Project—The name of the project to save samples to. Although optional at this step, a project is required later to store the data.
NucleicAcid—The nucleid acid, either RNA or DNA.
Well—The plate well.
Index1Name—The Index 1 name.
Index1Sequence—The Index 1 sequence.
Index2Name—The Index 2 name.
Index2Sequence—The Index 2 sequence.
Save the file.
Select Choose .csv File.
Browse to and select the appropriate CSV file, and then select Open. The information from the file populates the Import Sample Libraries page.
[Optional] Select additional libraries as follows.
Select Save & Continue Later.
Select the checkbox of each library you want to use.
Select Pool Libraries.
On the Pools page, drag a library from the plate to a pool. For Nextera Rapid Capture libraries, assign samples from the same enrichment to 1 pool.
In the Pool ID field, enter a unique name for the pool.
[OptionalSelect Add Pool to add a pool.
[Optional] Use 1 of the following methods to pool libraries from multiple plates:
Select the Plate ID drop-down arrow to switch between plates.
Select Save & Continue Later, select the checkbox of each pool to merge, and then select Merge Pools.
Select Plan Run.
To help manage multiple pools, the color of each sample well matches the color of the pool the sample was added to.
If your kit is not listed but is compatible with your sequencing system, set up a custom library prep kit or select a kit that uses the same index adapter set. For example, select Nextera Rapid Capture for TruSight Cardio.
In the Libraries page, select the Library Prep Kit drop-down arrow, and select a kit.
In the Plate ID field, enter a unique plate ID.
[Optional] In the Notes field, enter any notes. For Nextera Rapid Capture libraries, specify that the plate contains multiple enrichments.
[Optional] Assign the sample to a different project.
Select the checkbox of each library to assign to a project.
Select Set Project.
In the Select Project dialog, select or create a project to assign libraries to, and then select Select.
Select Auto Prep to automatically assign each library to a well and index location. For Nextera Rapid Capture libraries, samples belonging to the same enrichment must be in the same row. Each plate has a maximum layout of 12 indexes by 8 indexes. If you require more wells or indexes, create multiple plates.
[Optional] To change indexes, select an index on the plate and select a new index from the drop-down list. NOTE: This option is not available for fixed layout plates.
[Optional] Move libraries to a different well as follows. Select the checkbox of the library to move. Drag the selected library from the Libraries area to the appropriate well.
Select Export to save your library prep settings in a .csv file. This file serves as a reference when preparing samples in the lab.
[Optional] Select additional plates as follows.
SelectSave and Continue Later to return to Libraries.
Select the checkbox of each library you want to use.
Select Pool Libraries.
Biological samples are assigned an index even if you are not performing indexed sequencing. During run setup, you can specify no indexing.
On the Plan Run page, click the Select Instrument drop-down arrow, and select a sequencing system, either MiniSeq or NextSeq.
In the Name field, enter a name for the sequencing run.
[Optional] In the Reagent Barcode field, enter the barcode ID of the reagent kit used for the run. Entering the barcode ID links the reagent kit to the run.
[Optional] Select Use Custom Primer options:
R1 — Use custom primer for Read 1.
R2 — Use custom primer for Read 2.
Select a read type, either Single Read or Paired End.
Enter the number of cycles for each read in the sequencing run:
Read 1 Cycles — Enter a value up to 151 cycles.
Read 2 Cycles — Enter a value up to 151 cycles. This value is typically the same number of cycles as Read 1.
Review the indexing scheme for the run. To make changes, override the defaults as follows.
Select the Override default indexing scheme checkbox.
Select an indexing scheme:
Single Index — Performs a run with 1 index read.
Dual Index — Performs a run with 2 index reads.
No Index — Performs a non-indexed run.
Enter the number of cycles for each index read*:
Index 1 Cycles — Enter the number of cycles required for the Index 1 (i7) primer.
Index 2 Cycles — Enter the number of cycles required for the Index 2 (i5) primer.
Make sure that a pool is present.
Select 1 of the following buttons to continue:
Sequence — The run appears on the Planned Runs list with a status of Ready. The run becomes available from the control software of your sequencing system.
Save and Continue Later — The run appears on the Planned Runs list with a status of Planning. When you are ready to sequence, select the checkbox for the run and select Sequence. The run then becomes available from the control software of your sequencing system.
*Indexing is required when sequencing multiple libraries.
The library prep run setup for NeoPrep includes the following three parts:
Create and configure the run.
Assign samples to wells.
Review the run setup.
At any point during this process, you can select the Save button to save your work. Saved runs appear on the NeoPrep Runs page with a status of Planning. When you are ready to resume work, select the run name.
When prepping a library, select + Custom Library Prep Kit in the Library Prep Kit dropdown menu. The Custom Library Prep Kit Definition page opens.
Fill out the name of the custom prep. It has the following requirements:
Unique for your account.
Characters: only alphanumeric, hyphen, underscores, and spaces accepted.
Less than or equal to 50 characters.
Select at least 1 of the supported read types.
Select at least 1 of the indexing strategies. Only selecting None is not allowed.
Fill out the default number of cycles.
Select template to download the index definition file template.
Fill out the Settings section the following way:
For single read only: no adapter (blank), or 1 adapter sequence for Read 1.
For paired-end: no adapter (blank), or 2 adapter sequences, 1 for Read 1 and 1 for Read 2.
Each adapter sequence meets the following criteria:
Sequence of A, T, C, or G character.
Length from 1 to 20 characters.
Fill out the Index1Sequences and Index2Sequences sections the following way:
For Single Index, with or without None: 1 to 100 Index 1 names
For each Index 1 name an associated Index 1 sequence
For Dual Index, with or without None and Single Index: 1 to 100 Index 1 names
For each Index 1 name an associated Index 1 sequence: 1 to 100 Index 2 names
For each Index 2 name an associated Index 2 sequence
Each index name meets the following criteria:
Unique within the file
Length from 1 to 8 characters alphanumeric, hyphen, or underscore characters.
Each index sequence meets the following criteria
Sequence of A, T, C, or G characters
Length from 1 to 20 characters
All index sequence lengths (Read 1 and Read 2) are equal
Index 1 sequences are unique within the file set of Index 1 sequences
Index 2 sequences are unique within the file set of Index 2 sequences
If the supported indexing strategy specifies Single Index, you can set up Default Layout By Well the following way:
Each well unique from A01 to H12
For each well, an associated index name exists in the specified Index1Sequences section
If the supported indexing strategy specifies Single Index or Dual Index, you can set up Default Layout By Column the following way:
Each column number unique from 1 to 12
For each column, an associated index name exists in the specified Index1Sequences section
If the supported indexing strategy specifies Dual Index, you can set up Default Layout By Row the following way:
Each row letter unique from A to H
For each row, an associated index name exists in the specified Index2Sequences section
Select the Choose.csv File button to select and upload your custom index file.
Select Create New Kit to complete the process.
Your custom library prep has been added to the library kit drop-down!
In the Review Run Details page, review the run parameters.
If the parameters are acceptable, select Finish.
[Optional] Select the run setup details or library card mapping to view or print details.
Select Done to return to the NeoPrep Runs page.
The new run is listed with a status of Ready, which means the run is listed in the NeoPrep control software.
In the Assign Samples to Wells page, set up each well as follows.
In the Sample ID field, select Select Sample,
In the Select a Biological Sample dialog box, select a sample, and then select Select.
In the Library ID field, enter a unique name for the library.
Select the Index drop-down arrow, and select a unique index to add to the sample.
If you selected mixed insert sizes on the previous page, select the Insert Size drop-down arrow, and select the insert size of the library.
Select Next.
From the Runs page, select New Run, and then select Prep Tab.
From the Prep page, select NeoPrep.
From the NeoPrep Runs page, select Create New Run to open the Configure Run screen.
Select the Protocol drop-down arrow, and select a protocol, either TruSeq Nano DNA or TruSeq Stranded mRNA.
Select the Version drop-down arrow, and select a version of NeoPrep control software.
[Optional] In the Notes field, enter any notes.
In the Run Name field, enter a name for the run.
Select the Default Project field, select the project to configure for the run, and then select Confirm.
Select the checkbox for each process you want to run:
Prep Library—Requires preparation of libraries.
Quantify—Quantifies libraries after library prep is complete.
Normalize—Normalizes libraries after quantification is complete.
Complete the following fields as applicable. Read-only or unavailable fields are default selections for the run.
Select the Sample Count drop-down arrow, and select the number of samples to include in the run.
Select the Index Type drop-down arrow, and select the indexing scheme for the samples.
Select the Insert Size drop-down arrow, and select the insert size of the libraries. Mixed insert sizes are specified on the next page.
Select the PCR Cycles drop-down arrow, and select the number of PCR cycles for the run.
Select Next.
Sample ID
Biosample Name
If the Sample ID does not exactly match the name of a biosample associated with the specified default project in the run owner's account, BaseSpace Sequence Hub creates a new biosample from the Sample ID and associates incoming FASTQ data with the new biosample.
If the Sample ID matches a biosample name in the run owner's account, its data are aggregated to the existing biosample name.
For MiSeq instruments running Targeted RNA or Amplicon DS, the biosample name is created from the sample sheet as SampleName-SampleID, and the library name is set to default.
Project
Default Project
Sample Name
Library name
If the library is not already associated with the biosample, BaseSpace Sequence Hub creates a new library using the sample name.
If the sample name is not defined in the sample sheet, BaseSpace Sequence Hub creates a library name with the same name as the sample ID.
n/a
Library Prep Kit
If the biosample exists and has an active Prep Request, the Library Prep Kit from the Prep Request is used. If there is no Prep Request, the Library Prep Kit is set to Unknown.
Sample Plate
Container name
Sample Well
Container Position
Lanes
Pool
New pools are created for each lane with more than one library. If the same libraries (same names and indexes) are present in more than one lane of a run, a single pool is created and associated with each lane. However, if a lane has libraries that match a pool from a prior run, a new pool is created.
If there is no Lane data, all libraries are combined into a single pool.
One pool is created for each unique group in the lane column.
Run Planning provides a list of Illumina index adapter kits used for sequencing. If your index adapter kit is not available, please follow the following instructions to create a custom one.
Creating a custom index adapter kit can be done from within Run Planning (when creating a Configuration) or from the Resources page (select Index Adapter Kit tab).
A custom Index kit can be configured in yaml or tsv.
The following are basic rules to follow when configuring in yaml
3 dashes indicates the start of the definition
Begin a comment line with '#' character
Each line is typically in the format of SettingName: SettingValue
. Setting which value is a string has to be enclosed in double quotes. Other types like numeric or boolean do not require double quotes.
When a setting contains more complex information, it is usually defined in multiple lines. Ensure the right indentation to maintain the structure. Use two space characters instead of a tab character.
Three yaml templates are provided.
Non-fixed Layout: for non-fixed layout kit where any index can be selected for any sample.
Fixed Layout - Single Plate: for fixed layout kit with single plate, where each well has a defined index combination.
Fixed Layout - Multi Plate: for fixed layout kit with multi-plate, where each well has a defined index combination.
Level-1 Setting | Level-2 Setting | Description |
---|---|---|
The supported values are "Dual"
, "Single"
, and "NoIndex"
. Create a list below this setting. Each list item should be preceded with a dash (-) character and enclosed in double quotes. Use two spaces for indentation. See Example 1.
i7Index1
Create a list of Index1 sequences below this setting. Each index should be in the format of IndexName: "IndexSequence"
. See example 2.
i5Index2
Create a list of Index2 sequences below this setting. Each index should be in the format of IndexName: "IndexSequence"
. See example 2.
The mapping should be defined in the format of "WellPosition/Index1Name-Index2Name"
or "Plate-WellPosition/Index1Name-Index2Name"
. The allowed well positions are A01 - H12. If the kit requires well positions defined in different format, define FixedLayoutPositionKeyByIndexId: true
. See example 4, example 5, example 6.
The value is true
or false
. However, the usage is currently restricted to Instrument Platforms which allow multi-configuration. As each configuration only allows one Override Cycles, when setting up a run, samples with different index lengths should be separated into different configurations.
EnableCustomIndexCycles
The value is true
or false
. If the setting is set to false
or if the setting is not defined, the Override Cycles used is Y;I;I;Y
pattern.
NumCyclesIndex1Override
The value should be a numeric value. If this setting is not defined, the number of Index1 cycles follows the number of bases in the Index1 sequences.
NumCyclesIndex2Override
The value should be a numeric value. If this setting is not defined, the number of Index2 cycles follows the number of bases in the Index2 sequences. See example 7.
OverrideCycles
The value should be defined in this format: "Y{{Read1Length}};I{{Index1Length}};I{{Index2Length}};Y{{Read2Length}}?"
, where:
{{Read1Length}}
is the number of cycles for Read1,
{{Read2Length}}
is the number of cycles for Read2,
{{Index1Length}}
is the number of cycles for Index1, and
{{Index2Length}}
is the number of cycles for Index2.
If UMI is used, update the pattern accordingly.
E.g. if Read1 and Read2 cycles include 7 UMI cycles and 1 skipped-cycle: U7N1Y{{Read1Length-8}};I{{Index1Length}};I{{Index2Length}};U7N1Y{{Read2Length-8}}?"
E.g. if the kit is a single index kit, with UMI cycles instead of Index2: "Y50N{{Read1Length-50}};I8N{{Index1Length-8}};N{{Index2Length-16}}U16;Y50N{{Read2Length-50}}?"
. See example 7.
This section contains custom BCL Convert settings. The settings will be included in the sample sheet generated by Run Planning.
TrimUMI
Indicates if the UMI should be excluded from fastq files. The value is "0"
or "1"
. Set to "0" if BCL Convert should still output UMI cycles to fastq files. See example 8.
CreateFastqForIndexReads
Indicates if the UMI in Index cycles should be trimmed or not. The value is "0"
or "1"
. Set to "1" if BCL Convert should still output UMI cycles in Index to fastq files. Note that TrimUMI should also be set to "0".
Similar to yaml, three tsv templates are provided. Please note that currently .tsv file supports fewer custom kit settings (as compared to .yaml file).
Non-fixed Layout: for non-fixed layout kit where any index can be selected for any sample.
Fixed Layout - Single Plate: for fixed layout kit with single plate, where each well has a defined index combination.
Fixed Layout - Multi Plate: for fixed layout kit with multi-plate, where each well has a defined index combination.
A tsv file contains of three sections, namely [IndexKit], [Resources], [Indices], where each section contains rows of tab-separated values.
Each row in the Resources section consists of four columns: Name, Type, Format, and Value. It is used to define Adapter Read settings and the type of index kit (whether a fixed layout with single- or multi- plate or non fixed layout). In addition, the mappings of well positions and index names (only for a fixed layout kit) should be included in this section (see No 5 in the table below).
Index1 and Index2 sequences should be defined in this section. Each row consists of three columns: Name, Sequence, IndexReadNumber.
The following prerequisites are needed to get started with DRAGEN Array Cloud:
Illumina Connected Analytics subscription: An ICA Basic, Professional or Enterprise subscription can be used which include access to BaseSpace Sequence Hub. Follow the to register the software.
Workgroup setup: Workgroups must be created before login. Using a workgroup allows all members of the workgroup to share access to resources, analyses, and data. Learn more about .
Designating a workgroup as ‘Collaborative’ allows projects to be shared with collaborators or Illumina Tech Support to assist with troubleshooting. To create a collaborative workgroup, select the Enable collaborators outside of this domain checkbox during workgroup creation.
Software consumables: iCredits can be purchased for storage on the cloud platform and analysis pipelines with a compute charge. Per sample analysis can be purchased for relevant pipelines as listed in and . Follow the (found under Example 3: Configuring the Software Consumables) to register the software consumables.
[Optional] iScan integration: The iScan System is integrated with Illumina Connected Platform and can send IDATs for further analysis. The iScan System must be running iScan Control Software version 4.2.1 or later.
EULA acceptance: Accept all necessary End User License Agreements in BaseSpace Sequence Hub before scanning begins.
Internet connection: For uploading product files or IDATs, a network connection 1 GbE or faster is recommended.
Note: Accessioning BeadChips before scanning and starting analysis is no longer a required step and has been automated within the system.
Before beginning analysis, ensure workgroup context is being used so analysis can be viewed by all members of your workgroup. The name of your workgroup should appear in the top right corner.
Use the following steps to run the Microarray Analysis Setup on BaseSpace Sequence Hub:
Select the Runs tab
Select New Run
Select Microarray Analysis Setup
Enter the Analysis Name (Figure 1)
Use the Select Project link to choose the project for your output files To select an existing project, click the radio button next to the desired project name. You can also create a project by clicking the New button in the project selection window.
Select the Type of Analysis Further detail of each Type of Analysis is available in and
(Optional) Create a custom configuration via the "Add Custom Configuration" option in Configuration Settings. Custom configurations must be assigned a name and product files can be uploaded or selected (Figure 2). Custom configuration options vary by Type of Analysis including:
DRAGEN Array - Genotyping provides flexibility for turning off/on specific output files and adjusting GenCall score cutoff. Its recommended to turn off VCF output for non-human species and Final Report output for large sample numbers.
DRAGEN Array - Methylation - QC provides options to adjust thresholds as detailed in section DRAGEN Array Methylation QC .
Select your preferred option in the Configuration Settings drop-down menu Configuration setup will vary based on the Type of Analysis selected. More details are available in section .
Select Next
Select either Import Sample Sheet, Select BeadChips, or Import IDAT Files (Figure 3)
Import Sample Sheet presents a link to upload sample sheet. Users may download a template sample sheet by selecting the Download Template link.
Select BeadChips allows users to select BeadChips from the displayed list of available BeadChips. If selecting specific samples within the BeadChip is desired the Import Sample Sheet option should be used.
Import IDAT Files allows users to upload the IDAT files from a local folder to the cloud platform for use with the current and future analyses by users within the same workgroup.
Select Launch Analysis
On the Analyses tab, view the analysis status, e.g., initializing or complete.
After the analysis is complete, select the analysis and select the Files tab.
From the Files tab, select the Output folder.
BaseSpace Sequence Hub offers a wide variety of powerful NGS data analysis apps, including:
Illumina Core Apps: Developed or optimized and fully supported by Illumina
BaseSpace Lab Apps: Developed using an accelerated process to make them available to BaseSpace Sequence Hub users faster than conventional Illumina Apps, and provided as-is
Third-party apps: Developed and supported by a thriving ecosystem of third-party app providers in BaseSpace Sequence Hub
In addition, BaseSpace Sequence Hub enables users to develop custom bioinformatics apps in BaseSpace Sequence Hub. App developers are able to keep apps private, share with collaborators, or submit for publication.
Together, these apps cover the common data analysis methods used with Illumina sequencing data. These methods include RNA-Seq, exome /enrichment, amplicon, whole-genome sequencing (WGS), amplicon, de novo assembly, 16S metagenomics, and more.
For more information about BaseSpace Sequence Hub Apps, refer to the
For information about manually launching apps, see .
For information about scheduling an analysis to automatically launch, see .
The data management tab allows you to view and manage all your scanned IDAT files in the cloud. Before viewing, ensure workgroup context is being used so all data available your workgroup can be seen. The name of your workgroup should appear in the top right corner.
Use the following steps to view and manage scanned array data on BaseSpace Sequence Hub:
Select the Runs tab
Select New Run
Select Microarray Analysis Setup
Select Data Management, next to Analysis Setup. Cancel Analysis Setup.
View data and delete IDAT files as needed (Figure 1).
To view data, filter by Upload Status or sort and filter Upload Date.
To delete data, check boxes for individual samples on the left-hand side or delete multiple by using the top checkbox to select all on the current page.
Deleting the selected items will permanently delete them and the action cannot be undone. Deleting items can affect ongoing analysis. Ensure there is no ongoing analysis with the selected items before proceeding.
You can launch apps that perform additional analysis, visualization, or annotation of data. Running apps can incur a charge.
These instructions do not apply to sample sheet-driven apps (from MiSeq), which are launched automatically.
To start an app, do one of the following:
Navigate to the project or sample that you want to run the app on, select the Launch Apps button, and select the desired app from the drop-down list.
From the Apps page, select the desired app from the list and select Launch.
Read the End-User License Agreement and permissions, and then select Accept.
The app guides you through the start-up process. BaseSpace Sequence Hub has limited storage capacity and checks the free space available before starting an app. If there is not enough available space, BaseSpace Sequence Hub displays an error message. See to learn how to save space in your account.
No | Field Name | Field Value |
---|---|---|
No | Name | Type | Format | Value |
---|---|---|---|---|
Name | Sequence | IndexReadNumber |
---|---|---|
Name
Name of the kit. It is an internal name, which has to be unique within a domain.
DisplayName
Display name of the kit. It is used for the index kit display label in the Run Planning.
Organization
Organization name. It is informational and not used in planned run creation.
AllowedIndexStrategies
The index strategies supported by the kit. See AllowedIndexStrategies.
AdapterSequenceRead1
Adapter sequence for Read 1. Remove the line if it is not applicable.
AdapterSequenceRead2
Adapter sequence for Read 2. Remove the line if it is not applicable.
IndexSequences
i7Index1
A section of Index1 sequences. See IndexSequences.
IndexSequences
i7Index2
A section of Index2 sequences. See IndexSequences.
Settings
DefaultIndexStrategy
The default index strategy. It should be one of the strategy defined in the AllowedIndexStrategies.
Settings
FixedLayout
Indicates if the kit has a fixed-layout (true
) or not (false
). See example 3.
Settings
Multiplate
Settings
FixedIndexPositions
A section containing mappings of well position to index names. It is only applicable for a fixed-layout kit. See Settings - FixedIndexPositions.
Settings
AllowVariableLengthIndexSequences
Indicates if the kit can have index sequences with different lengths. See Settings - AllowVariableLengthIndexSequences
Settings
EnableCustomIndexCycles
Indicates if the kit uses a custom Override Cycles. See OverrideCycles.
Settings
OverrideCycles
The custom pattern for the Override Cycles. See OverrideCycles.
Settings
NumCyclesIndex1Override
Used to override the default Index1 cycles. See OverrideCycles.
Settings
NumCyclesIndex2Override
Used to override the default Index2 cycles. See OverrideCycles.
Settings
CustomBclConvertSettings
A section of custom BCL Convert settings. See Settings - CustomBclConvertSettings.
1
Name
Name of the kit. It is an internal name, which has to be unique within a domain.
2
DisplayName
Display name of the kit. It is used for the index kit display label in the Run Planning.
3
Description
Description of the kit. It is displayed below the index kit field when the kit is selected in the Run Planning.
4
IndexStrategy
The index strategies supported by the kit. See 4.1 - 4.7 for the supported values.
4.1
NoIndex
: only allow No Index
4.2
SingleOnly
: only allow Single Index
4.3
DualOnly
: only allow Dual Indexes
4.4
NoAndSingle
: allow No Index and Single Index; defaut is No Index
4.5
NoAndDual
: allow No Index and Dual Indexes; default is No Index
4.6
SingleAndDual
: allow Single Index and Dual Indexes; default is Single Index
4.7
All
: allow No Index, Single Index and Dual Indexes; default is No Index
1
Adapter
Adapter
string
The Adapter sequence for Read 1.
2
AdapterRead2
AdapterRead2
string
The Adapter sequence for Read 2. Include this line only when applicable.
3
FixedLayout
FixedLayout
bool
Indicates if it is a fixed layout kit. Value is true
or false
.
4
Multiplate
Multiplate
bool
Indicates if it is a fixed layout kit with multi- or single- plate. Value is true
or false
.
5
{Well position name}
FixedIndexPosition
string
Index1 and Index2 names separated by a dash, e.g. D701-D501.
{Index name}
{Index sequence}
Value is 1
(for Index1) or 2
(for Index2)
When an analysis fails or doesn't meet quality standards, you can schedule a new analysis using the original input or different inputs.
In cases where the analysis failed to successfully complete in BaseSpace Sequence Hub, you can relaunch the analysis using the original input.
In cases where the analysis did not meet quality standards, you can exclude or include data and relaunch the analysis using only data that passed QC.
Requirements:
If you are scheduling an analysis workflow that uses more than one sample as input (eg, Tumor Normal), all existing biosamples must have been imported from same original biosample workflow file.
The biosample workflow must be imported using the account that owns the biosamples. If the biosample workflow is uploaded to an account that does not own the biosample, the biosamples are added to the account as new biosamples.
[Optional] Exclude or QC fail any of the following resources. Excluding a resource also excludes any downstream data produced by the resource. For information about changing QC, see Manual QC.
Biosample
Library
Pool
Lane
Run
FASTQ Dataset
Add the following information to a new biosample workflow file.
Biosample Name
Analysis Workflow
[Optional] Analysis Group
[Optional] Sample Label
[Optional] Delivery Mode
All template columns must be present in the biosample workflow file, but the Default Project cannot be changed using a biosample workflow. If the biosample does not exist in the specified project, BaseSpace Sequence Hub creates a new biosample in the project. For information about changing the default project for a biosample, see Change Default Project.
Data owners can share data with you via a unique URL or email invitation. You can accept the share into your personal account or one of your workgroups and then view the data when you are signed into that account.
Do one of the following:
Select the link to the shared data.
Select Accept in the dashboard or email notification.
In the Share dialog, select the context (personal or workgroup) to accept the invite.
Select Accept.
CAUTION: If the owner deletes or revokes access to data they have shared with you, it is no longer available in your account. To get permanent access, download the data or have the data transferred to you.
Analysis workflows are packaged templates of BaseSpace Sequence Hub Apps with predefined settings and QC thresholds to support automation of running analyses. Use an analysis workflow to automatically launch the same app configurations on different biosamples in the project.
When you assign an analysis workflow using a biosample workflow file, the workflow status is listed as Pending. After data are received and conditional dependencies are met, the analysis is launched and the status is listed as Running.
For information about creating an Analysis Workflow for an existing app, using your own settings or QC thresholds with an existing BaseSpace Sequence Hub app, see the developer documentation at developer.basespace.illumina.com.
Analysis Workflows can be modified by their owners or members of the workgroup in which the workflow was created. If you need to modify a workflow, Illumina recommends creating a new workflow and adding a version number to the workflow name.
Upload a Biosample Manifest with analysis workflows set for one or more biosamples. See Biosample Workflow.
Sequence data for the biosample.
BaseSpace Sequence Hub checks that dependencies are met, collects data sets associated with the biosample, and launches the analysis.
BaseSpace Sequence Hub excludes datasets that have been QC Failed or are linked to QC failed libraries, pools, or lanes.
Select one or more analyses that are Running or Pending.
Select the Status drop-down arrow, point to Cancel, and then select Analysis.
BaseSpace Sequence Hub includes two types of apps:
Native applications — Run within the BaseSpace Sequence Hub infrastructure in the Amazon cloud. Native apps consist of an input form, analysis engine, and an in-browser report. Native apps can be automated using the command-line interface and BaseMount.
Web applications — Run outside of the BaseSpace Sequence Hub infrastructure, web apps connect to BaseSpace Sequence Hub using the API or SDK and can run on secure cloud infrastructure, user desktops, or mobile devices. With web apps, analysis can be configured and started from the BaseSpace Sequence Hub input form or from within the app.
BaseSpace Sequence Hub offers four categories of apps:
Illumina Core applications — Apps that are developed by Illumina bioinformaticians and software engineers and rigorously tested and documented. Illumina Technical Support fully supports core apps.
BaseSpace Labs applications — Apps that are developed by Illumina bioinformaticians and software engineers and released as beta apps. Illumina Technical Support does not fully support BaseSpace Labs apps.
Third-party applications — Apps that are developed by third-party groups outside of Illumina. Illumina reviews these native and web apps for quality and robustness before being published. The apps are supported by the third-party developers.
Private applications — Any apps that are developed but not published to the BaseSpace App store. These apps can be shared directly with collaborators to maintain privacy and access to the app.
When you share data in BaseSpace Sequence Hub, you give collaborators access to it while you retain ownership and write access. To transfer ownership of data, see Transfer Ownership.
You can share data with collaborators in the following ways:
Add collaborators to a workgroup—Workgroup collaborators automatically gain read and write access to all data in the workgroup. For information about sharing data between Enterprise accounts (eg, illumina.com) and Personal or Professional accounts, see Create Workgroups and Assign Administrators.
Share a project—You can grant access to a project and any associated biosamples, libraries, analyses, and data sets.
Share a run - You can grant read-only access to a specific run. Associated biosamples, libraries, analyses, and data sets are not shared. However, some biosample names appear on run data pages.
Runs and projects have separate permissions. When you share a run, the associated project is not automatically shared. Biosamples and other resources are not accessible to collaborators of a run.
If you want to share FASTQ data sets without sharing project data, download the data set and share it separately. Shared data cannot be used in biosample workflow files because biosamples are not transferred.
Use the Send Invitation option to invite a collaborator to share a project or run, and to manage collaborator access to shared resources. The shared project or run is visible to the collaborator when they are logged in to their account.
Use this option when you want greater control over who can view your data. For information about using a unique URL that can be used to invite a large group of people, see Share by Link.
When you share a project with a collaborator, the collaborator is granted read-only access to related biosamples, libraries, analyses, and data sets. You can optionally allow write access to the project, which allows collaborators to launch apps that write to the shared project.
When you share a run, related entities are not shared, however the names of entities might be visible in run details.
Select a project or run.
Select Share, then select Send Invitation.
Enter the email address for the collaborator.
[Optional] If you are sharing a run and want to share run data only, clear the Share Associated Project checkbox.
Select Add Collaborator.
[Optional] If you are sharing a project, select the access level.
Read Only—The collaborator can view data.
Write—The collaborator can launch apps that write to the project.
Select Save Settings.
Use the Send Invitation option to manage collaborator permissions. You can review the list of collaborators, change the access level for a project, or revoke access to a run or project.
Select the shared project or run.
Select Share, and then select Send Invitation.
In the Collaborator Settings section of the Share dialog, change or revoke collaborator access.
Select Save Settings.
The collaborator list uses the email address of the personal account or workgroup owner. Because users can accept shares on behalf of workgroups, the listed collaborator might not be the same as the user who received the invitation to share.
Runs and projects have separate permissions. When you change access to a project, access to shared runs is not automatically changed.
Use analysis delivery status to track the delivery of analysis reports or to make sure that analyses for test runs or R&D projects are not shared with customers.
Use the Biosample Workflow to specify an initial delivery status.
Select Analyses.
Select one or more analyses from the list.
Select Status, point to Change, and then select Analysis Delivery Status.
In the Change Delivery Status dialog, select the Change status to drop-down arrow, and then select a delivery status.
[Optional] Add a comment.
Select Save.
You can share project and run data outside your workgroup in two ways:
Invite individual collaborators by email.
Generate a unique URL invitation that can be shared with anyone.
Collaborators can accept shared data to a personal or workgroup account, and then view the data when they are signed into that account.
When you share data, you maintain ownership and write access. If you delete the data, your collaborators lose access to it. For information data access and transferring projects or runs to new owners to allow permanent access, see Data Access After Share or Transfer and Transfer Ownership.
You can configure BaseSpace Sequence Hub to automatically apply a QC status to analyses as they complete. BaseSpace Sequence Hub compares the analysis metrics to your predefined thresholds and assigns a dataset or a analysis status as QC Passed or QC Failed.
Automatic Analysis QC uses the BaseSpace Sequence Hub API. For more information, see the developer documentation at developer.basespace.illumina.com.
When a dataset is set to QC failed, its data are excluded from downstream data analyses and total yield calculations.
Automatic Analysis QC uses Analysis Workflows and requires the following:
The app used in the Analysis Workflow must register analysis metrics.
The Analysis Workflow must be launched automatically. See Biosample Workflow.
The Analysis Workflow must contain metric thresholds. See Create an Analysis Workflow.
To view the analysis metrics, select Metrics from the analysis details page. For information about manually applying QC status, see Manual QC.
Use the Get Link option to share a run or project with any collaborator who has access to the invitation link. This option is an easy way to share project or run data without having to specify email and set permissions.
When you activate a link and share the URL, collaborators can accept the invitation and view the data in their account. When the link is deactivated, access is limited to collaborators who have already accepted the invitation.
Select a project or run.
Select Share, then select Get Link.
Select Activate.
Copy the URL to share with collaborators.
To deactivate the link, navigate to the project or run, select Share, point to Get Link, and then select Deactivate.
The deactivated URL is permanently disabled and cannot be reactivated. To enable sharing again, activate a new link.
Runs and projects have separate permissions. When you share a run, the associated project is not automatically shared. Biosamples and other resources are not accessible to collaborators of a run.
Transfer analysis files using any of the following methods, then use the analysis delivery status to track the delivery progress.
Share or transfer project in BaseSpace Sequence Hub. See Share a Project or Run With Collaborators.
Share outside of BaseSpace Sequence Hub using BaseMount or a custom script. See the developer documentation at developer.basespace.illumina.com.
Download and share the files outside of BaseSpace Sequence Hub. See Download Files.
Enterprise domain administrators can create workgroups, rename workgroups, and assign administrators. Workgroup administrators can also add or remove workgroup users.
With an Enterprise subscription, the domain administrator can create multiple workgroups and assign one workgroup administrator to each group.
Select Settings from the Account drop-down list.
On the Settings page, select Account, and then select Manage Workgroups. The Workgroup Admin Console opens.
From the Dashboard, select New.
In the Create Workgroup dialog box, enter the workgroup information.
Enter a unique workgroup name in the Name field.
Enter a description of the workgroup in the Description field.
Enter the email for the person you want to be the workgroup administrator in the Administrator E‑mail field.
[Optional] To create a collaborative workgroup that include users outside your Enterprise domain (for example, a core lab), select the Enable collaborators outside of this domain checkbox.
Select Create.
Users in your workgroup can incur costs related to data storage, compute, and analysis. To share project and run data only, use the Get Link or Share options. For more information, see Share a Project or Run With Collaborators.
You can access your workgroups through the Account drop-down list.
When you are in your personal account, your name is displayed next to the Account drop-down arrow.
If you are in a workgroup, the workgroup name is displayed.
To return to your personal account, select Personal in the Account drop-down list.
When you use a workgroup to access BaseSpace Sequence Hub, you use the group account. This account has different data, settings, and resources than your personal account.
If you are an administrator, you can rename a workgroup in BaseSpace Sequence Hub by doing the following:
Select Settings from the Account drop-down list.
On the Settings page, select Account, and then select Manage Workgroups. The Workgroup Admin Console opens.
Select the workgroup you want to rename.
Select Change Settings.
In the dialog box, enter the new name.
Select Save.
View and change access levels for workgroup members:
Select Settings from the Account drop-down list.
On the Settings page, select Account, and then select Manage Workgroups. The Workgroup Admin Console opens.
Select Users.
Select the checkbox for each user you want to change.
Select Change Access.
Select the access level for each region and product.
Select Change access.
The dropdown selections show the default access levels, which may differ from the access levels currently assigned to the user.
The workgroup administrator adds users to a workgroup. A workgroup can contain an unlimited number of users.
Select Settings from the Account drop-down list.
On the Settings page, select Account, and then select Manage Workgroups. The Workgroup Admin Console opens.
Select a workgroup in the Dashboard.
Select Users.
Select Invite.
In the Invite new user dialog box, enter the email addresses for the users you want to add. Enter one address per line or as a comma-separated list.
Select the BaseSpace Sequence Hub drop-down list, and then select the access level.
No Access—The user does not have access to the workgroup.
Has Access—The user has access to the workgroup. This is the default selection.
[Optional] Select the access level for regional instances or other products available to the workgroup.
Select Grant access.
The invited user receives an email invitation and a dashboard notification. Email Illumina Technical Support to revoke an invitation to someone who is not a registered user of BaseSpace Sequence Hub.
The workgroup administrator removes users from a workgroup.
Select Settings from the Account drop-down list.
On the Settings page, select Account, and then select Manage Workgroups. The Workgroup Admin Console opens.
Select the Workgroup drop-down list on the Settings page, and then select the appropriate workgroup.
Select Users.
Select the checkbox for each user you want to remove.
Select Remove.
The workgroup administrator adds administrators to a workgroup. A workgroup can contain an unlimited number of users.
Select Settings from the Account drop-down list.
On the Settings page, select Account, and then select Manage Workgroups. The Workgroup Admin Console opens.
Select a workgroup in the Dashboard.
Select Overview.
Select Invite.
In the Invite to workgroup administration dialog box, enter the email addresses for the administrators you want to add. Enter one address per line or as a comma-separated list.
Select Grant access.
The invited user receives an email invitation and a dashboard notification. Email Illumina Technical Support to revoke an invitation to someone who is not a registered user of BaseSpace Sequence Hub.
You can remove an administrator from a workgroup, or you can revoke their administrator access and keep them in the workgroup as a basic user.
Select Settings from the Account drop-down list.
On the Settings page, select Account, and then select Manage Workgroups. The Workgroup Admin Console opens.
Select a workgroup in the Dashboard.
Select Overview.
Select the checkbox for each administrator you want to remove.
Select Remove.
To remove the administrator from the workgroup, select Remove these users from this workgroup.
Select Remove.
The following tables list the data read and write permissions after shares and transfers associated with projects and runs.
For Shared Resources:
For Transferred Resources:
The firewall protects the iScan control computer by filtering incoming traffic to remove potential threats. The firewall is enabled by default to block all inbound connections. Keep the firewall enabled and allow outbound connections.
For the instrument to connect to BaseSpace Sequence Hub, you will need to add regional platform endpoints and instrument specific endpoints to the allow list on your firewall. Regional endpoints and further detail can be found in Security and Networking for Illumina instrument control computers.
The following table shows the applicable endpoints for the iScan.
Endpoint | Category | Purpose |
---|---|---|
A workgroup consists of users who share data, storage, and other resources.
Each workgroup has one designated administrator who can modify workgroup settings and add or remove users. The administrator cannot be changed or transferred.
With a Professional subscription, an Illumina representative sets up the workgroup and designates one workgroup administrator.
With an Enterprise subscription, an Illumina representative creates a domain and designates a domain administrator. Domain administrators can create multiple workgroups and assign one administrator to each workgroup. The domain administrator can access and administer any workgroups in the domain. See .
Users in your workgroup can incur costs related to data storage, compute, and analysis. To share project and run data only, use the Get Link or Share options. For more information, see .
Entity
Access in Shared Project
Access in Shared Run
Project
Shared as read-only. The owner retains read and write permissions.
Not shared unless explicitly included when the run is shared.
Run
Associated runs are shared as read-only. The owner retains read and write permissions.
Shared as read-only. The owner retains read and write permissions.
Biosamples
Associated biosamples are shared as read-only. The owner retains read and write permissions.
Not shared. Some names might be visible on the run details biosamples page.
Libraries
Associated libraries are shared as read-only. The owner retains read and write permissions.
Not included.
Pools and Lab Requeues
Associated pools and lab requeues are shared as read-only. The owner retains read and write permissions.
Not included.
Analyses
Associated analyses are shared as read-only. The owner retains read and write permissions.
Not included.
Datasets
Associated datasets are shared as read-only. The owner retains read and write permissions.
Not included.
Entity
Access in Transferred Project
Access in Transferred Run
Project
The transferee is the new owner and has read and write access. The original owner loses all access.
Not included.
Run
The transferee is the new owner of the run and run files. FASTQ files are excluded from the transfer.
Biosamples
Ownership does not transfer. Associated biosamples are shared as read-only. The owner retains read and write permissions.
Ownership is not shared or transferred. Some names might be visible on the run details biosamples page.
Libraries
Ownership does not transfer. Associated libraries are shared as read-only. The owner retains read and write permissions.
Not included.
Pools and Lab Requeues
Ownership does not transfer. Associated pools and lab requeues are shared as read-only. The owner retains read and write permissions.
Not included.
Analyses
Ownership does not transfer. Associated analyses are shared as read-only. The owner retains read and write permissions.
Not included.
Datasets
Ownership does not transfer. Associated datasets are shared as read-only. The owner retains read and write permissions.
Not included.
ica.illumina.com
Required
Send IDAT files to ICA
o.ss2.us
Required
Certificate authorization
ocsp.digicert.com
Required
Certificate authorization
ocsp.pki.goog/gsr2
Required
Certificate authorization
ocsp.rootca1.amazontrust.com
Required
Certificate authorization
ocsp.rootg2.amazontrust.com
Required
Certificate authorization
ocsp.sca1b.amazontrust.com
Required
Certificate authorization
fonts.gstatic.com
Required
Display fonts
fonts.googleapis.com
Recommended
Display fonts
cdn.walkme.com
Recommended
Telemetry
cdn3.userzoom.com
Recommended
Telemetry
dpm.demdex.net
Recommended
Telemetry
illuminainc.demdex.net
Recommended
Telemetry
illuminainc.tt.omtrdc.net
Recommended
Telemetry
smetrics.illumina.com
Recommended
Telemetry
google.com
Recommended
Telemetry
google-analytics.com
Recommended
Telemetry
stats.g.doubleclick.net
Recommended
Telemetry
illumina.com
Optional
Access Illumina support material