Once a cluster is started, the cluster manager can be accessed from the workspace node.
Job resources
Every cluster member has a certain capacity which is determined by the selected model for the cluster member.
The following complex values have been added to the SGE cluster environment and are requestable.
static_cores (default: 1)
static_mem (default: 2G)
These values are used to avoid oversubscription of a node which can result in Out-Of-Memory or unresponsiveness. You need to ensure these limits are not exceeded.
To ensure stability of the system, some headroom is deducted from the total node capacity.
Scaling
These two values are used by the SGE auto scaler when running in dynamic mode. The SGE auto scaler will summarise all pending jobs and their requested resources to determine the scale up/down operation within the defined range.
Cluster members will remain in the cluster for at least 300 seconds. The Auto scaler only executes one scale up/down operation at a time and is stabilised before taking on a new operation.
Job requests that require more resources than the capacity of the selected resource model will be ignored by the auto scaler and will wait indefinitely.
The operation of the auto scaler can be monitored in the log file /data/logs/sge-scaler.log
Submitting jobs
Submitting a single job
Submitting a job array
Do not limit the job concurrency amount as this will result in unused cluster members.
Monitoring members
Listing all members of the cluster
Managing running/pending jobs
listing all jobs in the cluster
Showing the details of a job.
Deleting a job.
Managing executed jobs
Showing the details of an executed job.
SGE Reference documentation
SGE command line options and configuration details can be found .
Bench workspaces require setting a Docker image to use as the image for the workspace. Illumina Connected Analytics (ICA) provides a default Docker image with JupyterLab installed.
JupyterLab supports Jupyter Notebook documents (.ipynb). Notebook documents consist of a sequence of cells which may contain executable code, markdown, headers, and raw text.
The JupyterLab Docker image contains the following environment variables:
Variable
Set to
ICA_URL
https://ica.illumina.com/ica (ICA server URL)
ICA_PROJECT (Obsolete)
To export data from your workspace to your local machine, it is best practice to move the data in your workspace to the /data/project/ folder so that it becomes available in your project under projects > your_project > Data.
ICA Python Library
Included in the default JupyterLab Docker image is a python library with APIs to perform actions in ICA, such as add data, launch pipelines, and operate on Base tables. The python library is generated from the using .
The ICA Python library API documentation can be found in folder /etc/ica/data/ica_v2_api_docs within the JupyterLab Docker image.
See the for examples on using the ICA Python library.
Bench
ICA provides a tool called Bench for interactive data analysis. This is a sandboxed workspace which runs a docker image with access to the data and pipelines within a project. This workspace runs on the Amazon S3 system and comes with associated processing and provisioning costs. It is therefore best practice to not keep your Bench instances running indefinitely, but stopping them when not in use.
Access
Having access to Bench depends on the following conditions:
ICA project ID
ICA_PROJECT_UUID
Current ICA project UUID
ICA_SNOWFLAKE_ACCOUNT
ICA Snowflake (Base) Account ID
ICA_SNOWFLAKE_DATABASE
ICA Snowflake (Base) Database ID
ICA_PROJECT_TENANT_NAME
Name of the owning tenant of the project where the workspace is created.
ICA_STARTING_USER_TENANT_NAME
Name of the tenant of the user which last started the workspace.
ICA_COHORTS_URL
URL of the Cohorts web application used to support the Cohort's view
Bench needs to be included in your ICA subscription.
The project owner needs to enable Bench for their project.
Individual users of that project need to be given access to Bench.
Enabling Bench for your project
After creating a project, go to Projects > your_project > Bench > Workspaces page and click the Enable button. The entitlements you have determine the available resources for your Bench workspaces. If you have multiple entitlements, all the resources of your individual entitlements are taken into account. Once bench is enabled, users with matching permissions have access to the Bench module in that project.
If you do not see the Enable button for Bench, then either your tenant subscription does not include Bench or the tenant to which you belong is not the one where the project was created. Users from other tenants can create workspaces in Bench once Bench is enabled, but they cannot enable the Bench module itself.
Setting user level access.
Once Bench has been enabled for your project, the combination of roles and teams settings determines if a user can access Bench.
Tenant administrators and project owners are always able to access Bench and perform all actions.
The teams settings page at Projects > your_project > Project Settings > Team determines the role for the user/workgroup.
No Access means you have no access to the Bench workspace for that project.
Contributor gives you the right to start and stop the Bench workspace and to access the workspace contents, but not to create or edit the workspace.
Administrator gives you the right to create, edit, delete, start and stop the Bench workspace, and to access the actual workspace contents. In addition, the administrator can also build new derived Bench images and tools.
Finally, a verification is done of your user rights against the required workspace permissions. You will only have access when your user rights meet or exceed the required workspace permissions. The possible required Workspace permissions include:
Upload / Download rights (Download rights are mandatory for technical reasons)
Project Level (No Access / Data Provider / Viewer / Contributor)
Flow diagram of access to Bench
Workspaces
The main concept in Bench is the Workspace. A workspace is an instance of a Docker image that runs the framework which is defined in the image (for example JupyterLab, R Studio). In this workspace, you can write and run code and graphically represent data. You can use API calls to access data, analyses, Base tables and queries in the platform. Via the command line, R-packages, tools, libraries, IGV browsers, widgets, etc. can be installed.
You can create multiple workspaces within a project and each workspace runs on an individual node and is available in different resource sizes. Each node has local storage capacity, where files and results can be temporarily stored and exported from to be permanently stored in a Project. The size of the storage capacity can range from 1GB – 16TB.
For each workspace, you can see the status by the color.
Flow (No Access / Viewer / Contributor)
Base (No Access / Viewer / Contributor)
Once a workspace is started, it will be restarted every 30 days for security reasons. Even when you have automatic shutdown configured to be more than 30, the workspace will be restarted after 30 days and the remaining days will be counted in the next cycle.
You can see the remaining time until the next event (Shutdown or restart) in the workspaces overview and on the workspace details.
Create Workspace
If this is the first time you are using a workspace in a Project, click Enable to create new Bench Workspaces. In order to use Bench, you first need to have a workspace. This workspace determines which docker image will be used with which node and storage size.
Complete the following fields and save the changes.
(*1) URLs must comply with the following rules:
URLs can be between 1 and 263 characters including dot (.).
URLs can begin with a leading dot (.).
Domain and Sub-domains:
Can include alphanumeric characters (Letters A-Z and digits 0-9). Case insensitive.
Can contain hyphens (-) and underscores (_), but not as a first or last character.
Dot (.) must be placed after a domain or sub-domain.
If you use a trailing slash like in the path ftp.example.net/folder/ then you will not be able to access the path ftp.example.net/folder without the trailing slash included.
Regex for URL : [(http(s)?):\/\/(www\.)?a-zA-Z0-9@:%._\+~#=-]\{2,256}\.[a-z]\{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&\/\/=]*)
(*2) When you grant workspace access to multiple users, you need to provide an API key to access the workspace. Authenticate using icav2 config set command. The CLI will prompt for an x-api-key value. enter the API Key generated from the product dashboard. See here for more information.
Example URLs
The following are example URLs which will be considered valid.
The workspace can be edited afterwards when it is stopped, on the Details tab within the workspace. The changes will be applied when the workspace is restarted.
Workspace permissions
When Access limited to workspace owner is selected, only the workspace owner can access the workspace. Everything created in that workspace will belong to the workspace owner.
Administrator vs Contributor
Bench administrators are able to create, edit and delete workspaces and start and stop workspaces. If their permissions match or exceed those of the workspace, they can also access the workspace contents.
Contributors are able to start and stop workspaces and if their permissions match or exceed those of the workspace, they can also access the workspace contents.
Setting Workspace Permissions
The teams setting determines if someone is an administrator or contributor, while the dedicated permissions you set on the workspace level indicate what the workspace itself can and cannot do within your project. For this reason, the users need to meet or exceed the required permissions to enter this workspace and use it.
For security reasons, the Tenant administrator and Project owner can always access the workspace.
If one of your permissions is not high enough as bench contributor, you will see the following message "You are not allowed to use this workspace as your user permissions are not sufficient compared to the permissions of this workspace".
The permissions that a Bench workspace can receive are the following:
Upload rights
Download rights (required)
Project (No Access - Dataprovider - Viewer - Contributor)
Flow (No Access - Viewer - Contributor)
Base (No Access - Viewer - Contributor)
Based on these permissions, you will be able to upload or download data to your ICA project (upload and download rights) and will be allowed to take actions in the Project, Flow and Base modules related to the granted permission.
If you encounter issues when uploading/downloading data in a workspace, the security settings for that workspace may be set to not allow uploads and downloads. This can result in RequestError: send request failed and read: connection reset by peer. This is by design in restricted workspaces and thus limits data access to your project via /data/project to prevent the extraction of large amounts of (proprietary) data.
Workspaces which were created before this functionality existed can be upgraded by enabling these workspace permissions. If the workspaces are not upgraded, they will continue working as before.
Delete workspace (Bench Administrators Only)
To delete a workspace, go to Projects > your_project > Bench > Workspaces > your_workspace and click “Delete”. Note that the delete option is only available when the workspace is stopped.
The workspace will not be accessible anymore, nor will it be shown in the list of workspaces. The content of it will be deleted so if there is any information that should be kept, you can either put it in a docker image which you can use to start from next time, or export it using the API.
Use workspace
The workspace is not always accessible. It needs to be started before it can be used. From the moment a workspace is Running, a node with a specific capacity is assigned to this workspace. From that moment on, you can start working in your workspace.
As long as the workspace is running, the resources provided for this workspace will be charged.
Start workspace
To start the workspace, follow the next steps:
Go to Projects > your_project > Bench > Workspaces > your_workspace > Details
Click on Start Workspace button
On the top of the details tab, the status changes to “Starting”. When you click on the >_Access tab, the message “The workspace is starting” appears.
Wait until the status is “Running” and the “Access” tab can be opened. This can take some time because the necessary resources have to be provisioned.
You can refresh the workspace status by selecting the round refresh symbol at the top right.
Once a workspace is running, it can be manually stopped or it will be automatically shut down after the amount of time configured in the Automatic Shutdown field. Even with automatic shutdown, it is still best practice to stop your workspace run when you no longer need it to save costs.
You can edit running workspaces to update the shutdown timer, shutdown reminder and auto restart reminder.
If you want to open a running workspace in a new tab, then select the link at Projects > your_project > Bench > Workspaces > Details tab > Access. You can also copy the link with the copy symbol in front of the link.
Stop workspace
When you exit a workspace, you can choose to stop the workspace or keep it running. Keeping the workspace running means that it will continue to use resources and incur associated costs. To stop the workspace, select stop in the displayed dialog. You can also stop a workspace by opening it and selecting stop at the top right.
Stopping the workspace will stop the notebook, but will not delete local data. Content will no longer be accessible and no actions can be performed until it is restarted. Any work that has been saved will stay stored.
Storage will continue to be charged until the workspace is deleted.
Administrators have a delete option for the workspace in the exit screen.
The project/tenant administrator can enter and stop workspaces for their project/tenant even if they did not start those workspaces at Projects > your_project > Bench > Workspaces > your_workspace > Details. Be careful not to stop workspaces that are processing data. For security reasons, a log entry is added when a project/tenant administrator enters and exits a workspace.
You can see who is using a workspace in the workspace list view.
Workspace Tabs
Access tab
Once the Workspace is running, the default applications are loaded. These are defined by the start script of the docker image.
The docker images provided by Illumina will load JupyterLab by default. It also contains Tutorial notebooks that can help you get started. Opening a new terminal can be done via the Launcher, + button above the folder structure.
Docker Builds tab (Bench Administrators only)
To ensure that packages (and other objects, including data) are permanently installed on a Bench image, a new Bench image needs to be created, using the BUILD option in Bench. A new image can only be derived from an existing one. The build process uses the DOCKERFILE method, where an existing image is the starting point for the new Docker Image (The FROM directive), and any new or updated packages are additive (they are added as new layers to the existing Docker file).
The Dockerfile commands are all run as ROOT, so it is possible to delete or interfere with an image in such a way that the image is no longer running correctly. The image does not have access to any underlying parts of the platform so will not be able to harm the platform, but inoperable Bench images will have to be deleted or corrected.
In order to create a derived image, open up the image that you would like to use as the basis and select the Build tab.
Name: By default, this is the same name as the original image and it is recommended to change the name.
Version: Required field which can by any value.
Description: The description for your docker image (for example, indicating which apps it contains).
Code: The Docker file commands must be provided in this section.
The first 4 lines of the Docker file must NOT be edited. It is not possible to start a docker file with a different FROM directive. The main docker file commands are RUN and COPY. More information on them is available in the official Docker documentation.
Once all information is present, click the Build button. Note that the build process can take a while. Once building has completed, the docker image will be available on the Data page within the Project. If the build has failed, the log will be displayed here and the log file will be in the Data list.
Tools (Bench Administrators Only)
From within the workspace it is possible to create a tool from the Docker image.
Click the Manage > Create CWL Tool button in the top right corner of the workspace.
Give the tool a name.
Replace the description of the tool to describe what it does.
Add a version number for the tool.
Click the Docker Build tab.
Here the image that accompanies the tool will be created.
Change the name for the image.
Click the General tab. This tab and all next tabs will look familiar from Flow. Enter the information required for the tool in each of the tabs. For more detailed instruction check out the section in the Flow documentation.
Click the Save button in the upper, right-hand corner to start the build process.
The building can take a while. When it has completed, the tool will be available in the Tool Repository.
Workspace Data
To export data from your workspace to your local machine, it is best practice to move the data in your workspace to the /data/project/ folder so that it becomes available in your project under projects > your_project > Data. Although this storage is slow, it offers read and write access and access to the content from within ICA.
For fast read-only access, link folders with the CLI commandworkspace-ctl data create-mount --mode read-only.
For fast read/write access, link non-indexed folders which are visible, but whose contents are not accessible from ICA. Use the CLI commandworkspace-ctl data create-mount --mode read-write to do so. You can not have fast read-write access to indexed folders as the indexing mechanism on those would deteriorate the performance.
Every workspace you start has a read-only /data/.software/ folder which contains the icav2 command-line interface (and readme file).
File Mapping
Activity tab
The last tab of the workspace is the activity tab. On this tab all actions performed in the workspace are shown. For example, the creation of the workspace, starting or stopping of the workspace,etc. The activities are shown with their date, the user that performed the action and the description of the action. This page can be used to check how long the workspace has run.
In the general Activity page of the project, there is also a Bench activity tab. This shows all activities performed in all workspaces within the project, even when the workspace has been deleted. The Activity tab in the workspace only shows the action performed in that workspace. The information shown is the same as per workspace, except that here the workspace in which the action is performed is listed as well.
Length between 1 and 63 characters.
Change the version.
Replace the description to describe what the image does.
Below the line where it says “#Add your commands below.” write the code necessary for running this docker image.
In either FPGA mode (hardware-accelerated) or software mode when using FPGA instances. This can be useful when comparing performance gains by hardware acceleration or to distribute concurrent processes between the FPGA and cpu.
In software mode when using non-FPGA instances.
To run DRAGEN in software mode, you need to use the DRAGEN --sw-mode parameter.
The DRAGEN command line parameters to specify the location of the licence file are different.
DRAGEN software is provided in specific Bench images with names starting with Dragen. For example (available versions may vary):
Dragen 4.4.1 - Minimal provides DRAGEN 4.4.1 and SSH access
Dragen 4.4.6 provides DRAGEN 4.4.6, SSH and JupyterLab.
Prerequisites
Memory
The instance type is selected during workspace creation (Projects > your_project > Bench > Workspaces). The amount of RAM available on the instance is critical. 256GiB RAM is a safe choice to run DRAGEN in production. All FPGA2 instances offer 256GiB or more of RAM.
When running in Software mode, use (348GiB RAM) or (144 GiB RAM) to ensure enough RAM is available for your runs.
During pipeline development, when typically using small amounts of data, you can try to scale down in instance types to save costs. You can start at hicpu-large and progressively use smaller instances, though you will need at least standard-xlarge.
If DRAGEN runs out of available memory, the system is rebooted, losing your currently running commands and interface.
DRAGEN version 4.4.6 and later verify if the system has at least 128GB of memory available. If not enough memory is available, you will encounter an error stating that the Available memory is less than the minimum system memory required 128GB.
This can be overridden with the command line parameter dragen --min-memory 0
FPGA-mode
Using an fpga2-medium .
Example
Software-mode
Using a standard-xlarge .
Software mode is activated with the DRAGEN --sw-mode parameter.
Example
FUSE Driver
Bench Workspaces use a FUSE driver to mount project data directly into a workspace file system. There are both read and write capabilities with some limitations on write capabilities that are enforced by the underlying AWS S3 storage.
As a user, you are allowed to do the following actions from Bench (when having the correct user permissions compared to the workspace permissions) or through the CLI:
Copy project data
Delete project data
Mount project data (CLI only)
Unmount project data (CLI only)
When you have a running workspace, you will find a file system in Bench under the project folder along with the basic and advanced tutorials. When opening that folder, you will see all the data that resides in your project.
This is a fully mounted version of the project data. Changes in the workspace to project data cannot be undone.
Copy project data
The FUSE driver allows the user to easily copy data from /data/project to the local workspace and vice versa.
There is a file size limit of 500 GB per file for the FUSE driver.
Delete project data
The FUSE driver also allows you to delete data from your project. This is different from the use of Bench before where you took a local copy and still kept the original file in your project.
Deleting project data through Bench workspace through the FUSE driver will permanently delete the data in the Project. This action cannot be undone.
CLI
Using the FUSE driver through the CLI is not supported for Windows users. Linux users will be able to use the CLI without any further actions, Mac users will need to install the kernel extension from macFuse.
MacOS uses hidden metadata files beginning with ._ ,which are copied over and exposed during CLI copy to your project data. These can be safely deleted from your project.
Mount and unmount of data needs to be done through the CLI. In Bench this happens automatically and is not needed anymore.
Do NOT use the CP -f command to copy or move data to a mounted location. This will result in data loss as data on the destination location will be deleted.
Restrictions
Once a file is written, it cannot be changed! You will not be able to update it in the project location because of the restrictions mentioned above.
Trying to update files or saving you notebook in the project folder will typically result in File Save Error for fusedrivererror.ipynb Invalid response: 500 Internal Server Error.
Some examples of other actions or commands that will not work because of the above mentioned limitations:
Save a jupyter notebook or R script on the /project location
Add/remove a file from an existing zip file
Redirect with append to an existing file e.g. echo "This will not work" >> myTextFile.txt
A file can be written only sequentially. This is a restriction that comes from the library the FUSE driver uses to store data in AWS. That library supports only sequential writing, random writes are currently not supported. The FUSE driver will detect random writes and the write will fail with an IO error return code. Zip will not work since zip writes a table of contents at the end of the file. Please use gzip.
Listing data (ls -l) reads data from the platform. The actual data comes from AWS and there can be a short delay between the writing of the data and the listing being up to date. As a result, a file that is written may appear temporarily as a zero length file, a file that is deleted may appear in the file list. This is a tradeoff, the FUSE driver caches some information for a limited time and during that time the information may seem wrong. Note that besides the FUSE driver, the library used by the FUSE driver to implement the raw FUSE protocol and the OS kernel itself may also do caching.
Jupyter notebooks
To use a specific file in a jupyter notebook, you will need to use '/data/project/filename'.
Old Bench workspaces
This functionality won't work for old workspaces unless you enable the permissions for that old workspace.
Containers in Bench
Bench has the ability to handle containers inside a running workspace.
This allows you to install and package software more easily as a container image and provides capabilities to pull and run containers inside a workspace.
Bench offers a container runtime as a service in your running workspace. This allows you to do standardized container operations such as pulling in images from public and private registries, build containers at runtime from a Dockerfile, run containers and eventually publish your container to a registry of choice to be used in different ICA products such as ICA Flow.
Setup
The Container Service is accessible from your Bench workspace
Bench Clusters
Managing a Bench cluster
Introduction
Workspaces can have their own dedicated cluster which consists of a number of nodes. First the workspace node, which is used for interacting with the cluster, is started. Once the workspace node is started, the workspace cluster can be started.
Spark on ICA Bench
Running a Spark application in a Bench Spark Cluster
Running a pyspark application
The JupyterLab environment is by default configured with 3 additional kernels
PySpark –
Rename a file due to the existing association between ICA and AWS
The container service uses the workspace disk to store any container images you pulled in or created.
To interact with the Container Service, a container remote client CLI is exposed automatically in the /data/.local/bin folder. The Bench workspace environment is preconfigured to automatically detect where the Container Service is made available using environment variables. These environment variables are automatically injected into your environment and are not determined by the Bench Workspace Image.
Container Management
Use either docker or podman cli to interact with the Container Service. Both are interchangeable and support all the standardized operations commonly known.
Pulling a Container Image
To run a container, the first step is to either build a container from a source container or pull in a container from a registry
Public Registry
A public image registry does not require any form of authentication to pull the container layers.
The following command line example shows how to pull in a commonly known image.
The Container Service uses Dockerhub by default to pull images from if no registry hostname is defined in the container image URI.
Private Registry
To pull images from a private registry, the Container Service needs to authenticate to the Private Registry.
The following command line example shows how to instruct the Container Service to login into the Private registry.hub.docker.com registry
Depending on your authorisations in the private registry you will be able to pull and push images. These authorisations are managed outside of the scope of ICA.
Pushing a Container Image
Depending on the Registry setup you can publish Container Images with or without authentication.
If Authentication is required, follow the login procedure described in Private Registry
The following command line example shows how to publish a locally available Container Image to a private registry in Dockerhub.
Saving a Container Image as an Archive
The following example shows how to save a locally available Container Image as a compressed tar archive.
This lets you upload the container image into the Private ICA Docker Registry.
Listing Locally Available Container Images
The following example shows how to list all locally available Container Images
Deleting a Container Image
Container Images require storage capacity on the Bench Workspace disk. The capacity is shown when listing the locally available container images. The container Images are persisted on disk and remain available whenever a workspace stops and restarts.
The following example shows how to clean up a locally available Container Image
When a Container Image has multiple tags, all the tags need to be removed individually to free up disk capacity.
Running a Container
A Container Image can be instantiated in a Container running inside a Bench Workspace.
By default the workspace disk (/data) will be made available inside the running Container. This lets you to access data from the workspace environment.
When running a Container, the default user defined in the Container Image manifest will be used and mapped to the uid and the gid of the user in the running Bench Workspace (uid:1000, gid: 100). This will ensure files created inside the running container on the workspace disk will have the same file ownership permissions.
Run a Container as a normal user
The following command line example shows how to run an instance a locally available Container Image as a normal user
Run a Container as root user
Running a Container as root user maps the uid and gid inside the running Container to the running non-root user in the Bench Workspace. This lets you act as user with uid 0 and gid 0 inside the context of the container.
By enabling this functionality, you can install system level packages inside the context of the Container. This can be leveraged to run tools that require additional system level packages at runtime.
The following command line example shows how to run an instance of a locally available Container as root user and install system level packages
When no specific mapping is defined using the --userns flag, the user in the running Container user will be mapped to an undefined uid and gid based on an offset of id 100000. Files created in your workspace disk from the running Container will also use this uid and gid to define the ownership of the file.
Building a Container
To build a Container Image, you need to describe the instructions in a Dockerfile.
This next example builds a local Container Image and tags it as myimage:1.0 The Dockerfile used in this example is
The following command line example will build the actual Container Image
When defining the build context location, keep in mind that using the HOME folder (/data) will index all files available in /data, which can be a lot and will slow down the process of building. Hence the reason to use a minimal build context whenever possible.
mkdir /data/demo
cd /data/demo
# download ref
wget --progress=dot:giga https://s3.amazonaws.com/stratus-documentation-us-east-1-public/dragen/reference/Homo_sapiens/hg38.fa -O hg38.fa
# => 0.5min
# Build ht-ref
mkdir ref
dragen --build-hash-table true --ht-reference hg38.fa --output-directory ref
# => 6.5min
# run DRAGEN mapper
FASTQ=/opt/edico/self_test/reads/midsize_chrM.fastq.gz
# Next line is needed to resolve "run the requested pipeline with a pangenome reference, but a linear reference was provided" in DRAGEN (4.4.1 and others). Comment out when encountering unrecognised option '--validate-pangenome-reference=false'.
DRAGEN_VERSION_SPECIFIC_PARAMS="--validate-pangenome-reference=false"
# License Parameters
LICENSE_PARAMS="--lic-instance-id-location /opt/dragen-licence/instance-identity.protected --lic-credentials /opt/dragen-licence/instance-identity.protected/dragen-creds.lic"
mkdir out
dragen -r ref --output-directory out --output-file-prefix out -1 $FASTQ --enable-variant-caller false --RGID x --RGSM y ${LICENSE_PARAMS} ${DRAGEN_VERSION_SPECIFIC_PARAMS}
# => 1.5min (10 sec if fpga already programmed)
mkdir /data/demo
cd /data/demo
# download ref
wget --progress=dot:giga https://s3.amazonaws.com/stratus-documentation-us-east-1-public/dragen/reference/Homo_sapiens/hg38.fa -O hg38.fa
# => 0.5min
# Build ht-ref
mkdir ref
dragen --build-hash-table true --ht-reference hg38.fa --output-directory ref
# => 6.5min
# run DRAGEN mapper
FASTQ=/opt/edico/self_test/reads/midsize_chrM.fastq.gz
# Next line is needed to resolve "run the requested pipeline with a pangenome reference, but a linear reference was provided" in DRAGEN (4.4.1 and others). Comment out when encountering ERROR: unrecognised option '--validate-pangenome-reference=false'.
DRAGEN_VERSION_SPECIFIC_PARAMS="--validate-pangenome-reference=false"
# When using DRAGEN 4.4.6 and later, the line above should be extended with --min-memory 0 to skip the memory check.
DRAGEN_VERSION_SPECIFIC_PARAMS="--validate-pangenome-reference=false --min-memory 0"
# License Parameters
LICENSE_PARAMS="--sw-mode --lic-credentials /opt/dragen-licence/instance-identity.protected/dragen-creds-sw-mode.lic"
mkdir out
dragen -r ref --output-directory out --output-file-prefix out -1 $FASTQ --enable-variant-caller false --RGID x --RGSM y ${LICENSE_PARAMS} ${DRAGEN_VERSION_SPECIFIC_PARAMS}
# => 2min
# Push a Container Image to a Private registry in Dockerhub
/data $ docker pull alpine:latest
/data $ docker tag alpine:latest registry.hub.docker.com/<privateContainerUri>:<tag>
/data $ docker push registry.hub.docker.com/<privateContainerUri>:<tag>
# Save a Container Image as a compressed archive
/data $ docker pull alpine:latest
/data $ docker save alpine:latest | bzip2 > /data/alpine_latest.tar.bz2
# List all local available images
/data $ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/library/alpine latest aded1e1a5b37 3 weeks ago 8.13 MB
# Remove a locally available image
/data $ docker rmi alpine:latest
# Run a Container as a normal user
/data $ docker run -it --rm alpine:latest
~ $ id
uid=1000(ica) gid=100(users) groups=100(users)
# Run a Container as root user
/data $ docker run -it --rm --userns keep-id:uid=0,gid=0 --user 0:0 alpine:latest
/ # id
uid=0(root) gid=0(root) groups=0(root)
/ # apk add rsync
...
/ # rsync
rsync version 3.4.0 protocol version 32
...
# Run a Container as a non-mapped root user
/data $ docker run -it --rm --user 0:0 alpine:latest
/ # id
uid=0(root) gid=0(root) groups=100(users),0(root)
/ # touch /data/myfile
/ #
# Exited the running Container back to the shell in the running Bench Workspace
/data $ ls -al /data/myfile
-rw-r--r-- 1 100000 100000 0 Mar 13 08:27 /data/myfile
FROM alpine:latest
RUN apk add rsync
COPY myfile /root/myfile
# Build a Container image locally
/data $ mkdir /tmp/buildContext
/data $ touch /tmp/buildContext/myFile
/data $ docker build -f /tmp/Dockerfile -t myimage:1.0 /tmp/buildContext
...
/data $ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/library/alpine latest aded1e1a5b37 3 weeks ago 8.13 MB
localhost/myimage 1.0 06ef92e7544f About a minute ago 12.1 MB
The cluster consists of 2 components
The manager node which orchestrates the workload across the members.
Anywhere between 0 and up to maximum 50 member nodes.
Clusters can run in two modes.
Static - A static cluster has a manager node and a static number of members. At start-up of the cluster, the system ensures the predefined number of members are added to the cluster. These nodes will keep running as long as the entire cluster runs. The system will not automatically remove or add nodes depending on the job load. This gives the fastest resource availability, but at additional cost as unused nodes stay active, waiting for work.
Dynamic - A dynamic cluster has a manager node and a dynamic number of workers up to a predefined maximum (with a hard limit of 50). Based on the job load the system will scale the number of members up or down. This saves resources as only as much worker nodes as needed to perform the work are being used.
Configuration
You manage Bench Clusters via the Illumina Connected Analytics UI in Projects > your_project > Bench > Workspaces > your_workspace > Details.
The following settings can be defined for a bench cluster:
Field
Description
Web access
Enable or disable web access to the cluster manager.
Dedicated Cluster Manager
Use a dedicated node for the cluster manager. This means that an entire machine of the type defined at resource model is reserved for your cluster manager. If no dedicated cluster manager is selected, one core per cluster member will be reserved for scheduling.
For example, if you have 2 nodes of standard-medium (4 cores) and no dedicated cluster manager, then only 6 (2x3) cores are available to run tasks as each node reserves 1 core for the cluster manager.
Type
Choose between cluster members
Scaling interval
For static, set the number of cluster member nodes (maximum 50), for dynamic, choose the minimum and maximum (up to 50) amount of cluster member nodes.
Resource model
The type of on which the cluster member(s) will run. For every cluster member, one of these machines is used as resource, so be aware of the possible cost impact when running many machines with a high individual .
Economy mode
Economy mode uses AWS . This halves many compute iCredit rates vs standard mode, but may be interrupted. See for a list of which resource models support economy pricing.
Operations
Once the workspace is started, the cluster can be started at Projects > your_project > Bench > Workspaces > your_workspace > Details and the cluster can be stopped without stopping the workspace. Stopping the workspace will also stop all clusters in that workspace.
Managing Data in a Bench cluster
Data in a bench workspace can be divided into three groups:
Workspace data is accessible in read/write mode and can be accessed from all workspace components (workspace node, cluster manager node, cluster member nodes ) at /data. The size of the workspace data is defined at the creation of the workspace but can be increased when editing a workspace in the Illumina connected analytics UI. This is persistent storage and data remains when a workspace is shut down.
Project data can be accessed from all workspace components at /data/project. Every component will have their own dedicated mount to the project. Depending on the project data permissions you will be able to access it in either Read-Only or Read-Write mode.
Scratchdata is available on the cluster members at /scratch and can be used to store intermediate results for a given job dedicated to that member. This is temporary storage, and all data is deleted when a cluster member is removed from the cluster.
Managing these mounts is done via the workspace cli /data/.local/bin/workspace-ctl in the workspace. Every node will have his dedicated mount.
For fast data access, bench offers a mount solution to expose project data on every component in the workspace. This mount provides read-only access to a given location in the project data and is optimized for high read throughput per single file with concurrent access to files. It will try to utilise the full bandwidth capacity of the node.
All mounts occur in path /data/mounts/
Show mounts
Creating a mount
For fast read-only access, link folders with the CLI commandworkspace-ctl data create-mount --mode read-only.
This has the same effect as using the --mode read-only option because this is applied by default when using workspace-ctl data create-mount .
When one of the above kernels is selected, the spark context is automatically initialised and can be accessed using the sc object.
PySpark - Local
The PySpark - Local runtime environment launches the spark driver locally on the workspace node and all spark executors are created locally on the same node. It does not require a spark cluster to run and can be used for running smaller spark applications which don’t exceed the capacity of a single node.
The spark configuration can be found at /data/.spark/local/conf/spark-defaults.conf.
Making changes to the configuration requires a restart of the Jupyter kernel.
PySpark - Remote
The PySpark – Remote runtime environment launches the spark driver locally on the workspace node and interacts with the Manager for scheduling tasks onto executors created across the Bench Cluster.
This configuration will not dynamically spin up executors, hence it will not trigger the cluster to auto scale when using a Dynamic Bench cluster.
The spark configuration can be found at /data/.spark/remote/conf/spark-defaults.conf.
Making changes to the configuration requires a restart of the Jupyter kernel.
PySpark – Remote - Dynamic
The PySpark – Remote - Dynamic runtime environment launches the spark driver locally on the workspace node and interacts with the Manager for scheduling tasks onto executors created across the Bench Cluster.
This configuration will increase/decrease the required executors which will result into a cluster that auto scales using a Dynamic Bench cluster
The spark configuration can be found at /data/.spark/remote/conf-dynamic/spark-defaults.conf.
Making changes to the configuration requires a restart of the Jupyter kernel.
Job resources
Every cluster member has a certain capacity depending on the selection of the Resource model for the member.
A spark application consists of 1 or more jobs. Each Job consists out of one or more stages. Each stage consists out of one or more tasks. Task are handled by executors and executors are run on a worker (cluster member).
The following setting define the amount of cpus needed per task
The following settings define the size of a single executor which handles the execution of a task
The above example allows an executor to handle 4 tasks concurrently and share a total capacity of 4Gb of memory. Depending on the resource model chosen (e.g. standard-2xlarge) a single cluster member (worker node) is able to run multiple executors concurrently (e.g. 32 cores, 128 Gb for 8 concurrent executors on a single cluster member)
Spark User Interface
The Spark UI can be accessed via the Cluster. The Web Access URL is displayed in the Workspace details page
This Spark UI will register all applications submitted when using one of the Remote Jupyter kernels. It will provide an overview of the registered workers (Cluster members) and the applications running in the Spark cluster.
Bench images are Docker containers tailored to run in ICA with the necessary permissions, configuration and resources. For more information of Docker images, please refer to
The following steps are needed to get your bench image running in ICA.
Requirements
You need to have Docker installed in order to build your images.
nf-core Pipelines
Introduction
This tutorial shows you how to
workspace-ctl data get-mounts
workspace-ctl data create-mount --mount-path /data/mounts/mydata --source /data/project/mydata
workspace-ctl data delete-mount --mount-path /data/mounts/mydata
spark.task.cpus 1
spark.executor.cores 4
spark.executor.memory 4g
Include ephemeral storage
Select this to create scratch space for your nodes. Enabling it will make the storage size selector appear. The stored data in this space is deleted when the instance is terminated. When you deselect this option, the storage size is 0.
Storage size
How much storage space (1GB - 16 TB) should be reserved per node as dedicated scratch space, available at /scratch
For your Docker bench image to work in ICA, they must run on Linux X86 architecture, have the correct user id and initialization script in the Docker file.
For easy reference, you can find examples of preconfigured Bench images on the Illumina website which you can copy to your local machine and edit to suit your needs.
Bench-console provides an example to build a minimal image compatible with ICA Bench to run a SSH Daemon.
Bench-web provides an example to build a minimal image compatible with ICA Bench to run a Web Daemon.
Bench-rstudio provides an example to build a minimal image compatible with ICA Bench to run a rStudio Open Source.
These examples come with information on the available parameters.
Scripts
The following scripts must be part of your Docker bench image. Please refer to the examples from the Illumina website for more details.
Init Script (Dockerfile)
This script copies the ica_start.sh file which takes care of the Initialization and termination of your workspace to the location in your project from where it can be started by ICA when you request to start your workspace.
User (Dockerfile)
The user settings must be set up so that bench runs with UID 1000.
Shutdown Script (ica_start.sh)
To do a clean shutdown, you can capture the sigterm which is transmitted 30 seconds before the workspace is terminated.
Building a Bench Image
Once you have Docker installed and completed the configuration of your Docker files, you can build your bench image.
Open the command prompt on your machine.
Navigate to the root folder of your Docker files.
Execute docker build -f Dockerfile -t mybenchimage:0.0.1 . with mybenchimage being the name you want to give to your image and 0.0.1 replaced with the version number which you want your bench image to be. For more information on this command, seehttps://docs.docker.com/reference/cli/docker/buildx/build/
Once the image has been built, save it as docker tar file with the command docker save mybenchimage:0.0.1 | bzip2 > ../mybenchimage-0.0.1.tar.bz2 The resulting tar file will appear next to the root folder of your docker files.
If you want to build on a mac with Apple Silicon, then the build command is docker buildx build --platform linux/amd64 -f Dockerfile -t mybenchimage:0.0.1 .
Upload Your Docker Image to ICA
Open ICA and log in.
Go to Projects > your_project > Data.
For small Docker images, upload the docker image file which you generated in the previous step. For large Docker images use the service connector to better performance and reliability to import the Docker image.
Select the uploaded image file and perform Manage > Change Format.
From the format list, select DOCKER and save the change.
Go to System Settings > Docker Repository > Create > Image.
Select the uploaded docker image and fill out the other details.
Name: The name by which your docker image will be seen in the list
Version: A version number to keep track of which version you have uploaded. In our example this was 0.0.1
Once the settings are entered, select Save. The creation of the Docker image typically takes between 5 and 30 minutes. The status of your docker image will be partial during creation and available once completed.
Start Your Bench Image
Navigate to Projects > your_project > Bench > Workspaces.
Create a new workspace with + Create Workspace or edit an existing workspace.
Fill in the bench workspace details according to Workspaces.
Save your changes.
Select Start Workspace
Wait for the workspace to be started and you can access it either via console or the GUI.
Access Bench Image
Once your bench image has been started, you can access it via console, web or both, depending on your configuration.
Web access (HTTP) is done from either Projects > your_project > Bench > Workspaces > your_Workspace > Access tab or from the link provided at provided in your running workspace at Projects > your_project > Bench > Workspaces > your_Workspace > Details tab > Access section.
Console access (SSH) is performed from your command prompt by going to the path provided in your running workspace at Projects > your_project > Bench > Workspaces > your_Workspace > Details tab > Access section.
The password needed for SSH access is any one of your personal API keys
Command-line Interface
To execute the commands, your workspace needs a way to run them such as the inclusion of an SSH daemon, be it integrated into your web access image or into your console access. There is no need to download the workspace command-line interface, you can run it from within the workspace.
Restrictions
Root User
The bench image will be instantiated as a container which will be forcedly started as user with UID 1000 and GID 100.
You cannot elevate your permissions in a running workspace.
Do not run containers as root as this is bad security practice.
Read-only Root Filesystem
Only the following folders are writeable:
/data
/tmp
All other folders are mounted as read-only.
Network Access
For inbound access, the following ports on the container are publicly exposed, depending on the selection made at startup.
Web: TCP/8888
Console: TCP/2222
For outbound access, a workspace can be started in two modes:
Public: Access to public IP’s is allowed using TCP protocol.
Restricted: Access to list of URLs are allowed.
Context
Environment Variables
At runtime, the following Bench-specific environment variables are made available to the workspace instantiated from the Bench image.
Name
Description
Example Values
ICA_WORKSPACE
The unique identifier related to the started workspace. This value is bound to a workspace and will never change.
32781195
ICA_CONSOLE_ENABLED
Whether Console access is enabled for this running workspace.
true, false
ICA_WEB_ENABLED
Whether Web access is enabled for this running workspace.
true, false
ICA_SERVICE_ACCOUNT_USER_API_KEY
An API key that allows interaction with ICA using the ICA CLI and is bound to the permissions defined at startup of the workspace.
Configuration Files
Following files and folders will be provided to the workspace and made accessible for reading at runtime.
Name
Description
/etc/workspace-auth
Contains the SSH rsa public/private keypair which is required to be used to run the workspace SSHD.
Software Files
At runtime, ICA-related software will automatically be made available at /data/.software in read-only mode.
New versions of ICA software will be made available after a restart of your workspace.
Important Folders
Name
Description
/data
This folder contains all data specific to your workspace.
Data in this folder is not persisted in your project and will be removed at deletion of the workspace.
/data/project
This folder contains all your project data.
/data/.software
This folder contains ICA-related software.
Bench Lifecycle
Workspace Lifecycle
When a bench workspace is instantiated from your selected bench image, the following script is invoked: /usr/local/bin/ica_start.sh
This script needs to be available and executable otherwise your workspace will not boot.
This script is the main process in your running workspace and cannot run to completion as it will stop the workspace and instantiate a restart (see init script).
This script can be used to invoke other scripts.
When you stop a workspace, a TERM signal is sent to the main process in your bench workspace. You can trap this signal to handle the stop gracefully (see shutdown script) and shut down child processes of the main process. The workspace will be forcedly shut down after 30 seconds if your main process hasn’t stopped within the given period.
Troubleshooting
Build Argument
If you get the error "docker buildx build" requires exactly 1 argument when trying to build your docker image, then a possible cause is missing the last . of the command.
Server Connection Error
When you stop the workspace when users are still actively using it, they will receive a message showing a Server Connection Error.
For this tutorial, the instance size depends on the flow you import, and whether you use a Bench cluster:
If using a cluster, choose standard-small or standard-medium for the workspace master node
Otherwise, choose at least standard-large as nf-core pipelines often need more than 4 cores to run.
Select the single user workspace permissions (aka "Access limited to workspace owner "), which allows us to deploy pipelines
Specify at least 100GB of disk space
Optional: After choosing the image, enable a cluster with at least this one standard-largeinstance type
Start the workspace, then (if applicable) start the cluster
Import nf-core Pipeline to Bench
If conda and/or nextflow are not installed, pipeline-dev will offer to install them.
The Nextflow files are pulled into the nextflow-src subfolder.
A larger example that still runs quickly is nf-core/sarek
Result
Run Validation Test in Bench
All nf-core pipelines conveniently define a "test" profile that specifies a set of validation inputs for the pipeline.
The following command runs this test profile. If a Bench cluster is active, it runs on your Bench cluster, otherwise it runs on the main workspace instance.
The pipeline-dev tool is using "nextflow run ..." to run the pipeline. The full nextflow command is printed on stdout and can be copy-pasted+adjusted if you need additional options.
Result
Monitoring
When a pipeline is running locally (i.e. not on a Bench cluster), you can monitor the task execution from another terminal with docker ps
When a pipeline is running on your Bench cluster, a few commands help to monitor the tasks and cluster. In another terminal, you can use:
qstat to see the tasks being pending or running
tail /data/logs/sge-scaler.log.<latest available workspace reboot time> to check if the cluster is scaling up or down (it currently takes 3 to 5 minutes to get a new node)
Data Locations
The output of the pipeline is in the outdir folder
Nextflow work files are under the work folder
Log files are .nextflow.log* and output.log
Deploy as Flow Pipeline
After generating a few ICA-specific files (JSON input specs for Flow launch UI + list of inputs for next step's validation launch), the tool identifies which previous versions of the same pipeline have already been deployed (in ICA Flow, pipeline versioning is done by including the version number in the pipeline name, so that's what is checked here). It then asks if you want to update the latest version or create a new one.
Choose "3" and enter a name of your choice to avoid conflicts with other users following this same tutorial.
At the end, the URL of the pipeline is displayed. If you are using a terminal that supports it, Ctrl+click or middle-click can open this URL in your browser.
Run Validation Test in Flow
This launches an analysis in ICA Flow, using the same inputs as the nf-core pipeline's "test" profile.
Some of the input files will have been copied to your ICA project to allow the launch to take place. They are stored in the folder bench-pipeline-dev/temp-data.
Hints
Using older versions of Nextflow
Some older nf-core flows are still using DSL1, which is only working up to Nextflow 22.
An easy solution is to create a conda environment for nextflow 22:
The Pipeline Development Kit in Bench makes it easy to create Nextflow pipelines for ICA Flow. This kit consists of a number of development tools which are installed in /data/.software (regardless of which Bench image is selected) and provides the following features:
# Init script invoked at start of a bench workspace
COPY --chmod=0755 --chown=root:root ${FILES_BASE}/ica_start.sh /usr/local/bin/ica_start.sh
# Bench workspaces need to run as user with uid 1000 and be part of group with gid 100
RUN adduser -H -D -s /bin/bash -h ${HOME} -u 1000 -G users ica
# Terminate function
function terminate() {
# Send SIGTERM to child processes
kill -SIGTERM $(jobs -p)
# Send SIGTERM to waitpid
echo "Stopping ..."
kill -SIGTERM ${WAITPID}
}
# Catch SIGTERM signal and execute terminate function.
# A workspace will be informed 30s before forcefully being shutdown.
trap terminate SIGTERM
# Hold init process until TERM signal is received
tail -f /dev/null &
WAITPID=$!
wait $WAITPID
conda create -n nextflow22
# If, like me, you never ran "conda init", do it now:
conda init
bash -l # To load the conda's bashrc changes
conda activate nextflow22
conda install -y nextflow=22
# Check
nextflow -version
# Then use the pipeline-dev tools as in the demo
mkdir demo
cd demo
pipeline-dev import-from-nextflow nf-core/demo
/data/demo $ pipeline-dev import-from-nextflow nf-core/demo
Creating output folder nf-core/demo
Fetching project nf-core/demo
Fetching project info
project name: nf-core/demo
repository : https://github.com/nf-core/demo
local path : /data/.nextflow/assets/nf-core/demo
main script : main.nf
description : An nf-core demo pipeline
author : Christopher Hakkaart
Pipeline “nf-core/demo” successfully imported into nf-core/demo.
Suggested actions:
cd nf-core/demo
pipeline-dev run-in-bench
[ Iterative dev: Make code changes + re-validate with previous command ]
pipeline-dev deploy-as-flow-pipeline
pipeline-dev launch-validation-in-flow
Description: Provide a description explaining what your docker images does or is suited for.
Type: The type of this image is Bench. The Tool type is reserved for tool images.
Cluster compatible: Indicates if this docker images is suited for cluster computing.
Access: This setting must match the available access options of your Docker image. You can choose web access (HTTP), console access (SSH) or both. What is selected here becomes available on the + New Workspace screen. Enabling an option here which your Docker image does not support, will result in access denied errors when trying to run the workspace.
Regions: If your tenant has access to multiple regions, you can select to which regions to replicate the docker image.
ICA_BENCH_URL
The host part of the public URL which provides access to the running workspace.
use1-bench.platform.illumina.com
ICA_PROJECT_UUID
The unique identifier related to the ICA project in which the workspace was started.
The proxy endpoint in case the workspace was started in restricted mode.
HOME
The home folder.
/data
Import to Bench
From public nf-core pipelines
From existing ICA Flow Nextflow pipelines
Run in Bench
Modify and re-run in Bench, providing fast development iterations
Deploy to Flow
Launch validation in Flow
Prerequisites
Recommended workspace size: Nf-core Nextflow pipelines typically require 4 or more cores to run.
The pipeline development tools require
Conda which is automatically installed by “pipeline-dev” if conda-miniconda.installer.ica-userspace.sh is present in the image.
Nextflow (version 24.10.2 is automatically installed using conda, or you can use other versions)
git (automatically installed using conda)
jq, curl (which should be made available in the image)
NextFlow Requirements / Best Practices
Pipeline development tools work best when the following items are defined:
Nextflow profiles:
test profile, specifying inputs appropriate for a validation run
docker profile, instructing NextFlow to use Docker
nextflow_schema.json, as described . This is useful for the launch UI generation. The nf-core CLI tool (installable via pip install nf-core) offers extensive help to create and maintain this schema.
ICA Flow adds one additional constraint. The output directoryout is the only one automatically copied to the Project data when an ICA Flow Analysis completes. The -outdir parameter recommended by nf-core should therefore be set to--outdir=out when running as a Flow pipeline.
Pipeline Development Tools
New Bench pipeline development tools only become active after a workspace reboot.
These are installed in /data/.software (which should be in your $PATH), the pipeline-dev script is the front-end to the other pipeline-dev-* tools.
Pipeline-dev fulfils a number of roles:
Checks that the environment contains the required tools (conda, nextflow, etc) and offers to install them if needed.
Checks that the fast data mounts are present (/data/mounts/project etc.) – it is useful to check regularly, as they get unmounted when a workspace is stopped and restarted.
Redirects stdout and stderr to .pipeline-dev.log, with the history of log files kept as .pipeline-dev.log.<log date>.
Launches the appropriate sub-tool.
Prints out errors with backtrace, to help report issues.
Usage
1) Starting a new Project
A pipeline-dev project relies on the following Folder structure, which is auto-generated when using the pipeline-dev import* tools.
If you start a project manually, you must follow the same folder structure.
Project base folder
nextflow-src: Platform-agnostic Nextflow code, for example the github contents of an nf-core pipeline, or your usual nextflow source code.
main.nf
nextflow.config
nextflow_schema.json
pipeline-dev.project-info: contains project name, description, etc.
nextflow-bench.config (automatically generated when needed): contains definitions for bench.
ica-flow-config: Directory of files used when deploying pipeline to Flow.
inputForm.json (if not present, gets generated from nextflow-src/nextflow_schema.json): input form as defined in ICA Flow.
onSubmit.js, onRender.js (optional, generated at the same time as inputForm.json): javascript code to go with the input form.
Pipeline Sources
The above-mentioned project structure must be generated manually. The nf-core CLI tools can assist to generate the nextflow_schema.json. Tutorial Pipeline from Scratch goes into more details about this use case.
A directory with the same name as the nextflow/nf-core pipeline is created, and the Nextflow files are pulled into the nextflow-src subdirectory.
Tutorial Nf Core Pipelines goes into more details about this use case.
A directory called imported-flow-analysis is created and the analysis+pipeline assets are downloaded.
Tutorial goes into more details about this use case.
Currently only pipelines with publicly available Docker images are supported. Pipelines with ICA-stored images are not yet supported.
2) Running in Bench
Optional parameters --local / --sge can be added to force the execution on the local workspace node, or on the workspace cluster (when available). Otherwise, the presence of a cluster is automatically detected and used.
The script then launches nextflow. The full nextflow command line is printed and launched.
In case of errors, full logs are saved as .pipeline-dev.log
Currently, not all corner cases are covered by command line options. Please start from the nextflow command printed by the tool and extend it based on your specific needs.
Output Example
Nextflow output
Container (Docker) images
Nextflow can run processes with and without Docker images. In the context of pipeline development, the pipeline-dev tools assume Docker images are used, in particular during execution with the nextflow --profile docker.
In NextFlow, Docker images can be specified at the process level
This is done with the container "<image_name:version>" directive, which can be specified
in nextflow config files (preferred method when following the nf-core best practices)
or at the start of each process definition.
Each process can use a different docker image
It is highly recommended to always specify an image. If no Docker image is specified, Nextflow will report this. In ICA, a basic image will be used but with no guarantee that the necessary tools are available.
Resources such as #cpu and memory can be specified as described here See containers or our tutorials for details about Nextflow-Docker syntax.
Bench can push/pull/create/modify Docker images, as described in Containers.
3) Deploying to ICA Flow
This command does the following:
Generate the JSON file describing the ICA Flow user interface.
If ica-flow-config/inputForm.json doesn’t exist: generate it from nextflow-src/nextflow_swagger.json .
Generate the JSON file containing the validation launch inputs.
If ica-flow-config/launchPayload_inputFormValues.json doesn’t exist: generate it from nextflow --profile test inputs.
If local files are used as validation inputs or as default input values:
Identify the pipeline name to use for this new pipeline deployment:
If a deployment has already occurred in this project, or if the project was imported from an existing Flow pipeline, start from this pipeline name. Otherwise start from the project name.
Identify which already-deployed pipelines have the same base name, with or without suffixes that could be some versioning (_v<number>, _<number>, _<date>) .
New ICA Flow pipeline gets created (except in case of pipeline update) .
The current Nextflow version in Bench is used to select the best Nextflow version to be used in Flow
nextflow-srcfolder is uploaded file by file as pipeline assets.
Output Example:
The pipeline name, id and URL are printed out, and if your environment allows, Ctrl+Click/Option+Click/Right click can open the URL in a browser.
Opening the URL of the pipeline and clicking on Start Analysis shows the generated user interface:
4) Launching Validation in Flow
The ica-flow-config/launchPayload_inputFormValues.json file generated in the previous step is submitted to ICA Flow to start an analysis with the same validation inputs as “nextflow --profile test”.
Output Example:
launch-validation-in-flow
The analysis name, id and URL are printed out, and if your environment allows, Ctrl+Click/Option+Click/Right click can open the URL in a browser.
launchPayload_inputFormValues.json (if not present, gets generated from the test profile): used by “pipeline-dev launch-validation-in-flow”.
copy them to /data/project/pipeline-dev-files/temp .
get their ICA file ids.
use these file ids in the launch specifications.
If remote files are used as validation inputs or as default input values of an input of type “file” (and not “string”): do the same as above.
Ask the user if they prefer to update the current version of the pipeline, create a new version, or enter a new name of their choice – or use the --create/--update parameters when specified, for scripting without user interactions.
For this tutorial, any instance size will work, even the smallest standard-small.
Select the single user workspace permissions (aka "Access limited to workspace owner "), which allows us to deploy pipelines.
A small amount of disk space (10GB) will be enough.
We are going to wrap the "gzip" linux compression tool with inputs:
1 file
compression level: integer between 1 and 9
We intentionally do not include sanity checks, to keep this scenario simple.
Creation of test file:
Wrapping in Nextflow
Here is an example of NextFlow code that wraps the bzip2 command and publishes the final output in the “out” folder:
nextflow-src/main.nf
Save this file as nextflow-src/main.nf, and check that it works:
Result
Wrap the Pipeline in Bench
We now need to:
Use Docker
Follow some nf-core best practices to make our source+test compatible with the pipeline-dev tools
Using Docker:
In NextFlow, Docker images can be specified at the process level
Each process may use a different docker image
It is highly recommended to always specify an image. If no Docker image is specified, Nextflow will report this. In ICA, a basic image will be used but with no guarantee that the necessary tools are available.
Specifying the Docker image is done with the container '<image_name:version>' directive, which can be specified
at the start of each process definition
or in nextflow config files (preferred when following nf-core guidelines)
For example, create nextflow-src/nextflow.config:
We can now run with nextflow's -with-docker option:
Following some nf-core best practices to make our source+test compatible with the pipeline-dev tools:
Create NextFlow “test” profile
Here is an example of “test” profile that can be added to nextflow-src/nextflow.config to define some input values appropriate for a validation run:
nextflow-src/nextflow.config
With this profile defined, we can now run the same test as before with this command:
Create NextFlow “docker” profile
A “docker” profile is also present in all nf-core pipelines. Our pipeline-dev tools will make use of it, so let’s define it:
nextflow-src/nextflow.config
We can now run the same test as before with this command:
We also have enough structure in place to start using the pipeline-dev command:
In order to deploy our pipeline to ICA, we need to generate the user interface input form.
This is done by using nf-core's recommended nextflow_schema.json.
For our simple example, we generate a minimal one by hand (done by using one of the nf-core pipelines as example):
nextflow-src/nextflow_schema.json
In the next step, this gets converted to the ica-flow-config/inputForm.json file.
Note: For large pipelines, as described on the nf-core website
Manually building JSONSchema documents is not trivial and can be very error prone. Instead, the nf-core pipelines schema build command collects your pipeline parameters and gives interactive prompts about any missing or unexpected params. If no existing schema is found it will create one for you.
We recommend looking into "nf-core pipelines schema build -d nextflow-src/", which comes with a web builder to add descriptions etc.
Deploy as a Flow Pipeline
We just need to create a final file, which we had skipped until now: Our project description file, which can be created via the command pipeline-dev project-info --init:
pipeline-dev.project_info
We can now run:
After generating the ICA-Flow-specific files in the ica-flow-config folder (JSON input specs for Flow launch UI + list of inputs for next step's validation launch), the tool identifies which previous versions of the same pipeline have already been deployed (in ICA Flow, pipeline versioning is done by including the version number in the pipeline name).
It then asks if we want to update the latest version or create a new one.
Choose "3" and enter a name of your choice to avoid conflicts with all the others users following this same tutorial.
At the end, the URL of the pipeline is displayed. If you are using a terminal that supports it, Ctrl+click or middle-click can open this URL in your browser.
Run Validation Test in Flow
This launchesananalysis in ICA Flow, using the same inputs as the pipeline's "test" profile.
Some of the input files will have been copied to your ICA project in order for the analysis launch to work. They are stored in the folder /data/project/bench-pipeline-dev/temp-data.
{
"$defs": {
"input_output_options": {
"title": "Input/output options",
"properties": {
"input_file": {
"description": "Input file to compress",
"help_text": "The file that will get compressed",
"type": "string",
"format": "file-path"
},
"compression_level": {
"type": "integer",
"description": "Compression level to use (1-9)",
"default": 5,
"minimum": 1,
"maximum": 9
}
}
}
}
}
$ pipeline-dev project-info --init
pipeline-dev.project-info not found. Let's create it with 2 questions:
Please enter your project name: demo_gzip
Please enter a project description: Bench gzip demo
pipeline-dev deploy-as-flow-pipeline
pipeline-dev launch-validation-in-flow
/data/demo $ pipeline-dev launch-validation-in-flow
pipelineld: 331f209d-2a72-48cd-aa69-070142f57f73
Getting Analysis Storage Id
Launching as ICA Flow Analysis...
ICA Analysis created:
- Name: Test demo_gzip
- Id: 17106efc-7884-4121-a66d-b551a782b620
- Url: https://stage.v2.stratus.illumina.com/ica/projects/1873043/analyses/17106efc-7884-4121-a66d-b551a782620
an analysis exercising this pipeline, preferably with a short execution time, to use as validation test
Start Bench Workspace
For this tutorial, the instance size depends on the flow you import, and whether you use a Bench cluster:
When using a cluster, choose standard-small or standard-medium for the workspace master node
Otherwise, choose at least standard-large if you re-import a pipeline that originally came from nf-core, as they typically need 4 or more CPUs to run.
Select the "single user workspace" permissions (aka "Access limited to workspace owner "), which allows us to deploy pipelines
Specify at least 100GB of disk space
Optional: After choosing the image, enable a cluster with at least one standard-large instance type.
Start the workspace, then (if applicable) also start the cluster
Import Existing Pipeline and Analysis to Bench
The starting point is the analysis id that is used as pipeline validation test (the pipeline id is obtained from the analysis metadata).
If no --analysis-id is provided, the tool lists all the successfull analyses in the current project and lets the developer pick one.
If conda and/or nextflow are not installed, pipeline-dev will offer to install them.
A folder called imported-flow-analysis is created.
Pipeline Nextflow assets are downloaded into the nextflow-src sub-folder.
Pipeline input form and associated javascript are downloaded into the ica-flow-config sub-folder.
Analysis input specs are downloaded to the ica-flow-config/launchPayload_inputFormValues.json file.
The analysis inputs are converted into a "test" profile for Nextflow, stored - among other items - in nextflow_bench.conf
Results
Run Validation Test in Bench
The following command runs this test profile. If a Bench cluster is active, it runs on your Bench cluster, otherwise it runs on the main workspace instance:
The pipeline-dev tool is using "nextflow run ..." to run the pipeline. The full nextflow command is printed on stdout and can be copy-pasted+adjusted if you need additional options.
Monitoring
When a pipeline is running on your Bench cluster, a few commands help to monitor the tasks and cluster. In another terminal, you can use:
qstat to see the tasks being pending or running
tail /data/logs/sge-scaler.log.<latest available workspace reboot time> to check if the cluster is scaling up or down (it currently takes 3 to 5 minutes to get a new node)
Data Locations
The output of the pipeline is in the outdir folder
Nextflow work files are under the work folder
Log files are .nextflow.log* and output.log
Modify Pipeline
Nextflow files (located in the nextflow-src folder) are easy to modify.
Depending on your environment (ssh access / docker image with JupyterLab or VNC, with and without Visual Studio code), various source code editors can be used.
After modifying the source code, you can run a validation iteration with the same command as before:
Identify Docker Image
Modifying the Docker image is the next step.
Nextflow (and ICA) allow the Docker images to be specified at different places:
in config files such as nextflow-src/nextflow.config
in nextflow code files:
grep container may help locate the correct files:
Docker Image Update: Dockerfile Method
Use case: Update some of the software (mimalloc) by compiling a new version
With the appropriate permissions, you can then "docker login" and "docker push" the new image.
Docker Image Update: Interactive Method
With the appropriate permissions, you can then "docker login" and "docker push" the new image.
Fun fact: VScode with the "Dev Containers" extension lets you edit the files inside your running container:
Beware that this extension creates a lot of temp files in /tmp and in $HOME/.vscode-server. Don't include them in your image...
Update the nextflow code and/or configs to use the new image
Validate your changes in Bench:
Deploy as Flow Pipeline
After generating a few ICA-specific files (JSON input specs for Flow launch UI + list of inputs for next step's validation launch), the tool identifies which previous versions of the same pipeline have already been deployed (in ICA Flow, pipeline versioning is done by including the version number in the pipeline name, so that's what is checked here).
It then asks if we want to update the latest version or create a new one.
At the end, the URL of the pipeline is displayed. If you are using a terminal that supports it, Ctrl+click or middle-click can open this URL in your browser.
Result
Run Validation Test in Flow
This launches an analysis in ICA Flow, using the same inputs as the pipeline's "test" profile.
Some of the input files will have been copied to your ICA project to allow the launch to take place. They are stored in the folder /data/project/bench-pipeline-dev/temp-data.
mkdir demo-flow-dev
cd demo-flow-dev
pipeline-dev import-from-flow
or
pipeline-dev import-from-flow --analysis-id=9415d7ff-1757-4e74-97d1-86b47b29fb8f
Enter the number of the entry you want to use: 21
Fetching analysis 9415d7ff-1757-4e74-97d1-86b47b29fb8f ...
Fetching pipeline bb47d612-5906-4d5a-922e-541262c966df ...
Fetching pipeline files... main.nf
Fetching test inputs
New Json inputs detected
Resolving test input ids to /data/mounts/project paths
Fetching input form..
Pipeline "GWAS pipeline_1.
_2_1_20241215_130117" successfully imported.
pipeline name: GWAS pipeline_1_2_1_20241215_130117
analysis name: Test GWAS pipeline_1_2_1_20241215_130117
pipeline id : bb47d612-5906-4d5a-922e-541262c966df
analysis id : 9415d7ff-1757-4e74-97d1-86b47b29fb8f
Suggested actions:
pipeline-dev run-in-bench
I Iterative dev: Make code changes + re-validate with previous command ]
pipeline-dev deploy-as-flow-pipeline
pipeline-dev run-in-flow
cd imported-flow-analysis
pipeline-dev run-in-bench
/data/demo $ tail /data/logs/sge-scaler.log.*
2025-02-10 18:27:19,657 - SGEScaler - INFO: SGE Marked Overview - {'UNKNOWN': O, 'DEAD': O, 'IDLE': O, 'DISABLED': O, 'DELETED': O, 'UNRESPONSIVE': 0}
2025-02-10 18:27:19,657 - SGEScaler - INFO: Job Status - Active jobs : 0, Pending jobs : 6
2025-02-10 18:27:26,291 - SGEScaler - INFO: Cluster Status - State: Transitioning,
Online Members: 0, Offline Members: 2, Requested Members: 2, Min Members: 0, Max Members: 2
code nextflow-src # Open in Visual Studio Code
code . # Open current dir in Visual Studio Code
vi nextflow-src/main.nf
pipeline-dev run-in-bench
/data/demo-flow-dev $ head nextflow-src/main.nf
nextflow.enable.dsl = 2
process top_level_process t
container 'docker.io/ljanin/gwas-pipeline:1.2.1'
IMAGE_BEFORE=docker.io/ljanin/gwas-pipeline:1.2.1
IMAGE_AFTER=docker.io/ljanin/gwas-pipeline:tmpdemo
# Create directory for Dockerfile
mkdir dirForDockerfile
cd dirForDockerfile
# Create Dockerfile
cat <<EOF > Dockerfile
FROM ${IMAGE_BEFORE}
RUN mkdir /mimalloc-compile \
&& cd /mimalloc-compile \
&& git clone -b v2.0.6 https://github.com/microsoft/mimalloc \
&& mkdir -p mimalloc/out/release \
&& cd mimalloc/out/release \
&& cmake ../.. \
&& make \
&& make install \
&& cd / \
&& rm -rf mimalloc-compile
EOF
# Build image
docker build -t ${IMAGE_AFTER} .
IMAGE_BEFORE=docker.io/ljanin/gwas-pipeline:1.2.1
IMAGE_AFTER=docker.io/ljanin/gwas-pipeline:1.2.2
docker run -it --rm ${IMAGE_BEFORE} bash
# Make some modifications
vi /scripts/plot_manhattan.py
<Fix "manhatten.png" into "manhattAn.png">
<Enter :wq to save and quit vi>
<Start another terminal (try Ctrl+Shift+T if using wezterm)>
# Identify container id
# Save container changes into new image layer
CONTAINER_ID=c18670335247
docker commit ${CONTAINER_ID} ${IMAGE_AFTER}
sed --in-place "s/${IMAGE_BEFORE}/${IMAGE_AFTER}/" nextflow-src/main.nf
/data/demo $ pipeline-dev deploy-as-flow-pipeline
Generating ICA input specs...
Extracting nf-core test inputs...
Deploying project nf-core/demo
- Currently being developed as: dev-nf-core-demo
- Last version updated in ICA: dev-nf-core-demo_v3
- Next suggested version: dev-nf-core-demo_v4
How would you like to deploy?
1. Update dev-nf-core-demo (current version)
2. Create dev-nf-core-demo_v4
3. Enter new name
4. Update dev-nf-core-demo_v3 (latest version updated in ICA)
/data/demo $ pipeline-dev launch-validation-in-flow
pipelineld: 26bc5aa5-0218-4e79-8a63-ee92954c6cd9
Getting Analysis Storage Id
Launching as ICA Flow Analysis...
ICA Analysis created:
- Name: Test dev-nf-core-demo_v4
- Id: cadcee73-d975-435d-b321-5d60e9aec1ec
- Url: https://stage.v2.stratus.illumina.com/ica/projects/1873043/analyses/cadcee73-d975-435d-b321-5d60e9aec1ec
Bench Command Line Interface
Command Index
The following is a list of available bench CLI commands and thier options.
Please refer to the examples from the Illumina website for more details.
workspace-ctl
workspace-ctl completion
workspace-ctl compute
workspace-ctl compute get-cluster-details
workspace-ctl compute get-logs
workspace-ctl compute get-pools
workspace-ctl compute scale-pool
workspace-ctl data
workspace-ctl data create-mount
workspace-ctl data delete-mount
workspace-ctl data get-mounts
workspace-ctl help
workspace-ctl help completion
workspace-ctl help compute
workspace-ctl help compute get-cluster-details
workspace-ctl help compute get-logs
workspace-ctl help compute get-pools
workspace-ctl help compute scale-pool
workspace-ctl help data
workspace-ctl help data create-mount
workspace-ctl help data delete-mount
workspace-ctl help data get-mounts
workspace-ctl help help
workspace-ctl help software
workspace-ctl help software get-server-metadata
workspace-ctl help software get-software-settings
workspace-ctl help workspace
workspace-ctl help workspace get-cluster-settings
workspace-ctl help workspace get-connection-details
workspace-ctl help workspace get-workspace-settings
workspace-ctl software
workspace-ctl software get-server-metadata
workspace-ctl software get-software-settings
workspace-ctl workspace
workspace-ctl workspace get-cluster-settings
workspace-ctl workspace get-connection-details
workspace-ctl workspace get-workspace-settings
Usage:
workspace-ctl [flags]
workspace-ctl [command]
Available Commands:
completion Generate completion script
compute
data
help Help about any command
software
workspace
Flags:
--X-API-Key string
--base-path string For example: / (default "/")
--config string config file path
--debug output debug logs
--dry-run do not send the request to server
-h, --help help for workspace-ctl
--help-tree
--help-verbose
--hostname string hostname of the service (default "api:8080")
--print-curl print curl equivalent do not send the request to server
--scheme string Choose from: [http] (default "http")
Use "workspace-ctl [command] --help" for more information about a command.
cmd execute error: accepts 1 arg(s), received 0
Usage:
workspace-ctl compute [flags]
workspace-ctl compute [command]
Available Commands:
get-cluster-details
get-logs
get-pools
scale-pool
Flags:
-h, --help help for compute
--help-tree
--help-verbose
Global Flags:
--X-API-Key string
--base-path string For example: / (default "/")
--config string config file path
--debug output debug logs
--dry-run do not send the request to server
--hostname string hostname of the service (default "api:8080")
--print-curl print curl equivalent do not send the request to server
--scheme string Choose from: [http] (default "http")
Use "workspace-ctl compute [command] --help" for more information about a command.
Usage:
workspace-ctl compute get-cluster-details [flags]
Flags:
-h, --help help for get-cluster-details
--help-tree
--help-verbose
Global Flags:
--X-API-Key string
--base-path string For example: / (default "/")
--config string config file path
--debug output debug logs
--dry-run do not send the request to server
--hostname string hostname of the service (default "api:8080")
--print-curl print curl equivalent do not send the request to server
--scheme string Choose from: [http] (default "http")
Usage:
workspace-ctl compute get-logs [flags]
Flags:
-h, --help help for get-logs
--help-tree
--help-verbose
Global Flags:
--X-API-Key string
--base-path string For example: / (default "/")
--config string config file path
--debug output debug logs
--dry-run do not send the request to server
--hostname string hostname of the service (default "api:8080")
--print-curl print curl equivalent do not send the request to server
--scheme string Choose from: [http] (default "http")
Usage:
workspace-ctl compute get-pools [flags]
Flags:
--cluster-id string Required. Cluster ID
-h, --help help for get-pools
--help-tree
--help-verbose
Global Flags:
--X-API-Key string
--base-path string For example: / (default "/")
--config string config file path
--debug output debug logs
--dry-run do not send the request to server
--hostname string hostname of the service (default "api:8080")
--print-curl print curl equivalent do not send the request to server
--scheme string Choose from: [http] (default "http")
Usage:
workspace-ctl compute scale-pool [flags]
Flags:
--cluster-id string Required. Cluster ID
-h, --help help for scale-pool
--help-tree
--help-verbose
--pool-id string Required. Pool ID
--pool-member-count int Required. New pool size
Global Flags:
--X-API-Key string
--base-path string For example: / (default "/")
--config string config file path
--debug output debug logs
--dry-run do not send the request to server
--hostname string hostname of the service (default "api:8080")
--print-curl print curl equivalent do not send the request to server
--scheme string Choose from: [http] (default "http")
Usage:
workspace-ctl data [flags]
workspace-ctl data [command]
Available Commands:
create-mount Create a data mount under /data/mounts. Return newly created mount.
delete-mount Delete a data mount
get-mounts Returns the list of data mounts
Flags:
-h, --help help for data
--help-tree
--help-verbose
Global Flags:
--X-API-Key string
--base-path string For example: / (default "/")
--config string config file path
--debug output debug logs
--dry-run do not send the request to server
--hostname string hostname of the service (default "api:8080")
--print-curl print curl equivalent do not send the request to server
--scheme string Choose from: [http] (default "http")
Use "workspace-ctl data [command] --help" for more information about a command.
Create a data mount under /data/mounts. Return newly created mount.
Usage:
workspace-ctl data create-mount [flags]
Aliases:
create-mount, mount
Flags:
-h, --help help for create-mount
--help-tree Display commands as a tree
--help-verbose Extended help topics and options
--mode string Enum:["read-only","read-write"]. Mount mode i.e. read-only, read-write
--mount-path string Where to mount the data, e.g. /data/mounts/hg38data (or simply hg38data)
--source string Required. Source data location, e.g. /data/project/myData/hg38 or fol.bc53010dec124817f6fd08da4cf3c48a (ICA folder id)
--wait Wait for new mount to be available on all nodes before sending response
--wait-timeout int Max number of seconds for wait option. Absolute max: 300 (default 300)
Global Flags:
--X-API-Key string
--base-path string For example: / (default "/")
--config string config file path
--debug output debug logs
--dry-run do not send the request to server
--hostname string hostname of the service (default "api:8080")
--print-curl print curl equivalent do not send the request to server
--scheme string Choose from: [http] (default "http")
Delete a data mount
Usage:
workspace-ctl data delete-mount [flags]
Aliases:
delete-mount, unmount
Flags:
-h, --help help for delete-mount
--help-tree
--help-verbose
--id string Id of mount to remove
--mount-path string Path of mount to remove
Global Flags:
--X-API-Key string
--base-path string For example: / (default "/")
--config string config file path
--debug output debug logs
--dry-run do not send the request to server
--hostname string hostname of the service (default "api:8080")
--print-curl print curl equivalent do not send the request to server
--scheme string Choose from: [http] (default "http")
Returns the list of data mounts
Usage:
workspace-ctl data get-mounts [flags]
Aliases:
get-mounts, list-mounts
Flags:
-h, --help help for get-mounts
--help-tree
--help-verbose
Global Flags:
--X-API-Key string
--base-path string For example: / (default "/")
--config string config file path
--debug output debug logs
--dry-run do not send the request to server
--hostname string hostname of the service (default "api:8080")
--print-curl print curl equivalent do not send the request to server
--scheme string Choose from: [http] (default "http")
Usage:
workspace-ctl [flags]
workspace-ctl [command]
Available Commands:
completion Generate completion script
compute
data
help Help about any command
software
workspace
Flags:
--X-API-Key string
--base-path string For example: / (default "/")
--config string config file path
--debug output debug logs
--dry-run do not send the request to server
-h, --help help for workspace-ctl
--help-tree
--help-verbose
--hostname string hostname of the service (default "api:8080")
--print-curl print curl equivalent do not send the request to server
--scheme string Choose from: [http] (default "http")
Use "workspace-ctl [command] --help" for more information about a command.
To load completions:
Bash:
$ source <(yourprogram completion bash)
# To load completions for each session, execute once:
# Linux:
$ yourprogram completion bash > /etc/bash_completion.d/yourprogram
# macOS:
$ yourprogram completion bash > /usr/local/etc/bash_completion.d/yourprogram
Zsh:
# If shell completion is not already enabled in your environment,
# you will need to enable it. You can execute the following once:
$ echo "autoload -U compinit; compinit" >> ~/.zshrc
# To load completions for each session, execute once:
$ yourprogram completion zsh > "${fpath[1]}/_yourprogram"
# You will need to start a new shell for this setup to take effect.
fish:
$ yourprogram completion fish | source
# To load completions for each session, execute once:
$ yourprogram completion fish > ~/.config/fish/completions/yourprogram.fish
PowerShell:
PS> yourprogram completion powershell | Out-String | Invoke-Expression
# To load completions for every new session, run:
PS> yourprogram completion powershell > yourprogram.ps1
# and source this file from your PowerShell profile.
Usage:
workspace-ctl completion [bash|zsh|fish|powershell]
Flags:
-h, --help help for completion
Usage:
workspace-ctl compute [flags]
workspace-ctl compute [command]
Available Commands:
get-cluster-details
get-logs
get-pools
scale-pool
Flags:
-h, --help help for compute
--help-tree
--help-verbose
Use "workspace-ctl compute [command] --help" for more information about a command.
Usage:
workspace-ctl compute get-cluster-details [flags]
Flags:
-h, --help help for get-cluster-details
--help-tree
--help-verbose
Usage:
workspace-ctl compute get-logs [flags]
Flags:
-h, --help help for get-logs
--help-tree
--help-verbose
Usage:
workspace-ctl compute get-pools [flags]
Flags:
--cluster-id string Required. Cluster ID
-h, --help help for get-pools
--help-tree
--help-verbose
Usage:
workspace-ctl compute scale-pool [flags]
Flags:
--cluster-id string Required. Cluster ID
-h, --help help for scale-pool
--help-tree
--help-verbose
--pool-id string Required. Pool ID
--pool-member-count int Required. New pool size
Usage:
workspace-ctl data [flags]
workspace-ctl data [command]
Available Commands:
create-mount Create a data mount under /data/mounts. Return newly created mount.
delete-mount Delete a data mount
get-mounts Returns the list of data mounts
Flags:
-h, --help help for data
--help-tree
--help-verbose
Use "workspace-ctl data [command] --help" for more information about a command.
Create a data mount under /data/mounts. Return newly created mount.
Usage:
workspace-ctl data create-mount [flags]
Aliases:
create-mount, mount
Flags:
-h, --help help for create-mount
--help-tree
--help-verbose
--mount-path string Where to mount the data, e.g. /data/mounts/hg38data (or simply hg38data)
--source string Required. Source data location, e.g. /data/project/myData/hg38 or fol.bc53010dec124817f6fd08da4cf3c48a (ICA folder id)
--wait Wait for new mount to be available on all nodes before sending response
--wait-timeout int Max number of seconds for wait option. Absolute max: 300 (default 300)
Delete a data mount
Usage:
workspace-ctl data delete-mount [flags]
Aliases:
delete-mount, unmount
Flags:
-h, --help help for delete-mount
--help-tree
--help-verbose
--id string Id of mount to remove
--mount-path string Path of mount to remove
Returns the list of data mounts
Usage:
workspace-ctl data get-mounts [flags]
Aliases:
get-mounts, list-mounts
Flags:
-h, --help help for get-mounts
--help-tree
--help-verbose
Help provides help for any command in the application.
Simply type workspace-ctl help [path to command] for full details.
Usage:
workspace-ctl help [command] [flags]
Flags:
-h, --help help for help
Usage:
workspace-ctl software [flags]
workspace-ctl software [command]
Available Commands:
get-server-metadata
get-software-settings
Flags:
-h, --help help for software
--help-tree
--help-verbose
Use "workspace-ctl software [command] --help" for more information about a command.
Usage:
workspace-ctl software get-server-metadata [flags]
Flags:
-h, --help help for get-server-metadata
--help-tree
--help-verbose
Usage:
workspace-ctl software get-software-settings [flags]
Flags:
-h, --help help for get-software-settings
--help-tree
--help-verbose
Usage:
workspace-ctl workspace [flags]
workspace-ctl workspace [command]
Available Commands:
get-cluster-settings
get-connection-details
get-workspace-settings
Flags:
-h, --help help for workspace
--help-tree
--help-verbose
Use "workspace-ctl workspace [command] --help" for more information about a command.
Usage:
workspace-ctl workspace get-cluster-settings [flags]
Flags:
-h, --help help for get-cluster-settings
--help-tree
--help-verbose
Usage:
workspace-ctl workspace get-connection-details [flags]
Flags:
-h, --help help for get-connection-details
--help-tree
--help-verbose
Usage:
workspace-ctl workspace get-workspace-settings [flags]
Flags:
-h, --help help for get-workspace-settings
--help-tree
--help-verbose
Usage:
workspace-ctl software [flags]
workspace-ctl software [command]
Available Commands:
get-server-metadata
get-software-settings
Flags:
-h, --help help for software
--help-tree
--help-verbose
Global Flags:
--X-API-Key string
--base-path string For example: / (default "/")
--config string config file path
--debug output debug logs
--dry-run do not send the request to server
--hostname string hostname of the service (default "api:8080")
--print-curl print curl equivalent do not send the request to server
--scheme string Choose from: [http] (default "http")
Use "workspace-ctl software [command] --help" for more information about a command.
Usage:
workspace-ctl software get-server-metadata [flags]
Flags:
-h, --help help for get-server-metadata
--help-tree
--help-verbose
Global Flags:
--X-API-Key string
--base-path string For example: / (default "/")
--config string config file path
--debug output debug logs
--dry-run do not send the request to server
--hostname string hostname of the service (default "api:8080")
--print-curl print curl equivalent do not send the request to server
--scheme string Choose from: [http] (default "http")
Usage:
workspace-ctl software get-software-settings [flags]
Flags:
-h, --help help for get-software-settings
--help-tree
--help-verbose
Global Flags:
--X-API-Key string
--base-path string For example: / (default "/")
--config string config file path
--debug output debug logs
--dry-run do not send the request to server
--hostname string hostname of the service (default "api:8080")
--print-curl print curl equivalent do not send the request to server
--scheme string Choose from: [http] (default "http")
Usage:
workspace-ctl workspace [flags]
workspace-ctl workspace [command]
Available Commands:
get-cluster-settings
get-connection-details
get-workspace-settings
Flags:
-h, --help help for workspace
--help-tree
--help-verbose
Global Flags:
--X-API-Key string
--base-path string For example: / (default "/")
--config string config file path
--debug output debug logs
--dry-run do not send the request to server
--hostname string hostname of the service (default "api:8080")
--print-curl print curl equivalent do not send the request to server
--scheme string Choose from: [http] (default "http")
Use "workspace-ctl workspace [command] --help" for more information about a command.
Usage:
workspace-ctl workspace get-cluster-settings [flags]
Flags:
-h, --help help for get-cluster-settings
--help-tree
--help-verbose
Global Flags:
--X-API-Key string
--base-path string For example: / (default "/")
--config string config file path
--debug output debug logs
--dry-run do not send the request to server
--hostname string hostname of the service (default "api:8080")
--print-curl print curl equivalent do not send the request to server
--scheme string Choose from: [http] (default "http")
Usage:
workspace-ctl workspace get-connection-details [flags]
Flags:
-h, --help help for get-connection-details
--help-tree
--help-verbose
Global Flags:
--X-API-Key string
--base-path string For example: / (default "/")
--config string config file path
--debug output debug logs
--dry-run do not send the request to server
--hostname string hostname of the service (default "api:8080")
--print-curl print curl equivalent do not send the request to server
--scheme string Choose from: [http] (default "http")
Usage:
workspace-ctl workspace get-workspace-settings [flags]
Flags:
-h, --help help for get-workspace-settings
--help-tree
--help-verbose
Global Flags:
--X-API-Key string
--base-path string For example: / (default "/")
--config string config file path
--debug output debug logs
--dry-run do not send the request to server
--hostname string hostname of the service (default "api:8080")
--print-curl print curl equivalent do not send the request to server
--scheme string Choose from: [http] (default "http")