arrow-left

All pages
gitbookPowered by GitBook
1 of 11

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Reference Data

Reference Data are reference genome sets which you use to help look for deviations and to compare your data against.

hashtag
Creating Reference Data

Reference data properties are located at the main navigation level and consist of the following free text fields.

  • Types

  • Species

  • Reference Sets

Once these are configured,

  1. Go to your data in Projects > your_project > Data.

  2. Select the data you want to use as reference data and Manage > Use as reference data.

  3. Fill out the configuration screen

You can see the result at the main navigation level > Reference Data (outside of projects) or in Projects > your_project > Flow > Reference Data.

hashtag
Linking Reference Data to your Project

To use a reference set from within a project, you have first to add it. Select Projects > your_project > Flow > Reference Data > Link. Then select a reference set to add to your project.

circle-info

Reference sets are only supported in Graphical CWL pipelines.

hashtag
Copying Reference Data to other Regions

  1. Navigate to Reference Data (Not from within a project, but outside of project context, so at the main navigation level).

  2. Select the data set(s) you wish to add to another region and select Copy to another project.

  3. Select a project located in the region where you want to add your reference data.

circle-info

You only need one copy of each reference data set per region. Adding Reference Data sets to additional projects set in the same region does not result in extra copies, but creates links instead. This is done from inside the project at Projects > <your_project> > Flow > Reference Data > Manage > Add to project.

hashtag
Creating a Pipeline with Reference Data

To create a pipeline with a reference data, use the mode. Projects > your_project > Flow > Pipelines > +Create > CWL Graphical. Use the reference data icon instead of regular input icon. On the right hand side use the Reference files submenu to specify the name, the format, and the filters. You can specify the options for an end-user to choose from and a default selection. You can select more than 1 file, but you can only select 1 at a time (so, repeat process to select multiple reference files). If you only select 1 reference file, that file will be the only one users can use with your pipeline. In the screenshot a reference data with two options is presented.

circle-info

Safari is not supported for graphical CWL data.

If your pipeline was built to give users the option of choosing among multiple input reference files, they will see the option to select among the reference files you configured, under Settings. After clicking the magnifying glass icon the user can select from provided options.

Flow

Flow provides tooling for building and running secondary analysis pipelines. The platform supports analysis workflows constructed using Common Workflow Language (CWL) and Nextflow. Each step of an analysis pipeline executes a containerized application using inputs passed into the pipeline or output from previous steps.

You can configure the following components in Illumina Connected Analytics Flow:

  • Reference Data — Reference Data for Graphical CWL flows. See Reference Data.

  • Pipelines — One or more tools configured to process input data and generate output files. See Pipelines.

  • Analyses — Launched instance of a pipeline with selected input data. See .

Analyses
You can check in which region(s) Reference data is present by opening the Reference set and viewing Copy Details.
  • Allow a few minutes for new copies to become available before use.

  • CWL - graphical
    Two options for a reference file

    Tips and Tricks

    Developing on the cloud incurs inherent runtime costs due to compute and storage used to execute workflows. Here are a few tips that can facilitate development.

    • Leverage the cross-platform nature of these workflow languages. Both CWL and Nextflow can be run locally in addition to on ICA. When possible, testing should be performed locally before attempting to run in the cloud. For Nextflow, configuration filesarrow-up-right can be utilized to specify settings to be used either locally or on ICA. An example of advanced usage of a config would be applying the scratch directivearrow-up-right to a set of process names (or labels) so that they use the higher performance local scratch storage attached to an instance instead of the shared network disk,

    • When trying to test on the cloud, it's oftentimes beneficial to create scripts to automate the deployment and launching / monitoring process. This can be performed either using the or by creating your own scripts integrating with the REST API.

    • For scenarios in which instances are terminated prematurely (for example, while using spot instances) without warning, you can implement scripts like the following to retry the job a certain number of times. Adding the following script to 'nextflow.config' enables five retries for each job, with increasing delays between each try.

      Note: Adding the retry script where it is not needed might introduce additional delays.

    • When hardening a Nextflow to handle resource shortages (for example exit code 2147483647), an immediate retry will in most circumstances fail because the resources have not yet been made available. It is best practice to use which has an increasing backoff delay, allowing the system time to provide the necessary resources.

    • When publishing your Nextflow pipeline, make sure your have defined a container such as 'public.ecr.aws/lts/ubuntu:22.04' and are not using the default container 'ubuntu:latest'.

    • To limit potential costs, there is a timeout of 96 hours: if the analysis does not complete within four days, it will go to a 'Failed' state. This time begins to count as soon as the input data is being downloaded. This takes place during the ICA 'Requested' step of the analysis, before going to 'In Progress'. In case parallel tasks are executed, running time is counted once. As an example, let's assume the initial period before being picked up for execution is 10 minutes and consists of the request, queueing and initializing. Then, the data download takes 20 minutes. Next, a task runs on a single node for 25 minutes, followed by 10 minutes of queue time. Finally, three tasks execute simultaneously, each of them taking 25, 28, and 30 minutes, respectively. Upon completion, this is followed by uploading the outputs for one minute. The overall analysis time is then 20 + 25 + 10 + 30 (as the longest task out of three) + 1 = 86 minutes:

    If there are no available resources or your project priority is low, the time before download commences will be substantially longer.

    • By default, Nextflow will not generate the trace report. If you want to enable generating the report, add the section below to your userNextflow.config file.

    1. Useful Links

    withName: 'process1|process2|process3' { scratch = '/scratch/' }
    withName: 'process3' { stageInMode = 'copy' } // Copy the input files to scratch instead of symlinking to shared network disk

    30m

    1m

    -

    Status in ICA

    status requested

    status queued

    status initializing

    status preparing inputs

    status in progress

    status in progress

    status in progress

    status generating outputs

    status succeeded

    Analysis task

    request

    queued

    initializing

    input download

    single task

    queue

    parallel tasks

    generating outputs

    completed

    96 hour limit

    1m (not counted)

    7m (not counted)

    2m (not counted)

    20m

    25m

    ICA CLI
    Dynamic retry with backoffarrow-up-right
    Nextflow on Kubernetes: Best Practicesarrow-up-right
    The State of Kubernetes in Nextflowarrow-up-right

    10m

    process {
        maxRetries = 4
        errorStrategy = { sleep(task.attempt * 60000 as long); return'retry'} // Retry with increasing delay
    }
    trace.enabled = true
    trace.file = '.ica/user/trace-report.txt'
    trace.fields = 'task_id,hash,native_id,process,tag,name,status,exit,module,container,cpus,time,disk,memory,attempt,submit,start,complete,duration,realtime,queue,%cpu,%mem,rss,vmem,peak_rss,peak_vmem,rchar,wchar,syscr,syscw,read_bytes,write_bytes,vol_ctxt,inv_ctxt,env,workdir,script,scratch,error_action'

    Pipelines

    A Pipeline is a series of Tools with connected inputs and outputs configured to execute in a specific order.

    hashtag
    Linking Existing Pipelines

    Linking a pipeline (Projects > your_project > Flow > Pipelines > Link) adds that pipeline to your project. This is not as a copy, but as the actual pipeline, so any changes to the pipeline are atomatically propagated to and from any project which has this pipeline linked.

    You can link a pipeline if it is not already linked to your project and it is from your tenant or available in your bundle or activation code.

    circle-info

    Activation codes are tokens which allow you to run your analyses and are used for accounting and allocating the appropriate resources. ICA will automatically determine the best matching activation code, but this can be overwritten if needed.

    If you unlink a pipeline it removes the pipline from your project, but it remains part of the list of pipelines of your tenant, so it can be linked to other projects later on.

    circle-info

    There is no way to permanently delete a pipeline.


    hashtag
    Create a Pipeline

    Pipelines are created and stored within projects.

    1. Navigate to Projects > your_project > Flow > Pipelines > +Create.

    2. Select Nextflow (XML / JSON / ) , CWL Graphical or CWL code (XML / JSON / ) to create a new Pipeline.

    3. Configure pipeline settings in the pipeline property tabs.

    circle-exclamation

    Pipelines use the latest tool definition when the pipeline was last saved. Tool changes do not automatically propagate to the pipeline. In order to update the pipeline with the latest tool changes, edit the pipeline definition by removing the tool and re-adding it back to the pipeline.

    circle-info

    Individual Pipeline files are limited to 20 Megabytes. If you need to add more than this, split your content over multiple files.

    hashtag
    Pipeline Statuses

    For pipeline authors sharing and distributing their pipelines, the draft, released, deprecated, and archived statuses provide a structured framework for managing pipeline availability, user communication, and transition planning. To change the pipeline status, select it at Projects > your_project > Pipelines > your_pipeline > change status.

    circle-info

    You can edit pipelines while they are in Draft status. Once they move away from draft, pipelines can no longer be edited. Pipelines can be cloned (top right in the details view) to create a new editable version.


    hashtag
    Pipeline Properties

    The following sections describe the properties that can be configured in each tab of the pipeline editor.

    Depending on how you design the pipeline, the displayed tabs differ between the graphical and code definitions. For CWL you have a choice on how to define the pipeline, Nextflow is always defined in code mode.

    Any additional source files related to your pipeline will be displayed here in alphabetical order.

    See the following pages for language-specific details for defining pipelines:


    hashtag
    Details

    The details tab provides options for configuring basic information about the pipeline.

    Field
    Entry

    The following information becomes visible when viewing the pipeline details.

    Field
    Entry

    The clone action will be shown in the pipeline details at the top-right. Cloning a pipeline allows you to create modifications without impacting the original pipeline. When cloning a pipeline, you become the owner of the cloned pipeline. When you clone a pipeline, you must give it a unique name because no duplicate names are allowed within all projects of the tenant. So the name must be unique per tenant. It is possible that you see the same pipeline name twice when a pipeline linked from another tenant is cloned with that same name in your tenant. The name is then still unique per tenant, but you will see them both in your tenant.

    When you clone a Nextflow pipeline, a verification of the configured Nextflow version is done to prevent the use of deprecated versions.

    hashtag
    Documentation

    The Documentation tab provides is the place where you explain how your pipeline works to users. The description appears in the tool repository but is excluded from exported CWL definitions. If no documentation has been provided, this tab will be empty.

    hashtag
    Definition (Graphical)

    When using graphical mode for the pipeline definition, the Definition tab provides options for configuring the pipeline using a visualization panel and a list of component menus.

    Menu
    Description
    circle-info

    In graphical mode, you can drag and drop inputs into the visualization panel to connect them to the tools. Make sure to connect the input icons to the tool before editing the input details in the component menu. Required tool inputs are indicated by a yellow connector.

    Safari is not supported as browser for graphical editing.

    circle-exclamation

    When creating a graphical CWL pipeline, do not use spaces in the input field names, use underscores instead. The API performs normalization of input names when running the analysis to prevent issues with special characters (such as accented letters) by replacing them with their more common (unaccented) counterpart. Part of this normalization includes replacing spaces in names with underscores. This normalization is applied to file input name, reference file input name, step id and step parameters.

    You will encounter the error ICA_API_004 "No value found for required input parameter" when trying to run an API analysis on a graphical pipeline that has been designed with spaces in input parameters.

    hashtag
    XML Configuration / JSON Inputform Files (Code)

    This page is used to specify all relevant information about the pipeline parameters.

    circle-info

    There is a limit of 200 reports per report pattern which will be shown when you have multiple reports matching your regular expression.

    hashtag
    Compute Resources

    hashtag
    Compute Nodes

    For each process defined by the workflow, ICA will launch a compute node to execute the process.

    • For each compute type, the standard (default - AWS on-demand) or economy (AWS spot instance) tiers can be selected.

    • When selecting an fpga instance type for running analyses on ICA, it is recommended to use the medium size. While the large size offers slight performance benefits, these do not proportionately justify the associated cost increase for most use cases.

    • When no type is specified, the default type of compute node is

    circle-info

    You can see which resources were used in the different analysis steps at Projects > your_project > Flow > Analyses > your_analysis > Steps tab. (For child steps, these are displayed on the parent step)

    By default, compute nodes have no scratch space. This is an advanced setting and should only be used when absolutely necessary as it will incur additional costs and may offer only limited performance benefits because it is not local to the compute node.

    For simplicity and better integration, consider using shared storage available at /ces. It is what is provided in the Small/Medium/Large+ compute types. This shared storage is used when writing files with relative paths.

    chevron-rightScratch space noteshashtag

    If you do require scratch space via a Nextflow pod annotation or a CWL resource requirement, the path is /scratch.

    • For Nextflow pod annotation: 'volumes.illumina.com/scratchSize', value: '1TiB' will reserve 1 TiB.

    hashtag
    Compute Types

    Daemon sets and system processes consume approximately 1 CPU and 2 GB Memory from the base values shown in the table. Consumption will vary based on the activity of the pod.

    circle-exclamation

    1 DRAGEN pipelines running on fpga2 compute type will incur a DRAGEN license cost of 0.10 iCredits per gigabase of data processed, with additional discounts as shown below.

    • 80 or less gigabase per sample - no discount - 0.10 iCredits per gigabase

    circle-exclamation

    The DRAGEN_Map_Align pipeline running on fpga2 has the standard DRAGEN license cost of 0.10 iCredits per Gbase processed, with but replaces the standard volume discounts with the discounts shown below.

    • 10 or less gigabase per sample - no discount - 0.10 iCredits per gigabase

    circle-info

    (2) The compute type himem-xlarge has low availability.

    triangle-exclamation

    FPGA1 instances were decommissioned on Nov 1st 2025. Please migrate to F2 for improved capacity and performance with up to 40% reduced turnaround time for analysis.

    circle-info

    (3) The transfer size selected is based on the selected storage size for compute type and used during upload and download system tasks.

    hashtag
    Nextflow/CWL Files (Code)

    Syntax highlighting is determined by the file type, but you can select alternative syntax highlighting with the drop-down selection list. The following formats are supported:

    • DIFF (.diff)

    • GROOVY (.groovy .nf)

    • JAVASCRIPT (.js .javascript)

    • JSON (.json)

    circle-info

    If the file type is not recognized, it will default to text display. This can result in the application interpreting binary files as text when trying to display the contents.

    hashtag
    Main.nf (Nextflow code)

    The Nextflow project main script.

    hashtag
    Nextflow.config (Nextflow code)

    The Nextflow configuration settings.

    hashtag
    Workflow.cwl (CWL code)

    The Common Workflow Language main script.

    hashtag
    Adding Files

    Multiple files can be added by selecting the +Create option at the bottom of the screen to make pipelines more modular and manageable.

    hashtag
    Metadata Model

    See

    hashtag
    Report

    Here patterns for detecting report files in the analysis output can be defined. On opening an analysis result window of this pipeline, an additional tab will display these report files. The goal is to provide a pipeline-specific user-friendly representation of the analysis result.

    To add a report select the + symbol on the left side. Provide your report with a unique name, a regular expression matching the report and optionally, select the format of the report. This must be the source format of the report data generated during the analysis.

    circle-info

    There is a limit of 20 reports per report pattern which will be shown when you have multiple reports matching your regular expression.


    hashtag
    Start a New Analysis

    Use the following instructions to start a new analysis for a single pipeline.

    1. Select Projects > your_project > Flow > Pipelines.

    2. Select the pipeline or pipeline details of the pipeline you want to run.

    3. Select Start Analysis.

    hashtag
    Analysis Settings

    The Start Analysis screen provides the configuration options for the analysis.

    Field
    Entry

    1 When using the API, you can to be outside of the current project.

    hashtag
    Aborting Analyses

    You can abort a running analysis from either the analysis overview (Projects > your_project > Flow > Analyses > your_analysis > Manage > Abort) or from the analysis details (Projects > your_project > Flow > Analyses > your_analysis > Details tab > Abort).

    hashtag
    View Analysis Results

    You can view analysis results on the Analyses page or in the output folder on the Data page.

    1. Select a project, and then select the Flow > Analyses page.

    2. Select an analysis.

    3. From the output files tab, expand the list if needed and select an output file.

    When creating a graphical CWL pipeline, drag connectors to link tools to input and output files in the canvas. Required tool inputs are indicated by a yellow connector.

  • Select Save.

  • User selectable for running the pipeline. This must be large enough to run the pipeline, but setting it too large incurs unnecessary costs.

    Family

    A group of pipeline versions. To specify a family, select Change, and then select a pipeline or pipeline family. To change the order of the pipeline, select Up or Down. The first pipeline listed is the default and the remainder of the pipelines are listed as Other versions. The current pipeline appears in the list as this pipeline.

    Version comment

    A description of changes in the updated version.

    Links

    External reference links. (max 100 chars as name and 2048 chars as link)

    Tool repository

    A list of tools available to be used in the pipeline.

    standard-small
    .

    For CWL, adding - class: ResourceRequirement tmpdirMin: 5000 to your requirements section will reserve 5000 MiB for CWL.

    Avoid the following as it does not align with ICAv2 scratch space configuration.

    • Container overlay tmp path: /tmp

    • Legacy paths: /ephemeral

    • Environment Variables ($TMPDIR, $TEMP and $TMP)

    • Bash Command mktemp

    • CWL runtime.tmpdir

    32

    standard-large

    standard, large

    standard-xlarge

    16

    64

    standard-xlarge

    standard, xlarge

    standard-2xlarge

    32

    128

    standard-2xlarge

    standard, 2xlarge

    standard-3xlarge

    64

    256

    standard-3xlarge

    standard, 3xlarge

    hicpu-small

    16

    32

    hicpu-small

    hicpu, small

    hicpu-medium

    36

    72

    hicpu-medium

    hicpu, medium

    hicpu-large

    72

    144

    hicpu-large

    hicpu, large

    himem-small

    8

    64

    himem-small

    himem, small

    himem-medium

    16

    128

    himem-medium

    himem, medium

    himem-large

    48

    384

    himem-large

    himem, large

    himem-xlarge2

    92

    700

    himem-xlarge

    himem, xlarge

    hiio-small

    2

    16

    hiio-small

    hiio, small

    hiio-medium

    4

    32

    hiio-medium

    hiio, medium

    fpga2-medium1

    24

    256

    fpga2-medium

    fpga2,medium

    fpga2-large1

    48

    512

    fpga2-large

    fpga2,large

    gpu-small

    8

    61

    gpu-small

    gpu, small

    gpu-medium

    32

    244

    gpu-medium

    gpu, medium

    transfer-small3

    4

    10

    transfer-small

    transfer, small

    transfer-medium 3

    8

    15

    transfer-medium

    transfer, medium

    transfer-large3

    16

    30

    transfer-large

    transfer, large

    > 80 to 160 gigabase per sample - 20% discount - 0.08 iCredits per gigabase

  • > 160 to 240 gigabase per sample - 30% discount - 0.07 iCredits per gigabase

  • > 240 to 320 gigabase per sample - 40% discount - 0.06 iCredits per gigabase

  • > 320 and more gigabase per sample - 50% discount - 0.05 iCredits per gigabase

  • DRAGEN Iterative gVCF Genotyper (iGG) will incur a license cost of 0.6216 iCredits per gigabase. For example, a sample of 3.3 gigabase human reference will result in 2 iCredits per sample. The associated Compute costs will be based on the compute instance chosen.

    The ORA (Original Read Archive) compression pipeline is part of the DRAGEN platform. It performs lossless genomic data compression to reduce the size of FASTQ and FASTQ.GZ files (up to 4-6x smaller) while preserving data integrity with internal checksum verification. The ORA compression pipeline has a license cost of 0.017 iCredits per input Gbase; decompression does not have an associated license cost.

    > 10 to 25 gigabase per sample - 30% discount - 0.07 iCredits per gigabase

  • > 25 to 60 gigabase per sample - 70% discount - 0.03 iCredits per gigabase

  • > 60 and more gigabase per sample - 85% discount - 0.015 iCredits per gigabase

  • SH (.sh)

  • SQL (.sql)

  • TXT (.txt)

  • XML (.xml)

  • YAML (.yaml .cwl)

  • Configure analysis settings. (see below)
  • Select Start Analysis.

  • View the analysis status on the Analyses page.

    • Requested—The analysis is scheduled to begin.

    • In Progress—The analysis is in progress.

    • Succeeded—The analysis is complete.

    • Failed —The analysis has failed.

    • Aborted — The analysis was aborted before completing.

  • To end an analysis, select Abort.

  • To perform a completed analysis again, select Re-run.

  • Select a subscription to which the analysis will be charged.

    Input

    Select the input files to use in the analysis. (max. 50,000)

    Settings (optional)

    Provide input settings.

    Resources

    Select the storage size for your analysis. The available storage sizes depend on your selected Pricing subscription. See for more information.

    If you want to add or remove any user or technical tags, you can do so from the data details view.
  • If you want to download the file, select Download.

  • To preview the file, select the View tab.

  • Return to Flow > Analyses > your_analysis.

  • View additional analysis result information on the following tabs:

    • Details - View information on the pipeline configuration.

    • Report - Shows the reports defined on the pipeline report tab.

    • Output files - View the output of the Analysis.

    • Steps - stderr and stdout information.

    • Nextflow timeline - Nextflow process execution timeline.

    • Nextflow execution - Nextflow analysis report. Showing the run times, commands, resource usage and tasks for Nextflow analyses.

  • Code

    The name of the pipeline. The name must be unique within the tenant, including linked and unlinked pipelines.

    Nextflow Version

    User selectable Nextflow version available only for Nextflow pipelines

    Categories

    One or more tags to categorize the pipeline. Select from existing tags or type a new tag name in the field.

    Description

    A short description of the pipeline.

    Proprietary

    Hide the pipeline scripts and details from users who do not belong to the tenant who owns the pipeline. This also prevents cloning the pipeline.

    Status

    The release status of the pipeline.

    ID

    Unique Identifier of the pipeline.

    URN

    Identification of the pipeline in Uniform Resource Name

    Machine profiles

    Compute types available to use with Tools in the pipeline.

    Shared settings

    Settings for pipelines used in more than one tool.

    Reference files

    Descriptions of reference files used in the pipeline.

    Input files

    Descriptions of input files used in the pipeline.

    Output files

    Descriptions of output files used in the pipeline.

    Tool

    Details about the tool selected in the visualization panel.

    Compute Type

    CPUs

    Mem (GiB)

    Nextflow (pod.value)

    CWL (type, size)

    standard-small

    2

    8

    standard-small

    standard, small

    standard-medium

    4

    16

    standard-medium

    standard, medium

    standard-large

    User Reference

    The unique analysis name.

    Pipeline

    This is not editable, but provides a link to the pipeline so you want to look up details of the pipeline.

    User tags (optional)

    One or more tags used to filter the analysis list. Select from existing tags or type a new tag name in the field.

    Notification (optional)

    Enter your email address if you want to be notified when the analysis completes.

    Output Folder1

    Select a folder in which the output folder of the analysis should be located. When no folder is selected, the output folder will be located in the root of the project. When you open the folder selection dialog, you have the option to create a new folder (bottom of the screen). You can create nested folders by using the folder/subfolder syntax. Do not use a / before the first folder or after the last subfolder in the folder creation dialog.

    Logs Folder

    Select a folder in which the logs of the analysis should be located. When no logs folder is selected, the logs will be stored as subfolder in the output folder. When a logs folder is selected which is different from the output folder, the outputs and logs folders are separated. Files that already exist in the logs folder will be overwritten with new versions. When you open the folder selection dialog, you have the option to create a new folder (bottom of the screen). You can create nested folders by using the folder/subfolder syntax.

    Note: Choose a folder that is empty and not in use for other analyses, as files will be overwritten.

    Note: Do not use a / before the first folder or after the last subfolder in the folder creation dialog.

    Status

    Draft

    Purpose

    Use the draft status while developing or testing a pipeline version internally.

    Best Practice

    Only share draft pipelines with collaborators who are actively involved in development.

    Status

    Released

    Purpose

    The released status signals that a pipeline is stable and ready for general use.

    Best Practice

    Share your pipeline when it is ready for broad use. Ensure users have access to current documentation and know where to find support or updates. Releasing a pipeline is only possible if all tools of that pipeline must be in released status.

    Status

    Deprecated

    Purpose

    Deprecation is used when a pipeline version is scheduled for retirement or replacement. Deprecated pipelines can not be linked to bundles, but will not be unlinked from existing bundles. Users who already have access will still be able to start analyses. You can add a message (max 256 chars) when deprecating pipelines.

    Best Practice

    Deprecate in advance of archiving a pipeline, making sure the new pipeline is available in the same bundle as the deprecated pipeline. This will allow the pipeline author to link the new or alternative pipeline in the deprecation message field.

    Status

    Archived

    Purpose

    Archiving a pipeline version removes it from active use; users can no longer launch analyses. Archived pipelines can not be linked to bundles, but are not automatically unlinked from bundles or projects. You can add a message (max 256 chars) when archiving pipelines.

    Best Practice

    Warn users in advance: Deprecate the pipeline before archiving to allow existing users time to transition. Use the archive message to point users to the new or alternative pipeline

    CWL Graphical

    • Details

    • Documentation

    • Definition

    CWL Code

    • Details

    • Documentation

    • Inputform files (JSON) or XML Configuration (XML)

    Nextflow Code

    • Details

    • Documentation

    • Inputform Files (JSON) or XML Configuration (XML)

    Git
    Git
    Nextflow
    CWL
    Metadata Models
    redirect analysis outputs

    Storage size

    8

    Pricing

    CWL

    ICA supports running pipelines defined using .

    hashtag
    Compute Type

    To specify a compute type for a CWL CommandLineTool, either define the ram and number of cores or use the resource type and size. The ICA Compute Type will automatically be determined based on coresMin/coresMax (CPU) and ramMin/ramMax (Memory) values using a "best fit" strategy to meet the minimum specified requirements (See the table to see to what the resources are mapped).

    For example, take the following

    Analysis Report
  • Metadata Model

  • Report

  • CWL Files
  • Metadata Model

  • Report

  • Nextflow files
  • Metadata Model

  • Report

  • storage size
    Storage
    ResourceRequirements
    :

    This will result in a best fit of standard-large ICA Compute Type request for the task.

    circle-info

    If the specified requirements can not be met by any of the presets, the task will be rejected and failed.

    See the example below to use the ResourceRequirement in the cwl workflow with the Predefined Compute Types

    circle-info
    • FPGA requirements can not be set by means of CWL ResourceRequirements.

    • The Machine Profile Resource in the graphical editor will override whatever is set for requirements in the ResourceRequirement.

    hashtag
    Standard vs Economy

    For each compute type, you can choose between the

    • Standard - AWS on-demandarrow-up-right (Default) or

    • Economy - AWS spot instancearrow-up-right tiers.

    You can set economy mode with the "tier" parameter

    hashtag
    Considerations

    If no Docker image is specified, Ubuntu will be used as default. Both : and / can be used as separator.

    hashtag
    CWL Overrides

    ICA supports overriding workflow requirements at load time using Command Line Interface (CLI) with JSON input. Please refer to CWL documentationarrow-up-right for more details on the CWL overrides feature.

    In ICA you can provide the "override" recipes as a part of the input JSON. The following example uses CWL overrides to change the environment variable requirement at load time.

    Common Workflow Language (CWL)arrow-up-right
    CWL ResourceRequirementarrow-up-right
    Compute Types
    requirements:
        ResourceRequirement:
          ramMin: 10240
          coresMin: 6
    requirements:
        ResourceRequirement:
            https://platform.illumina.com/rdf/ica/resources:type: fpga2
            https://platform.illumina.com/rdf/ica/resources:size: medium 
            https://platform.illumina.com/rdf/ica/resources:tier: standard
    requirements:
        ResourceRequirement:
            https://platform.illumina.com/rdf/ica/resources:type: himem
            https://platform.illumina.com/rdf/ica/resources:size: small 
            https://platform.illumina.com/rdf/ica/resources:tier: economy
    icav2 projectpipelines start cwl cli-tutorial --data-id fil.a725a68301ee4e6ad28908da12510c25 --input-json '{
      "ipFQ": {
        "class": "File",
        "path": "test.fastq"
      },
      "cwltool:overrides": {
      "tool-fqTOfa.cwl": {
        "requirements": {
          "EnvVarRequirement": {
            "envDef": {
              "MESSAGE": "override_value"
              }
            }                                       
           }
          }
        }
    }' --type-input JSON --user-reference overrides-example

    JSON Scatter Gather Pipeline

    Let's create the Nextflow Scatter Gather pipeline with a JSON input form.

    circle-info

    Pay close attention to uppercase and lowercase characters when creating pipelines.

    Select Projects > your_project > Flow > Pipelines. From the Pipelines view, click the +Create > Nextflow > JSON based button to start creating a Nextflow pipeline.

    In the Details tab, add values for the required Code (unique pipeline name) and Description fields. Nextflow Version and Storage size defaults to preassigned values.

    hashtag
    Nextflow files

    hashtag
    split.nf

    First, we present the individual processes. Select Nextflow files > + Create and label the file split.nf. Copy and paste the following definition.

    hashtag
    sort.nf

    Next, select +Create and name the file sort.nf. Copy and paste the following definition.

    hashtag
    merge.nf

    Select +Create again and label the file merge.nf. Copy and paste the following definition.

    hashtag
    main.nf

    Edit the main.nf file by navigating to the Nextflow files > main.nf tab and copying and pasting the following definition.

    Here, the operators flatten and collect are used to transform the emitting channels. The Flatten operator transforms a channel in such a way that every item of type Collection or Array is flattened so that each single entry is emitted separately by the resulting channel. The collect operator collects all the items emitted by a channel to a List and return the resulting object as a sole emission.

    hashtag
    Inputform files

    On the Inputform files tab, edit the inputForm.json to allow selection of a file.

    hashtag
    inputForm.json

    Click the Simulate button (at the bottom of the text editor) to preview the launch form fields.

    The onSubmit.js and onRender.js can remain with their default scripts and are just shown here for reference.

    hashtag
    onSubmit.js

    hashtag
    onRender.js

    Click the Save button to save the changes.

    XML Input Form

    Pipelines defined using the "Code" mode require either an XML-based or JSON-based input form to define the fields shown on the launch view in the user interface (UI). The XML-based input form is defined in the "XML Configuration" tab of the pipeline editing view.

    The input form XML must adhere to the input form schema.

    hashtag
    Empty Form

    During the creation of a Nextflow pipeline the user is given an empty form to fill out.

    hashtag
    Files

    The input files are specified within a single DataInputs node. An individual input is then specified in a separate DataInput node. A DataInput node contains following attributes:

    • code: an unique id. Required.

    • format: specifying the format of the input: FASTA, TXT, JSON, UNKNOWN, etc. Multiple entries are possible: example below. Required.

    • type: is it a FILE or a DIRECTORY? Multiple entries are not allowed. Required.

    • required: is this input required for the execution of a pipeline? Required.

    • multiValue: are multiple files as an input allowed? Required.

    • dataFilter: TBD. Optional.

    Additionally, DataInput has two elements: label for labelling the input and description for a free text description of the input.

    hashtag
    Single file input

    An example of a single file input which can be in a TXT, CSV, or FASTA format.

    hashtag
    Folder as an input

    To use a folder as an input the following form is required:

    hashtag
    Multiple files as an input

    For multiple files, set the attribute multiValue to true. This will make it so the variable is considered to be of type list [], so adapt your pipeline when changing from single value to multiValue.

    hashtag
    Settings

    Settings (as opposed to files) are specified within the steps node. Settings represent any non-file input to the workflow, including but not limited to, strings, booleans, integers, etc. The following hierarchy of nodes must be followed: steps > step > tool > parameter. The parameter node must contain following attributes:

    • code: unique id. This is the parameter name that is passed to the workflow

    • minValues: how many values (at least) should be specified for this setting. If this setting is required, minValues should be set to 1.

    • maxValues: how many values (at most) should be specified for this setting

    • classification: is this setting specified by the user?

    In the code below a string setting with the identifier inp1 is specified.

    Examples of the following types of settings are shown in the subsequent sections. Within each type, the value tag can be used to denote a default value in the UI, or can be left blank to have no default. Note that setting a default value has no impact on analyses launched via the API.

    hashtag
    Integers

    For an integer setting the following schema with an element integerType is to be used. To define an allowed range use the attributes minimumValue and maximumValue.

    hashtag
    Options

    Options types can be used to designate options from a drop-down list in the UI. The selected option will be passed to the workflow as a string. This currently has no impact when launching from the API, however.

    Option types can also be used to specify a boolean, for example

    hashtag
    Strings

    For a string setting the following schema with an element stringType is to be used.

    hashtag
    Booleans

    For a boolean setting, booleanType can be used.

    hashtag
    Limitations

    One known limitation of the schema presented above is the inability to specify a parameter that can be multiple type, e.g. File or String. One way to implement this requirement would be to define two optional parameters: one for File input and the second for String input. At the moment ICA UI doesn't validate whether at least one of these parameters is populated - this check can be done within the pipeline itself.

    Below one can find both a main.nf and XML configuration of a generic pipeline with two optional inputs. One can use it as a template to address similar issues. If the file parameter is set, it will be used. If the str parameter is set but file is not, the str parameter will be used. If neither of both is used, the pipeline aborts with an informative error message.

    process split {
        cpus 1
        memory '512 MB'
        
        input:
        path x
        
        output:
        path("split.*.tsv")
        
        """
        split -a10 -d -l3 --numeric-suffixes=1 --additional-suffix .tsv ${x} split.
        """
        }
    process sort {
        cpus 1
        memory '512 MB'
        
        input:
        path x
        
        output:
        path '*.sorted.tsv'
        
        """
        sort -gk1,1 $x > ${x.baseName}.sorted.tsv
        """
    }
    process merge {
      cpus 1
      memory '512 MB'
     
      publishDir 'out', mode: 'move'
     
      input:
      path x
     
      output:
      path 'merged.tsv'
     
      """
      cat $x > merged.tsv
      """
    }
    nextflow.enable.dsl=2
     
    include { sort } from './sort.nf'
    include { split } from './split.nf'
    include { merge } from './merge.nf'
     
    params.myinput = "test.test"
     
    workflow {
        input_ch = Channel.fromPath(params.myinput)
        split(input_ch)
        sort(split.out.flatten())
        merge(sort.out.collect())
    }
    {
      "fields": [
        {
          "id": "myinput",
          "label": "myinput",
          "type": "data",
          "dataFilter": {
            "dataType": "file",
            "dataFormat": ["TSV"]
          },
          "maxValues": 1,
          "minValues": 1
        }
      ]
    }
    function onSubmit(input) {
        var validationErrors = [];
    
        return {
            'settings': input.settings,
            'validationErrors': validationErrors
        };
    }
    function onRender(input) {
    
        var validationErrors = [];
        var validationWarnings = [];
    
        if (input.currentAnalysisSettings === null) {
            //null first time, to use it in the remainder of he javascript
            input.currentAnalysisSettings = input.analysisSettings;
        }
    
        switch(input.context) {
            case 'Initial': {
                renderInitial(input, validationErrors, validationWarnings);
                break;
            }
            case 'FieldChanged': {
                renderFieldChanged(input, validationErrors, validationWarnings);
                break;
            }
            case 'Edited': {
                renderEdited(input, validationErrors, validationWarnings);
                break;
            }
            default:
                return {};
        }
    
        return {
            'analysisSettings': input.currentAnalysisSettings,
            'settingValues': input.settingValues,
            'validationErrors': validationErrors,
            'validationWarnings': validationWarnings
        };
    }
    
    function renderInitial(input, validationErrors, validationWarnings) {
    }
    
    function renderEdited(input, validationErrors, validationWarnings) {
    }
    
    function renderFieldChanged(input, validationErrors, validationWarnings) {
    }
    
    function findField(input, fieldId){
        var fields = input.currentAnalysisSettings['fields'];
        for (var i = 0; i < fields.length; i++){
            if (fields[i].id === fieldId) {
                return fields[i];
            }
        }
        return null;
    }
    <pipeline code="" version="1.0" xmlns="xsd://www.illumina.com/ica/cp/pipelinedefinition">
        <dataInputs>
        </dataInputs>
        <steps>
        </steps>
    </pipeline>
            <pd:dataInput code="in" format="TXT, CSV, FASTA" type="FILE" required="true" multiValue="false">
                <pd:label>Input file</pd:label>
                <pd:description>Input file can be either in TXT, CSV or FASTA format.</pd:description>
            </pd:dataInput>
        <pd:dataInput code="fastq_folder" format="UNKNOWN" type="DIRECTORY" required="false" multiValue="false">
             <pd:label>fastq folder path</pd:label>
            <pd:description>Providing Fastq folder</pd:description>
        </pd:dataInput>
    <pd:dataInput code="tumor_fastqs" format="FASTQ" type="FILE" required="false" multiValue="true">
        <pd:label>Tumor FASTQs</pd:label>
        <pd:description>Tumor FASTQ files to be provided as input. FASTQ files must have "_LXXX" in its filename to denote the lane and "_RX" to denote the read number. If either is omitted, lane 1 and read 1 will be used in the FASTQ list. The tool will automatically write a FASTQ list from all files provided and process each sample in batch in tumor-only mode. However, for tumor-normal mode, only one sample each can be provided.
        </pd:description>
    </pd:dataInput>
        <pd:steps>
            <pd:step execution="MANDATORY" code="General">
                <pd:label>General</pd:label>
                <pd:description>General parameters</pd:description>
                <pd:tool code="generalparameters">
                    <pd:label>generalparameters</pd:label>
                    <pd:description></pd:description>
                    <pd:parameter code="inp1" minValues="1" maxValues="3" classification="USER">
                        <pd:label>inp1</pd:label>
                        <pd:description>first</pd:description>
                        <pd:stringType/>
                        <pd:value></pd:value>
                    </pd:parameter>
                </pd:tool>
            </pd:step>
        </pd:steps>
    <pd:parameter code="ht_seed_len" minValues="0" maxValues="1" classification="USER">
        <pd:label>Seed Length</pd:label>
        <pd:description>Initial length in nucleotides of seeds from the reference genome to populate into the hash table. Consult the DRAGEN manual for recommended lengths. Corresponds to DRAGEN argument --ht-seed-len.
        </pd:description>
        <pd:integerType minimumValue="10" maximumValue="50"/>
        <pd:value>21</pd:value>
    </pd:parameter>
    <pd:parameter code="cnv_segmentation_mode" minValues="0" maxValues="1" classification="USER">
        <pd:label>Segmentation Algorithm</pd:label>
        <pd:description> DRAGEN implements multiple segmentation algorithms, including the following algorithms, Circular Binary Segmentation (CBS) and Shifting Level Models (SLM).
        </pd:description>
        <pd:optionsType>
            <pd:option>CBS</pd:option>
            <pd:option>SLM</pd:option>
            <pd:option>HSLM</pd:option>
            <pd:option>ASLM</pd:option>
        </pd:optionsType>
        <pd:value>false</pd:value>
    </pd:parameter>
    <pd:parameter code="output_format" minValues="1" maxValues="1" classification="USER">
        <pd:label>Map/Align Output</pd:label>
        <pd:description></pd:description>
        <pd:optionsType>
            <pd:option>BAM</pd:option>
            <pd:option>CRAM</pd:option>
        </pd:optionsType>
        <pd:value>BAM</pd:value>
    </pd:parameter>
    <pd:parameter code="output_file_prefix" minValues="1" maxValues="1" classification="USER">
        <pd:label>Output File Prefix</pd:label>
        <pd:description></pd:description>
        <pd:stringType/>
        <pd:value>tumor</pd:value>
    </pd:parameter>
    <pd:parameter code="quick_qc" minValues="0" maxValues="1" classification="USER">
        <pd:label>quick_qc</pd:label>
        <pd:description></pd:description>
        <pd:booleanType/>
        <pd:value></pd:value>
    </pd:parameter>
    nextflow.enable.dsl = 2
    
    // Define parameters with default values
    params.file = false
    params.str = false
    
    // Check that at least one of the parameters is specified
    if (!params.file && !params.str) {
        error "You must specify at least one input: --file or --str"
    }
    
    process printInputs {
        
        container 'public.ecr.aws/lts/ubuntu:22.04'
        pod annotation: 'scheduler.illumina.com/presetSize', value: 'standard-small'
    
        input:
        file(input_file)
    
        script:
        """
        echo "File contents:"
        cat $input_file
        """
    }
    
    process printInputs2 {
    
        container 'public.ecr.aws/lts/ubuntu:22.04'
        pod annotation: 'scheduler.illumina.com/presetSize', value: 'standard-small'
    
        input:
        val(input_str)
    
        script:
        """
        echo "String input: $input_str"
        """
    }
    
    workflow {
        if (params.file) {
            file_ch = Channel.fromPath(params.file)
            file_ch.view()
            str_ch = Channel.empty()
            printInputs(file_ch)
        }
        else {
            file_ch = Channel.empty()
            str_ch = Channel.of(params.str)
            str_ch.view()
            file_ch.view()
            printInputs2(str_ch)
        } 
    }
    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <pd:pipeline xmlns:pd="xsd://www.illumina.com/ica/cp/pipelinedefinition" code="" version="1.0">
        <pd:dataInputs>
            <pd:dataInput code="file" format="TXT" type="FILE" required="false" multiValue="false">
                <pd:label>in</pd:label>
                <pd:description>Generic file input</pd:description>
            </pd:dataInput>
        </pd:dataInputs>
        <pd:steps>
            <pd:step execution="MANDATORY" code="general">
                <pd:label>General Options</pd:label>
                <pd:description locked="false"></pd:description>
                <pd:tool code="general">
                    <pd:label locked="false"></pd:label>
                    <pd:description locked="false"></pd:description>
                    <pd:parameter code="str" minValues="0" maxValues="1" classification="USER">
                        <pd:label>String</pd:label>
                        <pd:description></pd:description>
                        <pd:stringType/>
                        <pd:value>string</pd:value>
                    </pd:parameter>
                </pd:tool>
            </pd:step>
        </pd:steps>
    </pd:pipeline>

    Nextflow

    ICA supports running pipelines defined using . See for an example.

    In order to run Nextflow pipelines, the following process-level attributes within the Nextflow definition must be considered.

    hashtag
    System Information

    circle-exclamation

    Version 20.10 on Illumina Connected Analytics will be obsoleted on April 22nd, 2026. After this date, all existing pipelines using Nextflow v20.10 will no longer run.

    See Planned Obsolescence Noticearrow-up-right

    The following table shows when which Nextflow version is

    • default (⭐) This version will be proposed when creating a new Nextflow pipeline.

    • supported (✅) This version can be selected when you do not want the default Nextflow version.

    • deprecated (⚠️) This version can not be selected for new pipelines, but pipelines using this version will still work.

    • removed (❌). This version can not be selected when creating new pipelines and pipelines using this version will no longer work.

    The switchover happens in the January release of that year.

    hashtag
    Nextflow Version

    You can select the Nextflow version while building a pipeline as follows:

    hashtag
    Compute Type

    To specify a compute type for a Nextflow process, you can either define the cpu and memory (recommended) or use the compute type predefined sizes (required for specific hardware such as FPGA2).

    circle-info

    Do not mix these definition methods within the same process2, use either one or the other method.

    hashtag
    CPU and Memory

    Specify the task resources using Nextflow directives in both the workflow script (.nf) and the configuration file (nextflow.config) cpus defines the number of CPU cores allocated to the process, memory defines the amount of RAM which will be allocated.

    Process file example

    Configuration file example

    ICA will convert the required resources to the correct predefined size. This enables porting public Nextflow pipelines without configuration changes.

    hashtag
    Predefined Sizes

    To use the predefined sizes, use the pod directivearrow-up-right within each process. Set the annotation to scheduler.illumina.com/presetSize and the value to the desired compute type. The default compute type, when this directive is not specified, is standard-small (2 CPUs and 8 GB of memory).

    For example, if you want to use FPGA 2 medium, you need to add the line below

    circle-info

    Often, there is a need to select the compute size for a process dynamically based on user input and other factors. The Kubernetes executor used on ICA does not use the cpu and memorydirectives, so instead, you can dynamically set the pod directive, as mentioned herearrow-up-right. e.g.

    It can also be specified in the configuration filearrow-up-right. See the example configuration below:

    hashtag
    Standard vs Economy

    hashtag
    Concept

    For each compute type, you can choose between the

    • scheduler.illumina.com/lifecycle: standard - AWS on-demandarrow-up-right (Default) or

    • scheduler.illumina.com/lifecycle: economy - AWS spot instancearrow-up-right tiers.

    On-Demand Instance
    Spot Instance

    Pricing

    Fixed per second with 60-second minimum.

    than On-Demand.

    Availability

    Guaranteed capacity with Full control of starting, stopping, and terminating.

    Not guaranteed. Depends on unused AWS capacity. Can be terminated and reclaimed by AWS when the capacity is needed for other processes with 2 minutes notice.

    Best for

    Ideal for critical workloads and urgent scaling needs.

    Best for cost optimization and non-critical workloads as interruptions can occur any time.

    hashtag
    Configuration

    You can switch to economy in the process itself with the pod directive or in the nextflow.config file.

    Process example

    nextlow.config example

    hashtag
    Inputs

    Inputs are specified via the JSON-based input form or XML input form. The specified code in the XML will correspond to the field in the params object that is available in the workflow. Refer to the tutorial for an example.

    hashtag
    Outputs

    Outputs for Nextflow pipelines are uploaded from the out folder in the attached shared filesystem. The publishDir directivearrow-up-right can be used to symlink (recommended), copy or move data to the correct folder. Symlinking is faster and does not increase storage cost as it creates a file pointer instead of copying or moving data. Data will be uploaded to the ICA project after the pipeline execution completes.

    chevron-rightNextflow version 20.10.10 (Deprecated)hashtag

    Version 20.10 will be obsoleted on April 22nd, 2026. After this date, all existing pipelines using Nextflow v20.10 will no longer be able to run.

    For Nextflow version 20.10.10 on ICA, using the "copy" method in the publishDir directive for uploading output files that consume large amounts of storage may cause workflow runs to complete with missing files. The underlying issue is that file uploads may silently fail (without any error messages) during the publishDir process due to insufficient disk space, resulting in incomplete output delivery.

    Solutions:

    1. Use "" instead of "copy" in the publishDir directive. Symlinking creates a link to the original file rather than copying it, which doesn’t consume additional disk space. This can prevent the issue of silent file upload failures due to disk space limitations.

    2. Use Nextflow 22.04 or later and enable the "" publishDir option. This option ensures that the workflow will fail and provide an error message if there's an issue with publishing files, rather than completing silently without all expected outputs.

    hashtag
    Nextflow Configuration

    During execution, the Nextflow pipeline runner determines the environment settings based on values passed via the command-line or via a configuration file (see Nextflow Configuration documentationarrow-up-right). When creating a Nextflow pipeline, use the nextflow.config tab in the UI (or API) to specify a nextflow configuration file to be used when launching the pipeline.

    Syntax highlighting is determined by the file type, but you can select alternative syntax highlighting with the drop-down selection list.

    nextflowconfig-0
    circle-info

    If no Docker image is specified, Ubuntu will be used as default.

    The following configuration settings will be ignored if provided as they are overridden by the system:

    hashtag
    Best Practices

    hashtag
    Process Time

    Setting a timeout to between 2 and 4 times the expected processing time with the timearrow-up-right directive for processes or task will ensure that no stuck processes remain indefinitely. Stuck process keep incurring costs for the occupied resources, so if the process can not complete within that timespan, it is safer and more economical to end the process and retry.

    hashtag
    Sample Sheet File Ingestion

    When you want to use a sample sheet with references to files as Nextflow input, add an extra input to the pipeline. This extra input lets the user select the samplesheet-mentioned files from their project. At run time, those files will get staged in the working directory, and when Nextflow parses the samplesheet and looks for those files without paths, it will find them there. You can not use file paths in a sample sheet without selecting the files in the input form because files are only passed as file/folder ids in the API payload when the analysis is launched.

    You can include public data such as http urls because Nextflow is able download those. Nextflow is also able to download publicly accessible S3 urls (s3://...). You can not use Illumina's urn:ilmn:ica:region:... structure.

    Nextflowarrow-up-right
    this tutorial
    process foo {
        // Assuming that params.compute_size is set to a valid size such as 'standard-small', 'standard-medium', etc.
        pod annotation: 'scheduler.illumina.com/presetSize', value: "${params.compute_size}"
    }
    // Set the default pod
    pod = [
        annotation: 'scheduler.illumina.com/presetSize',
        value     : 'standard-small'
    ]
    
    withName: 'big_memory_process' {
        pod = [
            annotation: 'scheduler.illumina.com/presetSize',
            value     : 'himem-large'
        ]
    }
    
    // Use an FPGA2 instance for dragen processes
    withLabel: 'dragen' {
        pod = [
            annotation: 'scheduler.illumina.com/presetSize',
            value     : 'fpga2-medium'
        ]
    }
    process ALIGN {
        cpus = 4
        memory = '16 GB'
        script:
        """
        your_command_here
        """
    }
    process {
        withName: ALIGN {
            cpus = 4
            memory = '16 GB'
        }
    }
    pod annotation: 'scheduler.illumina.com/presetSize', value: 'fpga2-medium'
    process foo {
        pod annotation: 'scheduler.illumina.com/lifecycle', value: "economy"
    }
    process.withName: PROCESS_NAME {
        pod.annotations = [
            'scheduler.illumina.com/lifecycle': 'economy'
        ]
    }
    publishDir 'out', mode: 'symlink'
    executor.name
    executor.queueSize
    k8s.namespace
    k8s.serviceAccount
    k8s.launchDir
    k8s.projectDir
    k8s.workDir
    k8s.storageClaimName
    k8s.storageMountPath
    trace.enabled
    trace.file
    trace.fields
    timeline.enabled
    timeline.file
    report.enabled
    report.file
    dag.enabled
    dag.file
    symlinkarrow-up-right
    failOnErrorarrow-up-right
    price
    Cheaper

    Analyses

    An Analysis is the execution of a pipeline.

    hashtag
    Starting Analyses

    You can start an analysis from both the dedicated analysis screen or from the actual pipeline.

    hashtag
    From Analyses

    1. Navigate to Projects > Your_Project > Flow > Analyses.

    2. Select Start.

    3. Select a single Pipeline.

    hashtag
    From Pipelines or Pipeline details

    1. Navigate to Projects > <Your_Project> > Flow > Pipelines

    2. Select the pipeline you want to run or open the pipeline details of the pipeline which you want to run.

    3. Select Start Analysis.

    hashtag
    Aborting Analyses

    You can abort a running analysis from either the analysis overview (Projects > your_project > Flow > Analyses > your_analysis > Manage > Abort) or from the analysis details (Projects > your_project > Flow > Analyses > your_analysis > Details tab > Abort).

    hashtag
    Rerunning Analyses

    Once an analysis has been executed, you can rerun it with the same settings or choose to modify the parameters when rerunning. Modifying the parameters is possible on a per-analysis basis. When selecting multiple analyses at once, they will be executed with the original parameters. Draft pipelines are subject to updates and thus can result in a different outcome when rerunning. ICA will display a warning message to inform you of this when you try to rerun an analysis based on a draft pipeline.

    circle-info

    When rerunning an analysis, the user reference will be the original user reference (up to 231 characters), followed by _rerun_yyyy-MM-dd_HHmmss.

    When there is an XML configuration change on a a pipeline for which you want to rerun an analysis, ICA will display a warning and not fill out the parameters as it cannot guarantee their validity for the new XML. You will need to provide the input data and settings again to rerun the analysis.

    Some restrictions apply when trying to rerun an analysis.

    Analyses
    Rerun
    Rerun with modified parameters

    To rerun one or more analyses with te same settings:

    1. Navigate to Projects > Your_Project > Flow > Analyses.

    2. In the overview screen, select one or more analyses.

    3. Select Manage > Rerun. The analyses will now be executed with the same parameters as their original run.

    To rerun a single analysis with modified parameters:

    1. Navigate to Projects > Your_Project > Flow > Analyses.

    2. In the overview screen, open the details of the analysis you want to rerun by clicking on the analysis user reference.

    3. Select Rerun. (at the top right)

    hashtag
    Lifecycle

    Status
    Description
    Final State

    When an analysis is started, the availability of resources may impact the start time of the pipeline or specific steps after execution has started. Analyses are subject to delay when the system is under high load and the availability of resources is limited.

    During analysis start, ICA runs a verification on the input files to see if they are available. When it encounters files that have not completed their upload or transfer, it will report "Data found for parameter [parameter_name], but status is Partial instead of Available". Wait for the file to be available and restart the analysis.

    circle-info

    When the underlying storage provider runs out of storage resources, the Status field of the Analysis details will indicate this. There is no need to abort or rerun the analysis.

    hashtag
    Analysis steps logs

    During the execution of an analysis, logs are produced for each process involved in the analysis lifecyle. In the analysis details view, the Steps tab is used to view the steps in near real time as they're produced in the running processes. A grid layout is used for analyses with more than 50 steps, a tiled view for analyses with 50 steps or less, though you can choose to also use the grid layout for those by means of the tile/grid button on the top right of the analysis log tab. The steps tab also shows which resources were used as compute type in the different main analysis steps. (For child steps, these are displayed on the parent step)

    There are system processes involved in the lifeycle for all analyses (ie. downloading inputs, uploading outputs, etc.) and there are processes which are pipeline-specific, such as processes which execute the pipeline steps. The below table describes the system processes. You can choose to display or hide these system processes with the Show technical steps

    Process
    Description

    Additional log entries will show for the processes which execute the steps defined in the pipeline.

    Each process shows as a distinct entry in the steps view with a Queue Date, Start Date, and End Date.

    Timestamp
    Description

    The time between the Start Date and the End Date is used to calculate the duration. The time of the duration is used to calculate the usage-based cost for the analysis. Because this is an active calculation, sorting on this field is not supported.

    Each log entry in the Steps view contains a checkbox to view the stdout and stderr log files for the process. Clicking a checkbox adds the log as a tab to the log viewer where the log text is displayed and made available for download.

    hashtag
    Analysis Cost

    To see the price of an analysis in iCredits, look at Projects > your_project > Flow > Analyses > your_analysis > Details tab. The pricing section will show you the entitlement bundle, storage detail and price in iCredits once the analysis has succeeded, failed or been aborted.

    hashtag
    Log Files

    By default, the stdout and stderr files are located in the ica_logs subfolder within the analysis. This location can be changed by selecting a different in the current project at the start of the analysis. Do not use a folder which already contains log files as these will be overwritten. To set the log file location, you can also use the CreateAnalysisLogs section of the Create Analysis .

    circle-exclamation

    If you delete these files, no log information will be available on the analysis details > Steps tab.

    You can access the log files from the analysis details (projects > your_project > flow > analysis > your_analysis > details tab)

    hashtag
    Log Streaming

    Logs can also be streamed using websocket client tooling. The API to retrieve analysis step details returns websocket URLs for each step to stream the logs from stdout/stderr during the step's execution. Upon completion, the websocket URL is no longer available.

    hashtag
    Analysis Output Mappings

    circle-exclamation

    Currently, only FOLDER type output mappings are supported

    By default, analysis outputs are directed to a new folder within the project where the analysis is launched. Analysis output mappings may be specified to redirect outputs to user-specified locations consisting of project and path. An output mapping consists of:

    • the source path on the local disk of the analysis execution environment, relative to the working folder.

    • the data type, either FILE or FOLDER

    • the target project ID to direct outputs to; analysis launcher must have contributor access to the project.

    circle-exclamation

    If the output folder already exists, any existing contents with the same filenames as those output from the pipeline will be overwritten by the new analysis

    chevron-rightExamplehashtag

    In this example, 2 analysis output mappings are specified. The analysis writes data during execution in the working directory at paths out/test and out/test2. The data contained in these folders are directed to project with ID 4d350d0f-88d8-4640-886d-5b8a23de7d81 and at paths /output-testing-01/ and /output-testing-02/ respectively, relative to the root of the project data.

    The following demonstrates the construction of the request body to start an analysis with the output mappings described above:

    You can jump from the Analysis Details to the individual files and folders by opening the output files tab on the detail view (Projects > your_project > Flow > Analyses > your_analysis > Output files tab > your_output_file) and selecting Open in data.

    circle-info

    The Output files section of the analyses will always show the generated outputs, even when they have since been deleted from storage. This is done so you can always see which files were generated during the analysis. In this case it will no longer be possible to navigate to the actual output files.

    analysis output
    logs output
    Notes

    hashtag
    Tags

    You can add and remove tags from your analyses.

    1. Navigate to Projects > Your_Project > Flow > Analyses.

    2. Select the analyses whose tags you want to change.

    3. Select Manage > Manage tags.

    Both system tags and customs tags exist. User tags are custom tags which you set to help identify and process information while technical tags are set by the system for processing. Both run-in and run-out tags are set on data to identify which analyses use the data. Connector tags determine data entry methods and reference data tags identify where data is used as reference data.

    hashtag
    Hyperlinking

    If you want to share a link to an analysis, you can copy and paste the URL from your browser when you have the analysis open. The syntax of the analysis link will be <hostURL>/ica/link/project/<projectUUID>/analysis/<analysisUUID>. Likewise, workflow sessions will use the syntax <hostURL>/ica/link/project/<projectUUID>/workflowSession/<workflowsessionUUID>. To prevent third parties from accessing data via the link when it is shared or forwarded, ICA will verify the access rights of every user when they open the link.

    hashtag
    Restrictions

    Input for analysis is limited to a total of 50,000 files (including multiple copies of the same file). Concurrency limits on analyses prevent resource hogging which could result in resource starvation for other tenants. Additional analyses will be queued and scheduled when currently running analyses complete and free up positions. The theoretical limit is 20, but this can be less in practice, depending on a number of external factors.

    hashtag
    Troubleshooting

    When your analysis fails, open the analysis details view (Projects > your_project> Flow > Analyses > your_analysis) and select display failed steps. This will give you the steps view filtered on those steps that had non-0 exit codes. If there is only one failed step which has logfiles, the stderr of that step will be displayed.

    circle-info

    For pipeline developers: add automatic retrying to the individual steps that fail with error 55 / 56 (provided these steps are idempotent) See for retries.

    • Exit code 55 indicates analysis failure on economy instances due to an external event such as spot termination. You can retry the analysis.

    • Exit code 56 indicates analysis failure due to pod disruption and deletion by Kubernetes' Pod Garbage Collector (PodGC) because the node it was running on no longer exists. You can retry the anlaysis.

    Configure the analysis settings.
  • Select Start Analysis.

  • Refresh to see the analysis status. See lifecycle for more information on statuses.

  • If for some reason, you want to end the analysis before it can complete, select Projects > Your_Project > Flow > Analyses > Manage > Abort. Refresh to see the status update.

  • Configure analysis settings.
  • Select Start Analysis.

  • View the analysis status on the Analyses page. See lifecycle for more information on statuses.

  • If for some reason, you want to end the analysis before it can complete, select Manage > Abort on the Analyses page.

  • Analyses with draft pipeline

    Warn

    Warn

    Analyses with XML configuration change

    Warn

    Warn

    Update the parameters you want to change.
  • Select Start Analysis The analysis will now be executed with the updated parameters.

  • In Progress

    Analysis execution is in progress

    No

    Generating outputs

    Transferring the Analysis results

    No

    Aborting

    Analysis has been requested to be aborted

    No

    Aborted

    Analysis has been aborted

    Yes

    Failed

    Analysis has finished with error

    Yes

    Succeeded

    Analysis has finished with success

    Yes

    the target path relative to the root of the project data to write the outputs.
    When the analysis completes the outputs can be seen in the ICA UI, within the folders designated in the payload JSON during pipeline launch (output-testing-01 and output-testing-02).
    Edit the user tags, reference data tags (if applicable) and technical tags.
  • Select Save to confirm the changes.

  • Analyses using external data

    Allowed

    -

    Analyses using mount paths on input data

    Allowed

    -

    Analyses using user-provided input json

    Allowed

    -

    Analyses using advanced output mappings

    -

    Requested

    The request to start the Analysis is being processed

    No

    Queued

    Analysis has been queued

    No

    Initializing

    Initializing environment and performing validations for Analysis

    No

    Preparing Inputs

    Downloading inputs for Analysis

    Setup Environment

    Validate analysis execution environment is prepared

    Run Monitor

    Monitor resource usage for billing and reporting

    Prepare Input Data

    Download and mount input data to the shared file system

    Pipeline Runner

    Parent process to execute the pipeline definition

    Finalize Output Data

    Upload Output Data

    Queue Date

    The time when the process is submitted to the processes scheduler for execution

    Start Date

    The time when the process has started exection

    End Date

    The time when the process has stopped execution

    Default

    Default

    Logs are a subfolder of the analysis output.

    Mapped

    Default

    Logs are a subfolder of the analysis output.

    Default

    Mapped

    Outputs and logs may be separated.

    Mapped

    Mapped

    logs folder
    endpointsarrow-up-right
    tips and tricks

    -

    No

    Outputs and logs may be separated.

    ```json
    {
    ...
        "analysisOutput":
        [
            {
                "sourcePath": "out/test1",
                "type": "FOLDER",
                "targetProjectId": "4d350d0f-88d8-4640-886d-5b8a23de7d81",
                "targetPath": "/output-testing-01/"
            },
            {
                "sourcePath": "out/test2",
                "type": "FOLDER",
                "targetProjectId": "4d350d0f-88d8-4640-886d-5b8a23de7d81",
                "targetPath": "/output-testing-02/"
            }
        ]
    }
    ```

    JSON Schema

    In the InputForm.json, use the syntax for the individual components you want to as listed below. This is a listing of all the currently available components and not to be used "as is", but adapted to the inputs you need in your inputform. For more information on JSON schema syntax, please see the .

    json-schema websitearrow-up-right
    {
      "$id": "#ica-pipeline-input-form",
      "$schema": "http://json-schema.org/draft-07/schema#",
      "title": "ICA Pipeline Input Forms",
      "description": "Describes the syntax for defining input setting forms for ICA pipelines",
      "type": "object",
      "additionalProperties": false,
      "properties": {
        "fields": {
          "description": "The list of setting fields",
          "type": "array",
          "items": {
            "$ref": "#/definitions/ica_pipeline_input_form_field"
          }
        }
      },
      "required": [
        "fields"
      ],
      "definitions": {
        "ica_pipeline_input_form_field": {
          "$id": "#ica_pipeline_input_form_field",
          "type": "object",
          "additionalProperties": false,
          "properties": {
            "id": {
              "description": "The unique identifier for this field. Will be available with this key to the pipeline script.",
              "type": "string",
              "pattern": "^[a-zA-Z-0-9\\-_\\.\\s\\+\\[\\]]+$"
            },
            "type": {
              "type": "string",
              "enum": [
                "textbox",
                "checkbox",
                "radio",
                "select",
                "number",
                "integer",
                "data",
                "section",
                "text",
                "fieldgroup"
              ]
            },
            "label": {
              "type": "string"
            },
            "minValues": {
              "description": "The minimal amount of values that needs to be present. Default is 0 when not provided. Set to >=1 to make the field required.",
              "type": "integer",
              "minimum": 0
            },
            "maxValues": {
              "description": "The maximal amount of values that needs to be present. Default is 1 when not provided.",
              "type": "integer",
              "exclusiveMinimum": 0
            },
            "minMaxValuesMessage": {
              "description": "The error message displayed when minValues or maxValues is not adhered to. When not provided a default message is generated.",
              "type": "string"
            },
            "helpText": {
              "type": "string"
            },
            "placeHolderText": {
              "description": "An optional short hint (a word or short phrase) to aid the user when the field has no value.",
              "type": "string"
            },
            "value": {
             "description": "The value for the field. Can be an array for multi-value fields. For 'number' type values the exponent needs to be between -300 and +300 and max precision is 15. For 'integer' type values the value needs to between -100000000000000000 and 100000000000000000"
             },
            "minLength": {
              "type": "integer",
              "minimum": 0
            },
            "maxLength": {
              "type": "integer",
              "exclusiveMinimum": 0
            },
            "min": {
              "description": "Minimal allowed value for 'integer' and 'number' type. Exponent needs to be between -300 and +300 and max precision is 15.",
              "type": "number"
            },
            "max": {
              "description": "Maximal allowed value for 'integer' and 'number' type. Exponent needs to be between -300 and +300 and max precision is 15.",
              "type": "number"
            },
            "choices": {
              "type": "array",
              "items": {
                "$ref": "#/definitions/ica_pipeline_input_form_field_choice"
              }
            },
            "fields": {
              "description": "The list of setting sub fields for type fieldgroup",
              "type": "array",
              "items": {
                "$ref": "#/definitions/ica_pipeline_input_form_field"
              }
            },
            "dataFilter": {
              "description": "For defining the filtering when type is 'data'.",
              "type": "object",
              "additionalProperties": false,
              "properties": {
                "nameFilter": {
                  "description": "Optional data filename filter pattern that input files need to adhere to when type is 'data'. Eg parts of the expected filename",
                  "type": "string"
                },
                "dataFormat": {
                  "description": "Optional dataformat name array that input files need to adhere to when type is 'data'",
                  "type": "array",
                  "contains": {
                    "type": "string"
                  }
                },
                "dataType": {
                  "description": "Optional data type (file or directory) that input files need to adhere to when type is 'data'",
                  "type": "string",
                  "enum": [
                    "file",
                    "directory"
                  ]
                }
              }
            },
            "regex": {
              "type": "string"
            },
            "regexErrorMessage": {
              "type": "string"
            },
            "hidden": {
              "type": "boolean"
            },
            "disabled": {
              "type": "boolean"
            },
            "emptyValuesAllowed": {
              "type": "boolean",
              "description": "When maxValues is greater than 1 and emptyValuesAllowed is true, the values may contain null entries. Default is false."
            },
            "updateRenderOnChange": {
              "type": "boolean",
              "description": "When true, the onRender javascript function is triggered ech time the user changes the value of this field. Default is false."
            },
            "streamable": {
              "type": "boolean",
              "description": "EXPERIMENTAL PARAMETER! Only possible for fields of type 'data'. When true, the data input files will be offered in streaming mode to the pipeline instead of downloading them."
            },
          "required": [
            "id",
            "type"
          ],
          "allOf": [
            {
              "if": {
                "description": "When type is 'textbox' then 'dataFilter', 'fields', 'choices', 'max' and 'min' are not allowed",
                "properties": {
                  "type": {
                    "enum": [
                      "textbox"
                    ]
                  }
                },
                "required": [
                  "type"
                ]
              },
              "then": {
                "propertyNames": {
                  "not": {
                    "enum": [
                      "dataFilter",
                      "fields",
                      "choices",
                      "max",
                      "min"
                    ]
                  }
                }
              }
            },
            {
              "if": {
                "description": "When type is 'checkbox' then 'dataFilter', 'fields', 'choices', 'placeHolderText', 'regex', 'regexErrorMessage', 'maxLength', 'minLength', 'max' and 'min' are not allowed",
                "properties": {
                  "type": {
                    "enum": [
                      "checkbox"
                    ]
                  }
                },
                "required": [
                  "type"
                ]
              },
              "then": {
                "propertyNames": {
                  "not": {
                    "enum": [
                      "dataFilter",
                      "fields",
                      "choices",
                      "placeHolderText",
                      "regex",
                      "regexErrorMessage",
                      "maxLength",
                      "minLength",
                      "max",
                      "min"
                    ]
                  }
                }
              }
            },
            {
              "if": {
                "description": "When type is 'radio' then 'dataFilter', 'fields', 'placeHolderText', 'regex', 'regexErrorMessage', 'maxLength', 'minLength', 'max' and 'min' are not allowed",
                "properties": {
                  "type": {
                    "enum": [
                      "radio"
                    ]
                  }
                },
                "required": [
                  "type"
                ]
              },
              "then": {
                "propertyNames": {
                  "not": {
                    "enum": [
                      "dataFilter",
                      "fields",
                      "placeHolderText",
                      "regex",
                      "regexErrorMessage",
                      "maxLength",
                      "minLength",
                      "max",
                      "min"
                    ]
                  }
                }
              }
            },
            {
              "if": {
                "description": "When type is 'select' then 'dataFilter', 'fields', 'regex', 'regexErrorMessage', 'maxLength', 'minLength', 'max' and 'min' are not allowed",
                "properties": {
                  "type": {
                    "enum": [
                      "select"
                    ]
                  }
                },
                "required": [
                  "type"
                ]
              },
              "then": {
                "propertyNames": {
                  "not": {
                    "enum": [
                      "dataFilter",
                      "fields",
                      "regex",
                      "regexErrorMessage",
                      "maxLength",
                      "minLength",
                      "max",
                      "min"
                    ]
                  }
                }
              }
            },
            {
              "if": {
                "description": "When type is 'number' or 'integer' then 'dataFilter', 'fields', 'choices', 'regex', 'regexErrorMessage', 'maxLength' and 'minLength' are not allowed",
                "properties": {
                  "type": {
                    "enum": [
                      "number",
                      "integer"
                    ]
                  }
                },
                "required": [
                  "type"
                ]
              },
              "then": {
                "propertyNames": {
                  "not": {
                    "enum": [
                      "dataFilter",
                      "fields",
                      "choices",
                      "regex",
                      "regexErrorMessage",
                      "maxLength",
                      "minLength"
                    ]
                  }
                }
              }
            },
            {
              "if": {
                "description": "When type is 'data' then 'dataFilter' is required and 'fields', 'choices', 'placeHolderText', 'regex', 'regexErrorMessage', 'maxLength', 'minLength', 'max' and 'min' are not allowed",
                "properties": {
                  "type": {
                    "enum": [
                      "data"
                    ]
                  }
                },
                "required": [
                  "type"
                ]
              },
              "then": {
                "required": [
                  "dataFilter"
                ],
                "propertyNames": {
                  "not": {
                    "enum": [
                      "fields",
                      "choices",
                      "placeHolderText",
                      "regex",
                      "regexErrorMessage",
                      "max",
                      "min",
                      "maxLength",
                      "minLength"
                    ]
                  }
                }
              }
            },
            {
              "if": {
                "description": "When type is 'section' or 'text' then 'disabled', 'fields', 'updateRenderOnChange', 'classification', 'value', 'minValues', 'maxValues', 'minMaxValuesMessage', 'dataFilter', 'choices', 'placeHolderText', 'regex', 'regexErrorMessage', 'maxLength', 'minLength', 'max' and 'min' are not allowed",
                "properties": {
                  "type": {
                    "enum": [
                      "section",
                      "text"
                    ]
                  }
                },
                "required": [
                  "type"
                ]
              },
              "then": {
                "propertyNames": {
                  "not": {
                    "enum": [
                      "disabled",
                      "fields",
                      "updateRenderOnChange",
                      "classification",
                      "value",
                      "minValues",
                      "maxValues",
                      "minMaxValuesMessage",
                      "dataFilter",
                      "choices",
                      "regex",
                      "placeHolderText",
                      "regexErrorMessage",
                      "maxLength",
                      "minLength",
                      "max",
                      "min"
                    ]
                  }
                }
              }
            },
            {
              "if": {
                "description": "When type is 'fieldgroup' then 'fields' is required and then 'dataFilter', 'choices', 'placeHolderText', 'regex', 'regexErrorMessage', 'maxLength', 'minLength', 'max' and 'min' and 'emptyValuesAllowed' are not allowed",
                "properties": {
                  "type": {
                    "enum": [
                      "fieldgroup"
                    ]
                  }
                },
                "required": [
                  "type",
                  "fields"
                ]
              },
              "then": {
                "propertyNames": {
                  "not": {
                    "enum": [
                      "dataFilter",
                      "choices",
                      "placeHolderText",
                      "regex",
                      "regexErrorMessage",
                      "maxLength",
                      "minLength",
                      "max",
                      "min",
                      "emptyValuesAllowed"
                    ]
                  }
                }
              }
            }
          ]
        },
        "ica_pipeline_input_form_field_choice": {
          "$id": "#ica_pipeline_input_form_field_choice",
          "type": "object",
          "additionalProperties": false,
          "properties": {
            "value": {
            "description": "The value which will be set when selecting this choice. Must be unique over the choices within a field"
            },
            "text": {
              "description": "The display text for this choice, similar as the label of a field. ",
              "type": "string"
            },
            "selected": {
              "description": "Optional. When true, this choice value is picked as default selected value.  As in selected=true has precedence over an eventual set field 'value'. For clarity it's better however not to use 'selected' but use field 'value' as is used to set default values for the other field types.  Only maximum 1 choice may have selected true.",
              "type": "boolean"
            },
            "disabled": {
              "type": "boolean"
            },
            "parent": {
              "description": "Value of the parent choice item. Can be used to build hierarchical choice trees."
            }
          },
          "required": [
            "value",
            "text"
          ]
        }
      }
    }
    }

    JSON-Based input forms

    hashtag
    Introduction

    Pipelines defined using the "Code" mode require an XML or JSON-based input form to define the fields shown on the launch view in the user interface (UI).

    To create a JSON-based Nextflow (or CWL) pipeline, go to Projects > your_project > Flow > Pipelines > +Create > Nextflow (or CWL) > JSON-based.

    Three files, located on the inputform files tab, work together for evaluating and presenting JSON-based input.

    • inputForm.json contains the actual input form which is rendered when starting the pipeline run.

    • onRender.js is triggered when a value is changed.

    • onSubmit.js is triggered when starting a pipeline via the GUI or API.

    Use + Create to add additional files and Simulate to test your inputForms.

    Scripting execution supports crossfield validation of the values, hiding fields, making them required, .... based on value changes.


    hashtag
    inputForm.json

    The JSON schema allowing you to define the input parameters. See the page for syntax details.

    circle-info

    The inputForm.json file has a size limit of 10 MB and a maximum of 200 fields.

    hashtag
    Parameter types

    Type
    Usage

    hashtag
    Parameter Attributes

    These attributes can be used to configure all parameter types.

    Attribute
    Purpose
    chevron-rightTree structure examplehashtag

    "choices" can be used for a single list or for a tree-structured list. See below for an example for how to set up a tree structure.

    hashtag
    Experimental Features

    Feature

    hashtag
    onSubmit.js

    The onSubmit.js javascript function receives an input object which holds information about the chosen values of the input form and the pipeline and pipeline execution request parameters. This javascript function is not only triggered when submitting a new pipeline execution request in the user interface, but also when submitting one through the rest API..

    hashtag
    Input parameters

    Value
    Meaning

    hashtag
    Return values (taken from the response object)

    Value
    Meaning

    hashtag
    AnalysisError

    This is the object used for representing validation errors.

    Value
    Meaning

    hashtag
    onRender.js

    Receives an input object which contains information about the current state of the input form, the chosen values and the field value change that triggered the onrender call. It also contains pipeline information. Changed objects are present in the onRender return value object. Any object not present is considered to be unmodified. Changing the storage size in the start analysis screen triggers an onRender execution with storageSize as changed field.

    hashtag
    Input Parameters

    hashtag
    Return values (taken from the response object)

    Value
    Meaning

    hashtag
    RenderMessage

    This is the object used for representing validation errors and warnings. The attributes can be used with first letter lowercase (consistent with the input object attributes) or uppercase.

    Value
    Meaning

    Data such as files.

    section

    For splitting up fields, to give structure. Rendered as subtitles. No values are to be assigned to these fields.

    text

    To display informational messages. No values are to be assigned to these fields.

    fieldgroup

    Can contain parameters or other groups. Allows to have repeating sets of parameters, for instance when a father|mother|child choice needs to be linked to each file input. So if you want to have the same elements multiple times in your form, combine them into a fieldgroup. Does not support the emptyValuesAllowed attribute.

    value

    The value of the parameter. Can be considered default value.

    minLength

    Only applied on type="textbox". Value is a positive integer.

    maxLength

    Only applied on type="textbox". Value is a positive integer.

    min

    Minimal allowed value for 'integer' and 'number' type.

    • for 'integer' type fields the minimal and maximal values are -100000000000000000 and 100000000000000000.

    • for 'number' type fields the max precision is 15 significant digits and the exponent needs to be between -300 and +300.

    max

    Maximal allowed value for 'integer' and 'number' type.

    • for 'integer' type fields the minimal and maximal values are -100000000000000000 and 100000000000000000.

    • for 'number' type fields the max precision is 15 significant digits and the exponent needs to be between -300 and +300.

    choices

    A list of choices with for each a "value", "text" (is label), "selected" (only 1 true supported), "disabled". "parent" can be used to build hierarchical choicetrees. "availableWhen" can be used for conditional presence of the choice based on values of other fields. Parent and value must be unique, you can not use the same value for both.

    fields

    The list of sub fields for type fieldgroup.

    dataFilter

    For defining the filtering when type is 'data'. Use nameFilter for matching the name of the file, dataFormat for file format and dataType for selecting between files and directories. Tip: To see the data formats, open the file details in ICA and look at the Format on the data details. You can expand the dropdown list to see the syntax.

    regex

    The regex pattern the value must adhere to. Only applied on type="textbox".

    regexErrorMessage

    The optional error message when the value does not adhere to the "regex". A default message will be used if this parameter is not present. It is highly recommended to set this as the default message will show the regex which is typically very technical.

    hidden

    Makes this parameter hidden. Can be made visible later in onRender.js or can be used to set hardcoded values of which the user should be aware.

    disabled

    Shows the parameter but makes editing it impossible. The value can still be altered by onRender.js.

    emptyValuesAllowed

    When maxValues is 1 or not set and emptyValuesAllowed is true, the values may contain null entries. Default is false.

    updateRenderOnChange

    When true, the onRender javascript function is triggered each time the user changes the value of this field. Default is false.

    dropValueWhenDisabled

    When this is present and true and the field has disabled being true, then the value will be omitted during the submit handling (on the onSubmit result).

    The input form json as saved in the pipeline. So the original json, without eventual changes.

    currentAnalysisSettings

    The current input form JSON as rendered to the user. This can contain already applied changes form earlier onRender passes. Null in the first call, when context is 'Initial' or when analysis is created through CLI/API.

    The storage size as chosen by the user. This will initially be null. StorageSize is an object containing an 'id' and 'name' property.

    storageSizeOptions

    The list of storage sizes available to the user when creating an analysis. Is a list of StorageSize objects containing an 'id' and 'name' property.

    textbox

    Corresponds to stringType in xml.

    checkbox

    A checkbox that supports the option of being required, so can serve as an active consent feature. (corresponds to the booleanType in xml).

    radio

    A radio button group to select one from a list of choices. The values to choose from must be unique.

    select

    A dropdown selection to select one from a list of choices. This can be used for both single-level lists and tree-based lists.

    number

    The value is of Number type in javascript and Double type in java. (corresponds to doubleType in xml).

    integer

    Corresponds to java Integer.

    label

    The display label for this parameter. Optional but recommended, id will be used if missing.

    minValues

    The minimal amount of values that needs to be present. Default when not set is 0. Set to >=1 to make the field required.

    maxValues

    The maximal amount of values that need to be present. Default when not set is 1.

    minMaxValuesMessage

    The error message displayed when minValues or maxValues is not adhered to. When not set, a default message is generated.

    helpText

    A helper text about the parameter. Will be displayed in smaller font with the parameter.

    placeHolderText

    An optional short hint ( a word or short phrase) to aid the user when the field has no value.

    Streamable inputs

    Adding "streamable":true to an input field of type "data" makes it a streamable input.

    settings

    The value of the setting fields. Corresponds to settingValues in the onRender.js. This is a map with field id as key and an array of field values as value. For convenience, values of single-value fields are present as the individual value and not as an array of length 1. In case of fieldGroups, the value can be multiple levels of arrays. For fields of type data the values in the json are data ids (fil.xxxx). To help with validation, these are expanded and made available as an object here containing the id, name, path, format, size and a boolean indicating whether the data is external. This info can be used to validate or pick the chosen storageSize.

    settingValues

    To maximize the opportunity for reusing code between onRender and onSubmit, the 'settings' are also exposed as settingValues like in the onRender input.

    pipeline

    Info about the pipeline: code, tenant, version, and description are all available in the pipeline object as string.

    analysis

    Info about this run: userReference, userName, and userTenant are all available in the analysis object as string.

    storageSize

    The storage size as chosen by the user. This will initially be null. StorageSize is an object containing an 'id' and 'name' property.

    storageSizeOptions

    The list of storage sizes available to the user when creating an analysis. Is a list of StorageSize objects containing an 'id' and 'name' property.

    settings

    The value of the setting fields. This allows modifying the values or applying defaults and such. Or taking info of the pipeline or analysis input object. When settings are not present in the onSubmit return value object, they are assumed to be not modified.

    validationErrors

    A list of AnalysisError essages representing validation errors. Submitting a pipeline execution request is not possible while there are still validation errors.

    analysisSettings

    The input form json with potential applied changes. The discovered changes will be applied in the UI when viewing the analysis.

    fieldId / FieldId

    The field which has an erroneous value. When not present, a general error/warning is displayed. To display an error on the storage size, use the storageSizeFieldid.

    index / Index

    The 0-starting index of the value which is incorrect. Use this when a particular value of a multivalue field is not correct. When not present, the entire field is marked as erroneous. The value can also be an array of indexes for use with fieldgroups. For instance, when the 3rd field of the 2nd instance of a fieldgroup is erroneous, a value of [ 1 , 2 ] is used.

    message / Message

    The error/warning message to display.

    context

    "Initial"/"FieldChanged"/"Edited".

    • Initial is the value when first displaying the form when a user opens the start run screen.

    • The value is FieldChanged when a field with 'updateRenderOnChange'=true is changed by the user.

    • Edited (Not yet supported in ICA) is used when a form is displayed later again, this is intended for draft runs or when editing the form during reruns.

    changedFieldId

    The id of the field that changed and which triggered this onRender call. context will be FieldChanged. When the storage size is changed, the fieldId will be storageSize.

    analysisSettings

    The input form json as saved in the pipeline. This is the original json, without changes.

    currentAnalysisSettings

    The current input form json as rendered to the user. This can contain already applied changes form earlier onRender passes. Null in the first call, when context is Initial.

    settingValues

    The current value of all settings fields. This is a map with field id as key and an array of field values as value for multivalue fields. For convenience, values of single-value fields are present as the individual value and not as an array of length 1. In case of fieldGroups, the value can be multiple levels of arrays. For fields of type data the values in the json are data ids (fil.xxxx). To help with validation, these are expanded and made available as an object here containing the id, name, path, format, size and a boolean indicating whether the data is external. This info can be used to validate or pick the chosen storageSize.

    pipeline

    Information about the pipeline: code, tenant, version, and description are all available in the pipeline object as string.

    analysis

    Information about this run: userReference, userName, and userTenant are all available in the analysis object as string.

    analysisSettings

    The input form json with potential applied changes. The discovered changes will be applied in the UI.

    settingValues

    The current, potentially altered map of all setting values. These will be updated in the UI.

    validationErrors

    A list of RenderMessages representing validation errors. Submitting a pipeline execution request is not possible while there are still validation errors.

    validationWarnings

    A list of RenderMessages representing validation warnings. A user may choose to ignore these validation warnings and start the pipeline execution request.

    storageSize

    The suitable value for storageSize. Must be one of the options of input.storageSizeOptions. When absent or null, it is ignored.

    validation errors and validation warnings can use 'storageSize' as fieldId to let an error appear on the storage size field. 'storageSize' is the value of the changedFieldId when the user alters the chosen storage size.

    fieldId / FieldId

    The field which has an erroneous value. When not present, a general error/warning is displayed. To display an error on the storage size, use the storageSizeFieldid.

    index / Index

    The 0-starting index of the value which is incorrect. Use this when a particular value of a multivalue field is not correct. When not present, the entire field is marked as erroneous. The value can also be an array of indexes for use with fieldgroups. For instance, when the 3rd field of the 2nd instance of a fieldgroup is erroneous, a value of [ 1 , 2 ] is used.

    message / Message

    The error/warning message to display.

    inputForm.json

    data

    analysisSettings

    storageSize

    {
      "fields": [
        {
          "id": "myTreeList",
          "type": "select",
          "label": "Selection Tree Example",
          "choices": [
            {
              "text": "trunk",
              "value": "treetrunk"
            },
            {
              "text": "branch",
              "value": "treebranch",
              "parent":"treetrunk"
            },
            {
              "text": "leaf",
              "value": "treeleaf",
              "parent":"treebranch"
            },
            {
              "text": "bird",
              "value": "happybird",
              "parent":"treebranch"
            },
            {
              "text": "cat",
              "value": "cat",
              "parent": "treetrunk",
              "disabled": true
            }
          ],
          "minValues": 1,
          "maxValues": 3,
          "helpText": "This is a tree example"
        }
      ]
    }