Data Upload from User Storage (Connected Insights - Cloud and Connected Insights - Local)
Connected Insights - Cloud Data Uploader tool supports uploading of VCF files and analysis output for user-provided machines and storage devices and can be downloaded from Configuration -> Data Upload section of the application.
Connected Insights - Local Connected Insights - Local includes the Data Uploader tool that is installed on the DRAGEN server v4 as part of the Connected Insights - Local installation and reads secondary analysis results from the external storage drive that is configured. The Data Uploader tool is detected and identified as Default-CLI-Installation in the Data Upload section of the Configuration page. If Default-CLI-Installation does not show as Online, refer to Software Errors and Corrective Actions.
The Data Uploader logs can be found at /staging/ici/logs/tss-cli/
. With Connected Insights - Local, these logs can also be downloaded from the Activity tab in the Case Details pane on the Cases page. For more information, refer to Case Details.
The secondary analysis input logs can be found /ExternalStorageDriveMountPath>/d53e4b2d-0428-4b3e-92bf-955f7153c360/d53e4b2d-0428-4b3e-92bf-955f7153c360/upload-logs/<DataUploadConfiguredMonitoringLocation>/<SecondaryAnalysisRunFolder>/run_<Timestamp>.json
Setup (Connected Insights - Cloud Only)
This section identifies the requirements for uploading data from user storage, enabling ingestion from the network drive, and adding an existing pipeline. For Connected Insights - Cloud, this section also covers how to download and install the Data Uploader tool and has instructions for generating an API key.
Requirements
To upload data from user storage, the following requirements must be met:
Make sure you have Java 8 or a newer version installed on your computer. You can check by opening a terminal or command prompt and typing java -version. If you don't have it installed, you can download and install it from the official Java website.
Data Uploader with Java Virtual Machine (JVM) 8+ that is compatible with Mac, Windows, or Linux CentOS.
An API key and access to the workgroup to which the data is uploaded. The API key comes with the installer. To generate a newAPI key, refer to [Optional] Generate an API Key.
Administrator access on the computer where Data Uploader is installed.
Before you begin, perform the following actions:
Create custom case data.
Create test components.
Create test definitions.
For more information, refer to Configuration.
Download and Install Data Uploader
Download and install the Data Uploader tool as follows.
In Connected Insights, select the gear at the top right of the page.
In the General tab, under Data Upload, select the From Local Storage tab.
Under Download and Launch the Data Uploader, select the storage device operating system from the drop-down list.
Select Download Data Uploader. A progress bar displays during the download.
Copy the installation directory of the storage device. Make sure that Data Uploader is in a location that has access to your secondary analysis output folder.
Use the following tar command to extract the files: Replace
{ici uploader script}
with the applicable file name.tar xvzf {ici uploader script}.tar.gz
❗ Extracting files can vary depending on the operating system. Most terminals support the tar command. For Windows, you can use a zip file extraction application (eg, 7-Zip) to extract the content of the tar folder.
Make sure that the files in the following table are in the unzipped Data Uploader file.
Component
Description
ici-uploader-daemon.jar
Used to run Data Uploader as a daemon.
ici-uploader.jar
Used to run Data Uploader on demand.
uploader-config.json
This file contains the configuration for Data Uploader to allow it to connect to Connected Insights.
ici-uploader.exe
The installer for ici-uploader on Mac or Windows.
Mac: com.illumina.isi.daemon.plist
The installer for ici-uploader on Mac or Windows.
Linux: com.illumina.isi.daemon.service
The installer for ici-uploader on Mac or Windows.
ici-uploader or ici-uploader.exe
The installer for ici-uploader on Mac or Windows.
wrapper.exe
This file is used to set up Data Uploader as a system service on Windows.
README.txt
Third-party licensing information for Data Uploader.
Install Data Uploader as follows:
a. [Windows] Start the command prompt as an administrator and run
ici-uploader.exe install
.❗ For windows environment, user must be in the installation directory to install/uninstall the application.
ici-uploader.exe start-daemon
b. [Mac and Linux] Run the following command on the terminal
./ici-uploader install
❗ If the installation fails, start the Connected Insights installation manually using
ici-uploader start-daemon
c. Follow any prompts in Data Uploader. d. Enter your API key and press Enter.
❗ If you did not download with an auto-generated API key, provide one to the installer when prompted. If you are installing Data Uploader for the first time, you must generate an API key. For more information, refer to [Optional] Generate an API Key.
e. Under Define and Monitor Data Uploads, make sure that your machine displays.
Data Uploader is now set up for auto and manual ingestion. You can also change the name of the installed Data Uploader by selecting ... and Edit Server Name . If you must download any logs associated with the Data Uploader from the last 24 hours, select ... and Download Logs.
❗ If the user is not a system administrator, the daemon can be started with the
ici-uploader start-daemon
command for Mac/Linux or with theici-uploader.exe start-daemon
command for Windows. This command does not run the process as a system service and requires you to start and stop the service manually. To stop the daemon, run the commandici-uploader stop-daemon
After Data Uploader has been downloaded and installed on the user storage, you can set up configurations for the tool to upload data into Connected Insights automatically. Each configured pipeline monitors user storage for new molecular data.
f. User can stop the ici-uploader (running as system service) using the following command ici-uploader stop-service
command for Mac/Linux or with the ici-uploader.exe stop-service
command for Windows. User can start the uploader via ici-uploader start-service
.
❗If data-uploader is already running as system-service, and user also runs the uploader manually via
ici-uploader start-daemon
, it may cause issues. User must stop already running data-uploader withici-uploader stop-service
before starting uploader manually.
Enable Upload from Network Drive (Connected Insights - Cloud)
If you are using Linux, automatic case upload is already enabled and this section is not applicable. For Mac, contact Illumina Technical Support. For Windows, run the "illumina ICI Auto Launch Service" to make the network drive available for Data Uploader as follows:
When the Data Uploader is running, open the Services application from the Start menu and locate the "illumina ICI Auto Launch Service".
Right-click "illumina ICI Auto Launch Service" and select Properties from the drop-down menu.
In Properties, select the Log On tab and enter your account ID and password. Confirm the password.
Select the General tab.
Select Stop, then select Start to start the service. The network drive is available for ingestion for Data Uploader.
After completing these steps, the Data Uploader details are visible in Connected Insights.
Enable Ingestion from Proxy Server Setting
If the computer where the Data uploader is installed is behind a proxy server, then Uploader proxy setting must be enabled. Before you install the Data Uploader, run the following command on the terminal (Linux).
❗ export JDK_JAVA_OPTIONS='-Dhttps.proxyHost=<IP address of the proxy server e.g. 1.2.3.4> -Dhttps.proxyPort=<Port of the proxy server e.g. 8080>'
For Windows, it can be set via following command.
❗ setx JDK_JAVA_OPTIONS "-Dhttps.proxyHost=<IP Address of the proxy server e.g. 1.2.3.4> -Dhttps.proxyPort=<Port of the proxy server e.g. 8080>"
Remember to replace and with the actual values you need.
Select Compatible Pipeline
In Configuration Settings, select the radio button next to Choose compatible pipeline from catalog, refer to Supported Pipelines.
Select a pipeline from the drop-down list (for example, DRAGEN TruSight Oncology 500 Analysis Software v2.5.2).
❗ When running the DRAGEN Somatic Whole Genome pipeline in the Tumor-Only mode, you must set
--output-file-prefix
to matchsample-id
(the RGSM of the sample in the FASTQ list) of the run.For Test Definition, select the applicable definition.
For Choose a folder to monitor for case metadata (optional), enter the path for the folder in the secondary analysis folder created by Data Uploader.
Select Save.
Generate an Api Key
If you are using Data Uploader for the first time, then you must generate a new API key. All Data Uploader operations require an API key for authentication. The Data Uploader bundle can be downloaded with an autogenerated API key. If the bundle includes an API key, skip this section.
❗ If you are installing Data Uploader on multiple machines, manually create and track your API key by running the following auto-updating and manual run command:
ici-uploader configure --api-key={apiKey}
Make sure that the API key is within single quotation marks (for example,'{SYSTEM_API-KEY}'
)
In Connected Insights, select Manage API Keys from the Account drop-down menu.
Select Generate.
Enter a name for the API key.
To generate a global API key, select All workgroups and roles.
Select Generate.
In the API Key Generated window, select one of the following options:
Show — Reveals the API key.
Download API Key — Downloads the API key in .TXT file format.
Select Close after you have stored the API key.
❗ The API key cannot be viewed again after closing this window. Download the API key or save it in a secure location.
The API key is added to the Manage API keys list.
Perform any of the following actions in the Manage API Keys list:
Select Regenerate to generate a new API with the existing API key name.
Select Edit to edit the API key name or change the workgroups and roles selection.
Select Delete to delete the API key.
Automation of Data Upload from User Storage ( Connected Insights - Cloud and Connected Insights - Local )
The following information is applicable to both Connected Insights - Local and Connected Insights - Cloud. Create each configuration as follows.
Input the {monitoring location} for secondary analysis output.
The monitoring location is the full path to the location where secondary analysis output data is deposited in user's storage.
For example:
/rest/of/storage path/{monitoring location}/{runFolder}/{sample folders, sample sheet, inputs, tumor_fastq_list.csv}
For DRAGEN server v4 standalone results:
/rest/of/storage path/{monitoring location}/{sample name folders}/{sample sheet, inputs, VCFs, .bams}
Associate the workflow schema for this pipeline. a. Under Choose compatible pipeline from the catalog, select the applicable pipeline from the drop-down list (for example, DRAGEN WGS Somatic v4.2). b. If you are running a custom workflow, then upload a workflow schema that corresponds to the data that the configuration uploads. For more information, refer to Custom Pipeline Configuration.
Select the test definition to be associated with cases created by this configuration. For more information, refer to Test Definition Setup.
[Optional] Input the location where custom case data is stored. This location is the full path to where custom case data is deposited in user storage. For more information, refer to Custom Case Data Upload.
Select Save to complete the configuration. If Data Uploader is running, it monitors this configuration for data upload.
Perform any of the following actions below:
Edit You can modify the pipeline configuration by clicking "Edit" against a configuration.
Note: The change in configuration does not impact the Case that are already ingested.
Delete You can delete the pipeline configuration by clicking "Delete" against a configuration.
Requeue (Connected Insights - Local) You can resume or reupload a secondary analysis output folder by clicking "Requeque" against a configuration. Upon clicking "Requeue", application will re-attempt to create the case from the run folders which previoulsy failed due to one of the below errors:
SampleSheet validation error
Case ingestion has stopped due to zero GEs balance
Case ingestion has stopped due to low space on external mounted storage or /staging
On-Demand Data Upload from User Storage (Connected Insights - Cloud Only)
Upload an analysis output by running the following command:
ici-uploader analysis upload --folder={path-to-analysis} --pipeline-name={pipelineName}
--folder {path-to-analysis}
— The absolute path to the analysis output to upload into Connected Insights. This folder contains the sample sheet.--pipeline-name={pipelinename}
— The name of the pipeline created in Connected Insights to apply to cases uploaded from this analysis. Pipeline names must include only letters, numbers, underscores, and dashes. The name cannot include spaces or special characters.[Optional]
--runId={path-to-config}
— The id of the run to be created in place of the run ID determined by the run folder name.[Optional]
--pair-id={pair-id}
— The pair-id of the analysis to upload from the Sample Sheet when limiting upload to a single analysis.[Optional]
--case-display-id={case-display-id}
— The id of the case to be created in place of the pair-id when uploading a single analysis with--pair-id={pair-id}
.[Optional]
ici-uploader logs show
— Run to display the logs in ICI_Data_Upload_log.json.[Optional]
Downloads > ici-uploader logs download
— Download Data Uploader logs in a zipped folder by running the command prompt.
Upload the custom case data file associated to one or more cases by running the following command:
ici-uploader case-data --filePath={absolute-path-of-csv-file}
For more information on custom case data files, refer to Custom Case Data Upload.
Case Processing Performance (Connected Insights - Local Only)
The following table shows the approximate time it takes for example datasets to upload and receive the Ready for Interpretation status. The duration is evaluated by analyzing a batch of samples ingested through the Connected Insights tertiary analysis in conjunction with the DRAGEN secondary analysis performed on similar datasets on the DRAGEN server v4.
Sample Size
Duration
CPU Usage
Memory Usage
Network I/O
8 Samples TSO 500 DNA cases with an average of 90,000 variants
55 minutes
88.4%
26.04%
write: 22573.098 kb/s, read: 25442.099 kb/s
8 Samples TSO 500 ctDNA
1h 5m
96.65%
51.0%
write: 288181.985 kb/s, read: 176519.153 kb/s
8 Samples WGS Tumor Normal
15m per sample
97.05%
46.32%
write: 22336.569 kb/s, read: 35373.43 kb/s
8 Samples WGS Tumor Only
3h 41m 21s per sample
94.05%
49.9.32%
write: 377878.867.569 kb/s, read: 197353.179.43 kb/s
Last updated