arrow-left

All pages
gitbookPowered by GitBook
1 of 10

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Managing data storage

Manage Azure Blob storage

Before you proceed to this article, make sure you understand data storage management basics.

hashtag
Update Azure Blob Storage Credentials

In Settings > Management Tab, add or edit the required credentials: CLIENT_ID, CLIENT_SECRET, TENANT_ID, and ACCOUNT_URL.

See the table below to learn where to look for them in your Azure account.

Emedgene setting
Corresponidng client (Azure) setting

hashtag
Blob Integration Setup

hashtag
Create an App registration

  1. In Microsoft Entra ID, click on App registrations.

  1. Select New registration.

  2. Fill the name of the application & press "register."

  3. You got to the registered app page: (CLIENT_ID / TENANT_ID) From this you can retrieve: Application ID and Tenant ID. Both are marked in the screenshot.

  1. Press "Certificates & secrets"

  2. Press on "New Client secret"

  1. Fill the "Description" and change expires to 12 months. (or according to your organization policy), than press "Add"

8. Get the CLIENT_SECRET from this page.

  1. Give this App registration roles and read access to the relevant Blob.

hashtag
Azure Blob configuration

  1. Go to Azure Storage accounts

  1. Get into the relevant Storage account

  1. Press on "containers"

  1. Press on the relevant container

  2. Press on "Properties"

  3. Copy the ACCOUNT_URL


hashtag
For Internal support:

Errors for bad connections can be found in CloudWatch on particular FRY log stream

Search for: BlobApi, BlobFs, azure.

CLIENT_ID

application_id.

Format: ########-####-####-####-############

(letters/numbers)

CLIENT_SECRET

Value of the client_secret tuple (Value, Secret ID).

Format: #####-#######-######-######

(letters/digits/special chars)

TENANT_ID

ID of the tenant.

Format: ########-####-####-####-############

(letters/numbers)

ACCOUNT_NAME

An arbitrary name that the customer must supply to define the ACCOUNT_URL.

Format: string

CONTAINER_NAME

An arbitrary name that the customer must supply to define the ACCOUNT_URL.

Format: string

ACCOUNT_URL

The account_url of the Azure account.

Format: https://account_name.blob.core.windows.net/container_name

Manage data storages

To directly import files from your own storage, link it to an organization's storage in Emedgene.

circle-info

Note: to manage data storage, you must have Manager and Multiple Storage user roles.

hashtag
How to link your storage to Emedgene:

1

Click on the user initials or profile picture at the rightmost corner of the top navigation panel and select Settings

2

Select the Management tab and proceed to Storage card that lists currently linked storages.

3

To add a new storage:

hashtag
How to edit storage information:

Click Manage on the right to the storage details.

hashtag
How to remove a link to storage:

Click Delete on the right to the storage details.

circle-exclamation

If data is deleted or moved from the customer's storage, it might adversely affect the case. To learn more about possible consequences, check out this table:

Manage S3 credentials

Whenever an organization is created, we automatically allocate bucket folders in AWS S3 cloud storage to it:

  • Path for upload

Folder intended to store input case files.

Authorized user has view and upload privileges.

  • Path for download

This folder contains a partially annotated (excluding results of proprietary algorithms) VCF file per case.

Authorized user has view and download privileges.

  • Path for DRAGEN output

This folder contains DRAGEN output files.

Authorized user has view and download privileges.

To get access to your upload, download and DRAGEN output folders, you need to get a key pair consisting of an access key ID and a secret access key. , , and credentials is available for users with Manager and Manage S3 Credentials .

You can create and use up to two dynamic access keys at the same time.

When you require technical support, you have the option to generate a new key pair specifically for the troubleshooting process. After the issue has been resolved, you can delete the credentials to ensure security of your system.

The newly generated credentials will only be saved in AWS Identity and Access Management (IAM) and not in our database.

hashtag
How to create a key pair

  1. In Settings > Management > S3 Credentials, click on Create Access Key.

  2. You can retrieve the secret access key only when you initially create the key pair. If you lose it, you have to create a new key pair. To immediately copy the secret access key to a secure location, use the Copy to clipboard button.

hashtag
How to deactivate a key pair

In Settings > Management > S3 Credentials, click on Deactivate in the corresponding key pair card.

hashtag
How to activate an inactive key pair

In Settings > Management > S3 Credentials, click on Activate in the corresponding key pair card.

hashtag
How to delete a key pair

In Settings > Management > S3 Credentials, click on Delete in the corresponding key pair card. Only inactive key pairs can be deleted.

<Region_Cloud>-emg-auto-samples/<org_name>/upload/

Storage providers

Creating
deactivating
activating
deleting
roles
<Region_Cloud>-emg-downloads/<org_name>/ 
<Region_Cloud>-emg-auto-results/<org_name>/ 
  1. Click Add Storage

  2. Choose a storage type from:

    1. Azure Data Lake

    2. Azure Blob

    3. AWS S3

    4. File Transport Protocol (FTP)

    5. Google Cloud

    6. Secure File Transport Protocol (SFTP)

    7. Illumina Basespace (BSSH)

    8. Illumina Connected Analytics (ICA)

  3. Fill in the required credentials

  4. Click Add storage

4

Check the connection to confirm that the storage is successfully linked.

To do this, find the storage in the list and check the cloud icon status:

  • If it's green, the connection is set correctly

  • If it's red and strikethrough, something went wrong. Hover over the icon to see details

Manage ICA storage

circle-info

Prerequisites for managing ICA storage

To manage ICA storage, the user must have:

  • The Storage Provider user role

  • on the ICA project—either granted individually or via entire workgroup

hashtag
How to get your ICA credentials:

1

Log in to your Illumina private domain via URL in the following format: . This opens the Connected Platform Home

2

In the left navigation panel: User > API keys

3

Name the key

hashtag
To connect ICA:

1

Log into your Emedgene domain and go to the workgroup where you want to link ICA storage

2

Click on the user avatar and select Settings from the dropdown

3

Select the Management tab

4

Choose one of the following options:

A. Grant access to all workgroups across the domain If your domain includes multiple workgroups and you want the API key to apply universally, select "All current and future Workgroups and roles (Global API Key)"

B. Grant access to specific workgroups Select one or more workgroups from the list. For each selected workgroup, assign the following application roles:

  • Emedgene Has Access

  • Illumina Connected Analytics - Has Access

  • Platform-home Workgroup Admin

5

Click Generate. Once the API key is generated, copy it to your clipboard or download it as a file.

⚠️ Important: The API key is only accessible while the API Key Generated popup window is open. After closing the window, the key cannot be retrieved. If you didn’t copy or download it, you’ll need to generate a new key.

4

In the Storage card, click Add Storage

5

Select Illumina Connected Analytics (not Illumina Connected Analytics V1!) from the Storage type dropdown

6

Fill the storage credentials:

  • "Api_key"—the API key generated before

  • "Project"—the name of the Project in ICA that contains and will contain the data you want to connect

  • "Path"—the folder within the project where the data is located. This can be used to restrict the user to only be able to access data within the specified folder. Using only “ / “ will allow all folders within your ICA project

7

Click Add Storage

Upload and download permissionsarrow-up-right
yourcompanyname.login.illumina.comarrow-up-right

Manage BaseSpace storage

Log in to Emedgene and navigate to Settings in the upper right-hand corner of the page.

Click on the Management tab and then on Add Storage.

Choose Illumina BaseSpace storage type.

Fill Client Key, Client Secret and App Token as provided from BaseSpace (a description on how to get this information is provided below) and click Add storage to complete the setup.

hashtag
Via Command Line

hashtag
Prerequisite

Install BaseSpace CLI (Command Line Interface)

Follow the instructions on the if needed. Be aware of the Basespace Regional Instance you are working on (us, euc1, aps2, euw2)

hashtag
Authenticate

On BSSH, login to the workgroup you want to connect as the storage.

Once the BaseSpace CLI is installed, run the authentication command in the terminal.

The command will direct you to a link which requires to login.

After the authentication was completed successfully, find the access token in the config file.

The result should look like -

Populate the App_token with the accessToken value, and Server with the apiServer URL from the BSSH config file.

Client_key will be displayed in subsequent menus, so a descriptive name such as the workgroup name can be used.

Client_secret is unused when the App_token is available and can be set to "x".

hashtag
Via BaseSpace Developer Portal

Go to the BaseSpace and login. Be aware of the Basespace Regional Instance you are working on (us, , , )

Go to My Apps and click Create a new Application.

Fill details for the application and click on create an application.

Fill details and press save.

You will need to fill all the fields that it requested, please add “NA” to them.

Go to My Apps and click on your new app. Then go to the credentials tab.

You will find the Client ID (Client Key), Client Secret and App Token to enter to Emedgene platform.

hashtag
Adding BSSH account to your Emedgene account

  1. Log in into the desired Emedgene organization.

  2. Go to Settings

  3. Go to Management tab

  4. Click on Add Storage

  1. Add the information from your “Credentials” of the App previously created in BSSH.

Select BaseSpace:

Via Command Line
Via BaseSpace Developer Portal
BaseSpace CLI Installation Pagearrow-up-right
developer portalarrow-up-right
euc1arrow-up-right
aps2arrow-up-right
euw2arrow-up-right
Example - connect integration1 workgroup as storage.
# Linux
$ wget "https://launch.basespace.illumina.com/CLI/latest/amd64-linux/bs" -O $HOME/bin/bs
# Mac
$ wget "https://launch.basespace.illumina.com/CLI/latest/amd64-osx/bs" -O $HOME/bin/bs
# or
$ brew tap basespace/basespace && brew install bs-cli
# Windows
$ wget "https://launch.basespace.illumina.com/CLI/latest/amd64-windows/bs.exe" -O bs.exe
$ bs auth
$ cat .basespace/default.cfg
apiServer   = https://api.basespace.illumina.com
accessToken = 
Managing AWS S3 Lifecycle policy

Bring Your Own Key

hashtag
Scope

Bring Your Own Key (BYOK) is a security feature that allows organizations to use their own encryption keys to protect their data. This ensures that they maintain control over their encryption keys and, consequently, their data.

circle-info

BYOK is only available for Enterprise-level support accounts.

circle-info

BYOK setup

For versions earlier than v100.39.0, BYOK setup requires Illumina Support.

For versions v100.39.0 and later, you can complete the setup from .

hashtag
Supported Key Management Services

Illumina integrates with leading Key Management Services (KMS), including Azure Key Vault and AWS KMS, so organizations can maintain full control over their encryption keys. These integrations combine Illumina’s Bring Your Own Key (BYOK) feature with your preferred KMS provider to deliver robust key management and enhanced data security.

hashtag
Azure Key Vault

is a cloud service that provides a secure way to store and manage sensitive information like API keys, passwords, and certificates. It offers robust features for key management, including key generation, storage, and lifecycle management.

hashtag
AWS KMS

(KMS) allows you to create and control encryption keys used to encrypt your data across a wide range of AWS services and applications. It provides centralized management of encryption keys and integrates seamlessly with other AWS services.

triangle-exclamation

hashtag
Risk of losing a key

Losing the encryption key means that all data encrypted with that key will be inaccessible. This can lead to permanent loss of access to crucial information.


hashtag
Setup

hashtag
Azure Key Vault Setup

The API server encrypts the organization's information before storing it in the database and decrypts it when needed (e.g., during pipeline execution). The key vault is managed by the organization.

To configure encryption in Emedgene, you need the following information from Azure Key Vault:

Application tokens:

  • Client Id

  • Tenant Id

  • Client Secret

The key information:

  • Key URL

hashtag
Create a new application

1

Navigate to App registrations

2

Click Register to create a new application and and fill in the required details

3

hashtag
Add a client secret

1

In the left menu, select Certificates & Secrets

2

Click New client secret. Copy and save the Value (Client Secret) immediately, as it is shown only once.

circle-info

hashtag
Create a new key

1

Click New Key (Create key vault)

2

Specify the key vault name, region (for example, East US), and pricing tier

3

hashtag
Find key details

1

Navigate to the newly created Key vault

2

In the left menu, select Keys, and then select the key

3

Select the current version

hashtag
AWS Key Management Service (KMS) Setup

Description is coming soon.

circle-info

Please reach out to [email protected] to get help with this setup.


hashtag
Architecture

The API server will encrypt the client's information before storing it in a database and decrypt that information when needed (e.g., running the pipeline). The key vault is managed by the client, and Emedgene will only be provided with access to encrypt/decrypt functions in that key vault. This guarantees that clients control access to the information.

Illustration of data flow when creating a case in Emedgene platform:

Illustration of data flow when reading a case data from emedgene platform:

A preliminary step to this solution is having a key vault owned by the client, and a key that Emedgene is given access to.

The client will create an access policy in the key vault of type “Application” and provide the matching key and secret to Emedgene. The access policy must contain permissions to perform encrypt and decrypt actions.

In order for Emedgene to integrate with the key, depending on the key vault provider, the client needs to provide the following information:

  • Client Id

  • Client Secret

  • Tenant Id

  • Key vault name

hashtag
Searching Encrypted Fields

Since some of our platform search capabilities run directly on the DB, we can’t directly search any data that is encrypted. To overcome this, we will implement a hashing search functionality as follows.

  • The case data will still be fully encrypted in the DB as it is today

  • Specific fields we want to make “searchable” - as defined by the customer, we will save their hash value alongside the encrypted data.

  • Hashing will be done using SHA-256, and will include a secure random generated salt of 32 characters, which will be added to the value.

Illustration of data flow when searching in Emedgene platform:

Illustration of data flow when creating a case with searchable field in Emedgene platform:

hashtag
Appendix

chevron-rightAppendix: Control flows texthashtag

Write:

Read

Write Searchable

Read Searchable

Bring Your Own Bucket

If you have an Enterprise account and you would like Emedgene-managed DRAGEN solution to save the DRAGEN output files in your own bucket, reach out to [email protected]envelope.

Emedgene visualizes data in IGV directly from your AWS S3 bucket. In order to do it, you should enable CORSarrow-up-right for the Emedgene application URLs.

Case Type
File Type
Expected effect

FASTQ

FASTQ/BAM/CRAM (input)

Reanalysis will fail (will be fixed)

FASTQ

This feature is only related to saving Dragen output files in your own bucket when using Dragen through Emedgene (without ICA).

If you are looking to:

  • Import data from AWS S3 to Emedgene go to

  • Integrating any data storage to Emedgene go to

  • Download any data from Emedgene go to


circle-info

Bring your own bucket is only available for Enterprise level support accounts and require Illumina support for setup.


hashtag
Bring Your Own Bucket

Bring Your Own Bucket, also known as BYOK, enables you to control your DRAGEN file outputs.

Emedgene-managed DRAGEN solution saves the DRAGEN output files in a detected AWS S3 bucket that you have access to using your .

However, if you have an Enterprise account and you would like Emedgene-managed DRAGEN solution to save the DRAGEN output files in your own bucket, reach out to [email protected] and follow this steps:

hashtag
1. Create an AWS bucket

Emedgene requires access to the root folder, which means a dedicated bucket might be appropriated.

hashtag
2. Edit Bucket policy

Bucket policy should allow Emedgene user access to the bucket.

Example bucket policy:

hashtag
3. Allow illumina.com and emedgene.com for CORS

Emedgene directly from your AWS S3 bucket. In order to do it, you should enable for the Emedgene application URLs.

Example CORS policy:

hashtag
4. Test and validate the configuration with Illumina support

We will require to run a case and validate the managed DRAGEN pipeline finish successfully and all features are available in the platform.

circle-info

The BYOB solution means you managed your own data, meaning if you accidentally deleted or moved the data the integration with Emedgene might break. You are responsible for your DRP and data backup solutions.


hashtag
Managing AWS S3 Lifecycle policy

If a customer enables an AWS S3 Lifecycle policy in order to archive or change the S3 tiers for different files, they might create an adverse effect on the platform.

Case Type
File Type
Expected effect
It is crucial to securely store and manage your keys to prevent such risks.
After registration, copy and save the
Application (Client) ID
and
Directory (Tenant) ID
Please note the expiration date. If the secret expires, encryption will fail.
Click
Next
to go to
Access Policies
4

Select Add access policy, and set Key permissions:

  • Key Management Operations

  • Cryptographic Operations: Decrypt, Encrypt, Unwrap Key, Wrap Key

5

Set Secret permissions:

  • Secret Permission: Get

  • Select Principal: select the application you created earlier

6

Finish with Review + create

4

Copy the Key Identifier (Key URL):

Key name

The salt is unique and will not be used anywhere else in the platform.
  • When the user enters a string to search, we will hash that value using all the salt values, and search those hash values.

  • Organization settings
    Azure Key Vaultarrow-up-right
    AWS Key Management Servicearrow-up-right
    Drawing
    Creating a case in emedgene platform
    Drawing
    Reading a case data from emedgene platform
    Drawing
    Drawing

    VCF

    BAM/CRAM (visualizations)

    Visualization will fail

    VCF

    VCF (input)

    Reanalysis will fail

    VCF

    CSV, etc

    Reanalysis will fail

    (will be fixed)

    CRAM (Output)

    Reanalysis will fail

    FASTQ

    VCFs

    Reanalysis will fail

    FASTQ

    CSV, etc

    Reanalysis will fail

    VCF

    BAM/CRAM (visualizations)

    Visualization will fail

    VCF

    VCF (input)

    Reanalysis will fail

    VCF

    CSV, etc

    Reanalysis will fail (will be fixed)

    FASTQ

    FASTQ/BAM/CRAM (input)

    Reanalysis will fail (will be fixed)

    FASTQ

    CRAM (Output)

    Reanalysis will fail

    FASTQ

    VCFs

    Reanalysis will fail

    FASTQ

    CSV, etc

    Manage data storages
    Manage data storages
    Manage S3 credentials
    S3 credentials
    visualizes data in IGV
    CORSarrow-up-right

    Reanalysis will fail

    https://<key-vault-name>.vault.azure.net/keys/<key-name>/<key-version>
    Client->Emedgene API: Add New Test Request 
    note right of Emedgene API: Process Request 
    Emedgene API->Key Vault: PHI 
    note right of Key Vault: Encrypt 
    Key Vault->Emedgene API: Encrypted PHI 
    Emedgene API->Emedgene DB: Store Encrypted PHI
    Client->Emedgene API: Get Test Request 
    emedgene DB->Emedgene API: Encrypted PHI 
    Emedgene API->Key Vault: Encrypted PHI 
    note right of Key Vault: Decrypt 
    Key Vault->Emedgene API: Decrypted PHI 
    Emedgene API->Client: Decrypted PHI
    Client->Emedgene API: Add New Test Request 
    note right of Emedgene API: Process Request 
    Emedgene API->Key Vault: PHI 
    note right of Key Vault: Encrypt 
    Key Vault->Emedgene API: Encrypted PHI 
    Emedgene API-> Emedgene DB: Get Salt 
    Emedgene API-> Emedgene API: Hash Value using Salt 
    Emedgene API->Emedgene DB: Store Encrypted PHI + Hashed value
    Client->Emedgene API: Search string 
    Emedgene API->AWS Secrets: Get Salt 
    Emedgene API-> Emedgene API: Hash string using Salt 
    Emedgene API->Emedgene DB: Search hashed string 
    Emedgene DB->Emedgene API: Search results 
    Emedgene API->Client: Search results
    {
        Coming Soon
    }
    {
        Coming Soon
    }

    Manage Google Cloud storage

    hashtag
    Google Cloud Storage Credentials update procedure

    hashtag
    How to get the client credentials?

    1. Go to the google cloud Console.

    2. Navigate to IAM & Admin - In the left sidebar, go to IAM & Admin > Service Accounts.

    1. Create a New Service Account: Click on the "Create Service Account" button at the top.

    1. Fill in the Service Account Details:

      • Service account name: Give your service account a name.

      • Service account ID: This will be automatically generated based on the name.

    1. Assign Roles to the Service Account:

      • In the Grant this service account access to project step, you’ll assign the necessary roles.

      • Grant these role:

    hashtag
    Add the storage provider to Emedgene platform:

    • Add the above 3 values into the appropriate fields:

      • Client_credentials_base64: pasting the output of 8.

      • Bucket: the bucket name.

    hashtag
    CORS - Visualisation

    1. Download and install the Google Cloud SDK from the Google Cloud SDK Install page.

    2. Select Your Platform (Windows, macOS, or Linux), download and run.

    3. Initialize and Authenticate with Google Cloud: In the Cloud SDK Shell/terminal, run: gcloud init This will open a browser window to authenticate your Google account. Follow the instructions to log in and select your project.

    notice:

    • origin: if using Illumina cloud:

      https://host_name.emg.illumina.com

      else, Emedgene cloud:

      https://host_name.emedgene.com

    1. Apply CORS Configuration to Your Bucket: run the next command. gcloud storage buckets update gs://your-bucket-name --cors-file=cors.json

    2. Verify the CORS Configuration: gcloud storage buckets describe gs://your-bucket-name

    Description: Optionally, provide a description for the service account.

    Click "Create and Continue".

    example:

    "storage object viewer" (read-only access)

  • Create the Service Account:

    • After assigning the roles, click "Done".

  • Generate and Download a Key:

    • Find your newly created service account, click the three dots on the right, and select "Manage Keys".

    • Click Add Key > Create New Key and choose the JSON format.

    • Download the key and store it securely, as it is used for authentication in your code or applications.

  • Encode the key in base 64:

    • use python function: put this function and your json (here named json_file.json) in the same directory and run.\

    • save the output printed.

  • Path: for default, fill with / else, put your path in the bucket. Seperate directories with /

    Set CORS Configuration via gcloud: Create a JSON file (cors.json) on your machine with the CORS rules. Example\ it should look like:

    LINKarrow-up-right
    import json
    import base64
    
    
    def encode_json_to_base64(json_file):
        # Read JSON data from file
        with open(json_file, 'r') as file:
            json_data = json.load(file)
    
        # Convert the JSON data to a string
        json_str = json.dumps(json_data)
    
        # Encode the string to bytes, then to Base64
        json_bytes = json_str.encode('utf-8')
        base64_bytes = base64.b64encode(json_bytes)
    
        # Convert Base64 bytes back to a string
        base64_str = base64_bytes.decode('utf-8')
        # Print the Base64-encoded string
        print(base64_str)
    
    
    encode_json_to_base64('json_file.json')
    [
        {
          "origin": ["https://<host_name>.emg.illumina.com"],
          "method": ["GET"],
          "responseHeader": ["emgauthorization"],
          "maxAgeSeconds": 3600
        }
    ]