LogoLogo
Illumina KnowledgeIllumina SupportSign In
  • Home
  • Get Started with Emedgene
    • Get started with Emedgene
    • How can Emedgene help you solve a case?
  • Emedgene analyze manual
    • Getting around the platform
      • Top navigation panel
      • Emedgene Applications menu
      • Dashboard
      • Settings
      • User roles
      • Help
      • Okta identity management
    • Managing data storage
      • Manage data storages
      • Manage Azure Blob data storage
      • Manage S3 credentials
      • Manage BaseSpace storage
      • Manage GCS storage (V37.0+)
      • Bring Your Own Bucket
      • Bring Your Own Key
    • Cases tab
      • Cases tab
      • Cases table
      • Case status
      • Browse and select cases
      • Case details
    • Creating a single case
      • Add a new case
      • Select sample type
      • Create a family tree
      • Family tree legend
      • Add a sample
      • Supported Variant callers
      • Adding patient info for the proband
      • Adding patient info for the non-proband samples
      • Secondary findings
      • Labeling a case
      • Gene list
      • Supported parental ethnicities
    • Creating multiple cases
      • Batch case upload from platform
      • CSV format requirements
      • Batch case upload via CLI
    • Reviewing a case
      • Individual case page
      • Individual case page: Top bar
      • Individual case page: Top bar
      • Candidates tab
      • Most Likely Candidates and Candidates
      • Genome Overview
      • Analysis tools tab
      • Variant table columns
      • Variant table
      • Variant search
      • Multiselection of variants and bulk actions (34.0+)
      • Download variants
      • Manually add variants to a delivered case
      • Filters/Presets panel
      • Filters
      • Presets
      • Preset groups
      • Variant Type Filters
      • Variant Effect Filters
      • Quality Filters
      • Polymorphism Filters
      • Gene Filters
      • Phenomatch Filters
      • Inheritance Filters
      • Zygosity Filters
      • User Filters
      • Evidence page
      • Phenotypic match strength
      • Lab tab
      • Versions tab
      • Editing an existing case
      • Finalizing a case
      • Clinical Report
      • Reflex genetic testing
      • Variant zygosity notations
      • STR calling and interpretation
    • Variant page
      • Variant page
      • Variant page top bar
      • Variant tagging widget
      • Variant activity panel
      • Desktop apps panel
      • Clinical Significance section
      • Summary section
      • Quality section
      • Visualization section
      • Population Statistics section
      • Related Cases section
      • CNV overlap percentage
      • Evidence section
      • ACMG SNV Classification wizard
      • Logic behind ACMG classification of SNVs
      • ACMG CNV Classification wizard
      • Variant page sidebar (2.29+)
    • Variant visualization setup
      • Enabling visualization for a VCF case
      • Integration between emedgene and desktop IGV
      • Loading alignment files to your desktop IGV (32.0+)
    • Analyze Network
      • Analyze Network Setup
      • Network sharing configuration
      • Case subject consent for extended sharing
      • Public vs Private network
      • Create a network
      • Set network data sharing policy
      • Leave a network
      • Delete a network
    • Settings
      • My settings
      • Management
      • User Management
      • Network
      • Organization Settings (33.0+)
    • Integrations
      • API Beginner Guide
      • Advanced API Implementations
      • API Key Generation
      • BSSH Integration
      • ICA Integration
      • Webhook Integration
  • Emedgene Curate Manual
    • Curate overview
      • Curate overview
      • Emedgene Applications menu
      • Curate navigation panel
      • Genome assemblies supported by Curate
    • Curate Variants
      • Curate Variants overview
      • Curate Variant table
      • Curate Variant page
      • How to add a variant to Curate
      • Curate Variant annotations in the case
    • Curate Genes (2.28+)
      • Curate Genes overview
      • Curate Gene table
      • Curate Gene page
      • How to add a gene to Curate
    • Import Curate annotations to the case (30.0+)
      • Import Curate Variant annotations to the case (30.0+)
      • Import Curate Gene annotations to the case (30.0+)
  • Frequently Asked Questions
    • All FAQ
      • Which browser should I use with Emedgene?
      • Emedgene annotations and update frequency
      • How do I use developer tools to collect logs?
      • Can I analyze Illumina Complete Long Reads in Emedgene?
      • How do I prepare VCF files generated by DRAGEN MANTA to be used as input for Emedgene?
      • Source of gnomAD data for small variants on GRCh38
      • How are MNVs handled on the platform?
      • Support for gene lists with up to 10,000 genes
      • Genomic Regions by Case Type
      • How do I analyze mtDNA variants?
      • Can I use exome data for CNV detection?
      • How does joint calling work on Emedgene?
      • What is the required format for a BED file defining a kit?
      • Which reference genomes can I use?
      • How do I move between organizations?
      • How do I check the version of my environment?
      • "Failed to generate report". What should I do?
      • How do I prepare VCF files generated by Dragen STR (ExpansionHunter) to be used as input?
      • How does Emedgene Analyze prioritize transcripts?
      • How does Emedgene Analyze merge variants from different sources?
      • Performance issue troubleshooting
      • How does Emedgene calculate variant effect and severity ?
      • How to I prepare metrics files generated by DRAGEN to be used as input for Emedgene
      • How are timekeeping and log timestamps kept accurate and consistent?
  • Release Notes
    • Workbench & Pipeline Updates
      • New in Emedgene V37.0 (February 20, 2025)
        • V37 Patches
      • New in Emedgene V36.0 (October 8 2024)
        • V36 Patches
      • New in Emedgene V35.0 (May 22nd 2024)
        • V35 Patches
      • New in Emedgene V34.0 (January 28th 2024)
        • V34 Patches
      • New in Emedgene V33.0 (September 6th 2023)
        • V33 Patches
      • New in Emedgene V32.0 (June 8th 2023)
        • New pipeline 32 (June 8th 2023)
        • V32 Patches
      • More release notes
        • New in emedgene 31 (March 1st 2023)
        • New in emedgene 30 (January 8th 2023)
        • New in emedgene 2.29 (August 25 2022)
        • New pipeline 5.29 (May 1st 2022)
        • New in emedgene 2.28 (May 1 2022)
        • New in emedgene 2.27 (March 7, 2022)
        • New in emedgene 2.26 (Dec 14, 2021)
        • New in emedgene 2.24-2.25 (Aug 11, 2021)
        • New in emedgene 2.23 (Jun 15, 2021)
        • New in emedgene 2.19-2.22 (Apr 8, 2021)
        • New in emedgene 2.16-2.19 (Dec 7, 2020)
        • New in emedgene 2.12-2.16 (Oct 18, 2020)
    • Knowledgebase Updates
      • 2025
        • Variant Databases (March 30th 2025)
        • Zoidberg 77 (March 17th 2025)
        • Zoidberg 76 (February 3rd 2025)
        • Zoidberg 75 (January 6th 2025)
      • 2024
        • Variant Databases (December 8th 2024)
        • Zoidberg 74 (December 2nd 2024)
        • Zoidberg 73 (October 21th 2024)
        • Variant Databases (September 22nd 2024)
        • Zoidberg 72 (September 10th 2024)
        • Variant Databases (July 21st 2024)
        • Zoidberg 71 (July 24th 2024)
        • Zoidberg 70 (June 3rd 2024)
        • Zoidberg 69 (April 19th 2024)
        • Variant Databases (April 9th 2024)
        • Zoidberg 68 (March 18th 2024)
        • Variant Databases (February 5th 2024)
        • Zoidberg 67 (January 28th 2024)
        • Variant Databases (January 5th 2024)
      • 2023
        • Zoidberg 66 (December 24th 2023)
        • Variant Databases (December 3rd 2023)
        • Zoidberg 65 (November 21th 2023)
        • Variant Databases (November 5th 2023)
        • Zoidberg 64 (October 24th 2023)
        • Variant Databases (October 8th 2023)
        • Zoidberg 63 (September 18th 2023)
        • Variant Databases (September 5th 2023)
        • Zoidberg 62 (August 23th 2023)
        • Zoidberg 61 (August 16th 2023)
        • Variant Databases (August 6th 2023)
        • Zoidberg 60 (July 30th 2023)
        • Variant Databases (July 2nd 2023)
        • Zoidberg 59 (June 18th 2023)
        • Variant Databases (June 4th 2023)
          • Variant Databases (May 7th 2023)
        • Zoidberg 58 (May 21th 2023)
        • Zoidberg 57 (April 16th 2023)
        • Variant Databases (April 2nd 2023)
        • Zoidberg 56 (March 19th 2023)
        • Variant Databases (March 11th 2023)
        • Zoidberg 55 (February 19th 2023)
        • Zoidberg 54 (January 16th 2023)
    • Change log
      • Change log pipeline v34
      • Change log pipeline 31
      • Change log workbench 31
      • Change log pipeline 30
      • Change log workbench 30
      • Change log workbench 2.29
      • Change log pipeline 5.29
      • Change log workbench 2.28
  • Legal
    • Privacy, Security & Compliance
    • Release Policy
Powered by GitBook
On this page
  • General CSV format requirements
  • CSV schema
  • 1. Mandatory fields
  • 2. Conditionally mandatory fields
  • 3. Optional fields
  • 4. Custom fields
  • Batch case .csv file validation rules
  • Required BSSH file path format:
  • Human-Readable Path for BSSH files in Batch CSV (Version 37)

Was this helpful?

Export as PDF
  1. Emedgene analyze manual
  2. Creating multiple cases

CSV format requirements

General CSV format requirements

The following are the general format requirements for a CSV file used to create multiple cases:

  1. The file must have a .csv extension.

  2. The file must contain a [Data] header.

  3. The row after [Data] header must include the field names identifying the data in each column. The column names are case-sensitive.

  4. The row after the column name header and each subsequent row represents a sample.

  5. Each column represents a data field.

  6. It is essential that there are no empty rows between the [Data] header and the last sample row.

  7. Number of cases per file can’t be greater than 50.

  8. On versions before 34.0, cells should not contain commas. Consider replacing the commas with semicolons.


CSV schema

1. Mandatory fields

Must be present in the sample table at all times.

  1. Case Type;

  2. Family Id;

  3. Phenotypes OR Phenotypes Id.

2. Conditionally mandatory fields

If these fields are left empty, it will result in the creation of an empty sample.

  1. BioSample Name;

  2. Files Names;

  3. Storage Provider Id;

This field is mandatory if Files Names is empty:

  1. Sample Type.

This field is required if the "auto" option is used for Files Names (only relevant for BSSH):

  1. Default Project.

3. Optional fields

The sample table may include these supported optional columns.

  1. Boost Genes;

  2. Clinical Notes;

  3. Date Of Birth;

  4. Due Date;

  5. Execute now;

  6. Gender;

  7. Gene List Id;

  8. Kit Id;

  9. Label Id;

  10. Opt In;

  11. Relation;

  12. Selected Preset;

  13. Visualization Files.

4. Custom fields

The sample table may contain custom columns to suit your specific needs and include any relevant information that is important for your workflow.

Note: In cases with more than one sample, custom fields are only recognized and added to case information if their values appear within the same table row where the Relation field is equal to "proband".

Custom field examples:

Field (column) name
Expected input
Field details
Example

Institution

Free text

Custom

GenoMed Solutions

Sample_Received_Date

Free text

Custom

24-02-2022

Sample_Type

Free text

Custom

Amniotic Fluid


Batch case .csv file validation rules

Field (column) name
Expected input
Field details
Example

BioSample Name

Free text

Conditionally mandatory. An empty sample will be created if the field is left blank.

NA24385

Boost Genes

1. "TRUE" 2. "FALSE"

TRUE

Case Type

1. "Whole Genome" 2. "Exome" 3. "Custom Panel" 4. Array

5. Custom case type

Mandatory. Only considered for proband.

Whole Genome

Clinical Notes

Free text

Optional

A 14-year-old boy with a visual acuity of 20/200 in both eyes in whom hearing loss was first noted at 5 years of age on routine screening; audiometry revealed sensorineural hearing loss.

Date Of Birth

Date "YYYY-MM-DD"

Optional

2013-01-22

Default Project

Free text

Conditionally mandatory. Must be filled in if the "auto" option is used for Files Names (only relevant for BSSH).

GIAB

Due Date

Date "YYYY-MM-DD"

Optional

2023-05-03

Execute now

1. "TRUE" 2. "FALSE"

Optional. Default value is "TRUE". Use "FALSE" if you don’t want to run the case upon uploading the file.e Only considered for proband.

FALSE

Family Id

Free text

Mandatory

RM8392

Files Names

1. Semicolon-separated list of paths to .fastq, .fastq.gz, .vcf, .vcf.gz, .bam, .cram, *gt_sample_sammary.json files without spaces 2. "existing" 3. "auto"

/GIAB_cases/1/NA24385.dragen.hard-filtered.gvcf.gz;/QA_cases/Other/NA24385.dragen.cnv.vcf.gz;/QA_cases/Other/NA24385.dragen.repeats.vcf;

Sex / Gender*

1. "F" 2. "M" 3. "U"

Optional. Default value is "U".

*The field is labeled as Sex in versions 33.0 and later, and as Gender in older versions.

M

Gene List Id

integer

Optional. Must be the id of a previously defined Gene List. Only considered for proband.

12345

Kit Id

integer

Optional. Must be the id of a previously defined Kit. Only considered for proband.

23456

Label Id

integer

Optional. Must be the id of a previously defined Case Label. Only considered for proband.

34567

Opt In

1. "TRUE" 2. "FALSE"

FALSE

Phenotypes

  1. Semicolon-separated list of HPO phenotype terms

  2. "Unaffected" is used for non-affected family members.

Mandatory for proband sample if Phenotypes Id is empty. List must be under 100. It is possible to include non-HPO terms if Phenotypes Id is empty.

Abnormal pupillary function;Orthotopic os odontoideum;

Phenotypes Id

Semicolon-separated list of HPO phenotype IDs

Mandatory for proband sample if Phenotypes is empty.

List must be under 100.

HP:0007686;HP:0025375;

Relation

1. "proband" 2. "mother" 3. "father" 4. "sibling"

Optional. Default value is "proband". Values "proband", "father", "mother" can be only used once per Family ID. One sample with Relation "proband" is required per Family ID.

Mother

Sample Type

1. "FASTQ" 2. "VCF"

Conditionally mandatory. Required if Files Names is empty. Only considered for proband.

FASTQ

Selected Preset

1. Free text 2. "Default"

Optional. Must be the name of a previously defined Preset. If set to default, the default Preset will be applied. If left empty, no Preset will be applied.

High quality candidates

Storage Provider Id

Integer

Conditionally mandatory. Required if Files Names is not empty. Must be from the configured storage provider ID list.

208

Visualization Files

Semicolon-separated list of paths to sequence alignment data files of extension .bam, .cram; 🆕34.0+: also .tn.bw, .baf.bw, .roh.bed, .lrr.bedgraph, .baf.bedgraph

Optional

/giab_project/NA24385.bam

Required BSSH file path format:

For BSSH, it is necessary to use the actual names (numbers):

/projects/3824821/appresults/2319318/files/119675608

instead of aliases

/projects/ABC_DEF_2022-12-22_DEv395/appresults/ABC-GM58342-def/files/ABC-GM58342-def.hard-filtered.vcf.gz

Human-Readable Path for BSSH files in Batch CSV (Version 37)

In version 37, we introduced an enhancement to the batch upload process that allows customers to provide a human-readable path in their batch CSV for BSSH files.

Validations

When a batch CSV includes a human-readable path, the system performs the following validations for paths in BSSH storage:

  1. Single File in the Path:

    • If the provided path contains exactly one file or dataset, the batch upload proceeds successfully.

  2. Two Files in the Path:

    • If the path contains two files with the same name (for example, two pairs of fastqs in a dataset) , the system will:

      • Select the dataset marked as QCPassed.

      • Fail the batch upload if both datasets are marked as QCPassed, as this indicates conflicting data.

  3. More Than Two Files in the Path:

    • If the path contains more than two files or datasets, the system fails the batch upload, as the path is considered ambiguous or invalid.

Error Scenarios

  • Multiple QCPassed Datasets: If two datasets in the same path are marked as QCPassed, the batch upload will fail with a descriptive error indicating the conflict.

  • Excessive Files in the Path: If more than two files are found for the provided path, the batch upload will fail, instructing the user to provide a more specific or valid path.

Benefits

  • Enables customers to use intuitive, human-readable paths in their workflows.

  • Automatically handles dataset selection based on quality control status.

PreviousBatch case upload from platformNextBatch case upload via CLI

Last updated 2 months ago

Was this helpful?

Each custom field must be assigned a unique name without spaces. Data from custom columns is saved per case under the Additional information section of .

(highlighted in red), (highlighted in orange), and fields should be filled in according to the following rules.

Optional. Indicates whether the will be used. "TRUE" means that variants in the targeted genes will receive upgraded scores during prioritization by the AI Shortlist algorithm. Default value is "FALSE". Only considered for proband.

Conditionally mandatory. An empty sample will be created if the field is left blank. The "existing" option automatically locates FASTQ files based on the BioSample Name. Note: If data files for an existing case were sourced from the customer’s external bucket and later removed, attempting to create a case from those files will result in an error. With the "auto" option, BSSH users can automatically locate FASTQ files based on the BioSample Name and Default Project provided. When using BSSH without the "auto" option, ensure that your file path is .

Optional. Indicates whether the case subject consented to the with your network(s). Default value is "TRUE".

Case Info
Mandatory
Conditionally mandatory
Optional
Boost genes mode
extended sharing of data
formatted correctly