LogoLogo
Illumina KnowledgeIllumina SupportSign In
  • Home
  • Start Here
  • Overview
    • Software Overview
    • What's New
    • FAQ
    • Technical Assistance
    • Release Notes
    • Data Model
    • Account Types
    • Professional Services
  • Automate
    • Data Automation Overview
    • Automatic Data Flow
    • Automatic Data Aggregation
      • Unlock Biosamples
      • Correct Aggregation
    • Request A Lab Requeue
    • Yield
    • Yield Examples
      • Example 1
      • Example 2
      • Example 3
    • Automate Lane QC
    • Manual QC
    • Statuses
  • Manage Data
    • Import Demo Data
    • Delete Data
    • Import Data Into Projects
      • FastQ Upload Requirements
    • Download Data
      • Download Individual Files
      • Download Datasets
      • Download Run Files
      • Download Project Files
      • Download Analysis Files
      • Download Samples
    • Copy Datasets
    • Transfer Ownership
    • Archival Storage
    • Automatic Data Deletion
  • Sequence
    • Sample Sheets
      • Mapping Sequencing Runs to Biosamples
    • Fix Indexes
    • Plan Runs
      • Plan a NextSeq 1000/2000 Run
        • Set up NextSeq 1000/2000 Secondary Analysis
        • Custom Noise Baseline File
      • Plan a NovaSeq X Series Run
        • Set up NovaSeq X Series Secondary Analysis
        • NovaSeq X Series Custom Reference File
      • Create a Custom Index Adapter Kit
      • Import samples
      • Requeue a Planned Run
      • Analysis Configuration Template
      • Prep Tab
        • Create Biological Samples
        • Import Biological Samples
        • Prep Libraries
        • Import Sample Libraries
        • Set Up a Custom Library Prep Kit
        • Pool Libraries
        • Plan Run Using Prep Tab
        • Neo Prep
          • Configure
          • Assign Wells
          • Review
  • Microarray
    • Getting Started
      • Troubleshooting iScan Integration
    • Analysis Setup
    • Data Management
  • Analyze
    • Analyze Data
    • Launching Apps
    • Analysis Workflows
    • Re-Launch Analysis
    • Auto Analysis QC
    • Basespace Apps
  • Collaborate
    • Sharing Data
    • Access Shared Data
    • Share with Collaborators
      • Share Analyses
      • Track Analysis Delivery Status
      • Share By Link
      • Share By Email
      • Manage Collaborator Access
      • Data Access After Share / Transfer
    • Workgroups
    • Manage Workgroups
      • Access Workgroups
      • Create a Workgroup and Assign Admins
      • Rename a Workgroup
      • Add Users to a Workgroup
      • Remove Users from a Workgroup
      • Add Admins to a Workgroup
      • Remove Admins from a Workgroup
      • Manage User Access
  • Manage Your Account
    • Change Password
    • iCredits and Billing
    • Generate Usage Reports
    • Manage Enterprise Domains
    • Global Regions
  • Developer Tools
    • Basespace API
    • Developer Tools
  • Files Used By Basespace
    • Biosample Workflow Files
    • BAM Files
    • FastQ Files
    • Quality Scores
    • VCF Files
    • gVCF Files
  • Data
    • View Data
      • View Runs
        • View Run Summary
        • View Run Biosamples
        • View Run Samples
        • View Run Charts
        • View Run Metrics
        • View Run Indexing QC
        • View Run Samplesheet
        • View Run Files
      • View Projects
      • View Analyses
      • View Biosamples
  • Projects
    • Create a Project
    • Edit Project Details
  • Runs
    • Fix Sample Sheet
    • Automated Run Zipping
  • Biosamples
    • Biosample Overview
    • Manage Biosamples
    • Biosample Workflow
      • Add Biosamples
      • Add Prep Requests
      • Add Analysis Workflows to an Existing Biosample
      • Schedule Multiple Analysis Workflows for a Biosample
      • Schedule Analysis Workflow - Multiple Biosamples
    • Associating Biosamples with Projects
  • Samples
    • Combine Samples
    • Copy Samples
  • Cmd Line Interfaces
    • Basespace CLI
      • Introduction to Basemount
  • Additional Resources
    • Additional Resources
      • Informatics Blog
      • Coverage Calculator
      • Support Bulletins
      • Training
      • Security Model
      • Data Streaming
      • AWS
  • Releases
    • Previous Releases
      • 2025
        • 7.33.0 - Shared BCL Convert Section
        • 7.32.0 - File Preview in Search
        • 7.31.0 - Usage Explorer
        • 7.30.0 - Prep Tab Deprecation
      • 2024
        • 7.29.0 - Improved Analysis Error Reporting
        • 7.28.0 - MiSeq i100 Support
        • 7.27.0 - App Store Upgrade
        • 7.26.0 - Prep Tab Obsolescence Notification
        • 7.25.0 - Project Column in the Analysis Files Tab
        • 7.24.0 - Requeue Improvements
        • 7.23.0 - BaseSpace CLI v1.6.0
        • 7.22.0 - Analysis Autolaunch for NovaSeq X Manual Mode Runs
        • 7.21.0 - Improved Look and Feel
        • 7.20.0 - Analysis List Improvements
        • 7.19.0 - Transfer of NovaSeq X Projects
        • 7.18.0 - Custom Kit Deletion
      • 2023
        • 7.17.0 - Deletion of Biosample Default Projects
        • 7.16.0 - Transfer of NovaSeq X Runs
        • 7.15.0 - Compatibility Filtering in Run Planning
        • 7.14.0 - Native App Engine Update
        • 7.13.0 - Sharing for NovaSeq X Runs and Analyses
        • 7.12.0 - Combined New and Classic Modes
        • 7.11.0 - NovaSeq X Analysis Requeue
        • 7.10.0 - NovaSeq X Analysis Autolaunch Improvements
        • 7.9.0 - Multi-Analysis Run Planning
        • 7.8.0 - Performance Improvements
      • 2022
        • 7.7.0 - NovaSeqX Support
        • 7.6.0 - Custom Configuration Files in Microarray Analysis Setup
        • 7.5.0 - Performance Enhancements
        • 7.4.0 - Run Planning Enhancements
        • 7.3.0 - Improve App Launch Performance
        • 7.2.0 - FastQ Generation and other Bug Fixes
        • 7.1.0 - FastQ Related Fixes and Performance Improvements
        • 7.0.0 - Datasets and Apps Performance
        • 6.19.0 - ICA Integration Enhancements
        • 6.18.0 - ICA Integration with BCL Convert
        • 6.17.0 - Preliminary ICA Integration
      • 2021
        • 6.16.0
        • 6.15.0
        • 6.14.0
        • 6.13.0
        • 6.12.0
        • 6.11.0
        • 6.10.0
        • 6.9.0
        • 6.8.0
        • 6.7.0
        • 6.6.0
        • 6.5.0
      • 2020
        • 6.4.0
        • 6.3.0
        • 6.2.0
        • 6.1.0
        • 6.0.0
        • 5.46.0
        • 5.45.0
        • 5.44.0
        • 5.43.0
        • 5.42.0
      • 2019
        • 5.41.0
        • 5.40.0
        • 5.39.0
        • 5.38.0
        • 5.37.0
        • 5.36.0
        • 5.35.0
        • 5.34.0
        • 5.33.0
        • 5.32.0
        • 5.31.0
        • 5.30.0
      • 2018
        • 5.29.0
        • 5.28.0
        • 5.27.0
        • 5.26.0
        • 5.25.0
        • 5.24.0
        • 5.23.0
        • 5.22.0
        • 5.21.1
        • 5.21.0
        • 5.20.0
        • 5.19.0
        • 5.18.0
        • 5.17.0
        • 5.16.0
        • 5.15.0
        • 5.14.0
        • 5.13.0
        • 5.12.0
        • 5.11.0
      • 2017
        • 5.10.0
        • 5.9.0
        • 5.8.0
        • 5.7.0
        • 5.6.0
        • 5.5.0
        • 5.4.0
        • 5.3.0
        • 5.2.0
        • 5.1.0
        • 5.0.0
        • 4.27.0
        • 4.26.0
        • 4.25.0
        • 4.24.0
        • 4.23.0
        • 4.22.0
        • 4.21.0
        • 4.20.0
        • 4.19.0
        • 4.18.0
        • 4.17.0
        • 4.16.0
        • 4.15.0
      • 2016
        • 4.14.0
        • 4.13.0
        • 4.12.0
        • 4.11.0
        • 4.10.0
        • 4.9.0
        • 4.8.0
        • 4.7.0
        • 4.6.0
        • 4.5.0
        • 4.4.0
        • 4.3.0
        • 4.2.0
        • 4.1.0
        • 4.0.4
        • 4.0.3
        • 4.0.2
        • 4.0.1
        • 4.0.0
      • 2015
        • 3.23.2
        • 3.23.1
        • 3.23.0 Issues
        • 3.23.0
        • 3.20.4
        • 3.20.3
        • 3.20.0
        • 3.19.1
        • 3.19.0
        • 3.18.0
        • 3.17.1
        • 3.17.0
        • 3.16.2
        • 3.16.0
        • 3.15.2
        • 3.15.1
    • Release notifications
Powered by GitBook
On this page

Was this helpful?

Export as PDF
  1. Files Used By Basespace

VCF Files

VCF is a text file format that contains information about variants found at specific positions in a reference genome. The file format consists of meta-information lines, a header line, and then data lines. Each data line contains information about a single variant.

Use VCF files for direct interpretation or as a starting point for further analysis with downstream analysis that is compatible with VCF, such as IGV or the UCSC Genome Browser. Do not use VCF files with tools that are not compatible with the VCF format, such as Outlook.

If you use a BaseSpace Sequence Hub app that uses VCF files as input, the app locates the file when launched. If using VCF files in other tools, download the file to use it in the external tool.

File Format

Some of the following information about VCF format may be dated for our newer apps. Please refer to the DRAGEN user guide for up to date information.

The file naming convention for VCF files is as follows: SampleName_S#.vcf (where # is the sample number determined by ordering in the sample sheet).

The header of the VCF file describes the tags used in the remainder of the file and has the column header: ##fileformat=VCFv4.1

##fileDate=20120317

##source=SequenceAnalysisReport.vshost.exe

##reference=

##phasing=none

##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">

##INFO=<ID=TI,Number=.,Type=String,Description="Transcript ID">

##INFO=<ID=GI,Number=.,Type=String,Description="Gene ID">

##INFO=<ID=CD,Number=0,Type=Flag,Description="Coding Region">

##FILTER=<ID=q20,Description="Quality below 20">

##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">

##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE

A sample line of the VCF file, with the data that is used to populate each column described: chr22 16285888 rs76548004 T C 17 d15;q20 DP=11;TI=NM_001136213;GI=POTEH;CD GT:GQ 1/0:17

ALT: The alleles that differ from the reference read. For example, an insertion of a single T could show reference A and alternate AT.

CHROM: The chromosome of the reference genome. Chromosomes appear in the same order as the reference FASTA file (generally karyotype order)

FILTER: If all filters are passed, the' PASS' is written. The possible filters are as follows:

  • q20 – The variant score is less than 20. (Configurable using the VariantFilterQualityCutoff setting in the config file)

  • r8 – For an Indel, the number of repeats in the reference (of a 1- or 2-base repeat) is greater than 8. (Configurable using the IndelRepeatFilterCutoff setting in the config file)

FORMAT: The format column lists fields (separated by colons), for example, "GT:GQ". The list of fields provided depends on the variant caller used. The available fields are as follows:

  • AD – Entry of the form X,Y where X is the number of reference calls, Y the number of alternate calls

  • GQ – Genotype quality

  • GT – Genotype. 0 corresponds to the reference base, 1 corresponds to the first entry in the ALT column, 2 corresponds to the second entry in the ALT column, etc. The '/' indicates that there is no phasing information.

  • NL – Noise level; an estimate of base calling noise at this position

  • SB – Strand bias at this position. Larger negative values indicate less bias; values near zero indicate more strand bias.

  • VF – Variant frequency. The percentage of reads supporting the alternate allele.

ID: The rs number for the SNP obtained from dbSNP. If there are multiple rs numbers at this location, the list is semi-colon delimited. If no dbSNP entry exists at this position, the missing value ('.') is used.

INFO: The possible entries in the INFO column:

  • AD – Entry of the form X,Y where X is the number of reference calls, Y the number of alternate calls.

  • CD – A flag indicating that the SNP occurs within the coding region of at least one RefGene entry

  • DP – The depth (number of base calls aligned to this position)

  • GI – A comma-separated list of gene IDs read from RefGene

  • NL – Noise level; an estimate of base calling noise at this position.

  • TI – A comma-separated list of transcript IDs read from RefGene

  • SB – Strand bias at this position.

  • VF – Variant frequency. The number of reads supporting the alternate allele.

POS: The 1-based position of this variant in the reference chromosome. The convention for VCF files is that, for SNPs, this base is the reference base with the variant. For indels or deletions, this base is the reference base immediately before the variant. Variants are in order of position.

QUAL: A Phred-scaled quality score assigned by the variant caller. Higher scores indicate higher confidence in the variant (and lower probability of errors). For a quality score of Q, the estimated probability of an error is 10-(Q/10). For example, the set of Q30 calls has a 0.1% error rate. Many variant callers assign quality scores (based on their statistical models) which are high relative to the error rate observed in practice.

REF: The reference genotype. For example, a deletion of a single T can read reference TT and alternate T.

SAMPLE: The sample column gives the values specified in the FORMAT column. One MAXGT sample column is provided for the normal genotyping (assuming the reference). For reference, a second column is provided for genotyping assuming the site is polymorphic. See the Starling documentation for more details.

Variant files for Isaac also contain off-target variant calls, with filter.

PreviousQuality ScoresNextgVCF Files

Last updated 1 year ago

Was this helpful?