Describes the controls on the Input Form and their function
Item name | Description | Choices | Default | Required |
---|---|---|---|---|
Save Results To
Project to run the analysis in
Required
Input Type
This app can accept samples or a project as input.
Samples: Select up to 60 individual samples, from any project(s)
Project: Select a single project containing up to 1536 samples. The app will analyze every FASTQ sample in that project (FASTQ datasets with QcStatus=QcFailed will be excluded)
Biosamples, Project
Biosamples
Required
Input Biosample
Select one or more samples to analyze. Either Input Samples or an Input Project can be selected - not both.
Required if Input Type is set to 'Samples'
Input Project
Select a Project containing up to 1536 samples to be analyzed. The analysis will process all samples from that project (FASTQ datasets with QcStatus=QcFailed will be excluded). There is currently no way to filter specific samples from a project. If the project contains more than 1536 Biosamples, the app will appear to launch, but then will immediately exit.
Required if Input Type is set to 'Project'
Experiment Type
This app can analyze samples generated from enrichment or amplicon sequencing experiments. Either can be selected - not both.
Enrichment, Amplicon
Enrichment
Required
Enrichment Panel
Select the enrichment panel used to generate the data. This determines the set of reference genomes the app uses. Different selection will produce different results. Choose 'Custom' to provide your own reference genomes below.
Viral Surveillance Panel (VSP)
Pan-Coronavirus Panel (Pan-Cov)
Respiratory Virus Oligo Panel (RVOP)
Custom
Required if Experiment Type is set to 'Enrichment'
Amplicon Primer Set
Select the virus genome to align to and primer set used to generate the data. Primer locations determine primer trimming locations and amplicon definitions. If processing SARS-CoV-2 data from a non-amplicon protocol, choose 'SARS-CoV-2, no primers'. Different selection will produce different results. Choose 'Custom' to provide your own reference genomes and primer set below
SARS-CoV-2, ARTIC v5.3.2 primers
SARS-CoV-2, ARTIC v4.1 primers
SARS-CoV-2, ARTIC v4 primers
SARS-CoV-2, ARTIC v3 primers
SARS-CoV-2, no primers
Influenza A, Universal primers
Influenza B, Universal primers
Influenza A and B, Universal primers
Chikungunya Virus, Grubaugh Lab primers
Chikungunya Virus, Illumina primers
Dengue Virus Serotype 1 (DENV1), 400-bp DengueSeq primers
Dengue Virus Serotype 1 (DENV1), Illumina primers
Monkeypox Virus (MPXV) Clade II, Grubaugh Lab primers
Respiratory Syncytial Virus (RSV), CDC primers
Respiratory Syncytial Virus (RSV), WCCRRI primers
Zika Virus, Grubaugh Lab primers
Custom
Required if Experiment Type is set to 'Amplicon'
Custom Reference: Custom Reference FASTA For Consensus Generation
Provide a custom reference FASTA to use for consensus generation. Either Enrichment Panel or Amplicon Primer Set must be set to Custom to enable this field.
Sequence names must be unique and must not contain any space. If there is any space in the FASTA header, the part before the first space is assumed to be the sequence name.
It is recommended to use the following in sequence names: alphabets, numbers, underscore (_
), hyphen (-
), parentheses ((
,)
), and period (.
). Otherwise, the sequence names may appear different in the output.
It is recommended to keep sequence names short (e.g. NC_045512.2). If needed, full names can be provided in the genomeName column of Reference BED below.
FASTA file name must not include any space, must not exceed 25 characters, and must use extension .fasta or .fa
Required if either Enrichment Panel or Amplicon Primer Set is set to 'Custom'
Custom Reference: Custom Reference BED
Provide a custom reference BED to describe each sequence in Custom Reference FASTA. See Genome definition BED file format
Optional if Enrichment Panel or Amplicon Primer Set is set to 'Custom'. Otherwise not applicable
Custom Reference: Custom PCR Primer Definitions
Provide a file defining primers used in amplicon sequencing. See Primer definition file formats
Optional if Amplicon Primer Set is set to 'Custom'. Otherwise not applicable
Custom Reference: NextClade Datasets
Select one or more available NextClade Datasets from the drop-down menu below. Hold ctrl/command key to select multiple or deselect.
Optional if either Enrichment Panel or Amplicon Primer Set is set to 'Custom'. Otherwise not applicable
Pangolin
Run Pangolin on applicable consensus genomes
True, False
True
Optional if any Enrichment Panel is selected, any SARS-CoV-2 Amplicon Primer Set is selected, or 'Custom' is selected for Enrichment Panel or Amplicon Primer Set. Otherwise not applicable
NextClade
Run NextClade on applicable consensus genomes. If providing Custom Reference, select NextClade Datasets above to enable. Otherwise not applicable NextClade
True, False
True
Optional if any Enrichment Panel is selected, if a genome with NextClade dataset available is selected for Amplicon Primer Set, or if 'Custom' is selected for Enrichment Panel or Amplicon Primer Set. Otherwise not applicable
Advanced Workflow Settings: Dehost
If checked: input FASTQs will be scrubbed of all human reads, before the Map/Align stage, so that the output BAM includes only viral reads.
True, False
True
Required
Advanced Workflow Settings: Trim Consensus Sequences
Remove any leading and trailing masked nucleotides from the resulting consensus sequences. Does not affect internal masked regions.
True, False
True
Required
Advanced Workflow Settings: Minimum percentage of amplicons with at least 90% coverage ≥ 1x to enable variant calling and consensus sequence generation
At low input concentrations, errors produced by the reverse transcriptase enzyme can propagate to high frequencies, leading to false positive sequence variants. Therefore, we attempt to infer the sample concentration from the amplicon coverage using this metric. If you wish to adjust this, we advise conducting internal studies to examine variant call reproducibility between replicates to determine a threshold that will produce acceptable quality levels for your application. Only applicable to amplicon sequencing where primers are defined. See Special considerations for amplicon sequencing with IMAP protocols
80.0%
Required if Experiment Type is set to 'Amplicon'
Advanced Workflow Settings: Minimum read coverage depth for consensus sequence generation
Genomic positions with read coverage below this threshold will be considered indeterminate and hard-masked in the final consensus sequence
10
Required
Advanced Workflow Settings: Minimum percentage of consensus sequence generated to label as confident
Consensus sequences with percentage of callable bases below this threshold will be considered 'low confidence'. Callability is defined based on minimum coverage depth for consensus sequence generation (above)
5.0%
Required
Additional DRAGEN Command Line Arguments: Additional DRAGEN Map/Align Command Line Arguments
USE AT YOUR OWN RISK. This field allows the user to add any DRAGEN command line argument, which can cause DRAGEN to:
Crash/fail/hang
Run for a very long time
Generate unexpected or invalid results
The app appends this input text to the DRAGEN command line after removing invalid characters (valid characters are alphanumeric plus ._-"'
). Note that there is no validation of the contents. If you use this field and the appsession aborts, the output*.log appsession log file may help to understand the cause of the failure.
Optional
Additional DRAGEN Command Line Arguments: Additional DRAGEN Variant Calling (Somatic) Command Line Arguments
USE AT YOUR OWN RISK. This field allows the user to add any DRAGEN command line argument, which can cause DRAGEN to:
Crash/fail/hang
Run for a very long time
Generate unexpected or invalid results
The app appends this input text to the DRAGEN command line after removing invalid characters (valid characters are alphanumeric plus ._-"'
). Note that there is no validation of the contents. If you use this field and the appsession aborts, the output*.log appsession log file may help to understand the cause of the failure.
Optional
Organisms to Report (VSP)
Only the checked organisms will be reported (consensus sequences and metrics). This will not affect the underlying bioinformatics pipeline, only which outputs are provided.
All VSP organisms
Optional if Enrichment Panel is set to 'VSP'. Otherwise, not applicable
Organisms to Report (RVOP)
Only the checked organisms will be reported (consensus sequences and metrics). This will not affect the underlying bioinformatics pipeline, only which outputs are provided.
All RVOP organisms
Optional if Enrichment Panel is set to 'RVOP'. Otherwise, not applicable