⚙️App Settings

Describes the controls on the Input Form and their function

Item name Description Choices Default Required

Item name	Description	Choices	Default	Required
Save Results To	Project to run the analysis in			Required
Input Type	This app can accept samples or a project as input. Samples: Select up to 60 individual samples, from any project(s) Project: Select a single project containing up to 1536 samples. The app will analyze every FASTQ sample in that project (FASTQ datasets with QcStatus=QcFailed will be excluded)	Biosamples, Project	Biosamples	Required
Input Biosample	Select one or more samples to analyze. Either Input Samples or an Input Project can be selected - not both.			Required if Input Type is set to 'Samples'
Input Project	Select a Project containing up to 1536 samples to be analyzed. The analysis will process all samples from that project (FASTQ datasets with QcStatus=QcFailed will be excluded). There is currently no way to filter specific samples from a project. If the project contains more than 1536 Biosamples, the app will appear to launch, but then will immediately exit.			Required if Input Type is set to 'Project'
Experiment Type	This app can analyze samples generated from enrichment or amplicon sequencing experiments. Either can be selected - not both.	Enrichment, Amplicon	Enrichment	Required
Enrichment Panel	Select the enrichment panel used to generate the data. This determines the set of reference genomes the app uses. Different selection will produce different results. Choose 'Custom' to provide your own reference genomes below.	Viral Surveillance Panel (VSP) Pan-Coronavirus Panel (Pan-Cov) Respiratory Virus Oligo Panel (RVOP) Custom		Required if Experiment Type is set to 'Enrichment'
Amplicon Primer Set	Select the virus genome to align to and primer set used to generate the data. Primer locations determine primer trimming locations and amplicon definitions. If processing SARS-CoV-2 data from a non-amplicon protocol, choose 'SARS-CoV-2, no primers'. Different selection will produce different results. Choose 'Custom' to provide your own reference genomes and primer set below	SARS-CoV-2, ARTIC v5.3.2 primers SARS-CoV-2, ARTIC v4.1 primers SARS-CoV-2, ARTIC v4 primers SARS-CoV-2, ARTIC v3 primers SARS-CoV-2, no primers Influenza A, Universal primers Influenza B, Universal primers Influenza A and B, Universal primers Chikungunya Virus, Grubaugh Lab primers Chikungunya Virus, Illumina primers Dengue Virus Serotype 1 (DENV1), 400-bp DengueSeq primers Dengue Virus Serotype 1 (DENV1), Illumina primers Monkeypox Virus (MPXV) Clade II, Grubaugh Lab primers Respiratory Syncytial Virus (RSV), CDC primers Respiratory Syncytial Virus (RSV), WCCRRI primers Zika Virus, Grubaugh Lab primers Custom		Required if Experiment Type is set to 'Amplicon'
Custom Reference: Custom Reference FASTA For Consensus Generation	Provide a custom reference FASTA to use for consensus generation. Either Enrichment Panel or Amplicon Primer Set must be set to Custom to enable this field. Sequence names must be unique and must not contain any space. If there is any space in the FASTA header, the part before the first space is assumed to be the sequence name. It is recommended to use the following in sequence names: alphabets, numbers, underscore (`_`), hyphen (`-`), parentheses (`(`,`)`), and period (`.`). Otherwise, the sequence names may appear different in the output. It is recommended to keep sequence names short (e.g. NC_045512.2). If needed, full names can be provided in the genomeName column of Reference BED below. FASTA file name must not include any space, must not exceed 25 characters, and must use extension .fasta or .fa			Required if either Enrichment Panel or Amplicon Primer Set is set to 'Custom'
Custom Reference: Custom Reference BED	Provide a custom reference BED to describe each sequence in Custom Reference FASTA. See Genome definition BED file format			Optional if Enrichment Panel or Amplicon Primer Set is set to 'Custom'. Otherwise not applicable
Custom Reference: Custom PCR Primer Definitions	Provide a file defining primers used in amplicon sequencing. See Primer definition file formats			Optional if Amplicon Primer Set is set to 'Custom'. Otherwise not applicable
Custom Reference: NextClade Datasets	Select one or more available NextClade Datasets from the drop-down menu below. Hold ctrl/command key to select multiple or deselect.			Optional if either Enrichment Panel or Amplicon Primer Set is set to 'Custom'. Otherwise not applicable
Pangolin	Run Pangolin on applicable consensus genomes	True, False	True	Optional if any Enrichment Panel is selected, any SARS-CoV-2 Amplicon Primer Set is selected, or 'Custom' is selected for Enrichment Panel or Amplicon Primer Set. Otherwise not applicable
NextClade	Run NextClade on applicable consensus genomes. If providing Custom Reference, select NextClade Datasets above to enable. Otherwise not applicable NextClade	True, False	True	Optional if any Enrichment Panel is selected, if a genome with NextClade dataset available is selected for Amplicon Primer Set, or if 'Custom' is selected for Enrichment Panel or Amplicon Primer Set. Otherwise not applicable
Advanced Workflow Settings: Dehost	If checked: input FASTQs will be scrubbed of all human reads, before the Map/Align stage, so that the output BAM includes only viral reads.	True, False	True	Required
Advanced Workflow Settings: Trim Consensus Sequences	Remove any leading and trailing masked nucleotides from the resulting consensus sequences. Does not affect internal masked regions.	True, False	True	Required
Advanced Workflow Settings: Minimum percentage of amplicons with at least 90% coverage ≥ 1x to enable variant calling and consensus sequence generation	At low input concentrations, errors produced by the reverse transcriptase enzyme can propagate to high frequencies, leading to false positive sequence variants. Therefore, we attempt to infer the sample concentration from the amplicon coverage using this metric. If you wish to adjust this, we advise conducting internal studies to examine variant call reproducibility between replicates to determine a threshold that will produce acceptable quality levels for your application. Only applicable to amplicon sequencing where primers are defined. See Special considerations for amplicon sequencing with IMAP protocols		80.0%	Required if Experiment Type is set to 'Amplicon'
Advanced Workflow Settings: Minimum read coverage depth for consensus sequence generation	Genomic positions with read coverage below this threshold will be considered indeterminate and hard-masked in the final consensus sequence		10	Required
Advanced Workflow Settings: Minimum percentage of consensus sequence generated to label as confident	Consensus sequences with percentage of callable bases below this threshold will be considered 'low confidence'. Callability is defined based on minimum coverage depth for consensus sequence generation (above)		5.0%	Required
Additional DRAGEN Command Line Arguments: Additional DRAGEN Map/Align Command Line Arguments	USE AT YOUR OWN RISK. This field allows the user to add any DRAGEN command line argument, which can cause DRAGEN to: Crash/fail/hang Run for a very long time Generate unexpected or invalid results The app appends this input text to the DRAGEN command line after removing invalid characters (valid characters are alphanumeric plus `._-"'`). Note that there is no validation of the contents. If you use this field and the appsession aborts, the output*.log appsession log file may help to understand the cause of the failure.			Optional
Additional DRAGEN Command Line Arguments: Additional DRAGEN Variant Calling (Somatic) Command Line Arguments	USE AT YOUR OWN RISK. This field allows the user to add any DRAGEN command line argument, which can cause DRAGEN to: Crash/fail/hang Run for a very long time Generate unexpected or invalid results The app appends this input text to the DRAGEN command line after removing invalid characters (valid characters are alphanumeric plus `._-"'`). Note that there is no validation of the contents. If you use this field and the appsession aborts, the output*.log appsession log file may help to understand the cause of the failure.			Optional
Organisms to Report (VSP)	Only the checked organisms will be reported (consensus sequences and metrics). This will not affect the underlying bioinformatics pipeline, only which outputs are provided.		All VSP organisms	Optional if Enrichment Panel is set to 'VSP'. Otherwise, not applicable
Organisms to Report (RVOP)	Only the checked organisms will be reported (consensus sequences and metrics). This will not affect the underlying bioinformatics pipeline, only which outputs are provided.		All RVOP organisms	Optional if Enrichment Panel is set to 'RVOP'. Otherwise, not applicable

Save Results To

Project to run the analysis in

Required

Input Type

This app can accept samples or a project as input.

Samples: Select up to 60 individual samples, from any project(s)
Project: Select a single project containing up to 1536 samples. The app will analyze every FASTQ sample in that project (FASTQ datasets with QcStatus=QcFailed will be excluded)

Biosamples, Project

Biosamples

Required

Input Biosample

Select one or more samples to analyze. Either Input Samples or an Input Project can be selected - not both.

Required if Input Type is set to 'Samples'

Input Project

Select a Project containing up to 1536 samples to be analyzed. The analysis will process all samples from that project (FASTQ datasets with QcStatus=QcFailed will be excluded). There is currently no way to filter specific samples from a project. If the project contains more than 1536 Biosamples, the app will appear to launch, but then will immediately exit.

Required if Input Type is set to 'Project'

Experiment Type

This app can analyze samples generated from enrichment or amplicon sequencing experiments. Either can be selected - not both.

Enrichment, Amplicon

Enrichment

Required

Enrichment Panel

Select the enrichment panel used to generate the data. This determines the set of reference genomes the app uses. Different selection will produce different results. Choose 'Custom' to provide your own reference genomes below.

Viral Surveillance Panel (VSP)
Pan-Coronavirus Panel (Pan-Cov)
Respiratory Virus Oligo Panel (RVOP)
Custom

Required if Experiment Type is set to 'Enrichment'

Amplicon Primer Set

Select the virus genome to align to and primer set used to generate the data. Primer locations determine primer trimming locations and amplicon definitions. If processing SARS-CoV-2 data from a non-amplicon protocol, choose 'SARS-CoV-2, no primers'. Different selection will produce different results. Choose 'Custom' to provide your own reference genomes and primer set below

SARS-CoV-2, ARTIC v5.3.2 primers
SARS-CoV-2, ARTIC v4.1 primers
SARS-CoV-2, ARTIC v4 primers
SARS-CoV-2, ARTIC v3 primers
SARS-CoV-2, no primers
Influenza A, Universal primers
Influenza B, Universal primers
Influenza A and B, Universal primers
Chikungunya Virus, Grubaugh Lab primers
Chikungunya Virus, Illumina primers
Dengue Virus Serotype 1 (DENV1), 400-bp DengueSeq primers
Dengue Virus Serotype 1 (DENV1), Illumina primers
Monkeypox Virus (MPXV) Clade II, Grubaugh Lab primers
Respiratory Syncytial Virus (RSV), CDC primers
Respiratory Syncytial Virus (RSV), WCCRRI primers
Zika Virus, Grubaugh Lab primers
Custom

Required if Experiment Type is set to 'Amplicon'

Custom Reference: Custom Reference FASTA For Consensus Generation

Provide a custom reference FASTA to use for consensus generation. Either Enrichment Panel or Amplicon Primer Set must be set to Custom to enable this field.

Sequence names must be unique and must not contain any space. If there is any space in the FASTA header, the part before the first space is assumed to be the sequence name.
It is recommended to use the following in sequence names: alphabets, numbers, underscore (_), hyphen (-), parentheses ((,)), and period (.). Otherwise, the sequence names may appear different in the output.
It is recommended to keep sequence names short (e.g. NC_045512.2). If needed, full names can be provided in the genomeName column of Reference BED below.
FASTA file name must not include any space, must not exceed 25 characters, and must use extension .fasta or .fa

Required if either Enrichment Panel or Amplicon Primer Set is set to 'Custom'

Custom Reference: Custom Reference BED

Provide a custom reference BED to describe each sequence in Custom Reference FASTA. See Genome definition BED file format

Optional if Enrichment Panel or Amplicon Primer Set is set to 'Custom'. Otherwise not applicable

Custom Reference: Custom PCR Primer Definitions

Provide a file defining primers used in amplicon sequencing. See Primer definition file formats

Optional if Amplicon Primer Set is set to 'Custom'. Otherwise not applicable

Custom Reference: NextClade Datasets

Select one or more available NextClade Datasets from the drop-down menu below. Hold ctrl/command key to select multiple or deselect.

Optional if either Enrichment Panel or Amplicon Primer Set is set to 'Custom'. Otherwise not applicable

Pangolin

Run Pangolin on applicable consensus genomes

True, False

True

Optional if any Enrichment Panel is selected, any SARS-CoV-2 Amplicon Primer Set is selected, or 'Custom' is selected for Enrichment Panel or Amplicon Primer Set. Otherwise not applicable

NextClade

Run NextClade on applicable consensus genomes. If providing Custom Reference, select NextClade Datasets above to enable. Otherwise not applicable NextClade

True, False

True

Optional if any Enrichment Panel is selected, if a genome with NextClade dataset available is selected for Amplicon Primer Set, or if 'Custom' is selected for Enrichment Panel or Amplicon Primer Set. Otherwise not applicable

Advanced Workflow Settings: Dehost

If checked: input FASTQs will be scrubbed of all human reads, before the Map/Align stage, so that the output BAM includes only viral reads.

True, False

True

Required

Advanced Workflow Settings: Trim Consensus Sequences

Remove any leading and trailing masked nucleotides from the resulting consensus sequences. Does not affect internal masked regions.

True, False

True

Required

Advanced Workflow Settings: Minimum percentage of amplicons with at least 90% coverage ≥ 1x to enable variant calling and consensus sequence generation

At low input concentrations, errors produced by the reverse transcriptase enzyme can propagate to high frequencies, leading to false positive sequence variants. Therefore, we attempt to infer the sample concentration from the amplicon coverage using this metric. If you wish to adjust this, we advise conducting internal studies to examine variant call reproducibility between replicates to determine a threshold that will produce acceptable quality levels for your application. Only applicable to amplicon sequencing where primers are defined. See Special considerations for amplicon sequencing with IMAP protocols

80.0%

Required if Experiment Type is set to 'Amplicon'

Advanced Workflow Settings: Minimum read coverage depth for consensus sequence generation

Genomic positions with read coverage below this threshold will be considered indeterminate and hard-masked in the final consensus sequence

Required

Advanced Workflow Settings: Minimum percentage of consensus sequence generated to label as confident

Consensus sequences with percentage of callable bases below this threshold will be considered 'low confidence'. Callability is defined based on minimum coverage depth for consensus sequence generation (above)

5.0%

Required

Additional DRAGEN Command Line Arguments: Additional DRAGEN Map/Align Command Line Arguments

USE AT YOUR OWN RISK. This field allows the user to add any DRAGEN command line argument, which can cause DRAGEN to:

Crash/fail/hang
Run for a very long time
Generate unexpected or invalid results

The app appends this input text to the DRAGEN command line after removing invalid characters (valid characters are alphanumeric plus ._-"'). Note that there is no validation of the contents. If you use this field and the appsession aborts, the output*.log appsession log file may help to understand the cause of the failure.

Optional

Additional DRAGEN Command Line Arguments: Additional DRAGEN Variant Calling (Somatic) Command Line Arguments

USE AT YOUR OWN RISK. This field allows the user to add any DRAGEN command line argument, which can cause DRAGEN to:

Crash/fail/hang
Run for a very long time
Generate unexpected or invalid results

Optional

Organisms to Report (VSP)

Only the checked organisms will be reported (consensus sequences and metrics). This will not affect the underlying bioinformatics pipeline, only which outputs are provided.

All VSP organisms

Optional if Enrichment Panel is set to 'VSP'. Otherwise, not applicable

Organisms to Report (RVOP)

Only the checked organisms will be reported (consensus sequences and metrics). This will not affect the underlying bioinformatics pipeline, only which outputs are provided.

All RVOP organisms

Optional if Enrichment Panel is set to 'RVOP'. Otherwise, not applicable

PreviousPrimer definition file formats NextUnderstanding the BaseSpace Reports

Last updated 6 months ago