This walk-through is meant to represent a typical workflow when building and studying a cohort of rare genetic disorder cases.
Create a new Project to track your study:
Login to the ICA
Navigate to Projects
Create a new project using the New Project
button.
Give your project a name and click Save
.
Navigate to the ICA Cohorts module by clicking COHORTS
in the left navigation panel then choose Cohorts
.
Navigate to the ICA Cohorts module by clicking Cohorts
in the left navigation panel.
Click Create Cohort
button.
Enter a name for your cohort, like Rare Disease + 1kGP
at top, left of pencil icon.
From the Public Data Sets
list select:
DRAGEN-1kGP
All Rare genetic disease cohorts
Notice that a cohort can also be created based on Technology
, Disease Type
and Tissue
.
Under Selected Conditions
in right panel, click on Apply
A new page opens with your cohort in a top-level tab.
Expand Query Details
to see the study makeup of your cohort.
A set of 4 Charts
will be open by default. If they are not, click Show Charts
.
Use the gear icon in the top-right of the Charts pane to change chart settings.
The bottom section is demarcated by 8 tabs (Subjects, Marker Frequency, Genes, GWAS, PheWAS, Correlation, Molecular Breakdown, CNV).
The Subjects
tab displays a list of exportable Subject IDs and attributes.
Clicking on a Subject ID
link pops up a Subject details page.
A recent GWAS publication identified 10 risk genes for intellectual disability (ID) and autism. Our task is to evaluate them in ICA Cohorts: TTN, PKHD1, ANKRD11, ARID1B, ASXL3, SCN2A, FHL1, KMT2A, DDX3X, SYNGAP1.
First let’s Hide charts
for more visual space.
Click the Genes
tab where you need to query a gene to see and interact with results.
Type SCN2A
into the Gene search field and select it from autocomplete dropdown options.
The Gene Summary
tab now lists information and links to public resources about SCN2A.
Click on the Variants
tab to see an interactive Legend and analysis tracks.
The Needle Plot displays gnomAD Allele Frequency
for variants in your cohort.
Note that some are in SCN2A conserved protein domains.
In Legend, switch the Plot by
option to Sample Count
in your cohort.
In Legend, uncheck all Variant Types
except Stop gained
. Now you should see 7 variants.
Hover over pin heads to see pop-up information about particular variants.
The Primate AI
track displays Scores for potential missense variants, based on polymorphisms observed in primate species. Points above the dashed line for the 75th percentile may be considered "likely pathogenic" as cross-species sequence is highly conserved; you often see high conservancy at the functional domains. Points below the 25th percentile may be considered "likely benign".
The Pathogenic variants
track displays markers from ClinVar color-coded by variant type. Hover over to see pop-ups with more information.
The Exons
track shows mRNA exon boundaries with click and zoom functionality at the ends.
Below the Needle Plot and analysis tracks is a list of "Variants observed in the selected cohort"
Export Gene Variants
table icon is above the legend on right side.
Now let's click on the Gene Expression
tab to see a Bar chart of 50 normal tissues from GTEx in transcripts per million (TPM). SCN2A is highly expressed in certain Brain tissues, indicating specificity to where good markers for intellectual disability and autism could be expected.
As a final exercise in discovering good markers, click on the tab for Genetic Burden Test
. The table here associates Phenotypes
with Mutations Observed
in each Study selected for our cohort, alongside Mutations Expected
to derive p-values. Given all considerations above, SCN2A is good marker for intellectual disability (p < 1.465 x 10 -22) and autism (p < 5.290 x 10 -9).
Continue to check the other genes of interest in step 1.