# Oncology Walk-through

This walk-through is intended to represent a typical workflow when building and studying a cohort of oncology cases.

{% embed url="<https://www.youtube.com/watch?v=wr19Y8BVaQ4&list=PLKRu7cmBQlaiQT6Giou9aSkZ4C0LMIGbc&index=8>" %}
Multi-Omic Cancer Workflow
{% endembed %}

## Create a Cancer Cohort and View Subject Details

1. Click `Create Cohort` button.
2. Select the following studies to add to your cohort:
   1. TCGA – BRCA – Breast Invasive Carcinoma
   2. TCGA – Ovarian Serous Cystadenocarcinoma
3. Add a `Cohort Name` = TCGA Breast and Ovarian\_1472
4. Click on `Apply`.
5. Expand `Show query details` to see the study makeup of your cohort.
6. `Charts` will be open by default. If not, click `Show charts`
7. Use the gear icon in the top-right to change viewable chart settings.

   > Tip: `Disease Type`, `Histological Diagnosis`, `Technology`, `Overall Survival` have interesting data about this cohorts
8. The `Subject` tab with all Subjects list is displayed below Charts with a link to each Subject by ID and other high-level information, like Data Types measured and reported. By clicking a subject ID, you will be brought to the data collected at the Subject level.
9. Search for subject `TCGA-E2-A14Y` and view the data about this Subject.
10. Click the `TCGA-E2-A14Y` Subject ID link to view clinical data for this Subject that was imported via the metadata.tsv file on ingest.

    > Note: the Subject is a 35 year old Female with vital status and other phenotypes that feed up into the `Subject` attribute selection criteria when creating or editing cohorts.
11. Click `X` to close the Subject details.
12. Click `Hide charts` to increase interactive landscape.

## Data Analysis, Multi-Omic Biomarker Discovery, and Interpretation

1. Click the `Marker Frequency` tab, then click the `Somatic Mutation` tab.
2. Review the gene list and mutation frequencies.
3. Note that PIK3CA has a high rate of mutation in the Cohort (ranked 2nd with 33% mutation frequency in 326 of the 987 Subjects that have Somatic Mutation data in this cohort).
   1. Do Subjects with PIK3CA mutations have changes in PIK3CA RNA Expression?
4. Click the `Gene Expression` tab, search for `PIK3CA`
   1. PIK3CA RNA is down-regulated in 27% of the subjects relative to normal samples.
      1. Switch from `normal` to `disease` Reference where the Subject’s denominator is the median of all disease samples in your cohort.
      2. The count of matching vs. total subjects that have PIK3CA up-regulated RNA which may indicate a distinctive sub-phenotype.
5. Click directly on `PIK3CA` gene link in the `Gene Expression` table.
6. You are brought to the `Gene` tab under the `Gene Summary` sub-tab that lists information and links to public resources about PIK3CA.
7. Click the `Variants` tab and `Show legend and filters` if it does not open by default.
8. Below the interactive legend you see a set of analysis tracks: Needle Plot, Primate AI, Pathogenic variants, and Exons.
9. The Needle Plot allows toggling the plot by `gnomAD frequency` and `Sample Count`. Select `Sample Count` in the `Plot by` legend above the plot.
   1. There are 87 mutations distributed across the 1068 amino acid sequence, listed below the analysis tracks. These can be exported via the icon into a table.
10. We know that missense variants can severely disrupt translated protein activity. Deselect all `Variant Types` except for `Missense` from the `Show Variant Type` legend above the needle plot.
    1. Many mutations are in the functional domains of the protein as seen by the colored boxes and labels on the x-axis of the Needle Plot.
11. Hover over the variant with the highest sample count in the yellow `PI3Ka` protein domain.
    1. The pop-up shows variant details for the 64 Subjects observed with it: 63 in the Breast Cancer study and 1 in the Ovarian Cancer Study.
12. Use the Exon zoom bar from each end of the Amino Acid sequence to zoom in to the `PI3Ka` domain to better separate observations.
13. There are three different missense mutations at this locus changing the wildtype Glutamine at different frequencies to Lysine (64), Glycine (6), or Alanine (2).
14. The `Pathogenic Variant` Track shows 7 ClinVar entries for mutations stacked at this locus affecting amino acid 545. Pop up details with pathogenicity calls, phenotypes, submitter and a link to the ClinVar entry is seen by hovering over the purple triangles.
15. Note the `Primate AI` track and high Primate AI score.
    1. `Primate AI` track displays Scores for potential missense variants, based on polymorphisms observed in primate species. Points above the dashed line for the 75th percentile may be considered likely pathogenic as cross-species sequence is highly conserved; you often see high conservancy at the functional domains. Points below the 25th percentile may be considered "likely benign".
16. Click the `Expression` tab and notice that normal Breast and normal Ovarian tissue have relatively high PIK3CA RNA Expression in GTex RNAseq tissue data but ubiquitously expressed.
