Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
File name: {SampleID}_CombinedVariantOutput.tsv
The combined variant file contains the variants and biomarkers in a single file. The output contains the following variant types and biomarkers:
Small variants (including EGFR complex variants)
Copy number variants
Tumor Mutational Burden (TMB)
MSI
DNA Fusions
The combined variant output file also contains Analysis Details and Sequencing Run Details sections. The details of each are listed in the following table:
- Sample ID - Output date - Output time - Pipeline version (Docker image version number)
- Run name - Run date - Sample index ID - Instrument ID - Instrument control software version - Instrument type - RTA version - SBS reagent cartridge lot number - Cluster reagent cartridge lot number
Combined variant output produces small variants with blank fields in the following situations:
The variant has been matched to a canonical RefSeq transcript on an overlapping gene not targeted by TruSight Oncology 500 ctDNA.
The variant is located in a region designated iSNP, indel, or Flanking in the TST500_Manifest.bed
file located in the Resources folder.
Small Variants - All variants with the FILTER field marked as PASS and which have a canonical RefSeq transcript are present in the combined variant output.
Gene and transcript information is only present for variants belonging to canonical transcripts that are within the Gene list–Small Variants.
Copy Number Variants - Copy number variants must meet the following conditions:
FILTER field marked as PASS.
ALT field is <DUP or <DEL> .
Gene is part of the copy number variant gene list
Fusion Variants - Fusion variants must meet the following conditions:
Passing variant call (KeepFusion field is true).
Contains at least one gene on the fusion allow list.
Genes separated by a dash (-) indicate that the fusion directionality could be determined. Genes separated by a slash (/) indicate that the fusion directionality could not be determined.
The gene and exon coverage report files are tab-separated value (TSV) files with coverage values matching respectively the exons and genes specified in the manifest file.
Refer to DNA Analysis Methods for more information.
File name: {SAMPLE_ID}_hard-filtered.gvcf.gz
The small variant genome VCF file includes the variant call status for all targeted intervals, left-padded by 25 bp.
The Epidermal Growth Factor Receptor (EGFR) complex variant VCF includes phased EGFR variants. The FILTER column in the genome VCF determines the variant status. Refer to the following table for more information.
.
PASS
WT.
., A, C, G, etc1
low_depth²
Reference or filtered variant candidate with depth < 1000.
A, C, G, etc1
PASS
PASS variants.
A, C, G, etc1
weak_evidence
Filtered variant candidate with the following conditions:
Filtered variant candidate with low SQ score (< 2).
A, C, G, etc1
excluded_regions³
Position with high background noise. Not available for variant detection.
A, C, G, etc1
systematic_noise
Filtered variant candidate with low AQ score (< 20 for hotspots, < 60 for nonhotspots).
A, C, G, etc1
mapping_quality
Filtered variant candidate with low median mapping quality (< 30).
A, C, G, etc1
read_position
Filtered variant candidate showed bias clustered at fragment ends.
A, C, G, etc1
multiallelic
Filtered if there are two or more ALT alleles at this location.
A, C, G, etc1
low_frac_info_reads
Filtered if the fraction of informative reads is low (< 0.5).
¹ Etc refers to other variant types not mentioned in the table.
² Reference positions and nonpassing variants with coverage below 1000X directly translate into low_depth. For variant calls, low_depth is not applied when a position has a PASS filter.
³ This is a static list of regions compiled by Illumina. Email Illumina Technical Support for more information.
File name: {SAMPLE_ID}_DNAVariants_Annotated.json.gz
The small variants annotated file provides variant annotation information for all non-reference positions in the VCF, which includes non-pass variants. The variant consequence definition is available on the Sequence Ontology website.
All pass variant calls are annotated using the Illumina Annotation Engine (IAE), also known as Nirvana, with the following information (using the RefSeq transcript):
HGNC Gene
Transcript
Exon
Consequence
c.HGVS
p.HGVS
COSMIC
File name: {Sample_ID}.tmb.metrics.csv
The TMB metrics file contains the tumor mutational burden metrics for each DNA sample. The file format uses the following CSV column convention, similar to other metric CSV files.
Filtered Variant Count
Remaining variants after variant and germline filters.
Eligible Region (MB)
The specified custom regions, in megabases, that meet the minimum coverage threshold.
TMB
Filtered variants normalized by the eligible regions.
File name: {SampleID}_SampleAnalysisResults.json
The sample analysis results file (SARJ) is an aggregated results file created for each sample. The SARJ file is used for the generation of downstream outputs. The file contains passing variants and passing variant annotations.
File name: {Sample_ID}_cnv.vcf
The copy number variants (CNV) file contains calls for DNA libraries of the CNV genes targeted by TruSight Oncology 500 ctDNA v2 and v1 assays. The CNV call indicates fold change results for each gene classified as reference, deletion, or amplification.
The value in the QUAL column of the copy number variants VCF is a Phred transformation of the p-value represented by the following equation:\
The p-value is derived from the t-test between the fold change (FC) of the gene against the rest of the genome. Higher Q-scores indicate higher confidence in the CNV call.
In the VCF notation, <DUP> indicates the detected FC is greater than a predefined amplification cutoff. <DEL> indicates the FC is less than a predefined deletion cutoff for that gene. This cutoff can vary from gene to gene.
Each copy number variant is reported as the fold change on normalized read depth in a testing sample relative to the normalized read depth in diploid genomes. Given tumor purity, the ploidy of a gene in the sample can be inferred from the reported fold change.\
Given tumor purity X%, for a reported fold change Y, the copy number n can be calculated by using the following equation:
For example, in a testing sample of tumor purity at 30%, MET with a fold change of 2.2x indicates that 10 copies of MET DNA are observed.
Copy number variant metrics are reported on a per sample level.
Sex Genotyper—The predicted sex of the sample.
Number of alignment records—The number of alignment records in the sample.
Bases in reference genome—The number of bases in the reference genome.
Average alignment coverage over genome—The average alignment coverage across the reference genome.
Number of filtered records (total)—The number of total filtered records.
Number of filtered records (duplicates)—The number of duplicated filtered records.
Number of filtered records (MAPQ)—The number of MAPQ filtered records.
Number of filtered records (unmapped)—The number of unmapped filtered records.
Number of target intervals—The number of target intervals in the sample.
Number of segments—The number of segments in the sample. Applicable only to CNV SLM mode.
Number of amplifications—The number of amplifications in the sample. Applicable only to CNV SLM mode.
Number of deletions—The number of deletions in the sample. Applicable only to CNV SLM mode.
Number of passing amplifications—The number of passing amplifications in the sample. Applicable only to CNV SLM mode.
Number of passing deletions—The number of passing deletions in the sample. Applicable only to CNV SLM mode.
File name: {SampleID}_hard-filtered.vcf
The small variant file contains both phased variants and all other small variants. The header sections from both the phased variant (complex) VCF and the small variant VCF are included in this merged VCF. Variants that are found for both phased variants and small variants are only displayed as phased variants.
File name: {Sample_ID}_Fusions.csv
The fusions file contains all candidate fusions identified by the analysis pipeline.
The fusion columns are described in the following table. If you use Microsoft Excel to view this file, genes that are convertible to dates (for example, MARCH1) automatically convert to dd‑mm format (1-Mar).
Sample
Input sample ID.
Name
Fusion name as reported by the DRAGEN fusion caller.
Chr1
The chromosome of the first breakend.
Pos1
The position of the first breakend.
Chr2
The chromosome of the second breakend.
Pos2
The position of the second breakend.
Direction
The direction of how the breakends are joined.
Alt_Depth
The number of read-pairs supporting the fusion call.
Total_Depth
Max number of read-pairs aligned to a fusion breakend.
BP1_Depth
Number of read-pairs aligned to the first breakend.
BP2_Depth
Number of read-pairs aligned to the second breakend.
VAF
Variant allele frequency.
Gene1
Genes that overlap the first breakend.
Gene2
Genes that overlap the second breakend.
Contig
The fusion contig.
Filter
Indicates whether the fusion has passed all of the fusion filters.
Is_Cosmic_GenePair
Indicates whether the gene pair has been reported by COSMIC (True/False).
Fusion Directionality Known
Indicates whether the fusion direction is known, and indicated by the order of the genes (True/False).
The following table lists the meaning of the values in the direction column. The values are in the format used by Samtools.
L1R2
t[p[
The left of breakend1 is joined with the right of breakend2.
L1rL2
t]p]
The left of breakend1 is joined with the reverse complement of the left of breakend2.
L2R1
]p]t
The left of breakend2 is joined with the right of breakend1.
rR2R1
[p[t
The reverse complement of the right of breakend2 is joined with the right of breakend1.
When the analysis run completes, the DRAGEN TruSight Oncology 500 ctDNA Analysis Software generates an analysis output folder in a specified location.
To view analysis output, navigate to the analysis output folder and select the files that you want to view.
Single output folder structure is as follows.
Logs_Intermediates
AdditionalSarjMetrics
Annotation—Contains outputs for small variant annotation.
Subfolders per sample ID—Contains the aligned small variants JSON.
CombinedVariantOutput
Subfolders per sample ID—Contains the combined variant output TSV files.
A combined output log file.
Contamination
Subfolders per sample ID—Contains the contamination metrics JSON file and output logs.
CoverageReports
DnaFusionFiltering
DragenCaller
Subfolders per sample ID—Contains the aligned BAM and index files, small variant VCF and gVCF, copy number variant VCF, MSI JSON, exon coverage report bed, and QC outputs in CSV format.
FastqValidation—Contains the FASTQ validation output log for the samples.
FastqGeneration
MetricsOutput
Subfolders per sample ID—Contains the metrics output TSV files.
A combined output log file.
ResourceVerification—Contains the resource file checksum verification logs.
Run QC—Contains the Run QC metrics JSON, Intermediate Run QC metrics JSON, and log file.
SampleAnalysisResults
Subfolders per sample ID—Contains the Sample Analysis Results JSON and detailed log file.
SampleSheetValidation—Contains the Intermediate sample sheet and validation log.
Passing Sample Steps - JSON file that contains the steps passed for each Sample ID
Tmb
Subfolders per sample ID—Contains the TMB metrics CSV, TMB trace TSV, and related files and logs.
pipeline_trace.txt
—Contains a summary and troubleshooting file that lists each Nextflow task executed and the status (for example, COMPLETED or FAILED).
run.log
—Contains a complete trace-level log file describing the Nextflow pipeline execution.
run_report.html
—Contains high-level run statistics (performance, usage, etc.)
run_timeline.html
—Contains timeline-related information about the analysis run.
Results
Metrics Output TSV (all Sample IDs)
Sample ID—The following outputs are produced for each sample:
Combined Variant Output TSV
Metrics Output TSV
TMB Trace TSV
Small Variant Genome VCF
Small Variant VCF
Small Variant Annotated JSON
Copy Number Variant VCF
MSI JSON
Fusions CSV
Exon Coverage Report TSV
Gene Coverage Report TSV
This section describes each output folder generated during analysis and where to find metric and analytic files when the pipeline is executed. The same output folder structure and content exist in ICA and BaseSpace Sequence Hub.
Run ID
TSO500_Nextflow_logs
_manifest.json
Results
_tags.json
Logs_intermediates
Errors—This folder is only present when analysis fails
The TSO_500_Nextflow_Logs provides information related to the execution of the pipeline on ICA as a whole and for specific nodes (when an analysis is split across multiple nodes). It contains files used to execute parts of the workflow on different nodes as well as records of the nextflow execution on those nodes.
TSO_500_Nextflow_Logs
_manifest.json
Contains the aggregated MetricsOutput.tsv file at the root level. Additionally, the Results folder contains a subfolder for each sample ID.
Results
MetricsOutput.tsv
Sample_1
Sample_2
Sample_<#>
The Results
subfolder contains the following files:
Results
MetricsOutput.tsv
<Sample_id>
CombinedVariantOutput.tsv
Fusions.csv
tmb.trace.tsv
hard-filtered.gvcf
hard-filtered.vcf
SmallVariants_Annotated.json.gz
cnv.vcf
exon_cov_report.tsv
gene_cov_report.tsv
MetricsOutput.tsv
microsat_output.json
Contains folders for each submodule in the DRAGEN TSO 500 ctDNA on ICA pipeline. The folders contain a copy of all the relevant files required to create the metric output files and report files, as well as the combined log files at the root level and subfolders for each sample.
Logs_intermediates
AdditionalSarjMetrics
Annotation
CombinedVariantOutput
Contamination
CoverageReports
DnaFusionFiltering
DragenCaller
FastqValidation
FastqGeneration
MetricsOutput
PassingSampleSteps
ResourceVerification
Run QC
SampleAnalysisResults
SampleSheetValidation
Tmb
All logs in Logs_Intermediates are generated from the running analysis software. Inputs to the running Docker container (for example, the run folder, sample sheet, and FASTQ folder) are mapped from native locations on the server to the following locations in the container:
The paths in the log messages refer to paths within the running docker container, not paths on the server.
Contains Errors.tsv. This file contains the summary of all the errors encountered during pipeline execution.
Errors
Errors.tsv
The following files and folders are created during analysis by NovaSeq 6000Dx Analysis Application:
analysisResults.json
CopyComplete.txt
edgeos.nextflow.config
inputs/
sampleMapping.json
SampleSheet.csv
SampleSheet.json
Logs_Intermediates
Manifest.tsv
params.json
Results/
workflowLogs/
nf-main-***.log
When the analysis run completes, the analysis application generates an analysis output in a specified location. To view analysis output, follow the steps below:
On the “Completed” runs tab, select the run
Review the run details page, and this will give the information to access the output folder
External Location: is the input for the run
Analysis Output Folder: is where the output is stored. To navigate to this page, follow the “server location” and the gds analysis output folder
Navigate to the directory that contains the analysis output folder
Open the folder, and then select the files that you want to view
File Name: MetricsOutput.tsv
The metrics output file is a final combined metrics report that provides sample status, key analysis metrics, and metadata in a tab-separated values (TSV) file. Sample metrics within the report indicate guideline‑suggested lower specification limits (LSL) and upper specification limits (USL) for each sample in the run.
One metrics output file is generated for the entire run. An additional file is generated for each sample
Run metrics from the analysis module indicate the quality of the sequencing run.
Review the following metrics to assess run data quality:
The values in the Run Metrics section are listed as NA in the following situations:
The analysis was started from FASTQ files.
The analysis was started from BCL files and the InterOp files are missing or corrupt.
[NovaSeqX Plus only] There is no PCT_PF_READS value in NovaSeqX Plus runs, so the PCT_PF_READS value will always be NA.
Review the following metrics to assess sample data quality:
*The recommended threshold of 0.059 for GENE_SCALED_MAD only applies to real cell‑free DNA.
DNA expanded metrics are provided for information only. They can be informative for troubleshooting but are provided without explicit specification limits and are not directly used for sample quality control. For additional guidance, contact Illumina Technical Support.
For troubleshooting information, refer to
PCT_PF_READS
Percentage of reads on the sequencing flow cell that pass the filter.
≥ 55.0
(No lower specification limit for NovaSeq X Plus)
PCT_Q30_R1
Percentage of bases with a quality score ≥ 30 from Read 1.
≥ 80.0
(≥ 85.0 for NovaSeq X Plus)
PCT_Q30_R2
Percentage of bases with a quality score ≥ 30 from Read 2.
≥ 80.0
(≥ 85.0 for NovaSeq X Plus)
CONTAMINATION_SCORE
The contamination score based on VAF distribution of SNPs.
≤ 1227
All
MEDIAN_EXON_COVERAGE
Median exon fragment coverage across all exon bases.
≥ 1300
Small variant, TMB, fusion, MSI
PCT_EXON_1000X
Percent exon bases with 1000X fragment coverage.
≥ 80.0
Small variant, TMB
GENE_SCALED_MAD
The median of absolute deviations normalized by gene fold change.
≤ 0.059*
CNV
MEDIAN_BIN_COUNT_CNV_TARGET
The median raw bin count per CNV target.
≥ 6.0
CNV
TOTAL_PF_READS (count)
Total number of non-supplementary, non-secondary, and passing QC reads after alignment to the whole genome sequence.
Primarily driven by data output of sequencer, quality of library and balancing of library in library pool. If TOTAL_PF_READS is in line with other samples, but coverage metrics are more may suggest non-specific enrichment.
Low values for all samples indicate a poor quality run with possible low cluster numbers or low numbers of Q30 and PF%.
A low value for an individual sample indicates poor pooling of this library into the final pool.
MEAN_FAMILY_SIZE (count)
A UMI Family is a group of reads that all have the same UMI barcode. The family size is the number of reads in family. MEAN_FAMILY_SIZE is the mean of the entire population of reads assembled into UMI families.
The mean UMI family size decreases with increased unique read numbers, and more input DNA leads to more unique reads. Conversely over sequencing of a fixed population of unique DNA molecules leads to increased family size.
As a guide, for a good run with optimal cluster density, passing specs, even sample pooling, and good quality DNA we usually observe values <10.
UMI family size = 1 is not ideal as it is harder to correct for errors.
UMI family size of 2 to 5 enables efficient error correction without wasting sequencing capacity on high percentages of duplicate reads.
MEDIAN_TARGET_COVERAGE (count)
Median depth across all the unique loci occurring in all regions of the manifest file.
Lower median target coverage may be due to poor sample input/quality, library preparation issues or low sequencing output.
PCT_CHIMERIC_READS (%)
Chimeric reads occur when one sequencing read aligns to two distinct portions of the genome with little or no overlap. Metric is proportion of total number of non-supplementary, non-secondary, and passing QC reads after alignment to the whole genome sequence.
While this can be indicative of large-scale structural rearrangement of the genome, values that are elevated above the usual baseline may indicate enrichment probe contamination during library preparation. A suggested metric USL is 8% (those that are higher might see decrease performance in small variant and tmb scores).
PCT_EXON_500X (%)
Percentage of exon bases with 500X fragment coverage. Calculated against all regions in manifest containing _exon in name.
Can be used in combination with other PCT_EXON metrics to understand under or over coverage of exons.
PCT_EXON_1500X (%)
Percentage of exon bases with 1500X fragment coverage. Calculated against all regions in manifest containing _exon in name.
Can be used in combination with other PCT_EXON metrics to understand under or over coverage of exons
PCT_READ_ENRICHMENT (%)
Percentage of reads that have overlapping sequence with the target regions defined in the sample manifest.
Indicative of general enrichment performance. Reduced proportions of enriched reads may indicate issues with the enrichment proportion of the library preparation.
PCT_USABLE_UMI_READS (%)
Percentage of reads that have valid UMI sequences associated with them.
As UMI reads are sequenced at the start of each read, loss of valid UMI sequence may be cause by sequencing issues impacting the quality of base calling in this portion of the sequencing read.
MEAN_TARGET_COVERAGE (count)
Mean depth across all the unique loci defined in the manifest file.
Lower mean target coverage may be due to poor sample input/quality, library preparation issues or low sequencing output. Large differences between the median and mean target coverage values may indicated a skewed distribution of target coverage.
PCT_ALIGNED_READS (%)
Proportion of aligned reads that are non-supplementary, non-secondary and pass QC versus aligned reads that are non-supplementary, non-secondary, mapped and pass QC.
PCT_CONTAMINATION_EST (%)
This metric should only be evaluated if the CONTAMINATION_SCORE metric exceed the USL. This metric estimates the amount of contamination in a sample. The contamination level is computed by taking 2.0* the average of the adjusted allele frequencies of all variants that were selected. The adjusted alllele frequency is either the actual allele frequency of the variant if it is less than 0.5, or 1 -allele frequency if it is greater than or equal to 0.5.
If the sample does not fail the CONTAMINATION_SCORE this metric has no intended meaning as it will be driven by statistical noise (e.g. the few variants that naturally fall outside an expected interval around 0.5 due to random chance)
High contamination estimates may be due to any of the following:
Inter-sample contamination caused by mixing of samples during extraction or library preparation.
Intra-sample contamination, due to mixing of clonally different cell populations during extraction. Large scale genomic rearrangements that cause unexpected VAFs for large numbers of variants.
PCT_TARGET_0.4X_MEAN (%)
Parentage of target (all locations in manifest) reads that have a coverage depth of greater the 0.4x the mean target coverage depth (see definition above).
Provides an indication of uniformity of coverage of the target regions in the manifest file. When trended over time reductions in this metric may indicate an issue with the enrichment process resulting in coverage bias.
PCT_TARGET_500X (%)
Percentage of target bases with 500X fragment coverage. Calculated against all regions in manifest file.
Can be used in combination with other PCT_TARGET metrics to understand under or over coverage of targets.
PCT_TARGET_1000X (%)
Percentage of target bases with 1000X fragment coverage. Calculated against all regions in manifest file.
Can be used in combination with other PCT_TARGET metrics to understand under or over coverage of targets.
PCT_TARGET_1500X (%)
Percentage of target bases with 1500X fragment coverage. Calculated against all regions in manifest file.
Can be used in combination with other PCT_TARGET metrics to understand under or over coverage of targets.
PCT_DUPLEXFAMILIES (%)
Percent of collapsed reads that are duplex (e.g. composed or original forward strand and original reverse strand reads). Number of families that are merged as duplex over total number of families.
Higher is more desirable, lower family depth leads to lower precent duplex families. If low check for under clustering or chemistry concerns.
MEDIAN_INSERT_SIZE (bp)
Median fragment size for sample.
A low median insert size could be a sign of low sample quality or degradation
MAX_SOMATIC_AF
Max somatic allele frequency of a variant; a proxy for tumor fraction. The TMB step flags the variants by potential somatic status using database, VAF and clonal hematopoiesis information. The remaining variants are ranked by variant allele frequency in descending order. The variant allele frequency of first COSMIC hotspot (count >50) or confident somatic variant (having significantly shorter fragment size) is reported as the MaxSomaticVaf for each sample. If no such variant exists, the 4th variant is reported.
This metric is driven by sample tumor fraction
PCT_SOFT_CLIPPED_BASES (%)
Percentage of based that were not used for alignment but retained as part of the alignment file
Soft clipped reads are used as a part of the downstream analysis for small variants calling. A higher-than-expected number could indicate a low-quality enrichment step.
PCT_Q30_BASES (%)
Average percentage of bases ≥ Q30. A prediction of the probability of an incorrect base call (Q‑score).
An indicator of sequencing run quality, low Q30 across all samples on a run could be the result of run overclustering.
Run folder
/opt/illumina/run-folder
Sample sheet
/opt/illumina/SampleSheet.csv
FASTQ folder
/opt/illumina/fastq-folder
Resources
/opt/illumina/resources
Analysis output folder
/opt/illumina/analysis-folder
The following table lists the genes that have associated block listed sites. For the exact location of the block listed site, contact Illumina Technical Support.
ABL1
5
FGFR2
144
PAX7
5
AKT2
5
FGFR3
1
PAX8
275
AKT3
20
FGFR4
36
PBRM1
3
ALK
90
FLCN
2
PDCD1
2
ANKRD11
6
FLI1
36
PDGFRA
5
ANKRD26
9
FLT1
91
PDGFRB
2
AR
81
FLT4
3
PDK1
1
ARID1A
40
FOXA1
48
PDPK1
6
ARID1B
87
FOXL2
4
PGR
5
ARID2
1
FOXO1
2
PHF6
2
ASXL1
3
FOXP1
3
PHOX2B
15
ASXL2
5
FUBP1
1
PIK3C2G
2
ATM
2
GATA4
6
PIK3CA
18
ATR
3
GATA6
12
PIK3CB
42
ATRX
17
GEN1
1
PIK3R1
6
AURKA
1
GID4
3
PIK3R2
2
AXIN2
4
GNAQ
4
PLCG2
3
AXL
74
GNAS
11
PLK2
2
BBC3
2
GPR124
3
PMAIP1
7
BCL10
2
GRM3
1
PMS2
1
BCL2L11
16
H3F3A
1
POLE
3
BCOR
2
H3F3C
2
PPARG
446
BCORL1
1
HGF
1
PRDM1
1
BCR
64
HIST1H1C
2
PRKCI
2
BIRC3
1
HLA-A
72
PRKDC
5
BLM
4
HNF1A
2
PTCH1
13
BMPR1A
4
HNRNPK
9
PTEN
41
BRAF
283
HOXB13
1
PTPRS
14
BRCA1
49
HSP90AA1
4
PTPRT
2
BRCA2
21
ICOSLG
6
QKI
2
BRD4
16
IFNGR1
2
RAD21
1
CARD11
4
iIndel
91
RAD50
5
CASP8
2
INHBA
4
RAD51
18
CBL
8
INPP4A
1
RAD51B
8
CCND1
25
INPP4B
1
RAF1
98
CCND3
49
IRS1
9
RANBP2
12
CCNE1
72
IRS2
19
RARA
2
CD74
50
iSNP
4
RASA1
1
CDH1
4
JAK2
4
RB1
5
CDK12
3
JUN
7
RBM10
13
CDK4
46
KAT6A
5
RECQL4
3
CDK6
13
KDM5A
7
REL
3
CDK8
4
KDM5C
2
RET
3
CDKN2B
2
KDM6A
2
RFWD2
22
CEBPA
12
KDR
1
RICTOR
1
CHD2
5
KIF5B
7
ROS1
287
CHD4
12
KIT
5
RPS6KA4
3
CHEK1
75
KMT2B
51
RPS6KB1
109
CHEK2
64
KMT2C
118
RUNX1
3
chrY
93
KMT2D
108
SDHA
18
CIC
2
KRAS
44
SDHB
3
CREBBP
4
LAMP1
64
SDHD
17
CSNK1A1
4
LATS1
1
SETBP1
7
CTNNB1
1
LATS2
4
SETD2
26
CUL3
1
LoH
85
SF3B1
1
CUX1
9
LRP1B
3
SH2B3
4
DAXX
5
LZTR1
1
SH2D1A
2
DDR2
1
MAGI2
2
SLIT2
1
DDX41
1
MALT1
4
SLX4
2
DIS3
2
MAP2K2
1
SMARCA4
4
DNAJB1
6
MAP2K4
5
SMC1A
1
DNMT1
1
MAP3K1
8
SMC3
8
DNMT3A
4
MAP3K14
2
SMO
2
DOT1L
2
MAP3K4
10
SOX10
7
E2F3
70
MAPK1
6
SOX17
1
EGFR
304
MAPK3
6
SOX9
14
EIF4E
12
MCL1
1
SPEN
4
EML4
9
MDC1
23
STAG1
5
EP300
1
MDM2
53
STAG2
2
ERBB2
14
MDM4
67
STAT4
1
ERBB3
62
MED12
28
STAT5A
1
ERCC1
53
MGA
6
STAT5B
4
ERCC2
57
MLL
9
SUFU
5
ERCC3
4
MLLT3
18
SUZ12
9
ERCC5
4
MRE11A
5
TAF1
9
ERG
2
MSH3
10
TBX3
1
ESR1
32
MSH6
2
TCEB1
1
ETS1
45
MSI
148
TCF3
2
ETV1
862
MST1
18
TCF7L2
6
ETV4
502
MYB
402
TERT
2
ETV5
11
MYC
78
TET1
1
ETV6
187
MYCL1
28
TET2
23
EWSR1
364
MYCN
69
TFE3
299
EZH2
2
MYOD1
3
TFRC
33
FANCA
1
NAB2
10
TGFBR1
6
FANCD2
11
NCOA3
28
TGFBR2
2
FANCG
10
NCOR1
9
TMEM127
5
FANCI
1
NF1
3
TMPRSS2
236
FANCL
1
NKX2-1
4
TOP2A
1
FAT1
2
NOTCH1
4
TP53
22
FBXW7
4
NOTCH3
7
TRAF7
4
FGF1
25
NOTCH4
9
TSC1
4
FGF10
17
NPM1
5
TSC2
1
FGF14
15
NRAS
29
U2AF1
1
FGF19
102
NRG1
47
VEGFA
7
FGF2
26
NTRK1
134
WISP3
2
FGF23
38
NTRK2
145
WT1
10
FGF3
60
NTRK3
13
XIAP
1
FGF4
25
NUTM1
134
XPO1
2
FGF5
14
PAK1
68
XRCC2
1
FGF6
9
PAK3
8
YAP1
1
FGF7
9
PALB2
1
ZBTB7A
11
FGF8
30
PARK2
23
ZFHX3
56
FGF9
21
PARP1
2
ZNF703
7
FGFR1
26
PAX3
156
ZRSR2
2
\