Can I use exome data for CNV detection?
CNV calling from exome FASTQ is done automatically! A prerequisite for this is defining a Panel of Normals (PON) per each enrichment kit you're using.
A PON aids to set a baseline coverage pattern and account for recurrent technical artifacts that are specific to your workflow. Depth of coverage per each sequenced region is averaged across PON samples; if a significant increase or decrease from this baseline is detected in a test sample, a CNV is called.
Recommendations for creating a PON to call CNVs from exome data:
Samples for a PON should be derived from healthy individuals.
In our experience, a PON of at least 40-50 samples yields the best results. A smaller PON is better than nothing, but keep in mind that you may encounter more false positives.
You should aim at preparing samples for a PON in a unified manner to avoid the batch effect. Please log differences in library preparation (if any).
Last updated
Was this helpful?