Background: Not all cells in a given individual are identical in their genomic makeup. Mosaicism describes such a phenomenon where a mixture of genotypic states in certain genomic segments exists within the same individual. Mosaicism is a prevalent and impactful class of non-integer state copy number variation (CNV). Mosaicism implies that certain cell types or subset of cells contain a CNV in a segment of the genome while other cells in the same individual do not. Several studies have investigated the impact of mosaicism in single patients or small cohorts but no comprehensive scan of mosaic CNVs has been undertaken to accurately detect such variants and interpret their impact on human health and disease. Results: We developed a tool called Montage to improve the accuracy of detection of mosaic copy number variants in a high throughput fashion. Montage directly interfaces with ParseCNV2 algorithm to establish disease phenotype genome-wide association and determine which genomic ranges had more or less than expected frequency of mosaic events. We screened for mosaic events in over 350,000 samples using 1% allele frequency as the detection limit. Additionally, we uncovered disease associations of multiple phenotypes with mosaic CNVs at several genomic loci. We additionally investigated the allele imbalance observations genome-wide to define non-diploid and non-integer copy number states. Conclusions: Our novel algorithm presents an efficient tool with fast computational runtime and high levels of accuracy of mosaic CNV detection. A curated mosaic CNV callset of 3716 events in 2269 samples is presented with comparability to previous reports and disease phenotype associations. The new algorithm can be freely accessed via: https://github.com/CAG-CNV/MONTAGE.
All Science Journal Classification (ASJC) codes
- Copy number variation