Document Type
Article
Publication Date
3-8-2025
Identifier
DOI: 10.1038/s41467-025-57505-2; PMCID: PMC11890787
Abstract
Variant calling is hindered in segmental duplications by sequence homology. We developed Paraphase, a HiFi-based informatics method that resolves highly similar genes by phasing all haplotypes of paralogous genes together. We applied Paraphase to 160 long (>10 kb) segmental duplication regions across the human genome with high (>99%) sequence similarity, encoding 316 genes. Analysis across five ancestral populations revealed highly variable copy numbers of these regions. We identified 23 paralog groups with exceptionally low within-group diversity, where extensive gene conversion and unequal crossing over contribute to highly similar gene copies. Furthermore, our analysis of 36 trios identified 7 de novo SNVs and 4 de novo gene conversion events, 2 of which are non-allelic. Finally, we summarized extensive genetic diversity in 9 medically relevant genes previously considered challenging to genotype. Paraphase provides a framework for resolving gene paralogs, enabling accurate testing in medically relevant genes and population-wide studies of previously inaccessible genes.
Journal Title
Nat Commun
Volume
16
Issue
1
First Page
2340
Last Page
2340
MeSH Keywords
Humans; Genome, Human; Polymorphism, Single Nucleotide; DNA Copy Number Variations; Haplotypes; Gene Conversion; Sequence Analysis, DNA; Gene Duplication
PubMed ID
40057485
Keywords
Human Genome; Single Nucleotide Polymorphism; DNA Copy Number Variations; Haplotypes; Gene Conversion; DNA Sequence Analysis; Gene Duplication
Recommended Citation
Chen X, Baker D, Dolzhenko E, et al. Genome-wide profiling of highly similar paralogous genes using HiFi sequencing. Nat Commun. 2025;16(1):2340. Published 2025 Mar 8. doi:10.1038/s41467-025-57505-2
Comments
Grants and funding
This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
Publisher's Link: https://www.nature.com/articles/s41467-025-57505-2