Document Type


Publication Date



DOI: 10.3389/fphar.2023.1195778; PMCID: PMC10324673


Complex regions in the human genome such as repeat motifs, pseudogenes and structural (SVs) and copy number variations (CNVs) present ongoing challenges to accurate genetic analysis, particularly for short-read Next-Generation-Sequencing (NGS) technologies. One such region is the highly polymorphic CYP2D loci, containing CYP2D6, a clinically relevant pharmacogene contributing to the metabolism of >20% of common drugs, and two highly similar pseudogenes, CYP2D7 and CYP2D8. Multiple complex SVs, including CYP2D6/CYP2D7-derived hybrid genes are known to occur in different configurations and frequencies across populations and are difficult to detect and characterize accurately. This can lead to incorrect enzyme activity assignment and impact drug dosing recommendations, often disproportionally affecting underrepresented populations. To improve CYP2D6 genotyping accuracy, we developed a PCR-free CRISPR-Cas9 based enrichment method for targeted long-read sequencing that fully characterizes the entire CYP2D6-CYP2D7-CYP2D8 loci. Clinically relevant sample types, including blood, saliva, and liver tissue were sequenced, generating high coverage sets of continuous single molecule reads spanning the entire targeted region of up to 52 kb, regardless of SV present (n = 9). This allowed for fully phased dissection of the entire loci structure, including breakpoints, to accurately resolve complex CYP2D6 diplotypes with a single assay. Additionally, we identified three novel CYP2D6 suballeles, and fully characterized 17 CYP2D7 and 18 CYP2D8 unique haplotypes. This method for CYP2D6 genotyping has the potential to significantly improve accurate clinical phenotyping to inform drug therapy and can be adapted to overcome testing limitations of other clinically challenging genomic regions.

Journal Title

Front Pharmacol



First Page


Last Page



CRISPR; CYP2D6; PCR-free; clinical testing; pharmacogenetics; precision medicine; single-molecule long-read sequencing


Grant support

AT, AD, UB, and GS efforts were supported in part by SBIR 1R43FD007247-01A1. The human liver tissue sample was obtained through the Liver Tissue Cell Distribution System, Minneapolis, MN and Pittsburgh, PA, which was funded by NIH Contract #HHSN276201200017C. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. Publisher's Link: