Long-read sequencing enables SLCO1B1 haplotype phasing and facilitates the discovery of novel star alleles

Presenter Status

Post-Doctorial Research

Abstract Type

Translational Research

Primary Mentor or Principal Investigator

Laura B Ramsey

Presentation Type

Poster

Start Date

21-5-2026 12:00 PM

End Date

21-5-2026 1:00 PM

Abstract Text

Background: Pharmacogenomics (PGx) studies how genetic variation influences individual responses to medications, enabling personalized drug selection to improve efficacy and minimize adverse effects. Clinically, genetic variants are organized into star (*) alleles, and paired star alleles (diplotypes) are used to infer activity/function phenotypes that guide prescribing. SLCO1B1 encodes one of the majorhepatic transporters, organic anions transporting polypeptide 1B1 (OATP1B1), which affects the hepatic uptake of many clinically used drugs, including statins. However, current PGx methods are limited by panels that test few variants and by short-read phasing errors, which can miss rare or novel variants and result in incorrect star allele or phenotype assignment. Long-reads overcome these limitations by enabling haplotype-resolved star allele calling.

Objectives: To assess how long-read sequencing refines SLCO1B1 alleles and phenotype frequency estimates and to characterize novel haplotypes. 

Methods: Long-read sequencing data were generated as part of the Genomic Answers for Kids (GA4K) program, an IRB-approved (CMH-IRB-11120514) pediatric genomics initiative at Children’s Mercy. Self-identified race and ethnicity were collected at enrollment and categorized as White, Black, Hispanic, Multiracial, Other (including Asian and American Indian), or Unknown. Sequencing on Sequel IIe or Revio platforms, alignment to GRCh38 with pbmm2, and SNV calling using DeepVariant were performed by the Children’s Mercy Center for Genomic Medicine. SLCO1B1 diplotypes were initially assigned using pb-StarPhase (v2.0.1), with novel haplotypes identified using Aldy and a custom R-based tool (staR-SLCO1B1).

Results: A total of 2,119 long-read samples were initially available, including index cases, parents, and siblings. After excluding samples with read coverage of < 10x (n=366) and related individuals (n=383), data for 1,370 unrelated subjects were retained for analysis. Across the cohort, the most common allele was SLCO1B1*1 (53%), followed by *14 (15%), *15 (12%), *37 (11%), *20 (5.2%), and *5 (2.5%). The allele frequencies were largely consistent with estimates from the All of US and UK Biobank cohorts; however, the *14 allele showed closer agreement with All of US and remained higher than reported in the UK Biobank. Rare and novel alleles collectively accounted for 1.6% of all observed alleles and were more frequent among non-White individuals (4.2%) than White individuals (1.3%). Phenotype distributions also differed between White and non-White individuals. For example, decreased-function phenotypes were less frequent in non-Whites (20.79% vs. 26.51%), whereas indeterminate SLCO1B1 phenotype was more common in non-Whites (5.62%) than in Whites (1.87%), suggesting a higher haplotype diversity and frequency of rare and novel alleles. Long-read sequencing identified 13 novel star alleles and 9 suballeles in 14 of 1,370 individuals (1%) which were submitted to PharmVar for star allele designation. Seven novel star alleles (SLCO1B1*59, *60, *63, *65, *68, *70, and *71) were defaulted to *1 by pb-StarPhase as this caller relies on existing PharmVar star allele definitions.

Conclusions: Long-read sequencing improves haplotype resolution and enables more accurate phenotype prediction with implications for pharmacogenomic implementation. The GA4K long-read data are an invaluable resource, and ongoing work is focused on characterizing other important pharmacogenes. 

Comments

Restricted to Title/Author List/Abstract only as requested by primary author

Poster Board Number: 38

This document is currently not available here.

Share

COinS
 
May 21st, 12:00 PM May 21st, 1:00 PM

Long-read sequencing enables SLCO1B1 haplotype phasing and facilitates the discovery of novel star alleles

Background: Pharmacogenomics (PGx) studies how genetic variation influences individual responses to medications, enabling personalized drug selection to improve efficacy and minimize adverse effects. Clinically, genetic variants are organized into star (*) alleles, and paired star alleles (diplotypes) are used to infer activity/function phenotypes that guide prescribing. SLCO1B1 encodes one of the majorhepatic transporters, organic anions transporting polypeptide 1B1 (OATP1B1), which affects the hepatic uptake of many clinically used drugs, including statins. However, current PGx methods are limited by panels that test few variants and by short-read phasing errors, which can miss rare or novel variants and result in incorrect star allele or phenotype assignment. Long-reads overcome these limitations by enabling haplotype-resolved star allele calling.

Objectives: To assess how long-read sequencing refines SLCO1B1 alleles and phenotype frequency estimates and to characterize novel haplotypes. 

Methods: Long-read sequencing data were generated as part of the Genomic Answers for Kids (GA4K) program, an IRB-approved (CMH-IRB-11120514) pediatric genomics initiative at Children’s Mercy. Self-identified race and ethnicity were collected at enrollment and categorized as White, Black, Hispanic, Multiracial, Other (including Asian and American Indian), or Unknown. Sequencing on Sequel IIe or Revio platforms, alignment to GRCh38 with pbmm2, and SNV calling using DeepVariant were performed by the Children’s Mercy Center for Genomic Medicine. SLCO1B1 diplotypes were initially assigned using pb-StarPhase (v2.0.1), with novel haplotypes identified using Aldy and a custom R-based tool (staR-SLCO1B1).

Results: A total of 2,119 long-read samples were initially available, including index cases, parents, and siblings. After excluding samples with read coverage of < 10x (n=366) and related individuals (n=383), data for 1,370 unrelated subjects were retained for analysis. Across the cohort, the most common allele was SLCO1B1*1 (53%), followed by *14 (15%), *15 (12%), *37 (11%), *20 (5.2%), and *5 (2.5%). The allele frequencies were largely consistent with estimates from the All of US and UK Biobank cohorts; however, the *14 allele showed closer agreement with All of US and remained higher than reported in the UK Biobank. Rare and novel alleles collectively accounted for 1.6% of all observed alleles and were more frequent among non-White individuals (4.2%) than White individuals (1.3%). Phenotype distributions also differed between White and non-White individuals. For example, decreased-function phenotypes were less frequent in non-Whites (20.79% vs. 26.51%), whereas indeterminate SLCO1B1 phenotype was more common in non-Whites (5.62%) than in Whites (1.87%), suggesting a higher haplotype diversity and frequency of rare and novel alleles. Long-read sequencing identified 13 novel star alleles and 9 suballeles in 14 of 1,370 individuals (1%) which were submitted to PharmVar for star allele designation. Seven novel star alleles (SLCO1B1*59, *60, *63, *65, *68, *70, and *71) were defaulted to *1 by pb-StarPhase as this caller relies on existing PharmVar star allele definitions.

Conclusions: Long-read sequencing improves haplotype resolution and enables more accurate phenotype prediction with implications for pharmacogenomic implementation. The GA4K long-read data are an invaluable resource, and ongoing work is focused on characterizing other important pharmacogenes.