Download Full Text (5.0 MB)


Background: Untargeted plasma metabolomic profiling combined with novel machine learning (ML) tools may lead to the discovery of metabolic profiles that inform our understanding of pediatric chronic kidney disease (CKD) etiologies and identify potential therapeutic targets.

Objective: We sought to identify metabolomic signatures in pediatric CKD based on etiology: focal segmental glomerular sclerosis (FSGS), obstructive uropathy (OU), aplasia/dysplasia/hypoplasia (A/D/H), & reflux nephropathy (RN).

Design/Methods: Untargeted GC/MS2 and LC/MS2-based metabolomics quantification (Metabolon) was performed on baseline plasma samples from 702 Chronic Kidney Disease in Children (CKiD) participants. Participants per etiology were: FSGS (n=63), OU (n=122), A/D/H (n=109), & RN (n=86). Lasso penalized logistic regression was used for feature selection, adjusting for age, sex, race, BMI z-score, proteinuria, estimated glomerular filtration rate, hypertension, medication usage, & CKD duration. Four methods were then applied to the selected metabolites to stratify significance; logistic regression, support vector machine, random forest, and extreme gradient boosting. Important features were selected based on being significant in at least 2 out of the 4 modeling approaches. ML significance was designated as being top 10-percentile weighted input features.

Results: Participant characteristics differed based on CKD etiology. Metabolomic profiles were identified based on CKD etiology (Table 1). ML models were evaluated on hold-out validation subsets with 4 metrics; receiver-operator & precision-recall area-under-the-curve, F-1 score, & Matthews correlation coefficient. All metrics demonstrated the ML models outperformed no-skill prediction (Table 2). FSGS had strong lipid signals that remained significant when comparing FSGS within the glomerular CKD cohort. Histidine metabolites were associated with OU.

Conclusion(s): We successfully trained ML models on the CKiD metabolomics data to identify metabolomic signatures based on CKD etiology. By using newer techniques such as Lasso, SVM, RF, & XGB in conjunction with traditional statistical approaches, we increased our confidence in these findings. Sphingomyelin dysmetabolism has been previously described in smaller FSGS studies. This is the largest cohort of pediatric FSGS showing associations with lipid dysmetabolism. We showed unique histidine signals in OU, which were previously undescribed.

Presented at the 2021 PAS Virtual Conference

Publication Date



Nephrology | Pediatrics

When and Where Presented

Presented at the 2021 PAS Virtual Conference

Using machine learning to identify metabolomic signatures based on pediatric chronic kidney disease etiology