Abstract
Purpose :
Advances in artificial intelligence have enabled the development of predictive models for glaucoma. However, most work is single-center and uncertainty exists regarding the generalizability of such models. The purpose of this study was to build and evaluate machine learning (ML) approaches to predict glaucoma progression requiring surgery using data from a large multicenter consortium of electronic health records (EHR).
Methods :
Structured EHR data from 5 academic eye centers participating in SOURCE were identified, including demographics, diagnosis codes, medications, and clinical information (intraocular pressure, visual acuity, refractive status, and central corneal thickness). We developed machine learning models to predict whether glaucoma patients (identified by ICD codes) would progress to glaucoma surgery (identified by CPT codes) using the following modeling approaches: 1) penalized logistic regression (lasso, ridge, and elastic net); 2) random forest. One site was reserved as an “external site” test set (N=2574); of the patients from the remaining sites, 3000 each were randomly selected to be in development and test sets, with the remaining 34747 reserved for model training. Evaluation metrics included area under the receiver operating characteristic curve (AUROC) on the test set and the external site.
Results :
55444 (12.8%) of 43321 patients underwent glaucoma surgery. Model performance is summarized in the figure. Overall, the AUROC ranged from 0.651-0.675 on the random test set and from 0.623-0.673 on the external test site, with the random forest model performing best on both sets. There was a greater performance decrease from the random test set to the external test site for the penalized regression models than for the random forest model.
Conclusions :
ML models developed using EHR data can predict whether glaucoma patients will need surgery. Performance of our predictive models was similar to prior models trained on structured EHR data from a single-center. Caution should accompany deployment of predictive models to populations on sites external to the original training set. Additional research is needed to investigate the impact of protected class characteristics such as race or gender on model performance and fairness.
This abstract was presented at the 2023 ARVO Annual Meeting, held in New Orleans, LA, April 23-27, 2023.