Abstract
Purpose :
As healthcare shifts towards value-based care, there has been an increased focus on providing efficient and cost-effective clinical services. An important barrier for clinic efficiency is a patient’s late arrival. Predicting which patients will be late can allow clinic schedulers to adjust and optimize the schedule to minimize the disruption of patient lateness. However, effectively predicting late patients is a challenging task due to a variety of factors that are associated with late arrival for appointments. The purpose of this study was to develop machine learning models to predict late patients in pediatric ophthalmology clinics at Oregon Health & Science University (OHSU) Casey Eye Institute.
Methods :
The study data was collected from office visits from 2012 to 2018 at OHSU. Time-stamp and office visit data were extracted from the enterprise-wide clinical warehouse and used to calculate time-related variables. Patients who checked in more than 10 minutes after their scheduled appointment time were considered late. Models using random forest, gradient boosting machine (GBM), support vector machine (SVM), and logistic regression were developed to predict whether the patient would arrive late. We used 10-fold cross-validation to reduce over-fitting. Area under the curve-receiver operating characteristic (AUC-ROC) curve scores were used to evaluate the accuracy of the prediction models. We also ranked the importance of predictors based on the decrease in mean impurity in the GBM and random forest models.
Results :
Figure 1 shows the ROC curves and AUC-ROC scores of four machine learning models for distinguishing late patients. The GBM had the best accuracy (AUC=0.654), sensitivity (66%), and specificity (64%). The random forest model had the second-best performance (AUC=0.652) followed by the logistic regression model (AUC=0.643) and SVM (AUC=0.639). The top three important predictors identified in the GBM model were clinic volume, patients arrived late in the previous visit, and previous exam length.
Conclusions :
Machine learning model with secondary use of EHR data can be used to predict late patients with reasonable success. More work is needed to refine the models to improve accuracy. Late arrival prediction has implications for improving clinical scheduling efficiency and patient satisfaction.
This is a 2020 ARVO Annual Meeting abstract.