Abstract
Purpose :
Diabetic Retinopathy (DR) causes preventable blindness. Previously, we found a DR prevalence of 21% in a Temple Primary Care Clinics (TPCC) screening program, suggesting that an annual diabetic eye exam for many patients is unnecessary. To optimize the frequency of DR screenings, we used machine learning to develop a DR risk calculator.
Methods :
Diabetic patients were screened for DR from 2016 to 2020 in TPCC. A chart review of 30 clinical parameters and 1 response variable (DR defined by International Clinical Diabetic Retinopathy) was completed. 5 models (generalized linear model/logistic regression: GLM, support vector machine: SVM, recursive partitioning and regression trees: RPART, random forest: RF and gradient boosted machine: GBM) were trained using 10-fold cross validation to maximize the area under the receiver operator characteristic curves (AUC). A simple product predictor (PP) was defined as the product of HgA1c and years with diabetes (DMY). The maximum AUCs of the 5 models and the PP, along with the predicted rates of normal (sensitivity) and abnormal (specificity) retinas were determined from the resampled data of the training set and compared by pairwise t-tests on the differences vs PP; p values < 0.05 were significant. The predictive model was derived on a random allotment of 2/3 of the subjects.
Results :
1930 subjects were reviewed. 340 subjects were excluded for missing data. 14 clinical variables were excluded for statistical insignificance by univariate analysis (p >0.1), near zero variance (>0.9/0.1 distribution), or high correlation (R>0.85). The remaining 16 features included age, type of DM medication, DMY, HgbA1c, mean arterial pressure, creatinine, glomerular filtration rate, chronic kidney disease stage, microalbuminuria, hypertension, hyperlipidemia, coronary artery disease, cerebrovascular accident, peripheral artery disease and angioplasty. 1590 subjects were included. Table 1 shows AUC comparison of the 5 optimized models and PP. None of the differences in AUC or sensitivity between the machine learning models and PP were significant; only the specificity AUC for the GLM model was significantly greater than that of PP.
Conclusions :
A simple PP is as good a predictor of DR as optimized, complex machine learning models. This finding needs to be validated with an unseen test set.
This abstract was presented at the 2023 ARVO Annual Meeting, held in New Orleans, LA, April 23-27, 2023.