Purchase this article with an account.
Nikita Mokhashi, Julia Grachevskaya, Lorrie Cheng, Daohai Yu, Jeffrey Henderer; A comparison of artificial intelligence and human diabetic retinal image interpretation in an urban health system.. Invest. Ophthalmol. Vis. Sci. 2020;61(7):5311.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
Artificial intelligence (AI) diabetic retinopathy (DR) software has the potential to decrease time spent by clinicians on image interpretation, provide point-of-care results to the patient, and expand the scope of DR screening. We are beginning to employ AI in our Temple University Health System screening program. To compare AI to the human grader, we performed a retrospective review to determine the sensitivity and specificity of Eyenuk’s EyeArt system (Woodland Hills, CA) compared to Temple Ophthalmology optometry (OD) grading using the International Classification of Diabetic Retinopathy (ICDR) scale.
260 consecutive diabetics from the Temple Faculty Practice Internal Medicine clinic underwent 2-field retinal imaging between April 1, 2019 and August 1, 2019. At least 1 optic nerve-centered image and 1 macula-centered image from each eye were analyzed by EyeArt, and resulted in a reading of non-referable DR, referable DR, or ungradable image for each eye. No DR (ICDR 0) or mild DR (ICDR 1) was classified as non-referable DR. Moderate (ICDR 2), severe (ICDR 3), or proliferative (ICDR 4) DR was classified as referable DR. The OD reading was classified using the identical system. For interpretable images, if either eye was classified as referable, the patient was classified as referable. If both eyes were classified as non-referable, the patient was classified as non-referable. EyeArt and OD results were compared by patient level agreement and were analyzed for sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV).
The average age was 61 years (range of 21-90 years). 39.2% were male, and 60.8% were female. Ungradable image breakdown is listed in Figure 1. Overall, there was a significant difference between the EyeArt and the OD grading (p<0.0001). After excluding ungradable images, sensitivity for EyeArt was 100%, while specificity was 77.3%. PPV was 23.5%, and NPV was 100% (Figure 2).
EyeArt image grading yielded a different result than human grading. EyeArt is very unlikely to miss disease, but there is a substantial false positive rate when comparing to the OD as the gold standard. A positive EyeArt result may not be sufficient to rule-in disease.
This is a 2020 ARVO Annual Meeting abstract.
This PDF is available to Subscribers Only