Purchase this article with an account.
Edward Korot, Siegfried Wagner, Livia Faes, Dun Jack Fu, Xiaoxuan Liu, Daniel Ferraz, Hagar Khalid, Reena Chopra, Gabriella Moraes, Gongyu Zhang, Zeyu Guan, Konstantinos Balaskas, Pearse Andrew Keane; AI building AI: Deep Learning Detection of Referable Diabetic Retinopathy Sans-coding. Invest. Ophthalmol. Vis. Sci. 2020;61(7):2025.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
To create a deep learning algorithm for fundus photo classification of referable diabetic retinopathy (DR) without coding. Deployment enables batch classification of large fundus photo datasets to compare performance against clinicians and regulatory approved models.
We utilized publicly available fundus photo datasets including Messidor-2 and eyePACS. Retina specialist adjudicated ground truth labels were sourced from the Kaggle platform for DR level, macular edema (ME), and gradeability. Referable DR was defined as moderate or greater (ICDR scale) and/or the presence of ME as determined by hard exudates within one disc diameter of the fovea. In datasets without provided gradeability ground truths, we manually set labels for a selection of photos (n=3982). A grading and adjudication process was employed to decide unresolved disagreements between 3 retina specialists. Ungradable images were excluded from analysis. Google Cloud AutoML was used to train a deep learning classification model through a code-free online interface. The algorithm was exported as edge and cloud models, and were integrated into a framework for batch fundus photo prediction. The edge model was uploaded to GitHub for public research use.
Area under the precision-recall curve (AUPRC) was 0.964. The model’s overall accuracy was 88.5%. When the threshold was adjusted for a liberal operating point (optimizing for sensitivity to mimic commercial DR screening models) the sensitivity= 85%, specificity= 90%, PPV= 73%, NPV= 96%. At a conservative operating point (optimizing for specificity to mimic ophthalmologists) the resulting sensitivity= 78%, specificity= 98%, PPV= 92%, NPV= 93%.
We demonstrate the use of automated machine learning (AutoML) sans-coding for the creation of an algorithm to detect referable DR. Performance matched or exceeded published diagnostic accuracy metrics of ophthalmologists and of commercial models used for screening. We published a version of the model for public use as part of a research toolkit. In designing such an algorithm, we elucidate the potential of AutoML. This framework may democratize access to machine learning, driven forward by clinicians without access to specialized computer expertise. We hope these results encourage ophthalmologists and researchers to design their own AutoML models for the benefit of the community.
This is a 2020 ARVO Annual Meeting abstract.
This PDF is available to Subscribers Only