Purchase this article with an account.
Paolo S Silva, Drew Lewis, Jerry Cavallerano, Mohamed Ashraf, Cris Martin P. Jacoba, Duy Doan, Frank S Wang, Jennifer K Sun, Lloyd P Aiello; Automated machine learning (AutoML) model for diabetic retinopathy (DR) image classification from ultrawide field (UWF) retinal images. Invest. Ophthalmol. Vis. Sci. 2022;63(7):2095 – F0084.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
To create and validate automated deep learning models for DR that are trained on UWF images obtained from a DR teleophthalmology program.
AutoML Vision (Google) models were generated based on nonmydriatic UWF images from the Joslin Vision Network (JVN). Image labeling was based on standardized evaluation by the JVN reading center following the clinical Early Treatment Diabetic Retinopathy Study Severity Scales (ETDRS-SS). Images for the initial model were split 8-1-1 for training, validation and testing to detect referable DR [(refDR), defined as moderate nonproliferative DR (NPDR) or worse]. External testing of the autoML model was performed using a published image set with matching nonmydriatic UWF, clinical exam and standard 7-Field ETDRS photos (N=192 eyes). Sensitivity and specificity (SN/SP) for refDR were calculated. Based on published FDA requirements, prespecified performance thresholds were defined at 0.85/0.825 for SN/SP.
Distribution of ETDRS-SS in training set (N=3,999 images): no DR 33.8%, mild 16.2%, moderate 29.4%, severe NPDR 5.0%, PDR 16.6%, RefDR was present in 50.0% of images. Area under the precision-recall curve (AUPRC) was 0.947 (figure 1). The model’s overall accuracy for RefDR was 92.2%. External testing set distribution of ETDRS-SS by UWF/clinical exam/ETDRS photos: no DR 10.9/8.9/12.5%, mild NPDR 22.9/18.7/22.9%, moderate NPDR 33.8/33.8/29.7%, severe NPDR 10.9/12.0/8.3%, PDR 21.3/26.6/26.6% with RefDR was present in 66.1/72.4/64.6%. SN/SP for refDR on the external test set was 0.79/0.83 for UWF, 0.76/0.90 for clinical exam, 0.79/0.81 for ETDRS photos. Table 1 shows a comparison with reported metrics from FDA approved and UWF algorithms.
Despite the increasing adoption, there are no commercially available artificial intelligence DR algorithms for UWF images. This study demonstrates feasibility of using autoML models for the identification of refDR from UWF obtained in a teleophthalmology program. Despite the large image size and complexity of UWF compared to standard retinal images, the performance approaches published diagnostic accuracy metrics of commercial models used for DRSP. This proof of concept use emphasizes the broad future potential by allowing programs with large image datasets to address emerging clinical needs with AI applications
This abstract was presented at the 2022 ARVO Annual Meeting, held in Denver, CO, May 1-4, 2022, and virtually.
Figure 1: Area under the precision-recall curve: 0.957
This PDF is available to Subscribers Only