Abstract
Purpose :
Preventive therapies to reduce progression of non-proliferative DR (NPDR) to proliferative DR are underway. Clinical trial enrollment criteria typically include a baseline DR severity scale (DRSS) of moderately severe (level 47) or severe NPDR (level 53) determined from stereoscopic 7-field retinal color photographs by certified graders at the Wisconsin Reading Center (WRC). The estimated screen failure rate after WRC confirmation is 50% with the most common reason being level < 47. We developed an AI algorithm to enable real-time prescreening of color photographs for DRSS to reduce the screen failure rate.
Methods :
Grader training images at WRC were utilized for training the EffcientNetB4 model starting from Imagenet weights and fine tuned to a binary classifier with an input resolution of 768x768 utilizing an 80:20 split for training and internal validation. The dataset included 572 training images stratified across the NPDR scale (levels 35, 43, 47 and 53). 132 images were set aside for internal validation with the same stratification. The algorithm was trained to identify ineligible eyes (levels 35 – 43) and eligible eyes (levels 47-53) using macula centered field 2 images only. WRC graders used the DRSS scale to assess NPDR severity in all 572 eyes.
Results :
Of the 37 eyes considered eligible by the AI, the grader agreed on 62.9%. Of 97 eyes considered ineligible by AI, the grader agreed on 90.7%. The best model had an area under the curve (AUC) of 0.91. The overall accuracy (weighted by class) was 83%. The sensitivity of the eligible class was 0.59, specificity of 0.71 giving an F1 score of 0.65. A review of the false positive and negative images showed image quality and an imbalance in DR features between F2 and peripheral fields to be the two largest contributing factors to false results.
Conclusions :
The AI algorithm is effective at identifying levels <47 and can exclude them from clinical trial submissions. A tiered approach with AI prescreening for eligible participants followed by WRC grader confirmation reduces the screen failure rate, creating cost efficiency and reduced burden for participants and clinical site staff. Real time automated assessment of potential eligibility could improve enrollment in DR clinical trials.
This abstract was presented at the 2022 ARVO Annual Meeting, held in Denver, CO, May 1-4, 2022, and virtually.