Purchase this article with an account.
Rui Fan, Christopher Bowd, Mark Christopher, Nicole Brye, James A Proudfoot, Jasmin Rezapour, Akram Belghith, Robert N Weinreb, David Kriegman, Linda M Zangwill; Deep learning for detecting glaucoma in the Ocular Hypertension Treatment Study: Implications for clinical trial endpoints. Invest. Ophthalmol. Vis. Sci. 2021;62(8):1006.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
To investigate the diagnostic accuracy of deep learning (DL) algorithms to detect primary open angle glaucoma (POAG) trained on fundus photographs from the Ocular Hypertension Treatment Study (OHTS).
74,678 photographs from 3,272 eyes of 1,636 OHTS participants with a mean follow-up (range) of 10.7 (0.0, 14.3) years were used to train a ResNet-50 deep learning model to detect the OHTS I and II Endpoint Committee POAG determination based on optic disc (n=287 eyes, 3,502 photographs) and /or visual field (n=198 eyes, 2,300 visual fields) changes. OHTS training, validation and testing sets were randomly determined using an 85-5-10 percentage split by subject. Three independent test sets (1: UCSD Diagnostic Innovations in Glaucoma Study (DIGS), 2: ACRIMA (Spain) and 3: Large-scale Attention-based Glaucoma (LAG, China) were used to estimate the generalizability of the model. Areas under the receiver operating characteristic curve (AUROC) and sensitivities at fixed specificities were used to compare model performance. Evaluation of false positive rates at a fixed specificity of 90% was used to determine whether the DL model detected glaucoma before the Endpoint Committee determination.
For the OHTS test set, the DL model achieved an AUROC (95% CI) of 0.87 (0.80, 0.91) for the overall OHTS POAG endpoint. For the OHTS endpoints based on optic disc changes or visual field changes, AUROCs were 0.90 (0.87, 0.93) and 0.87 (0.80, 0.91), respectively. False positive rates (at 90% specificity) were higher in earlier photographs of hypertensive eyes that later developed POAG by disc or visual field (19.1%), compared to hypertensive eyes that did not develop POAG (7.3%) during their OHTS follow-up. The diagnostic accuracy of the DL model developed based on the OHTS optic disc endpoint on the 3 independent datasets was lower with AUROC for DIGS of 0.74 (0.69, 0.79), ACRIMA of 0.74 (0.70, 0.77) and LAG of 0.79 (0.78, 0.81).
The high diagnostic accuracy of the current DL model suggests that DL can be used to automate the determination of POAG for clinical trials and management. In addition, the higher false positive rate in early photographs of eyes that later developed POAG suggests that DL models detected POAG in some eyes earlier than the OHTS POAG Endpoint Committee.
This is a 2021 ARVO Annual Meeting abstract.
This PDF is available to Subscribers Only