June 2020
Volume 61, Issue 7
Open Access
ARVO Annual Meeting Abstract  |   June 2020
Performance of Deep Learning Glaucoma Suspect Models Compared to Various Reference Standards
Author Affiliations & Notes
  • Lu Yang
    Google Health, California, United States
  • Carter Dunn
    Google Health, California, United States
  • Abigail E Huang
    Google Health, California, United States
  • Naama Hammel
    Google Health, California, United States
  • Ilana Traynis
    Advanced Clinical, Deerfield, Illinois, United States
  • Monica Gandhi
    Dr. Shroff’s Charity Eye Hospital, India
  • Jonathan Krause
    Google Health, California, United States
  • Sonia Phene
    Google Health, California, United States
  • Footnotes
    Commercial Relationships   Lu Yang, Google (E); Carter Dunn, Google (E); Abigail Huang, Verily (E); Naama Hammel, Google (E); Ilana Traynis, Google (C); Monica Gandhi, None; Jonathan Krause, Google (E); Sonia Phene, Google (E)
  • Footnotes
    Support  None
Investigative Ophthalmology & Visual Science June 2020, Vol.61, 4538. doi:
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Lu Yang, Carter Dunn, Abigail E Huang, Naama Hammel, Ilana Traynis, Monica Gandhi, Jonathan Krause, Sonia Phene; Performance of Deep Learning Glaucoma Suspect Models Compared to Various Reference Standards. Invest. Ophthalmol. Vis. Sci. 2020;61(7):4538.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose : To train deep learning models for identifying glaucoma suspect from color fundus photos and compare their performance against various reference standards.

Methods : We trained two deep learning models on fundus photos to predict the presence of any referable anatomical abnormalities (ANA) indicative of glaucoma. The first model (“3ANA”) outputs whether any of the following three ANAs is present in the fundus photo: vertical cup to disc ratio > 0.7, neuroretinal rim notch, or retinal nerve fiber layer defect. The second model (“4ANA”) assesses for any of the three ANAs, or disc hemorrhage. We then measured the performance of 3ANA and 4ANA on two data sets with different reference standards. The primary validation set (n=1119) uses a reference standard based on three glaucoma specialists’ assessment of a single fundus photo. The secondary validation set consists of 346 eyes from 346 patients from an independent institution, and uses a reference standard based on a complete clinical glaucoma workup, determined as glaucoma, glaucoma suspect, or not glaucoma.

Results : When evaluated on a reference standard based on three glaucoma specialists’ assessment of a single fundus photo, 3ANA and 4ANA achieve AUCs of 0.890 and 0.861, respectively. When compared against a reference standard based on a full glaucoma workup, 3ANA and 4ANA achieve AUCs of 0.778 and 0.782, respectively.

Conclusions : The models developed to detect the presence of glaucoma-related ANAs are fairly well correlated with glaucoma specialists’ assessment on fundus photo alone. Compared to performance on the primary validation set, the apparent performance decrease when evaluating on a reference standard based on the full glaucoma workup may be due to differences in patient populations, or in the breadth of clinical data used to arrive at the diagnosis.

This is a 2020 ARVO Annual Meeting abstract.

 

Figure 1. Receiver operating characteristic curve (ROC) of the 3ANA and 4ANA models on the primary validation dataset (n=1119), against a reference standard based on glaucoma specialists’ assessment of a single fundus photograph.

Figure 1. Receiver operating characteristic curve (ROC) of the 3ANA and 4ANA models on the primary validation dataset (n=1119), against a reference standard based on glaucoma specialists’ assessment of a single fundus photograph.

 

Receiver operating characteristic curve (ROC) of the 3ANA and 4ANA models on the secondary validation dataset (n=346), against a reference standard based on a full glaucoma workup.

Receiver operating characteristic curve (ROC) of the 3ANA and 4ANA models on the secondary validation dataset (n=346), against a reference standard based on a full glaucoma workup.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×