June 2020
Volume 61, Issue 7
ARVO Annual Meeting Abstract  |   June 2020
Impact of Reference Standard, Data Augmentation, and OCT Input on Glaucoma Detection Accuracy by CNNs on a New Test Set
Author Affiliations & Notes
  • Kaveri Thakoor
    Biomedical Engineering, Columbia University, New York, United States
  • Emmanouil (Manos) Tsamis
    Psychology, Columbia University, New York, United States
  • C Gustavo De Moraes
    Ophthalmology, Columbia University Medical Center, New York, United States
  • Paul Sajda
    Biomedical Engineering, Electrical Engineering, Radiology (Physics), Columbia University, New York, United States
  • Donald C Hood
    Psychology, Ophthalmology, Columbia University, New York, United States
  • Footnotes
    Commercial Relationships   Kaveri Thakoor, None; Emmanouil (Manos) Tsamis, Topcon, Inc. (R); C Gustavo De Moraes, Belite (C), Carl Zeiss (C), Galimedix (C), Heidelberg (R), NIH (R), Novartis (C), Perfuse Therapeutics (C), Reichert (C), Research to Prevent Blindness (R), Topcon (R); Paul Sajda, None; Donald Hood, Heidelberg Eng, Inc. (F), Heidelberg Eng, Inc. (C), Heidelberg Eng, Inc. (R), Novartis (F), Novartis (C), Novartis (R), Topcon, Inc. (F), Topcon, Inc. (C), TopCon, Inc. (R)
  • Footnotes
    Support  EY-025253, EY-02115
Investigative Ophthalmology & Visual Science June 2020, Vol.61, 4540. doi:
  • Views
  • Share
  • Tools
    • Alerts
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Kaveri Thakoor, Emmanouil (Manos) Tsamis, C Gustavo De Moraes, Paul Sajda, Donald C Hood; Impact of Reference Standard, Data Augmentation, and OCT Input on Glaucoma Detection Accuracy by CNNs on a New Test Set. Invest. Ophthalmol. Vis. Sci. 2020;61(7):4540.

      Download citation file:

      © ARVO (1962-2015); The Authors (2016-present)

  • Supplements

Purpose : To evaluate the performance of Convolutional Neural Networks (CNNs) previously shown [1] to detect glaucoma from Optical Coherence Tomography (OCT) Retinal Nerve Fiber Layer (RNFL) probability maps, on a new dataset collected at a different location, by different operators, and on a different OCT instrument; the reference standard (RS), training, and input data were varied.

Methods : The performances of 5 CNNs (previously trained to detect early glaucomatous damage from OCT RNFL probability maps, which achieved high accuracy (95%) [1]), were examined without any re-training on a new test set using 4 new reference standards (RS) for evaluation: an OCT expert’s gradings based on RS1: full OCT reports, RS2: only RNFL and RGCP (Retinal Ganglion Cell Plexiform) probability maps, RS3: only RNFL probability maps (format provided to CNNs [1]), and RS4: consensus of 3 graders who had access to OCT and visual field information. For the best-performing CNN, the impact on performance of data augmentation during training and varying input (only RNFL probability maps vs. RNFL and RGCP maps together) was assessed. False positive (FP) and false negative (FN) RNFL images were visualized with Grad-CAMs [2] and quantitatively assessed via abnormal structure & function (aS-aF) agreement. [3]

Results : The ResNet-18 + Random Forest model with data augmentation and with RNFL probability map input alone was the best-performing model, achieving 83.0% accuracy when transferred to the new test set with RS1 and 81.1% accuracy with clinically-relevant RS4 (Table). aS-aF analysis of FP and FN indicated that number of aS-aF locations is significantly greater for true positives (TP) than for FN (p < 0.05) (Fig-lower panel). Regions highlighted in Grad-CAMs are also regions with aS-aF agreement (Fig-upper panels).

Conclusions : When transferring to a new test set, choice of reference standard, data augmentation, and input image format can improve CNN performance. In this study, RNFL maps alone enabled better performance compared to RNFL and RGCP maps combined as CNN input. Providing the grader full OCT reports served as optimal transfer RS. S-F analysis indicated that CNNs miss cases (FNs) when there are significantly fewer aS-aF locations, suggesting that such CNNs could serve to screen RNFL images with extreme damage. 1. Thakoor et al., EMBC 2019; 2. Selvaraju et al., ICCV 2017; 3. Hood et al., IOVS 2019

This is a 2020 ARVO Annual Meeting abstract.




This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.