June 2020
Volume 61, Issue 7
Open Access
ARVO Annual Meeting Abstract  |   June 2020
How much benefit can a deep learning system provide for diabetic retinopathy grading?: A Wizard-of-Oz study
Author Affiliations & Notes
  • Rory Sayres
    Google, Mountain View, California, United States
  • Lin Yang
    Verily Life Sciences, Mountain View, California, United States
  • Abigail Huang
    Verily Life Sciences, Mountain View, California, United States
  • Shawn Xu
    Verily Life Sciences, Mountain View, California, United States
  • Siva Balasubramanian
    Advanced Clinical, Deerfield, Illinois, United States
  • Ilana Traynis
    Advanced Clinical, Deerfield, Illinois, United States
  • Anna Iurchenko
    Google, Mountain View, California, United States
  • Sonali Verma
    Adecco, Mountain View, California, United States
  • Daniel Golden
    Verily Life Sciences, Mountain View, California, United States
  • Footnotes
    Commercial Relationships   Rory Sayres, Google (E), Google (I), Google (F); Lin Yang, Verily Life Sciences (E), Verily Life Sciences (I); Abigail Huang, Verily Life Sciences (E), Verily Life Sciences (I); Shawn Xu, Verily Life Sciences (E), Verily Life Sciences (I); Siva Balasubramanian, Google (C); Ilana Traynis, Google (C); Anna Iurchenko, Google (E), Google (I); Sonali Verma, Verily Life Sciences (C); Daniel Golden, Verily Life Sciences (E), Verily Life Sciences (I)
  • Footnotes
    Support  None
Investigative Ophthalmology & Visual Science June 2020, Vol.61, 3315. doi:
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Rory Sayres, Lin Yang, Abigail Huang, Shawn Xu, Siva Balasubramanian, Ilana Traynis, Anna Iurchenko, Sonali Verma, Daniel Golden; How much benefit can a deep learning system provide for diabetic retinopathy grading?: A Wizard-of-Oz study. Invest. Ophthalmol. Vis. Sci. 2020;61(7):3315.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose : Deep-learning systems (DLS) may improve diabetic retinopathy (DR) assessment of retinal fundus images (Sayres et al, Ophthalmology 2019). However, previous methods left room for further accuracy increase; and grading tended to be slower with assistance. Heatmap methods used previously may also have limited explanatory power for clinicians.

We ask: What is the upper bound on performance benefit from a DR assistant, if the DLS output is as accurate and interpretable as possible? And can a localization-based assistant provide clearer explanations to clinicians?

Methods : We ran a “Wizard-of-Oz” style study in which assistive overlays were edited for optimal accuracy, and presented to clinicians as automatically generated output. We trained a segmentation-based DLS on 13,647 hand-segmented images to localize DR-relevant pathologies such as hemorrhages and neovascularization. The trained DLS was applied to an evaluation set of 400 fundus images, enriched for cases with DR. DLS output was manually edited by an expert ophthalmologist to reflect the underlying pathology as accurately as possible.

Four readers (2 retina specialists, 2 optometrists) read each case in a multi-reader, multi-case study with full crossover. Readers assessed DR gradability and severity for each case. Each batch was read either assisted by the DLS or unassisted; when assisted, the lesion-localization overlay could be toggled on/off as needed. Readers read each case in each arm, with a one-month washout in between.

Results : DR grading accuracy, evaluated against an adjudicated reference standard, increased substantially with assistance. For moderate or worse DR, specificity increased from 92.4% unassisted to 96.6% assisted (p = 0.01, Obuchowski-Rockette analysis), while sensitivity remained high, from 92.4% unassisted to 95.6% assisted (p = 0.14). Cohen’s quadratically weighted kappa for the 5-point DR grade increased significantly from 90.4% to 96.0%. Mean grading time decreased overall with assistance, from 99.6 sec to 87.8 sec (p = 0.002, T test).

Conclusions : Manually annotated, lesion-based localization assistance can produce significant improvements in DR grading accuracy and grading time. Further research should determine whether real-world systems can be developed with sufficiently high localization accuracy to produce the performance benefits seen in this study.

This is a 2020 ARVO Annual Meeting abstract.

 

Illustration of the lesion localization assistant.

Illustration of the lesion localization assistant.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×