June 2023
Volume 64, Issue 8
Open Access
ARVO Annual Meeting Abstract  |   June 2023
Monte Carlo dropout for increased deep learning repeatability and disease classification performance in retinopathy of prematurity
Author Affiliations & Notes
  • Aaron S Coyner
    Opthalmology, Oregon Health & Science University, Portland, Oregon, United States
  • Andreanne Lemay
    Biomedical Engineering, Polytechnique Montreal Bibliotheque Louise-Lalonde-Lamarre, Montreal, Quebec, Canada
  • Katharina Hoebel
    Radiology, Massachusetts General Hospital, Boston, Massachusetts, United States
  • Praveer Singh
    University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States
  • Susan Ostmo
    Opthalmology, Oregon Health & Science University, Portland, Oregon, United States
  • Michael F Chiang
    National Eye Institute, Bethesda, Maryland, United States
    National Library of Medicine, Bethesda, Maryland, United States
  • Jayashree Kalpathy-Cramer
    University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States
  • J. Peter Campbell
    Opthalmology, Oregon Health & Science University, Portland, Oregon, United States
  • Footnotes
    Commercial Relationships   Aaron Coyner Boston AI Lab, Code R (Recipient); Andreanne Lemay None; Katharina Hoebel None; Praveer Singh None; Susan Ostmo None; Michael Chiang None; Jayashree Kalpathy-Cramer Genentech, Code F (Financial Support), Boston AI Lab, Code R (Recipient); J. Peter Campbell Boston AI Lab, Code C (Consultant/Contractor), Genentech, Code F (Financial Support), Siloam Vision, Code O (Owner), Boston AI Lab, Code R (Recipient)
  • Footnotes
    Support  NIH Grants R01 EY019474, R01 EY031331, R21 EY031883, and P30 EY010572, and unrestricted departmental funding and a Career Development Award (JPC) from Research to Prevent Blindness (New York, NY).
Investigative Ophthalmology & Visual Science June 2023, Vol.64, 5124. doi:
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Aaron S Coyner, Andreanne Lemay, Katharina Hoebel, Praveer Singh, Susan Ostmo, Michael F Chiang, Jayashree Kalpathy-Cramer, J. Peter Campbell; Monte Carlo dropout for increased deep learning repeatability and disease classification performance in retinopathy of prematurity. Invest. Ophthalmol. Vis. Sci. 2023;64(8):5124.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose : Image-based medical artificial intelligence (AI) algorithms can be overly sensitive to changes in image features, which can negatively affect diagnostic performance and repeatability (test/retest with different images from the same patient). Model ensembling via Monte Carlo dropout (MCD) has been proposed as a solution to reduce model variance, thereby increasing repeatability and possibly classification performance.1 We evaluate implementation of MCD for detection of plus disease in retinopathy of prematurity (ROP).

Methods : A dataset of color retinal fundus images, collected by the Imaging and Informatics in ROP Consortium (28426 images, 965 babies), was stratified by patient into training, validation, and test datasets (60%/15%/25%) and used to generate a ResNet-18 deep learning (DL) model for classification of normal, pre-plus disease, and plus disease. Spatial dropout layers (P=0.2) were added after every residual block, which were unconventionally activated during inference — slightly altering model structure and predictions with each forward pass (n=2). Softmax outputs of each pass were averaged and converted into a continuous 1–9 vascular severity score (VSS), which was used to measure area under the precision-recall curve (AUPR, normal/pre-plus versus plus). Two random images from each eye were used to measure repeatability via Bland-Altman limits of agreement (LoA) and the classification disagreement rate. Significance (p < 0.05) was assessed via t-tests.

Results : MCD significantly improved AUPR, LoA, and classification disagreement rates by 12.5%, 20.4%, and 21.7%, respectively. AUPR [95% confidence interval (CI)] was increased to 0.800 [0.799, 0.801] from 0.711 [0.702, 0.719] (p < 0.001), LoA was decreased to 2.63 [2.61, 2.66] from 3.31 [3.27, 3.34] (p < 0.001), and disagreement rates decreased to 22.6% [22.0%, 23.1%] from 28.8% [28.2%, 29.5%] (p < 0.001).

Conclusions : Without increasing model complexity or training time, MCD provided a significant increase in performance and repeatability. Implementation of this technique is simple, yet effective, and a strong argument could be made to use it over non-MCD DL models. This has important implications for medical AI algorithms applied to image-based diseases, such as ROP, where imprecision in repeatability could lead to diagnostic and therapeutic errors with the potential for life-altering consequences.

This abstract was presented at the 2023 ARVO Annual Meeting, held in New Orleans, LA, April 23-27, 2023.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×