June 2021
Volume 62, Issue 8
Open Access
ARVO Annual Meeting Abstract  |   June 2021
External validation of a deep learning algorithm for plus disease classification on a multinational ROP dataset
Author Affiliations & Notes
  • Praveer Singh
    Massachusetts General Hospital Department of Radiology, Boston, Massachusetts, United States
    Harvard Medical School, Boston, Massachusetts, United States
  • J. Peter Campbell
    Casey Eye Institute, Oregon Health & Science University, Portland, Oregon, United States
  • Susan Ostmo
    Oregon Health & Science University, Portland, Oregon, United States
  • James Brown
    University of Lincoln, Lincoln, Lincolnshire, United Kingdom
  • Szu-Yeu Hu
    Harvard Medical School, Boston, Massachusetts, United States
  • Nathaphop Chaichaya
    Khon Kaen University, Nai Mueang, Khon Kaen, Thailand
  • Phanthipha Wongwai
    Khon Kaen University, Nai Mueang, Khon Kaen, Thailand
  • Somkiat Asawaphureekorn
    Khon Kaen University, Nai Mueang, Khon Kaen, Thailand
  • Sirinya Suwannaraj
    Khon Kaen University, Nai Mueang, Khon Kaen, Thailand
  • Michael Morley
    Harvard Medical School, Boston, Massachusetts, United States
    OCB, Boston, Massachusetts, United States
  • Parag Shah
    Aravind Eye Hospital Coimbatore, Coimbatore, Tamil Nadu, India
  • Narendran Venkatapathy
    Aravind Eye Hospital Coimbatore, Coimbatore, Tamil Nadu, India
  • Robison Vernon Paul Chan
    Department of Ophthalmology, University of Illinois at Chicago College of Medicine, Chicago, Illinois, United States
  • Michael F Chiang
    Oregon Health & Science University, Portland, Oregon, United States
    OCB, Boston, Massachusetts, United States
  • Jayashree Kalpathy-Cramer
    Massachusetts General Hospital Department of Radiology, Boston, Massachusetts, United States
    Harvard Medical School, Boston, Massachusetts, United States
  • Footnotes
    Commercial Relationships   Praveer Singh, None; J. Peter Campbell, None; Susan Ostmo, None; James Brown, None; Szu-Yeu Hu, None; Nathaphop Chaichaya, None; Phanthipha Wongwai, None; Somkiat Asawaphureekorn, None; Sirinya Suwannaraj, None; Michael Morley, None; Parag Shah, None; Narendran Venkatapathy, None; Robison Chan, None; Michael Chiang, Novaritis (C); Jayashree Kalpathy-Cramer, GE (F)
  • Footnotes
    Support  None
Investigative Ophthalmology & Visual Science June 2021, Vol.62, 3266. doi:
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Praveer Singh, J. Peter Campbell, Susan Ostmo, James Brown, Szu-Yeu Hu, Nathaphop Chaichaya, Phanthipha Wongwai, Somkiat Asawaphureekorn, Sirinya Suwannaraj, Michael Morley, Parag Shah, Narendran Venkatapathy, Robison Vernon Paul Chan, Michael F Chiang, Jayashree Kalpathy-Cramer; External validation of a deep learning algorithm for plus disease classification on a multinational ROP dataset. Invest. Ophthalmol. Vis. Sci. 2021;62(8):3266.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose : Deep learning (DL) algorithms have been shown to perform well for classifying plus disease in ROP. However it is common for DL algorithms to have reduced performance on external datasets compared to the datasets that they were trained on. In this study, we demonstrate the efficacy of a DL algorithm, trained on a North American population, on two external multinational datasets.

Methods : Retcam images were obtained from India and Thailand through databases hosted by partner institutions, Aravind Eye Hospital (AEH) & Khon Kaen University (KKU) respectively. After filtering out images with inferior quality, Indian dataset consisted of 8811 images captured from 1275 eye-exams, while the Thai dataset had 1299 images from 385 eye-exams all from at risk infants. Both the Indian and Thai datasets were additionally labelled by 2-3 North American experts and gold standards were obtained through mutual consensus among all raters for each dataset. The performance of the iROP-DL model, trained on Retcam images from American population, was evaluated on both the external Retcam datasets after screening out all the non posterior-pole (PP) images.

Results : The two external datasets included many images which were out of distribution compared to the original training and testing iROP population (multiple views of the retina, anterior segment photos, samples with considerable pigmentation), and thus presented challenges for evaluation of the algorithm. The Table shows low performance before PP-filtering (AUC’s 0.88 & 0.78 for India and Thai respectively), which improved considerably after PP-filtering (AUC’s 0.89 & 0.84) and later by using consensus labels (AUC’s 0.97 & 0.95). As shown via UMAPs in Figure, similar to iROP (yellow), AEH (blue) and KKU (red) also have Normal, pre-plus and plus feature points properly aligned in space (resulting in excellent performance), though they are segregated from iROP owing to demographic differences.

Conclusions : Applying DL algorithms on external datasets is prone to challenges due to demographic or phenotypic differences, or differences in acquisition methodology. After PP-filtering, we demonstrate excellent performance for the i-ROP DL system on the international datasets compared to the original test set. UMAP visualization further substantiates our point and highlights segregation of the external datasets owing to remaining ethnic/phenotypic differences.

This is a 2021 ARVO Annual Meeting abstract.

 

 

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×