June 2021
Volume 62, Issue 8
Open Access
ARVO Annual Meeting Abstract  |   June 2021
Evaluating the influence of data source and labelling variability on a Deep Learning (DL) based OCT B-scan abnormality classification
Author Affiliations & Notes
  • GANESH BABU TUMKUR CHANDRA
    Center of Application and Research in India, Carl Zeiss India (Bangalore) Pvt. Ltd., Bangalore, Karnataka, India
  • SANDIPAN CHAKROBORTY
    Center of Application and Research in India, Carl Zeiss India (Bangalore) Pvt. Ltd., Bangalore, Karnataka, India
  • Krunalkumar Ramanbhai Patel
    Center of Application and Research in India, Carl Zeiss India (Bangalore) Pvt. Ltd., Bangalore, Karnataka, India
  • Niranchana Manivannan
    Carl Zeiss Meditec, Inc., Dublin, California, United States
  • Mary K Durbin
    Carl Zeiss Meditec, Inc., Dublin, California, United States
  • Alexander Freytag
    CRT, Carl Zeiss AG, Jena, Thuringia, Germany
  • Footnotes
    Commercial Relationships   GANESH BABU TUMKUR CHANDRA, Carl Zeiss India (Bangalore) Pvt. Ltd.: (E); SANDIPAN CHAKROBORTY, Carl Zeiss India (Bangalore) Pvt. Ltd. (E); Krunalkumar Ramanbhai Patel, Carl Zeiss India (Bangalore) Pvt. Ltd. (E); Niranchana Manivannan, Carl Zeiss Meditec, Inc. (E); Mary Durbin, Carl Zeiss Meditec, Inc. (E); Alexander Freytag, Carl Zeiss AG, Jena, Germany (E)
  • Footnotes
    Support  None
Investigative Ophthalmology & Visual Science June 2021, Vol.62, 1781. doi:
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      GANESH BABU TUMKUR CHANDRA, SANDIPAN CHAKROBORTY, Krunalkumar Ramanbhai Patel, Niranchana Manivannan, Mary K Durbin, Alexander Freytag; Evaluating the influence of data source and labelling variability on a Deep Learning (DL) based OCT B-scan abnormality classification. Invest. Ophthalmol. Vis. Sci. 2021;62(8):1781.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose : DL techniques can be used to detect abnormalities in OCT B-scans. The performance of such an algorithm is dependent on quality of both data and its labels. The sources of data could be either from a clinical study or busy eye clinics. While the image quality, disease prevalence, age of subjects etc. can be well controlled within scope of a clinical study, the same may not be applicable when collected from the eye clinics. The involvement of multiple labelers may introduce inconsistencies while creating labels, because each expert has their own clinical judgment and differs depending on how and where (Primary, Secondary or Tertiary clinics) they practice. In this abstract, we discuss the effects of the above aspects on classification performance.

Methods : To assess the effect of data source, we gathered macular OCT cube data during clinical studies and from eye clinics. Next, to measure the effect of labelling variability, the data were labelled at B-scan level by five labelers. Among five labelers, two of them (labelers X & Y in Fig. 2) had similar expertise levels practicing in the same hospital, while the rest (labelers A, B & C) were from three different eye clinics. Data from each labeler was split into training and test sets. An Inception_V1 model was developed on the training set for each labeler, and the performance was evaluated on all test sets. Fig 1(a) and 2(a) show the number of samples used for trainings and evaluations.

Results : Fig 1(b) shows that evaluating a model using data from clinical trials does not always indicate good generalizability and may over-estimate the model accuracy. Training on ‘uncontrolled’ data sources leads overall to improved performance in a typical clinical setting, even if such a model underperforms in the clinical trial setting.
From Fig 2(b), we observe that models perform well on data labelled by experts with similar background. It shows strong differences in accuracy for abnormality prediction with labelers from different backgrounds, showing significant AUC drop.

Conclusions : We conclude that prediction models are not easily transferable across labelers from different backgrounds. Furthermore, model accuracy from clinical trial model may not be transferrable to busy clinical environments.

This is a 2021 ARVO Annual Meeting abstract.

 

Fig 1. Controlledness of datasets affecting abnormality prediction

Fig 1. Controlledness of datasets affecting abnormality prediction

 

Fig 2. Labeler background affecting model transferability

Fig 2. Labeler background affecting model transferability

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×