June 2023
Volume 64, Issue 8
Open Access
ARVO Annual Meeting Abstract  |   June 2023
Ground truth validation of publicly available datasets utilized in artificial intelligence models for glaucoma detection
Author Affiliations & Notes
  • Ehsan Amjadian
    Computer Science, University of Waterloo, Waterloo, Ontario, Canada
    Technology & Operations, Royal Bank of Canada, Toronto, Ontario, Canada
  • Mahsa Raeisi Ardali
    College of Optometry, Nova Southeastern University Health Professions Division, Fort Lauderdale, Florida, United States
  • Riley Kiefer
    Computer Science, Florida Polytechnic University, Lakeland, Florida, United States
  • Muhammad Abid
    Computer Science, Florida Polytechnic University, Lakeland, Florida, United States
  • Jessica Steen
    College of Optometry, Nova Southeastern University Health Professions Division, Fort Lauderdale, Florida, United States
  • Footnotes
    Commercial Relationships   Ehsan Amjadian None; Mahsa Raeisi Ardali None; Riley Kiefer None; Muhammad Abid None; Jessica Steen None
  • Footnotes
    Support  None
Investigative Ophthalmology & Visual Science June 2023, Vol.64, 392. doi:
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Ehsan Amjadian, Mahsa Raeisi Ardali, Riley Kiefer, Muhammad Abid, Jessica Steen; Ground truth validation of publicly available datasets utilized in artificial intelligence models for glaucoma detection. Invest. Ophthalmol. Vis. Sci. 2023;64(8):392.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose : Publicly available datasets used to train artificial intelligence (AI) models for the detection of glaucoma utilize various, often unspecified methods to determine ground truth of the presence or absence of glaucoma based on fundus images. Accurate determination of ground truth is essential for training valid AI models for glaucoma detection. The purpose of the study is to validate ground truth of presence or absence of glaucoma as labeled in fundus images from 20 publicly available glaucoma datasets.

Methods : Two datapoints with labeled ground truth of ‘glaucoma’, two with labeled ground truth of ‘no glaucoma’, were randomly sampled from 20 datasets; 3 of the 20 only provided a single label, for a total of 74 validation instances. All available metadata was removed, and graders were masked to the labeled reference standard. Graders independently evaluated each image for VCDR, presence of peripapillary atrophy, presence of retinal nerve fiber layer defect, presence of optic disc hemorrhage, integrity of the neuroretinal rim (presence of notching), and evaluation of the ISNT rule. Based on evaluation of all features, presence or absence of glaucoma was determined. Where graders disagreed, discussion of each feature and final diagnosis was undertaken. Agreement between graders and agreement of graders with labeled ground truth for each image was determined by percent agreement and Cohen’s Kappa coefficient.

Results : Annotator agreement and kappa score between graders on the diagnosis of glaucoma based on fundus images were 79.05% & 0.52 which improved following discussion to 97.72% & 0.95, respectively. Mean agreement of graders with labeled reference standard and the corresponding kappa coefficient were 75.33% & 0.52 that improved to 77.02% & 0.54 post-discussion. Following discussion, 8 datasets had 100% and 5 datasets had 50% agreement with both graders.

Conclusions : Agreement of presence or absence of glaucoma based on six pre-specified clinical features between expert clinical graders was very high; while agreement with established ground truth of publicly available datasets varied greatly between datasets. Consistent, established, and clearly described protocols for evaluation of labeling of fundus images in publicly available datasets used in model development for the detection of glaucoma is necessary prior to model training and potential clinical deployment.

This abstract was presented at the 2023 ARVO Annual Meeting, held in New Orleans, LA, April 23-27, 2023.

 

 

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×