June 2021
Volume 62, Issue 8
Open Access
ARVO Annual Meeting Abstract  |   June 2021
Crowdsourcing Can Match Field Grading Validity for Follicular Trachoma
Author Affiliations & Notes
  • Christopher J Brady
    Division of Ophthalmology, University of Vermont College of Medicine, Burlington, Vermont, United States
    Vermont Center on Behavior and Health, University of Vermont College of Medicine, Burlington, Vermont, United States
  • Fahd Naufal
    Dana Center for Preventive Ophthalmology, Johns Hopkins Medicine Wilmer Eye Institute, Baltimore, Maryland, United States
  • Meraf A Wolle
    Dana Center for Preventive Ophthalmology, Johns Hopkins Medicine Wilmer Eye Institute, Baltimore, Maryland, United States
  • Harran Mkocha
    Kongwa Trachoma Project, Tanzania, United Republic of
  • Sheila K West
    Dana Center for Preventive Ophthalmology, Johns Hopkins Medicine Wilmer Eye Institute, Baltimore, Maryland, United States
  • Footnotes
    Commercial Relationships   Christopher Brady, None; Fahd Naufal, None; Meraf Wolle, None; Harran Mkocha, None; Sheila West, None
  • Footnotes
    Support   NIH Grant P20GM103644 & Seeing is Believing Innovation Grant
Investigative Ophthalmology & Visual Science June 2021, Vol.62, 1788. doi:
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Christopher J Brady, Fahd Naufal, Meraf A Wolle, Harran Mkocha, Sheila K West; Crowdsourcing Can Match Field Grading Validity for Follicular Trachoma. Invest. Ophthalmol. Vis. Sci. 2021;62(8):1788.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose : As trachoma is eliminated, field graders lose exposure to the disease and become less adept at identifying follicular trachoma (TF). New solutions to complete field surveys including photography and telemedicine may be needed to ensure elimination and accurately monitor for re-emergence. Expert grading of images is costly and time intensive. Our purpose was to validate crowdsourcing for follicular trachoma image interpretation.

Methods : Tarsal plate images acquired using a smartphone-based device during a 2019 field survey in Tanzania (n=1000) were posted to the Amazon Mechanical Turk (AMT) crowdsourcing marketplace for grading as "not-TF," “possible TF,” “probable TF” or “definite TF.” Each image was graded by 7 unique graders who received $0.05 USD per image. The grades were summed to create a raw score (0-21) which was analyzed by receiver-operating characteristic (ROC) using images with concordant field and expert photo grades to determine the optimal diagnostic set-point. Kappa, sensitivity and specificity were then analyzed at various prevalences of disease.

Results : 7000 grades were rendered in 1 hour for $420 USD. The raw score produced an area under the ROC of 0.940 (95% CI 0.902-0.977). Optimizing the setpoint to a raw score of 7 produced a kappa of 0.43, sensitivity of 84.8%, specificity of 90% and % correct (to master/field grade) of 89.3% in the full sample with a prevalence of 5.7% TF. When normal images were randomly removed from the sample to mimic the prevalences used to validate field graders (30% & 75% TF), the kappa ranged from 0.71-0.74 which is within the acceptable range per the World Health Organization. Images with discordant field and expert grades were more likely to receive a raw score in the middle of the range suggesting disagreement among crowdsourcers as well.

Conclusions : Crowdsourcing was able to rapidly and accurately identify TF on smartphone-acquired photographs with minimal training. Agreement with a reference standard is poor in a sample with low TF prevalence, but when held to the same standard as a skilled field grader in the current training paradigm, crowdsourcing may be acceptable. Further testing compared to field grading in low prevalence areas is needed.

This is a 2021 ARVO Annual Meeting abstract.

 

Receiver-operating characteristic for raw crowdsourcing score for images with concordant field and expert photograph grade.

Receiver-operating characteristic for raw crowdsourcing score for images with concordant field and expert photograph grade.

 

Distribution of field and expert grade within each crowdsourced raw score

Distribution of field and expert grade within each crowdsourced raw score

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×