June 2023
Volume 64, Issue 8
Open Access
ARVO Annual Meeting Abstract  |   June 2023
Evaluating the performance of ChatGPT in answering ophthalmology case challenges
Author Affiliations & Notes
  • Neel Edupuganti
    Augusta University, Augusta, Georgia, United States
  • Tommy Bui
    Augusta University, Augusta, Georgia, United States
  • Parth A Patel
    Augusta University, Augusta, Georgia, United States
  • Veeral Sheth
    University Retina and Macula Associates PC, Oak Forest, Illinois, United States
    Department of Ophthalmology, University of Illinois Chicago, Chicago, Illinois, United States
  • Footnotes
    Commercial Relationships   Neel Edupuganti None; Tommy Bui None; Parth Patel None; Veeral Sheth None
  • Footnotes
    Support  None
Investigative Ophthalmology & Visual Science June 2023, Vol.64, 4991. doi:
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Neel Edupuganti, Tommy Bui, Parth A Patel, Veeral Sheth; Evaluating the performance of ChatGPT in answering ophthalmology case challenges. Invest. Ophthalmol. Vis. Sci. 2023;64(8):4991.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose :
Recently, the applications of artificial intelligence (AI) in medicine have proliferated with advancements in deployment, image interpretation, and assistance in clinical decision making. Launched on November 30, 2022, ChatGPT is an AI chatbot created by OpenAI, that responds to users’ questions with human-like, algorithmic-driven responses. Here, we evaluated ChatGPT’s performance in answering American Academy of Ophthalmology (AAO) "Diagnose This Case" challenges.

Methods :
Ophthalmic clinical material was collated from the 2022 AAO "Diagnose This Case" challenges. Cases were categorized according to subspecialty, the relevant segment of the eye (anterior, posterior), and difficulty. The difficulty was approximated using the percentage of correct answers by prior respondents. Subsequently, each case’s question and answer were entered into ChatGPT. Because ChatGPT is unable to interpret images, an appropriate description of any relevant visual findings was provided where necessary. Descriptions were derived from the image interpretations in the associated answer explanations. Outputs were recorded and compared to the AAO answers and reasoning. The accuracy of ChatGPT was compared across categories. Significance (p < 0.05) was assessed using Fisher’s exact test, chi-squared test, and Spearman’s correlation coefficient.

Results :
ChatGPT was provided 51 clinical case challenges between December 22, 2022, and December 26, 2022. 56% of the outputs were correct overall, with greater performance for lower-difficulty questions (χ2= 6.42, p = 0.04). There was a positive correlation between the reported percentages of respondents who chose the correct answers and the answers provided by ChatGTP (r = 0.41, p = 0.004). There was a slightly higher performance on posterior segment cases (63%; n=24) compared to anterior segment cases (48%; n=27), however, this was not statistically significant (p = 0.40).

Conclusions :
Our study demonstrated a tendency for ChatGPT to select answers that were also selected by respondents in the ophthalmology case challenges. This trend is consistent with the AI’s intended design to generate human-like responses. Higher performance on posterior segment cases may reflect the initial focus of AI research on posterior segment disease, although this finding is limited by the sample size. Further research with expanded datasets can provide insights into performance between subspecialties.

This abstract was presented at the 2023 ARVO Annual Meeting, held in New Orleans, LA, April 23-27, 2023.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×