Investigative Ophthalmology & Visual Science Cover Image for Volume 65, Issue 7
June 2024
Volume 65, Issue 7
Open Access
ARVO Annual Meeting Abstract  |   June 2024
Diagnostic Performance of ChatGPT-4 in Patients with Retinal Pathologies
Author Affiliations & Notes
  • Mostafa Mafi
    UCLA Stein Eye Institute, University of California Los Angeles, Los Angeles, California, United States
  • Fateme Montazeri
    Department of Ophthalmology & Vision Science, University of California Davis, Sacramento, California, United States
  • Mohammad Mehdi Johari Moghadam
    Department of Ophthalmology & Vision Science, University of California Davis, Sacramento, California, United States
  • Masoud Mirghorbani
    Farabi Eye Hospital, Tehran, Tehran, Iran (the Islamic Republic of)
  • Pasha Anvari
    Iran University of Medical Sciences Iran Eye Research Center, Tehran, Tehran, Iran (the Islamic Republic of)
  • Khalil Ghasemi Falavarjani
    Iran University of Medical Sciences Iran Eye Research Center, Tehran, Tehran, Iran (the Islamic Republic of)
  • Mohammad Delsoz Mahoney
    The University of Tennessee Health Science Center Department of Ophthalmology Hamilton Eye Institute, Memphis, Tennessee, United States
  • Footnotes
    Commercial Relationships   Mostafa Mafi None; Fateme Montazeri None; Mohammad Mehdi Johari Moghadam None; Masoud Mirghorbani None; Pasha Anvari None; Khalil Ghasemi Falavarjani None; Mohammad Mahoney None
  • Footnotes
    Support  None
Investigative Ophthalmology & Visual Science June 2024, Vol.65, 5671. doi:
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Mostafa Mafi, Fateme Montazeri, Mohammad Mehdi Johari Moghadam, Masoud Mirghorbani, Pasha Anvari, Khalil Ghasemi Falavarjani, Mohammad Delsoz Mahoney; Diagnostic Performance of ChatGPT-4 in Patients with Retinal Pathologies. Invest. Ophthalmol. Vis. Sci. 2024;65(7):5671.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose : With the emergence of advanced Large Language Models, such as ChatGPT, there is a growing trend in leveraging artificial intelligence for improved clinical support. This study aims to evaluate ChatGPT-4’s diagnostic performance in analyzing retinal pathologies.

Methods : Twenty-four clinical cases covering diverse retinal conditions were sourced from a publicly available online database and input into ChatGPT-4. Each case scenario’s text was presented to the model three times, soliciting suggestions for both differential and final diagnoses. This process was iterated with the inclusion of clinical images alongside the text. Model accuracy was evaluated by comparing diagnoses and differentials with those provided by four ophthalmologists – two relying solely on textual information and two using both text and images. Statistical comparisons were performed using chi-square and Fisher’s exact tests.

Results : The diagnostic accuracy of ChatGPT was found to be 54.1% (13 correct answers) in text-only scenarios, which improved to 58.3% (14 correct answers) when clinical images were included. Ophthalmologists demonstrated higher accuracy levels, ranging from 79.1% to 83.3%. In the top three differential diagnoses, ChatGPT reached 66.6% accuracy (16 correct answers) with text-only input and 70.8% accuracy (17 correct answers) when relevant images were included. Specialists identified differential diagnoses with accuracy rates between 91.6% and 95.8%. In the final diagnosis, ChatGPT’s performance was comparable to that of one of the specialists with text-only input (p=0.06) and two other specialists when images were included (p=0.05 and 0.11). While ChatGPT generally demonstrated better results in differential diagnosis, its overall performance was notably lower than all specialists (p=0.02, p=0.03).

Conclusions : ChatGPT demonstrated improved diagnostic performance in the analysis of retinal disease when clinical images were included. While its performance was lower than that of the specialists, the discrepancy narrowed in the final diagnoses. These results underscore the potential of ChatGPT as a supplementary tool in both medical education and practice, augmenting the clinical decision-making process.

This abstract was presented at the 2024 ARVO Annual Meeting, held in Seattle, WA, May 5-9, 2024.

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×