Investigative Ophthalmology & Visual Science Cover Image for Volume 65, Issue 7
June 2024
Volume 65, Issue 7
Open Access
ARVO Annual Meeting Abstract  |   June 2024
Accuracy and readability of after visit summaries for retinal conditions generated by a large language model
Author Affiliations & Notes
  • Nikhil Bommakanti
    Mid Atlantic Retina, Wills Eye Hospital, Philadelphia, Pennsylvania, United States
  • Fatima Rizvi
    Mid Atlantic Retina, Wills Eye Hospital, Philadelphia, Pennsylvania, United States
  • Anza Rizvi
    Mid Atlantic Retina, Wills Eye Hospital, Philadelphia, Pennsylvania, United States
  • Hana A. Mansour
    Mid Atlantic Retina, Wills Eye Hospital, Philadelphia, Pennsylvania, United States
  • Bita Momenaei
    Mid Atlantic Retina, Wills Eye Hospital, Philadelphia, Pennsylvania, United States
  • Jordan Safran
    Mid Atlantic Retina, Wills Eye Hospital, Philadelphia, Pennsylvania, United States
  • Michael Yu
    Mid Atlantic Retina, Wills Eye Hospital, Philadelphia, Pennsylvania, United States
  • Anthony Obeid
    Mid Atlantic Retina, Wills Eye Hospital, Philadelphia, Pennsylvania, United States
  • Yoshihiro Yonekawa
    Mid Atlantic Retina, Wills Eye Hospital, Philadelphia, Pennsylvania, United States
  • Footnotes
    Commercial Relationships   Nikhil Bommakanti None; Fatima Rizvi None; Anza Rizvi None; Hana Mansour None; Bita Momenaei None; Jordan Safran None; Michael Yu None; Anthony Obeid None; Yoshihiro Yonekawa None
  • Footnotes
    Support  None
Investigative Ophthalmology & Visual Science June 2024, Vol.65, 822. doi:
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Nikhil Bommakanti, Fatima Rizvi, Anza Rizvi, Hana A. Mansour, Bita Momenaei, Jordan Safran, Michael Yu, Anthony Obeid, Yoshihiro Yonekawa; Accuracy and readability of after visit summaries for retinal conditions generated by a large language model. Invest. Ophthalmol. Vis. Sci. 2024;65(7):822.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose : After-visit summaries (AVS), can strengthen doctor-patient communication and improve health outcomes. AVS may not routinely be provided at every clinical visit, potentially due to the extra time required to create additional documentation. Recent legislation mandates patient access to clinical notes, however these are written at a more sophisticated reading level or may contain abbreviations or jargon, all of which limits patient understanding.

ChatGPT, an artificial intelligence (AI) model can appropriately answer questions concerning retinal disease. The purpose of this study is to determine whether ChatGPT-4, the most updated version of the model, can generate accurate, readable AVS for common retinal conditions.

Methods : One to three conditions among the following disease categories were selected by an experienced, fellowship-trained, retina specialist with the intention of capturing a set of conditions which may be routinely encountered by an adult retina specialist: “vascular,” “macular,” “peripheral,” “inflammatory or infectious,” “neoplastic,” “toxic,” “surgical.”

Two clinical notes written between 2020 and 2023 were randomly obtained for each condition and were graded by three practicing ophthalmologists, with senior author adjudication, by the following criteria: Accurately describes 1. diagnosis, 2. clinical visit, and 3. follow up plan. Inaccurate responses were further categorized as “Incorrect,” “Omission,” or “Hallucination.” Reading level of the responses was assessed using the Flesch-Kincaid readability test.

Results : 38 AVS describing 19 retinal conditions written by 12 physicians were generated. The mean (standard deviation) word count of the notes was 242 (53) (range: 119 to 361). Descriptions of the diagnosis, clinical visit, and follow up were accurate for 30 (79%), 20 (53%), and 26 (68%) of the AVS. Incorrect information was most common (5 [13%], 12 [32%], and 7 [18%], respectively) whereas hallucination was noted in 6 (16%) notes. There was no difference in accuracy as a function of note word count or author. Flesch-Kincaid scores demonstrated patients would require between 3.4 and 12.5 years of education to understand the responses.

Conclusions : AI could be used to create clinical summaries, which may improve doctor-patient communication and health outcomes. Further work is necessary to ensure output is readable and completely accurate.

This abstract was presented at the 2024 ARVO Annual Meeting, held in Seattle, WA, May 5-9, 2024.

 

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×