Abstract
Purpose :
Chat GPT is a large-scale language model trained on various datasets to learn, analyze, and generate human-like answers to user’s questions. As artificial intelligence (AI) progresses and its use expands into medical education, it is increasingly important to assess the validity of its output. When considering its applicability in medical education, more information is required to understand whether its analyses are producing accurate and coherent responses. This study aims to evaluate the performance and accuracy of ChatGPT on practice questions used to prepare for the Ophthalmic Knowledge Assessment Program (OKAP).
Methods :
Ophthalmology questions were obtained from a widely utilized study resource, Ophthoquestions. 13 sections, each with a differing ophthalmic subtopic, were sampled and 10 questions were collected from each section. Questions containing images or tables wereexcluded. 98 out of 130 questions and their respective answer choices were input into ChatGPT-3.5. ChatGPT responses were evaluated via the properties of natural coherence (Table 1). Incorrect responses were categorized as either logical fallacy, informational fallacy, or explicit fallacy (Table 2). ChatGPT accuracy was analyzed using Microsoft Excel and chi-square tests to determine statistical significance of categorical variables.
Results :
ChatGPT answered 52 out of 98 questions correctly (53%). Logical reasoning, internal information and external information were identified in 82.7%, 84.7%, and 78.6% of the responses, respectively.Of the incorrect answers, informational was the most frequent fallacy (43.5%), followed by logical fallacy (32.6%) and external fallacy (23.9%). The use of logical reasoning (p=0.02) and internal information (p=0.01) was found to be statistically significant when stratified by correct and incorrect responses.
Conclusions :
ChatGPT may be a potential study aid in resident education and serve as an additional resource for board preparation in ophthalmology. Given the recent advancements in AI, future studies should assess whether ChatGPT can positively influence resident performance when implemented as a learning tool.
This abstract was presented at the 2024 ARVO Annual Meeting, held in Seattle, WA, May 5-9, 2024.