Abstract
Purpose :
Patients and families are increasingly turning to internet-based resources, such as large language model-powered (LLM) chatbots, to obtain medical information and advice. The accuracy of medical recommendations from these programs, particularly for rare conditions such as congenital glaucoma, remains unknown.
Methods :
A series of questions regarding common signs and symptoms of congenital glaucoma (tearing, blepharospasm, photophobia, buphthalmos, corneal haze) were queried to ChatGPT-3.5 (OpenAI, San Fransisco, USA). Answers were assessed by a pediatric ophthalmologist specializing in pediatric glaucoma (CLK) and ophthalmology resident (NC) for accuracy of responses, concern raised for congenital glaucoma, and need for further evaluation. Readability of responses was evaluated using an online readability tool, Readable.
Results :
All questions queried to ChatGPT-3.5 regarding signs and symptoms of congenital glaucoma provided reassurance and listed several alternate etiologies, including nasolacrimal duct obstruction (NLDO), refractive error, ocular surface infection, and normal ocular development without raising concern for childhood glaucoma. When each symptom was queried individually, the patient was recommended for routine evaluation with their pediatrician. Only when several symptoms were queried simultaneously was urgent evaluation with a pediatrician or pediatric eye specialist recommended, although the list of etiologies still did not mention concern for glaucoma, instead including NLDO, congenital cataract, and conjunctivitis. Only when specific concern about childhood glaucoma was queried did the chatbot recommend immediate evaluation with pediatric ophthalmology. The average Flesch Kincaid Grade Level for response output was 12.2±1.4; the average Flesch Reading Ease Score was 39.3±9.2.
Conclusions :
Although artificial intelligence continues to grow as a resource for medical information for patients and families, ChatGPT-3.5 failed to recognize several symptoms of congenital glaucoma and recommend appropriately urgent evaluation by a pediatric eye care specialist. The readability analysis suggests that an average person would have difficulty comprehending the information provided, requiring more than high school education. These findings will help to inform patient counseling, public health awareness measures, and future algorithmic changes to similar programs.
This abstract was presented at the 2024 ARVO Annual Meeting, held in Seattle, WA, May 5-9, 2024.