Investigative Ophthalmology & Visual Science Cover Image for Volume 65, Issue 7
June 2024
Volume 65, Issue 7
Open Access
ARVO Annual Meeting Abstract  |   June 2024
Enhancing Amblyopia Identification Using NLP: A Study of BioClinical BERT and Flan-T5 Models
Author Affiliations & Notes
  • Wei-Chun Lin
    Ophthalmology, Oregon Health & Science University, Portland, Oregon, United States
  • Caleb Reznick
    Ophthalmology, Oregon Health & Science University, Portland, Oregon, United States
  • Leah Reznick
    Ophthalmology, Oregon Health & Science University, Portland, Oregon, United States
  • Abigail Lucero
    Ophthalmology, Oregon Health & Science University, Portland, Oregon, United States
  • J. Peter Campbell
    Ophthalmology, Oregon Health & Science University, Portland, Oregon, United States
  • Hiroshi Ishikawa
    Ophthalmology, Oregon Health & Science University, Portland, Oregon, United States
  • Michelle Hribar
    Ophthalmology, Oregon Health & Science University, Portland, Oregon, United States
    National Eye Institute, Bethesda, Maryland, United States
  • Footnotes
    Commercial Relationships   Wei-Chun Lin None; Caleb Reznick None; Leah Reznick None; Abigail Lucero None; J. Peter Campbell Boston AI Lab, Code C (Consultant/Contractor), Genentech, Code F (Financial Support), Siloam, Code O (Owner); Hiroshi Ishikawa None; Michelle Hribar None
  • Footnotes
    Support  Funding Support: NIH T15LM007088, NIH R01 LM013426, and unrestricted departmental funding from Research to Prevent Blindness.
Investigative Ophthalmology & Visual Science June 2024, Vol.65, 338. doi:
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Wei-Chun Lin, Caleb Reznick, Leah Reznick, Abigail Lucero, J. Peter Campbell, Hiroshi Ishikawa, Michelle Hribar; Enhancing Amblyopia Identification Using NLP: A Study of BioClinical BERT and Flan-T5 Models. Invest. Ophthalmol. Vis. Sci. 2024;65(7):338.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose : Amblyopia is one of the most common causes of treatable vision loss in children. Accurately identifying patients with amblyopia in the electronic health record (EHR) is crucial for effective clinical care, enabling practice alerts, facilitating clinical trials, and supporting quality measures and research. However, amblyopia diagnosis billing codes are frequently omitted in patient records. To address this gap, we propose using Natural Language Processing (NLP) to analyze visit notes for a patient's amblyopia status and to compare the performance of a fine-tuned, pre-trained BioClinical BERT model and zero-shot prompt with Large Language Models (LLMs).

Methods : Our dataset included visit notes and billing codes from new patient office visits between 2015 and 2022 in Pediatrics and Strabismus clinics at OHSU, focusing on patients aged 9 years or younger. We manually reviewed these notes to categorize the amblyopia status of each patient. Notes were labeled as “Amblyopia”, “Not Amblyopia”, and “Suspect Amblyopia.” Also, part of the patients with amblyopia were annotated with subtypes of amblyopia, including strabismic, deprivation, and refractive amblyopia. To evaluate the effectiveness of current billing codes, we randomly selected 2,000 patient notes and assessed the accuracy of the billing diagnoses. For the NLP approaches, we fine-tuned the BioClinical BERT model with the labeled clinical notes and explored the performance of the LLM Flan-T5 model with zero-shot. The dataset was split into training/validation/testing with 70%/15%/15%.

Results : A total of 3,726 notes were randomly selected and manually annotated, identifying 2,089 amblyopia cases, 1,339 non-amblyopia cases, and 298 suspect amblyopia cases. Of these, 900 notes diagnosed with amblyopia were further annotated with three subtypes of the condition. The BioClinical BERT model achieved the highest results, with a macro average AUROC of 0.992 and an accuracy of 0.977 in determining amblyopia diagnoses (Table 1 and Figure 1). Additionally, the zero-shot Flan-T5 model demonstrated higher performance compared to using billing codes alone.

Conclusions : Our findings clearly indicate that billing codes alone are inadequate for accurately identifying patients with amblyopia. In contrast, NLP approaches exhibit much higher accuracy and precision. Moreover, the Flan-T5 model reveals the potential for rapid phenotyping and enhanced interpretability.

This abstract was presented at the 2024 ARVO Annual Meeting, held in Seattle, WA, May 5-9, 2024.

 

 

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×