Purchase this article with an account.
Kaihua Hou, Jasdeep Sabharwal, Patrick Herbert, Chris Bradley, Chris A Johnson, Michael Wall, Pradeep Y Ramulu, Mathias Unberath, Jithin Yohannan; A Comparison of Clinician and Deep Learning Performance at Detecting Visual Field Worsening. Invest. Ophthalmol. Vis. Sci. 2022;63(7):2014 – A0455.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
To compare the ability of a Deep Learning Model (DLM) and clinicians to identify Visual Field (VF) worsening among a large cohort of glaucoma patients.
We conducted a retrospective longitudinal study of glaucoma patients across multiple glaucoma providers with at least seven reliable VFs. The clinicians' decision of the presence of VF worsening in each eye was made at the time of the last VF in the series during routine clinical care. We trained a 2D convolutional Long Short-Term Memory DLM to predict VF worsening from the series of VFs for each eye(Figure 1). The reference standard for defining VF worsening used to train/test the DLM and evaluate clinician performance was defined as worsening in at least 4 out of the 6 trend-based and event-based algorithms: Mean Deviation (MD) slope, Visual Field Index (VFI) slope, Point Linear Regression (PLR) slope, Advanced Glaucoma Intervention Study (AGIS) score, Guided Progression Analysis (GPA), and Collaborative Initial Glaucoma Treatment Study (CIGTS). We split the data into 80%, 10%, and 10% for training, validation, and testing respectively for our DLM. The performance of the DLM and clinician at identifying VF worsening was evaluated in the test set using Area Under the Receiver Operating Characteristic Curve (AUROC).
A total of 8,705 eyes from 5,099 patients were included. Adapting the reference standard criteria of VF worsening, a total of 869 eyes (10%) were found to have worsening VFs over time. The DLM had an AUROC of 0.94 (95% CI: 0.93, 0.99) for detecting VF worsening on the test set. In contrast, the clinician decision had an estimated AUROC of 0.63 (95% CI: 0.56, 0.70) on the test set.
A DLM was trained to identify VF worsening with good classification performance. The performance of the DLM at identifying VF worsening was superior to the performance of clinicians during routine clinical care.
This abstract was presented at the 2022 ARVO Annual Meeting, held in Denver, CO, May 1-4, 2022, and virtually.
Figure 1: Deep Learning Model Architecture
This PDF is available to Subscribers Only