Abstract
Purpose :
To provide precise and trustable visual field (VF) forecasts to aid in training ophthalmic residents and to encourage patients with glaucoma progression (GP) to follow up with their treatments, we utilize artificial intelligence (AI) to create vision transformer (ViT)-based deep learning (DL) networks, which take as input VFs to detect GP and predict VF appearance up to 10 years in the future.
Methods :
We utilize the largest available dataset to-date including minority populations with 62,000+ VFs collected longitudinally from 2010-2023 for training/testing, and the University of Washington Humphrey Visual Field (UWHVF) dataset for external validation. We conduct binary classification on GP detection to prove the predictive power of transformers on 24-2 VF images that are padded to 9×9. Then for VF appearance prediction, we customize and compare a generative vision transformer (GenViT) with 3×3 patch size, and a shallow U-Net-based architecture (sU-Net) for predicting future VFs in 1-2 years, 3-5 years, and over 5 years.
Results :
Our GP detection outperforms convolutional neural networks at both slow and fast progression, reaching 0.65 F1-score. In terms of future VF prediction, our sU-Net generated qualitatively insightful VFs (Fig 1), with masked mean absolute error (MAE) of 2.9 and 3.0 for 0-2 and 2-5 year groups, respectively. GenViT performed better on the 5-10 year group with masked MAE of 3.7. Our models generalize well on the unseen UWHVF dataset and outperform a previous U-Net-based approach [1].
[1] Wen et al, 2019. PloS one.
Conclusions :
Our AI models exhibit superior performance compared to existing state-of-the-art approaches up to 10 years in the future using only a single VF as input. This robust tool can provide VF-appearance reference to potentially guide residents, improve patient follow-up, and expedite glaucoma treatment for the broadest patient populations with confidence. Toward clinical translation, in future work, we aim to conduct a simulated (retrospective) clinician decision support experiment by showing clinicians baseline patient VFs with and without AI-generated future VFs to assess the impact of AI on resident accuracy, speed, and confidence of diagnosis. We also plan to combine optical coherence tomography (OCT) retinal nerve fiber layer thickness maps with VF inputs to capture structure and function correlations in model training.
This abstract was presented at the 2024 ARVO Annual Meeting, held in Seattle, WA, May 5-9, 2024.