Abstract
Purpose :
Faxed and scanned visual field (VF) reports are ubiquitous in clinical settings due to the lack of interoperability between electronic health records (EHR). Aiming to reduce clinician workload, we developed and validated a machine learning-based optical character recognition (OCR) system to automate the extraction of VF and VF measures from scanned images and PDFs.
Methods :
Humphrey 24-2 SITA Standard VF reports were extracted from the EHR for 472 glaucoma patients with multiple visual field reports from 2015 to 2023, comprising 1806 Zeiss Humphrey Field Analyzer 3 (HFA3) reports and 168 earlier from HFA2.
Using OpenCV, a free and open-source software (FOSS) for image analysis, we preprocessed these images to detect VF reports. We then used image morphological transformations with Sobel filters to detect the locations of key textual information from this report including patient identifiers, VF thresholds and VF statistics. We developed two OCR systems using two FOSS OCR engines: Tesseract and PaddleOCR to extract textual information from detected locations as reported by OpenCV (Figure 1). Tesseract is based on a machine-learning long short-term memory (LSTM) and support vector model ensemble, whereas PaddleOCR is based on a residual attention based convolutional neural network backbone. Their character recognition accuracy was compared against human recognition using a random subset of 50 VF reports from both HFA2 and HFA3 reports.
Results :
Tesseract showed better accuracy on the OpenCV post-processed HFA3 reports (98% ± 2%) than HFA2 reports (95% ± 2%) on VFs and VF statistics. PaddleOCR performed better on HFA2 reports (98% ± 1%) than HFA3 reports (96% ± 2%). PaddleOCR and Tesseract performed similarly on patient information (96% ± 2% vs. 95% ± 1%). We found that the cause of most errors in OCR was due to font kerning and character or axis overlapping on VFs (Figure 2).
Conclusions :
Automated FOSS OCR engines effectively convert scanned VF reports to structured textual data, enhancing clinical workflow and research capabilities. This technology can aid in clinical trend analysis by facilitating access to previously hard-to-extract data, representing an advancement in the management and analysis of visual field information in healthcare settings.
This abstract was presented at the 2024 ARVO Annual Meeting, held in Seattle, WA, May 5-9, 2024.