Purchase this article with an account.
Gary Lee, Thomas Callan, Charles Wu, Ashwini Tamhankar, Angelina Covita, Mary Durbin; Predicting laterality in external eye images using deep learning. Invest. Ophthalmol. Vis. Sci. 2019;60(11):PB0111.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
Ophthalmic devices often acquire ancillary eye images during clinical testing. For example, external eye photos are often used for gaze tracking in perimetry, where the visual field can be highly dependent on laterality for expedited testing and post-processing. In this study, we explored the use of deep learning (DL) approaches to predict laterality in eye images.
A total of 22,185 grayscale eye images (472x472 pixels, 40.7x40.7 mm2) were extracted from visual field exams previously acquired on HFA3 perimeters (ZEISS, Dublin, CA) from 134 eyes (83 subjects, age range: 23 to 72 years) in a series of ethics board-approved clinical research studies. Images were split into training (12,770 images, 94 eyes, 53 subjects), validation (3,517 images, 20 eyes, 10 subjects), and test sets (5,898 images, 20 eyes, 20 subjects) in a 70:15:15 ratio of eyes, restricting subjects to a given set. The test set contained 11 right (OD) and 9 left (OS) eyes. Two convolutional neural networks (CNNs) were trained with augmentation on adaptive histogram equalized input images in Tensorflow/Keras. VGGfrozen used a well-known CNN (VGG-16), pre-trained on ImageNet and frozen, replacing the fully connected (FCN) layers with two FCN layers (100 neurons + ReLU + dropout) and a final FCN layer (2 neurons + softmax). Images were resized to 224x224x3. VGGmini used the first three convolution/pooling blocks of VGG-16 (8-fold filter reduction) with the same final FCN layers used in VGGfrozen. Images were resized to 64x64. Performance was assessed by computing global accuracy (percent of all images correct), class accuracy (mean of per eye accuracy), and gradient-weighted class activation maps (Grad-CAM).
Three errors were found for each CNN, resulting in global and class accuracies of 99.95% and 99.93% for VGGfrozen compared to 99.95% and 99.94% for VGGmini (see Table 1). Errors correlated to blinks, poor contrast, or features near the decision threshold (see examples in Figure 1 for original vs. input images plus Grad-CAM overlays).
Classic DL approaches enabled accurate prediction of laterality in eye images in a preliminary cohort. In general, DL methods combined with relevant data and labels may provide solutions for related tasks such as gaze tracking, self-alignment, and finding metrics of test quality.
This abstract was presented at the 2019 ARVO Imaging in the Eye Conference, held in Vancouver, Canada, April 26-27, 2019.
Table 1. Summary of Networks and Performance
Figure 1. Example CNN outputs: (a) typical, (b) blink, and (c) poor contrast
This PDF is available to Subscribers Only