Abstract
Purpose :
To establish an end-to-end data-driven learning method that detects retinal tasks and image landmarks to determine SLO-based retinal tracking results
Methods :
We present a novel method for SLO-based systems to automatically detect eye motion task paradigms and retinal landmarks on image data, such as foveal localization, utilizing a fully data driven, end-to-end DNN based algorithm. Our DNN approaches were tested using a 99-subject concussion/control database collected at the University of Pittsburgh for both fixation and saccade tasks. The task-based detection results aimed to localize the fovea, as well as identify the stimulus target positions of each video sequence to be used for task sorting. These detection results were evaluated and compared with “strip-registration” based approaches to quantify model performance.
Results :
Preliminary results of model performances were measured using the IoU (intersection over union) metric by comparing DNN predictions of landmarks on the image with the annotated ground truth. We demonstrate a precision metric of >0.9 and a recall metric of >0.9 for each prediction of certain visual landmarks on the image, which gives an overall mean Average Precision (mAP) >50 for the sorting of retinal motion tasks. This detection approach has a great generalizability for different types of landmark detections following the same supervised training pipeline. We experimented on detecting different types of objects in the image with minimum amounts of training data, on the order of 10 SLO images, by utilizing the concept of “Transfer Learning” with most of the neurons in the DNN pretrained using ImageNet. By doing this, we improved the requirement on training data size and data annotation efforts.
Conclusions :
The use of DNN algorithms to extract latent features of SLO videos with supervised training demonstrates exquisite classification and prediction power while alleviating the limitations of manual feature engineering. A generalized data-driven approach to learn the data representation automatically is of the utmost importance to consider the latent features embedded in SLO videos. Future applications of this technique will be applied to quantify saccadic latency to differentiate concussed vs. healthy control subjects.
This abstract was presented at the 2022 ARVO Annual Meeting, held in Denver, CO, May 1-4, 2022, and virtually.