Abstract
Purpose :
Acquiring good quality images is crucial for disease screening and diagnosis using fundus cameras. A quality indicator after image capture allows recapture of the low-quality ones while the patient is still available. However, running a deep learning image quality algorithm on low-cost fundus cameras can be slow. TensorFlow Lite (TFL) is a deep learning framework designed for inference on the device, also known as edge computing. In this study, we investigated methods to optimize the inference performance using TFL.
Methods :
A VGG-16 neural network was trained using fundus images captured with VELARATM 200 (ZEISS, Dublin, CA), a fully automated non-mydriatic fundus camera with a 45 degree field of view centered around the macula. 4574 images, including 3158 good and 1416 bad quality ones were used for training; 597 images, consisting of 353 good and 244 bad quality ones, were used for testing. The grading of the images was performed by a majority vote among 3 subject matter experts. First, the floating-point (FP) TensorFlow model was converted and saved as a tflite file. Then, post-training dynamic range quantization was applied using TFL. The weight of the trained model was quantized into 8-bit integer. To test the accuracy, both the FP and quantized TFL models were evaluated using the same test set. To test the speed, an Android app was built and installed on the fundus camera tablet that comes with a Qualcomm Snapdragon 439 APU.
Results :
The sensitivity, specificity and AUC score of the TensorFlow model were 85.7%, 99.2% and 0.976. For FP and quantized TFL models, they were 85.7%, 99.2%, 0.976 and 85.2%, 99.4%, 0.977. 95% confidence intervals and detailed comparison can be found in Figure 1. The sensitivity (p=0.5) and specificity (p=0.5) are not statistically different after TFL optimization. For inference speed, the FP and quantized TFL models ran at 4808ms and 3008ms per image using the tablet CPU with 1 thread. The inference time reduced to 1847ms and 783ms when using 8 threads. Inference time changed to 1263ms and 1261ms when running on the tablet GPU. For model size, the FP model is 67.3MB and the quantized one is 17.8MB.
Conclusions :
In this study, we demonstrated a method to optimize the inference performance of a fundus image quality neural network model for edge computing using TFL.
This is a 2021 ARVO Annual Meeting abstract.