Abstract
Purpose: :
There is a need for automated screening of diabetics given the projected growth in diabetes patients. An ocular telehealth network of retina cameras and machine-vision software is a low-cost method of achieving broad-based screening. There is often additional "meta-data" collected such as diabetes onset date and age, as well as laboratory data such as hemoglobin A1c levels which may be useful in automated diagnosis. We present an analysis of the effects of using this metadata in an computer-aided diagnosis system.
Methods: :
We collected a set of images and metadata from 1104 examinations conducted between February 2009 to August 2011 from 9 walk-in clinics in the mid-South region of the USA. Each exam has undergone a quality check and has been diagnosed by an ophthalmologist as normal / mild Diabetic Retinopathy (DR) (1026 exams) or more severe disease (78 exams). Each image undergoes automated physiological location, microaneurysm and exudate detection. Algorithms convert the images to a numeric "feature vector" which measures the lesion population and other characteristics of the image. The metadata is used as additional, non-image descriptors. These vectors train an automated diagnosis system to screen for disease vs no disease, using linear feature projection methods and linear discriminate analysis, and also with an uncorrelated ensemble of neural network pattern classifiers. Three cases were studied: use of metadata , use of image data, and combined image / metadata. The linear methods were tested using hold-one-out validation and the neural network methods were tested using 4-fold validation.
Results: :
The linear-based methods achieved a sensitivity / specificity of 97.2% and 48.1% with image and metadata; 97.8% and 42.4% with image data; and 94.8% and 29.1% using metadata. Results were comparable with the neural network , which also produced Receiver Operating Characteristics curves with an Area-Under-Curve of 0.914, 0.907, and 0.821. An analysis was also conducted where the mild DR was regarded as a negative-only result; in this case the sensitivity and specificity was 94.7% , 44.0% for image and metadata; 95.8%, 38.7% for image data; and 89.6%, 26.3% for metadata. The AUC scores for the neural network methods in this case were .857, .854, and .702.
Conclusions: :
Overall both methods worked well on this data set, with comparable performance. The use of metadata alone showed a viable correlation with the disease status of the retina for this test set. When combined with image data some slight improvement over solely image data was shown. For future work we will expand to still larger data sets, with more complete metadata and more extensive metadata including longitudinal lab results.
Keywords: image processing • diabetic retinopathy • detection