Two datasets were used in this project: the Kaggle retinopathy dataset
14 for training and validation and the eOphta dataset (TeleOphta)
15 for testing. Datasets consisted of color retinal images that varied in height and width between the low hundreds to low thousands of pixels. The Kaggle dataset (EyePacs LLC, San Jose, CA, USA) is a collection of 35,126 images of diabetic retinopathy (DR) with five class labels (normal, mild, moderate, severe, and proliferative) (
Fig. 1b). These images vary significantly both in image quality and patient demographics and are sometimes mislabeled. Therefore, two ophthalmologists screened a subset of these images for both correct labeling and image quality. Images with disagreement were excluded. The eOphta dataset contains retinal fundus images derived from a consortium of French hospitals and consists of 47 images with exudates, 35 exudate-free images, 148 images with microaneurysms or other small red lesions, and 233 microaneurysm-free images (
Fig. 1c). For the eOphta images, two ophthalmologists have labeled every pixel in the image as either belonging to exudate, microaneurysms, or neither (
Fig. 1d). The pixel-level classifications are provided as binary masks that are the same dimensions as the original image.