Abstract
Purpose :
Diabetic Retinopathy (DR) is one of the most common causes of blindness, and a screening using fundus images helps to detect DR at early stage. Various types of fundus cameras are available today, with differences in their image quality, portability, and field of view. To screen a large population, it is essential to develop an automated DR screening system that is device agnostic. We evaluate how different, device-specific solutions can be fused when the training data and underlying algorithms differ.
Methods :
From previous studies, we obtained two deep learning models with different model architecture. One model (MHH) was trained exclusively on fundus images captured with a low-cost hand-held fundus camera (VISUSCOUT®100;ZEISS,Jena, Germany). The second model (MTT) was trained on fundus images recorded with different table-top fundus cameras manufactured by various companies.
We evaluated individual models and combination strategies on two datasets: one from hand-held cameras (VISUSCOUT®) and one from table-top cameras (seen in Messidor-2 dataset). We analyzed three fusion techniques: score level fusion, decision level combination, and system level combination. For the weighted score level combination, the optimum weights were found experimentally between 0 to 1. In decision level combination, a modified OR logic was used for referable DR. For the system level combination, the metadata of the image was used to find out whether the image was captured using VISUSCOUT or a table-top fundus camera for routing the image to the appropriate algorithm. These strategies are visualized in Fig. 1.
Results :
Fig. 2 shows the diagnostic efficacy in AUC %. Every single model performs well in the data regime it was trained on (MHH on data from hand-held cameras, MTT on table-top camera data) but accuracy drops significantly when tested with data coming from different device. For fusing both models into a single system, we observe best screening results with score level fusion, being clearly superior to decision level or system level fusion and insensitive to the data type.
Conclusions :
DR screening systems which have been trained on different device data and with different deep learning models can be fused with various combination techniques for higher system performance.
This is a 2021 ARVO Annual Meeting abstract.