Abstract
Purpose :
Softwares based on artificial intelligence (AI) algorithms have shown a good DR screening performance when compared to human graders. However, it is well known that automated image analysis software are influenced by image resolution, contrast and fidelity to the real retinal appearance. The aim of this study was to evaluate how a new imaging device may impact on screening performance of an automated DR image analysis software based on AI algorithms.
Methods :
This was an observational study based on data from consecutive diabetes mellitus patients who attended a routine annual screening visit. Mydriatic fundus images were acquired on the same day using both a conventional flash fundus camera (Topcon, Tokyo) and a new, fully automated, confocal retinal imaging device that uses a broad spectrum white LED as light source (Centervue, Padova). All images were analysed for severity of the DR with an AI-based software and graded as referral-warranted or not warranted. A manual grading was performed as a reference standard by two masked, certified DR graders. Discrepancies were adjudicated by a third specialist. Sensitivity and specificity rates were computed.
Results :
A series of 144 subjects (288 eyes) were enrolled. Four percent of images were determined to be ungradable by the human graders. The screening performance of the AI software was affected by retinal imaging system. The automated algorithm achieved a 90.7% sensitivity (95% CI 88.1–93.3) with a 76.2% specificity (95% CI 74.3–78.1) for detecting referral-warranted DR when feeded with conventional flash fundus camera images. It achieved 94.7% sensitivity (95% CI 91.6–97.8) with a 83.3% specificity (95% CI 81–85.6) when using the new confocal white LED device.
Conclusions :
Compared to human grading, automated image analysis software achieved high sensitivity for referable retinopathy using both retinal imaging systems. However, an higher specificity was recorded using the confocal white LED device in comparison with the conventional fundus camera. The present study shows that accuracy and reliability of deep learning-based algorithms in DR-screening depend on image quality and the choice of the fundus camera may influence the results of screening campaigns.
This abstract was presented at the 2019 ARVO Annual Meeting, held in Vancouver, Canada, April 28 - May 2, 2019.