Catastrophic Failure in Image-Based Convolutional Neural Network Algorithms for Detecting Diabetic Retinopathy

Stephanie Klein Lynch; Abhay Shah; James C Folk; Xiaodong Wu; Michael David Abramoff

Abstract

Purpose : Convolutional neural networks (CNNs) outperform retinal specialists in detecting diabetic retinopathy (DR). There are two principal CNN designs: 1) image-based deep learning algorithms, such as that developed by Google Inc.,[1] in which a CNN trains based on whole images, and 2) hybrid algorithms, in which multiple, semi-dependent CNNs train based on the appearance of focal lesions [2]. We compared the performance of these two CNN designs on adversarial (confounder) images, in which a small fraction of pixels have been modified.

Methods : Ten (10) DR images were selected from a reference set. The images were subjected to slight pixel modifications through a process called adversarialization. This was performed as follows: the 10 DR images were presented to an image-based CNN, which had already been trained on 500k DR images (AUC 0.99) to high performance. The diagnostic output for the 10 images was re-labeled from ‘DR’ to ‘normal.’ The resulting error was back-propagated into the image through 450 iterations; pixels in each image were updated iteratively at the input layer as εsign(▽xJ(θ,x,y)), where ε is the small learning rate (0.001) and ▽xJ(θ,x,y) is the gradient of the Jacobian. Ten (10) adversarial images resulted. These were input into the hybrid algorithm, trained on a set of 5 million image components (AUC 0.98), to determine if DR would be detected.

Results : The difference between the adversarial images and original DR images averaged 0.5-1.3 pixel values (0.12%-0.51%) [Fig. 1]. Clinicians and the hybrid algorithm identified the adversarial images correctly as ‘DR,’ despite the fact that the image-based system classified all of them as normal.

Conclusions : Although both image-based and hybrid systems perform equivalently on validated datasets, image-based systems may fail catastrophically when confronted with adversarial images. They are sensitive to extremely small changes, potentially leading to false negatives. Hybrid algorithms based on multiple semi-dependent CNNs may offer a more robust option for clinical screening.

1. Gulshan V et al. Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA. 2016 Nov 29.
2. Abràmoff MD et al. Improved Automated Detection of Diabetic Retinopathy on a Publicly Available Dataset Through Integration of Deep Learning. IOVS. 2016 Oct;57(13):5200-5206.

This is an abstract that was submitted for the 2017 ARVO Annual Meeting, held in Baltimore, MD, May 7-11, 2017.

View Original Download Slide

This feature is available to authenticated users only.

To View More...

You must be signed into an individual account to use this feature.