Purchase this article with an account.
Stephanie Klein Lynch, Abhay Shah, James C Folk, Xiaodong Wu, Michael David Abramoff; Catastrophic Failure in Image-Based Convolutional Neural Network Algorithms for Detecting Diabetic Retinopathy. Invest. Ophthalmol. Vis. Sci. 2017;58(8):3776.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
Convolutional neural networks (CNNs) outperform retinal specialists in detecting diabetic retinopathy (DR). There are two principal CNN designs: 1) image-based deep learning algorithms, such as that developed by Google Inc., in which a CNN trains based on whole images, and 2) hybrid algorithms, in which multiple, semi-dependent CNNs train based on the appearance of focal lesions . We compared the performance of these two CNN designs on adversarial (confounder) images, in which a small fraction of pixels have been modified.
Ten (10) DR images were selected from a reference set. The images were subjected to slight pixel modifications through a process called adversarialization. This was performed as follows: the 10 DR images were presented to an image-based CNN, which had already been trained on 500k DR images (AUC 0.99) to high performance. The diagnostic output for the 10 images was re-labeled from ‘DR’ to ‘normal.’ The resulting error was back-propagated into the image through 450 iterations; pixels in each image were updated iteratively at the input layer as εsign(▽xJ(θ,x,y)), where ε is the small learning rate (0.001) and ▽xJ(θ,x,y) is the gradient of the Jacobian. Ten (10) adversarial images resulted. These were input into the hybrid algorithm, trained on a set of 5 million image components (AUC 0.98), to determine if DR would be detected.
The difference between the adversarial images and original DR images averaged 0.5-1.3 pixel values (0.12%-0.51%) [Fig. 1]. Clinicians and the hybrid algorithm identified the adversarial images correctly as ‘DR,’ despite the fact that the image-based system classified all of them as normal.
Although both image-based and hybrid systems perform equivalently on validated datasets, image-based systems may fail catastrophically when confronted with adversarial images. They are sensitive to extremely small changes, potentially leading to false negatives. Hybrid algorithms based on multiple semi-dependent CNNs may offer a more robust option for clinical screening.1. Gulshan V et al. Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA. 2016 Nov 29.2. Abràmoff MD et al. Improved Automated Detection of Diabetic Retinopathy on a Publicly Available Dataset Through Integration of Deep Learning. IOVS. 2016 Oct;57(13):5200-5206.
This is an abstract that was submitted for the 2017 ARVO Annual Meeting, held in Baltimore, MD, May 7-11, 2017.
This PDF is available to Subscribers Only