Abstract
Purpose :
Current GPU memory limitations do not support the analysis of OCT scans at its original resolution, and previous techniques have downsampled the inputs considerably which resulted in a loss of detail. Here, we utilise a new memory management support framework that allows for the training of large deep learning networks and apply it to the detection of glaucoma in OCT scans at its original resolution.
Methods :
A total of 1110 SDOCT volumes (Cirrus, Zeiss, CA) were acquired from both eyes of 624 subjects (139 healthy and 485 glaucomatous patients (POAG)). A convolutional neural network (CNN) consisting of 8 3D-convolutional layers with a total of 600K parameters and was trained using a cross-entropy loss to differentiate between the healthy and glaucomatous scans. To avoid GPU memory constraints, the network was trained using a large model support library that automatically adds swap-in and swap-out nodes for transferring tensors from GPUs to the host and vice versa. This allowed for the OCT scans to be analysed at the original resolution of 200x200x1024. The performance of the network was gauged by computing the area under the receiver operating characteristic (AUC) curve. The performance of this network was also compared to a previously proposed network that ingested downsampled OCT scans (50x50x128), consisted of 5 3D-convolutional layers and had a total of 222K parameters; and a machine-learning technique (random forests) that relied on segmented features (peripapillary nerve fibre thicknesses). Class activation maps (CAM) were also generated for each of these networks to provide a qualitative view of the regions that the network deemed as important and relevant to the task.
Results :
The AUCs computed on the test set for the networks that analysed the volumes at the original and downsampled resolutions was found to be 0.92 and 0.91, respectively. The CAMs obtained using the high resolution images show more detail in comparison to the downsampled volume. The random forest technique showed an AUC of 0.85.
Conclusions :
The performance of the two networks was comparable for glaucoma detection but showed a vast improvement over the random forest that relied on segmented features. The ability to retain detail (as shown in the CAM) will likely allow for improvements in other tasks, such as spatial correspondences between visual field test locations and retinal structure.
This is a 2020 ARVO Annual Meeting abstract.