Abstract
Purpose :
Recent applications of CNN models showed promising results of automatic delineation of retinal layers in OCT images of retinal diseases. Here, we evaluated two CNN models, sliding-window pixel classification (SW) and semantic image classification (UNet), for automatic segmentation of retinal layers of SD-OCT B-scan images from RP.
Methods :
The SW model to classify center pixel of tiny images (33x33 pixels) was similar to that reported previously (Fang et al., 2017). The UNet construction followed Ronneberger et al. (2015) with input image size of 256x32, initial channels of 8 and kernel size of 5x5. Both models were implemented in MATLAB on an iMac Pro. Training image patches (2.88 million for SW and 158,000 for UNet) were generated from 480 horizontal midline B-scans obtained from 220 RP patients and 20 normal subjects. The training completed after 45 epochs. Testing images were 160 midline B-scans from a separate group of 80 RP patients. B-scans were segmented automatically using Spectralis and manually corrected for the boundaries of ILM, INL, EZ, RPE, and BM by one grader for the training set and two for the testing set. The trained models were used to classify all pixels in the testing B-scans. For SW, post-processing was needed to obtain layer boundary lines for defining classes of all pixels in an image (see Figure). Bland-Altman and correlation analyses were conducted to compare EZ width measured by the models to those by human graders.
Results :
The mean time to classify a B-scan image was 94 sec for SW and 0.41 sec for UNet. When compared to human graders, the accuracy to identify pixels (thickness) of inner retina (ILM-INL), photoreceptor+ (INL-RPE) and photoreceptor outer segment (EZ-RPE) were 93.5%, 89.6%, and 85.8% respectively for SW, and 92.5%, 91.4%, and 84.7% respectively for UNet. Bland-Altman analysis revealed a mean ± SE difference of EZ width of 0.17±0.05 mm for SW and 0.03±0.04 mm for UNet when compared with human graders. EZ width measured by the models was highly correlated with human graders (r > 0.95, p<0.05).
Conclusions :
While the performance of two CNN models were comparable in delineating various retinal layers, UNet was much faster than the SW model to segment B-Scan images, and may be a more efficient and effective method for automatic analysis of SD-OCT volume scan images in RP.
This is a 2020 ARVO Annual Meeting abstract.