Abstract
Purpose :
Eye diseases like age-related macular degeneration (AMD) and diabetic retinopathy (DR) exhibit localized retinal structural changes. To enable quantitative assessment of retinal sublayer thickness changes in optical coherence tomography (OCT) images, many attempts are made towards automated segmentation including deep learning (DL) methods. DL methods perform training using binary or multilayered colored masks. In contrast, we hypothesize that training based on boundaries labeled may yield optimal segmentation. In this study, we investigate retinal sublayer segmentation based on multilayer colored masks as well as boundary-annotated images using multiple encoder-decoder-based DL architectures.
Methods :
This is a retrospective study involving 750 enhanced depth imaging (EDI) OCT images. As depicted in Figure 1, we trained four encoder-decoder models: U-Net (U), Residual U-Net (RU), conditional generative adversarial network Pix2Pix-GAN with U-Net as the generator (PG-U), and Pix2Pix-GAN with RU as the generator (PG-RU), separately with OCT image—multilayer-colored-mask pairs as well as OCT image--boundary-marked-image pairs. The ground-truth labels are obtained from an open-source MATLAB file exchange tool which detects 6 retinal boundaries (See Figure 1a). These segmentations are graded by an expert grader and only perfectly segmented images (177 out of 750) are selected as ground-truth and are used for training while the rest are for testing. Performance analysis is based on subjective grading on 50 (of 573) randomly chosen unseen test images where each model segmentation is graded on a scale of 0 to 100.
Results :
In Figure 2, the qualitative comparison shows similar performance for mask and boundary-marked models on good-quality images. However, on poor-quality images, the PG-RU model (trained on boundary-marked) outperforms. Subjective grading values for mask-based models are 89.14% (U), 90.08% (R), 89.37% (PG-U), and 88.12% (PG-RU). For boundary-marked models: 91.67% (RU), 92.08% (U), 90.49% (PG-U), and 94% (PG-RU) highlight the effectiveness of the PG-RU (boundary-marked) model.
Conclusions :
This study demonstrates that PG-RU (trained boundary-marked images) is robust to noise and can perform well even on poor-quality images. However, on good-quality images, all the models yielded comparable results.
This abstract was presented at the 2024 ARVO Annual Meeting, held in Seattle, WA, May 5-9, 2024.