Abstract
Purpose :
A significant limitation of OCT imaging is the high level of speckle noise which can interfere with downstream tasks such as detailed measurements of retinal layers or complex structures such as the lamina cribrosa (LC). A learning-based method, Noise2Noise, has been previously demonstrated to achieve a high denoising level. But this method requires extensive sets of repeated scans from the same subject for training, often not available in real-world applications. We investigate the efficacy of a self-supervised deep learning model for mitigating speckle noise in OCT scans without the need for repeat scans.
Methods :
Based on Noise2Noise unsupervised denoising, where a convolutional neural network is trained to learn noise distribution from multiple repeat scans of the same subject, the Neighbor2Neighbor methodology follows a similar process, which generates subsampled images from one image for the training phase. Additionally, the method introduces a term in the objective function to tolerate minor differences between the subsampled images.
To test this method on LC images, 273 raw optic nerve head scans from 54 subjects were acquired with spectral-domain OCT (Leica, Chicago, IL) using 400x400x2048 pixels scan pattern. We removed regions without structural information for efficiency and only used the first 600 slices. Each slice was further cropped into a size of 384x384 to fit the input size of the neural network. Given the significant training size, we do not apply any additional data augmentation step.
Results :
Whereas the common peak signal-to-noise ratio measurements require clean references, we apply a non-reference signal-to-noise ratio (NR-SNR) image quality metric, where the background noise level is defined as the 66th percentile of the pixel value.
The NR-SNR score for our denoising method reaches 60.09, compared to 38.08 for BM3D (commonly used nonlinear filtering) and 13.25 for the raw scan.
Conclusions :
By applying a self-supervised denoising solution, OCT scans get noticeable qualitative and quantitative improvement. This method does not require any complex preprocessing and, most importantly, does not require repeated scanning, which is often difficult to obtain due to the long acquisition time and artifacts from eye motion, and thus can also be applied to in vivo imaging. Once trained, the model can be generalized to new scans from the same device and scanning configuration.
This abstract was presented at the 2023 ARVO Annual Meeting, held in New Orleans, LA, April 23-27, 2023.