Abstract
Purpose :
In recent years, machine learning has become increasingly prominent in fundus image analysis, but the need for large numbers of images is a barrier. Image data should be collected for each individual task, but this requires significant resources. A common approach in information science is to create a general-purpose model, called “pre-trained model”, and fine-tune it for each task. In this study, we created a pre-trained model using fundus images obtained from Ultra Wide Field camera (UWF) and verified its generalization performance.
Methods :
A pre-trained model was created using approximately 10,000 fundus images taken with UWF at Tsukazaki Hospital. The training task was designed to rotate each image in the range of 0-360° and estimate its angle. Next, using the pre-trained model as initial parameters, disease classification was performed on fundus images taken of about 4,000 patients at the same hospital who had a disease. The classification targets were Age-Related Macular Degeneration, Retinal Detachment, Glaucoma, Retinal Vessel Occlusion, Diabetic Retinopathy, and normal fundus. Similarly, we conducted experiments without pre-training and with a pre-trained model named ImageNet, a pre-trained model commonly used not only in the medical field but also for other general situations.
Results :
When trained on a pre-trained model created from 10,000 UWF images, the performance was not different from that without pre-training, and the accuracy for the 6-class classifications was 0.743. On the other hand, when the k-nearest neighbors algorithm (k-NN) was used with features extracted from the pre-trained model, the accuracy was slightly improved to 0.744. However, fine tuning the more popular ImageNet-trained model gave the highest accuracy of 0.747.
Conclusions :
The angle estimation problem using 10,000 UWF images did not yield a pre-trained model with high generalization performance that could be adapted to downstream tasks in fundus imaging. However, experiments using k-NN on features extracted from the same model suggest that the model may learn some features from the UWF images. It may be possible to generate a pre-trained model with better performance depending on the learning conditions (model architectures or hyperparameters) and learning content (other than angle estimation problems) in the pre-training process.
This abstract was presented at the 2024 ARVO Annual Meeting, held in Seattle, WA, May 5-9, 2024.