Abstract
Purpose :
Machine learning has a considerable data dependence that increases the resources required for its use. In addition, these algorithms are often black box systems, with unexplained behaviours.
The aim of this study was to develop a machine learning training methodology that is able to achieve accurate, explainable results to classify OCT scans using a small sample size of images, which we have termed interpretable staged transfer learning (iSTL) (Figure 1).
Methods :
The iSTL classification algorithm was trained to identify normal, diabetic retinopathy (DR), central serous retinopathy (CSR) and macular hole (MH) OCT images from a target training dataset of 50 images per class. To achieve this the algorithm was initialised with source ImageNet pretrained weights and was then trained on a large bridge dataset of OCT images from a separate open source dataset. Bridge training reduces domain difference between source and target datasets. The algorithm was then trained on the target dataset, which was synthetically expanded using data augmentation at a ratio of 4:1. Data augmentation uses random image transformations to generate new disease appearances from existing data. iSTL was compared to algorithms trained using traditional direct transfer learning (DTL) initialising with ImageNet weights only against unseen target images. Attention maps were generated using SHapley Additive exPlanations (SHAP).
Results :
Against unseen data the best iSTL model achieved greater overall accuracy (0.94), mean specificity (0.93), sensitivity (0.93) and f1-score (0.93) compared to DTL (0.86, 0.84, 0.80, 0.80) (Table 1).
Attention maps showed finer attention to pathologically important areas by iSTL. DTL exhibited wider attention across each image, frequently attributing importance to non-clinically significant areas (Figure 2).
Conclusions :
Our results show that iSTL scores higher against unseen data compared to DTL, demonstrating the effectiveness of our performance-boosting methods with small sample sizes. To avoid a black-box system and to be able to action results, clinicians must be able to interpret its predictions. Attention maps show that iSTL uses clinical features to make predictions and not uninterpretable abstractions. The sample size used could be gathered in clinical outpatient settings allowing small-scale research to be carried out without significant resources.
This is a 2021 ARVO Annual Meeting abstract.