Abstract
Purpose :
Recent machine learning models have made significant progress in automatic diagnosis of glaucoma based on fundus images. However, this method is often called a “black box” due to the lack of explainability of its results, hindering the model’s clinical adoption. This study proposes a novel approach to enhance medical explainability by using a multi-task deep learning model for glaucoma diagnosis.
Methods :
Unlike conventional deep learning models that use a single task of image-level glaucoma classification, the multi-task learning model (MTL) simultaneously performs three tasks:
1. Image-level glaucoma classification
2. Pixel-level segmentation of optic cup (OC) and optic disc (OD) to derive the vCDR biomarker
3. Image-level classification of peripapillary atrophy (PPA).
The proposed model is evaluated using the Retinal Fundus Glaucoma Challenge (REFUGE) database, consisting of 1,200 retinal fundus images, supplemented by PPA annotations provided by a glaucoma specialist. The dataset was split into training, validation, and testing sets (400 each).
Results :
The MTL model produces three outputs: a “yes or no” glaucoma diagnosis, and two biomarkers (vCDR and PPA) that can explain and support the diagnosis. Regression analysis shows that vCDR and PPA explain 45% of the variation in glaucoma diagnosis. Moreover, by sharing representation of features learned from each task, the MTL model outperforms the baseline single-task learning model (STL) across all performance metrics for all tasks. For glaucoma classification, area-under-the-curve (AUC), sensitivity, specificity, and accuracy are 94.7%, 67.5%, 96.4%, and 93.5%, respectively for MTL, compared with 88.3%, 40.0%, 92.5%, and 87.3%, respectively for STL. Results are consistent for PPA classification. The MTL model also outperforms STL based on the dice similarity coefficient (DSC) for segmentation tasks: For OC, DSC is 85.8% for MTL and 85.4% for STL; for OD, DSC is 82.2% for MTL and 79.8% for STL.
Conclusions :
This study introduces a multi-task deep learning model for glaucoma diagnosis. Unlike single-task learning which only generates a “yes or no” diagnosis with no explainability, the MTL model produces relevant biomarkers to help support and validate the diagnosis. This is a novel approach to help solve the AI “black box” problem and thus contributes toward increasing trust in AI in clinical practice. The MTL model also improves performance over the STL model.
This abstract was presented at the 2024 ARVO Annual Meeting, held in Seattle, WA, May 5-9, 2024.