Abstract
Purpose :
Machine learning (ML) methods hold promise in clinical data analysis, with various options based on different assumptions and algorithms. However, when choosing ML methods, it's vital to consider the implications on outcomes, depending on study nature and datasets. Here, we illustrate this using four ML methods to transfer open-angle glaucoma (OAG) information from a progression study to a cross-sectional one.
Methods :
The Fuzzy C-Means (FCM) algorithm had previously been able to identify 3 clusters of OAG eyes at a higher risk of progression in the Indianapolis Glaucoma Progression Study (IGPS)[Fig. 1a]. Using those clusters as a foundation, a dataset of 134 OAG eyes from a cross-sectional study conducted at Mount Sinai (MSSM) were similarly clustered. Four different ML techniques were used to translate knowledge gained from the IGPS clusters to MSSM [Fig. 1] : Support Vector Machine (SVM) [1b]; Gradient Boosted Decision Trees (GBDT) [1c]; Induced soft partitioning [1d][1e]; Warm start [1f]. Fig. 1d, 1e, and 1f show how the soft partition was hardened to obtain crisp cluster labels. For each scenario, structural markers of eyes in the 3 clusters were compared for significant differences using a Kruskal-Wallis test. These markers included cup-to- disk ratios (C/D ratio) and retinal nerve fiber thickness (RNFL) thickness obtained using optical coherence tomography angiography (OCTA).
Results :
Average (A), vertical (V), and horizontal (H) C/D ratios were significantly different across the clusters transferred via GBDT [Table 1]. This can also be seen with the SVM and induced soft partitioning methods, although to a lesser extent. No significant differences were found in C/D ratios with the warm start method. Differences in average (A), superior (S), and inferior (I) RNFL thicknesses were not found to be significant across clusters for any of the four ML methods.
Conclusions :
Different ML methods of transferring clusters between IGPS and MSSM produced results with varying degrees of clinical significance. This study demonstrates that while ML is a powerful tool and could be beneficial to studying complex diseases such as OAG, it is important to consider their implications on data analysis outcomes and the nature of the data while using them.
This abstract was presented at the 2024 ARVO Annual Meeting, held in Seattle, WA, May 5-9, 2024.