Abstract
Purpose :
The existing foundation models (FMs) were primarily limited to supporting single-modality input per decision, posing challenges in common clinical scenarios where disease diagnose rely on combined information of multi-modalities. We propose a proof-of-concept FM EyeNET, a multimodal FM for the detection of ocular and systemic disease based on ocular imaging, to achieve cross-modality diagnosis and adaptable to more developing new modalities in the future.
Methods :
We pretrained EyeNET with unlabeled ocular images including color fundus photo [CFP], optical coherence tomography scanner [OCT], ultra-wide fundus photo [UWF] and external eye photos to build a FM. Validation was done in multi-ethnic participants with labeled CFP, OCT, UWF, external eye photos and OCTA from healthy people and patients with ocular or systemic diseases. For cross-modality diagnosis, EyeNET was trained with CFP and paired ground-truth label based on OCT. For transferability to new modality, EyeNET was validated in OCTA which was unseen in this FM. For integrated multi-modal diagnosis, paired data such as CFP and OCT, or CFP and external eye photo, would be input for EyeNET to predict for systemic diseases.
Results :
1.5 millions of CFP, 0.5 million of OCT, 0.5 million of UWF and 0.5 million of external eye photo from multi-center cohorts were included in the pretraining of EyeNET, including 0.3 million of paired CFP and OCT, and 50,000 paired CFP and external eye photo. For cross-modality diagnosis of epiretinal membrane at macular with CFP as input, EyeNET (AUROC 0.88) outperformed both supervised-learning model (0.83, p<0.01) and RETFound (AUROC 0.87 p<0.01). For transferability to unseen modality with OCTA input, EyeNET showed the best performance in diagnose late AMD (AUROC 0.66), significantly outperformed supervised learning model (AUROC 0.64, p<0.05) and RETFound (AUROC 0.55 in CFP model, p<0.01; AUROC 0.50 in OCT model, p<0.01). Other tasks including integrated diagnosis of systemic diseases are still under validation.
Conclusions :
As a proof-of-concept multimodal FM, EyeNET demonstrated the cross-modality and transferability ability proving its possibility for risk-stratification in primary care center, and to adapt to new-developed modalities in the future . Although its ability to support integrated prediction for systemic diseases are still under validation, EyeNET is supposed to improve the cost-efficiency of clinical FM.
This abstract was presented at the 2024 ARVO Annual Meeting, held in Seattle, WA, May 5-9, 2024.