Investigative Ophthalmology & Visual Science Cover Image for Volume 65, Issue 7
June 2024
Volume 65, Issue 7
Open Access
ARVO Annual Meeting Abstract  |   June 2024
Enriching AI-based Predictive Models from Retinal Imaging by Multi-Modal Contrastive Pre-training
Author Affiliations & Notes
  • Emese Sükei
    Department of Ophthalmology, Medizinische Universitat Wien, Wien, Wien, Austria
  • Sophie Riedl
    Department of Ophthalmology, Medizinische Universitat Wien, Wien, Wien, Austria
  • Elisabeth Rumetshofer
    Institute for Machine Learning, Johannes Kepler Universitat Linz, Linz, Austria
  • Niklas Schmidinger
    Institute for Machine Learning, Johannes Kepler Universitat Linz, Linz, Austria
  • Andreas Mayr
    Institute for Machine Learning, Johannes Kepler Universitat Linz, Linz, Austria
  • Ursula Schmidt-Erfurth
    Department of Ophthalmology, Medizinische Universitat Wien, Wien, Wien, Austria
  • Günter Klambauer
    Institute for Machine Learning, Johannes Kepler Universitat Linz, Linz, Austria
  • Hrvoje Bogunovic
    Department of Ophthalmology, Medizinische Universitat Wien, Wien, Wien, Austria
  • Footnotes
    Commercial Relationships   Emese Sükei None; Sophie Riedl None; Elisabeth Rumetshofer None; Niklas Schmidinger None; Andreas Mayr None; Ursula Schmidt-Erfurth Apellis Pharmaceuticals, Bayer, EcoR1, AbbVie, Medscape, Johnson&Johnson, Allergan, Roche, Böhringer, Heidelberg, Novartis, Galimedix, Code C (Consultant/Contractor), Genentech, Heidelberg Engineering, Kodiak, Novartis, Roche, RetInSight, Apellis Pharmaceuticals , Code F (Financial Support), AbbVie, Apellis, Roche, Code R (Recipient); Günter Klambauer None; Hrvoje Bogunovic Apellis, Heidelberg Engineering, Code F (Financial Support)
  • Footnotes
    Support  Austrian Science Fund (FWF): FG 9 Forschungsgruppen
Investigative Ophthalmology & Visual Science June 2024, Vol.65, 450. doi:
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Emese Sükei, Sophie Riedl, Elisabeth Rumetshofer, Niklas Schmidinger, Andreas Mayr, Ursula Schmidt-Erfurth, Günter Klambauer, Hrvoje Bogunovic; Enriching AI-based Predictive Models from Retinal Imaging by Multi-Modal Contrastive Pre-training. Invest. Ophthalmol. Vis. Sci. 2024;65(7):450.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose : Self-supervised pre-training has demonstrated efficacy in yielding deep learning (DL) models with remarkable data efficiency and generalization capabilities. Retinal imaging has an untapped potential to exploit such approaches by leveraging matched multimodal data in the form of 2D fundus photography/near-infrared reflective imaging (NIR) and 3D spectral domain optical coherence tomography (SD-OCT) scans. We explore multimodal pre-training to enhance DL models on downstream predictive tasks: disease classification, structure-function prediction, and treatment forecasting.

Methods : We propose a multi-modal contrastive pre-training method for retinal imaging (Fig. 1a), for which we utilized extensive longitudinal reading center data comprising 153,306 pairs of OCT volumes and corresponding fundus photography (Topcon) or NIR (Spectralis & Cirrus) images from 3,790 neovascular age-related macular degeneration patients. The pre-training aimed to bring similar instances (fundus and OCT from the same eye) close in the latent space and push dissimilar instances apart, fostering meaningful embeddings. Linear predictive models were built on pre-trained encoder blocks (Fig. 1b) and trained on external HARBOR clinical trial data for visual acuity, fluid presence, high treatment need prediction, and a mixed diseases dataset of clinical trial baseline scans for retinal disease screening.

Results : Our results highlight the superiority of the multi-modal contrastive pre-trained encoder-based models over the fully supervised ones across all downstream tasks (Tab. 1), confirming the efficacy of capturing relevant biomarkers through pre-training. Notably, the pre-training also enhances fundus-based prediction performance. Exploring the adaptability of our approach, we observed minimal performance decay (≤20%) when swapping imaging modalities for predictions post-OCT-based model training, underscoring its robustness and the possibility of leveraging the close mapping of image-volume pairs in the latent space.

Conclusions : In summary, this study underscores the capacity of multi-modal contrastive pre-training to harness extensive unlabeled data, presenting a promising starting point for tasks concerning image interpretation in retinal research and clinical care. Furthermore, by enhancing 2D fundus representations, our simple yet effective method may serve in (pre)-clinical settings where access to OCT is limited.

This abstract was presented at the 2024 ARVO Annual Meeting, held in Seattle, WA, May 5-9, 2024.

 

 

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×