Purchase this article with an account.
Robin Hirt, Christian Kungel, Caroline Dieterich, Gary C Lee, Dominik Fischer, Niranchana Manivannan, Aditya Nair, Hugang Ren, Sophia Yu, Alexander Urich; Evaluation of Federated Learning for OCT B-scan classification. Invest. Ophthalmol. Vis. Sci. 2021;62(8):2103.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
Building robust deep learning-based models requires large quantities of diverse training data. Due to medical data privacy regulations, it is often infeasible to collect sensitive patient data in a centralized data lake.Federated Learning (FL) sidesteps this difficulty by only sharing intermediate model training updates among them. In this study, we will show the effectiveness of FL in comparison to models trained on isolated and centralized data.
76,544 OCT B-scans from 598 macular cubes acquired from 598 subjects using CIRRUSTM HD-OCT 5000 (ZEISS, Dublin, CA) were used as the training and test sets (478/120 split at cube level). 40% of B-scans were graded as “abnormal” by at least 1 of 2 retinal specialists.A real-world scenario was simulated by splitting the training set into 3 originating regions for US (Region A), Europe (Region B), and Asia (Region C). For validation, the global test set consisted of 15,338 B-scans (38% “abnormal”) from the 120 cubes.A ResNet50 was used to perform binary classification at the B-scan level. To evaluate the FL approach with meaningful baselines, we additionally trained isolated models on the regional data sets and one global model on the complete data set which share the same ANN settings and 50 epochs. All models have been tested on a global test set.For applying FL, the decentralized ML framework mlx (prenode, Germany, Karlsruhe) was used (see Fig. 1). Each model was trained for one epoch per region locally before centralizing their updated model weights (Δw) for aggregation. This cycled for 100 iterations.
In Table 1, we show the performance (ACC, AUC, and 95% CI in %) of locally best models (selected by best validation score on the global test set) using local training data alone as well as after federated learning.For Region B and C, a 7.1% relative improvement in accuracy can be observed when the federated model is applied.
Given our experimental results, we can see that FL in a simulated real-world scenario can both ensure a higher level of security and trust between data owners as well as increase the generalizability of the model across regions.In addition, we observe no difference in the performance of a federated model in comparison to a model that has directly been trained on centralized data. Thus, we show the potential of FL replacing a conventional centralization of data to preserve confidentiality whilst developing robust models.
This is a 2021 ARVO Annual Meeting abstract.
This PDF is available to Subscribers Only