Abstract
Purpose :
Despite advancements in globe-preserving treatments, improvements in retinoblastoma (RB) outcomes are inconsistent across income levels and geographical locations. Both socio-economic factors and healthcare structure (SEH) were associated to the outcome. There are also regions with scarce literature coverage which hinder comprehension of the global landscape. We aim to use machine learning to categorize countries for care capacity with consideration to both socio-economic backgrounds and clinical services and identify differences in research focus.
Methods :
We used hierarchical clustering to assess strata using 26 SEH variables from the World Bank and 10 clinical factors (e.g. treatment modalities) covering 129 countries. We collected literature published after Jan 1, 2001 from PubMed, Embase, ScienceDirect, Web of Science, and Global Index Medicus with search term ‘retinoblastoma’. A BERT model was trained to filter literature by relevance. Topic modelling was done with hierarchical Dirichlet Process (HDP). Abstracts and publication info were geotagged for relevant locations.
Results :
4 major country clusters with 10 subgroups were identified and interpreted with reference to SEHs and clinical delivery. BERT model attained 91.47% accuracy (sensitivity: 93.07%, specificity: 87.91%). 28,347 abstracts were identified as relevant. HDP identified 52 sub-topics over genetics of RB, genetics of associated diseases, clinical presentation and work-up, and treatment options. While coverage on clinical presentation and work-up were fairly sufficient across the world, genetics of both RB and associated diseases were rarely discussed in part of sub-Saharan Africa and the Middle-east and was predominantly dominated by localities of European descent. Stratification by our computed strata identified slim coverage of genetics presentation in Tier 1 and 2 countries and even distribution in Tier 4.
Conclusions :
Capacity strata could serve as more comprehensive reference than income levels for targeted intervention, with implications on cross-border referral which should be considered when regional hubs are established. Our study proved machine learning can feasibly map existing knowledge on a disease. The scant evidence of RB genetics in ethnicity apart from the European descent should be tackled, which would implicate accuracy of genetics testing, identification of subtypes and general understanding of RB.
This abstract was presented at the 2022 ARVO Annual Meeting, held in Denver, CO, May 1-4, 2022, and virtually.