Abstract
Purpose :
DARC (Detection of Apoptosing Retinal Cells) is a novel technique enabling the visualization of sick and apoptosing retinal cells using intravenous Anx776 - a fluorescently labelled annexin A5. Previous experimental and the Phase 1 clincial trial have shown that the number of labelled cells, seen on captured retinal images using the Spectralis cSLO using ICG settings, can be used as a biomarker of disease. Here we examine manual observer inter-rater reliability, and compare to a CNN-aided automated algorithm.
Methods :
906 anonymised retinal images were used from patients in the Phase 2 DARC clinical trial (ISRCTN10751859), recruited following informed consent being obtained according to the Declaration of Helsinki and study approval by the Brent Research Ethics Committee. Images were randomly displayed on the same computer and under the same lighting conditions, to 5 masked operators using ImageJ® (National Institutes of Mental Health, USA). The ImageJ ‘multi-point’ tool was used to identify each structure in the image which observers wished to label as an ANX776 positive spot. Manual observer spots for each image were compared with each other and with a newly developed CNN-aided algorithm.
Results :
Poor inter-rater reliability was found amongst the 5 observers, with Krippendorffs alpha 0.51, suggesting low agreement between them. Individual observer counts were then compared to glaucoma progression using optical coherence tomography (OCT) global rates of progression (RNFL 3.5) eighteen months after their assessment with DARC. Those patients with a significant (p<0.05) negative slope were defined as progressing compared to those without who were defined as stable. ROC curves were created for each observer. AUCs ranged from 0.76 to 0.80, with sensitivities of 0.71 and specificities from 0.75 – 0.92. In comparison, the CNN algorithm had an AUC of 0.89 with sensitivity of 0.86 and specificity of 0.92 to glaucoma progression.
Conclusions :
Comparison of manual observer counts shows poor agreement between the 5 observers, and this was highlighted by the ROC parameters. In comparison, the CNN-aided algorithm appeared better, and has promise as an automated and objective biomarker.
Acknowledgements: Melanie Almonte, Paolo Bonetti, Jessica Bonetti, Benjamin Davis, Serge Miodragovic, Tim Yap, Eduardo Normando, Philip Bloom, Saad Younis, Madelein Walpert, Richard Nicholas, and all our Phase 2 subjects.
This is a 2020 ARVO Annual Meeting abstract.