Purchase this article with an account.
Susan Ostmo, Veysi Yildiz, Peng Tian, James M Brown, J. Peter Campbell, Sang Jin Kim, Jennifer Dy, Stratis Ioannidis, Deniz Erdogmus, Robison Vernon Paul Chan, Jayashree Kalpathy-Cramer, Michael F Chiang; Artificial intelligence in retinopathy of prematurity (ROP): diagnostic performance of a supervised machine learning system (i-ROP ASSIST). Invest. Ophthalmol. Vis. Sci. 2018;59(9):2772.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
This study explores the use of a supervised machine learning (SML) algorithm (i-ROP ASSIST) to develop a quantitative severity scale in ROP using diagnostic labels (plus disease vs. pre-plus disease vs. normal) from multiple experts, as well as pairwise comparison rankings (i.e. selection of image with more severe retinal vascular findings).
We developed separate training and testing sets of 100 images with a 3-level (plus, pre-plus, or normal) reference standard diagnosis (RSD). The training set included diagnostic labels from 13 experts, as well as pairwise comparisons (each image was ranked from least severe to most severe) by 5 graders. Using previously published methods, we automatically converted each image into a binary mask (Figure 1), and extracted multiple tortuosity and dilation related features from the masks. Using a SML approach, with 5 fold (80:20) cross validation, we trained a logistic regression model combining both expert diagnostic labels and pairwise comparison rankings from our training set to identify: (1) plus disease based on the RSD, (2) pre-plus or worse disease based on the RSD, (3) a continuous severity score (1-100). We calculated the area under the receiver operating characteristic curve (AUC) to evaluate the performance of our method in the independent test set.
In the test set, the algorithm achieved an AUC of for detection of plus disease, and for detection of pre-plus or worse. Figure 2 displays the severity score (1-100) compared to the expert ordered ranking of disease severity for the training set (the only dataset with pairwise rankings to sort). Using a cut off severity score of 40, the sensitivity for detection of pre-plus or worse disease was 93.9% with a specificity of 98.2%.
The i-ROP ASSIST SML algorithm was able to identify plus disease with high sensitivity and specificity, and compute a quantitative severity score that correlated well with expert ranking of disease severity. Future work will us the output from the SML algorithm to develop a more refined quantitative scale that could be used in the screening, diagnosis, and management of ROP.
This is an abstract that was submitted for the 2018 ARVO Annual Meeting, held in Honolulu, Hawaii, April 29 - May 3, 2018.
This PDF is available to Subscribers Only