Purchase this article with an account.
M. S. Muller, A. E. Elsner, D. A. VanNasdale, V. Malinovsky, T. D. Peabody, M. Miura, A. Weber, A. Remky, K. Montealegre, N. Dolbee; Accounting for Image Grading Criteria in Minimally Trained Telemedicine Graders. Invest. Ophthalmol. Vis. Sci. 2010;51(13):3541.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
To model and compensate for differences in grader criteria when grading image sets with minimal training and without patient history.
Non-mydriatic near-infrared retinal images of 70 volunteers were taken with the Heidelberg Spectralis SLO and Indiana University’s Laser Scanning Digital Camera (LSDC). Inclusion criteria were that images were taken with both devices during summer 2009 and that the patient is diabetic or has macular pathology in at least one eye that is documented in available eye exam charts. These criteria resulted in 28 study subjects, with an average age of 61±12 years.The pathology in each subject eye was categorized by 27 common retinal lesions, which were then classified into 1 of 4 referral periods for the next recommended eye exam. Five graders familiar with Spectralis images marked the perceived lesions (forced-choice), with randomized image sets grouped by eye and camera type. Grading was performed on macula-centered images taken with both the Spectralis (one 30° field) and with the LSDC (one 36° field), and then a 3 field LSDC set was graded at the end of the session. Based on the patient charts and referral classification, 8, 11, 7, and 30 eyes required referrals of 2-4 days, 1 month, 3 months, and 1 year, respectively.Graders viewed minimal sample images prior to grading, and the image sets included some images with subtle lesions and images of inferior quality. These factors caused greater referral disagreement between the graders and the charts, and indicated grader preferences toward some pathology categories. These preferential grading criteria were compensated for by calculating the positive predictive value (PPV) for each category based on the Spectralis and chart grading data. A 100 iteration Monte-Carlo simulation overturned some pathology decisions according to a PPV probability and reclassified the referrals.
After compensation using Spectralis data, the overall referral agreement between the graders and charts, measured by the mean Κ-value, showed a statistically significant improvement of >30% for each LSDC data set.
Benchmarking and compensating for the image grading criteria of graders will provide greater reliability and standardization in referral decisions and will allow for more targeted telemedicine training development.
This PDF is available to Subscribers Only