To our knowledge, this is the first study to compare multiple digital image formats to the ETDRS film protocol for DR evaluation in the same eyes. Results showed that the accuracy in classifying full-scale ETDRS severity level using studied digital formats was comparable to that of 35-mm film. The absence of stereoscopic viewing, the use of 37:1 JPEG 2000 compression, or substituting a wide-angle mosaic for the ETDRS seven standard fields did not compromise assessment of the severity level or threshold. Interreader reliability for all digital formats was similar to that of film. Although agreement suggests that 37:1 compression had less effect on DR severity classification than removing the stereo effect, differences between digital formats were not statistically significant.
Readers more often assigned a higher level of severity using digital formats than film. Other investigators assigned higher levels of severity using film than digital.
32 –34 Differences between our results and those in other reports include algorithmic color balancing and supplementary green-channel viewing, which may have contributed to higher digital severity level grading.
There was lower severity level agreement using DmMos compared with film and other digital formats, perhaps because of the larger DmMos retinal area and slightly different retinal region. These differences may also have contributed to lower Ma and RH specificity and lower ≥15/20 threshold specificity, compared with film. Digital format differences did not otherwise affect grading agreement. There was wider variability in agreement between digital and film when grading extraretinal than when grading intraretinal lesions, although there was no pattern of variability between formats. Consistent with ETDRS findings, IRMA, VB, NVD, and FPD were demanding lesions regardless of format. Our extraretinal abnormality results may be confounded by the smaller number of eyes and the challenge of any media used to photograph abnormalities in more than one plane.
In population studies or telemedicine programs, severity thresholds and pooled severity categories may be more relevant than discrete severity levels. An epidemiologic study may involve populations with vision-threatening DR (e.g., ≥ level 53). Threshold information is necessary for planning DR evaluation programs using fundus photographs. A genotype–phenotype linkage study may use three thresholds to analyze phenotypic effect: clearly unaffected (e.g., level ≤20), indeterminate (levels 35–43), and clearly affected (e.g., level ≥47).
35 Table 5 shows a substantial κ for a three-part and a clinical five-part
36 threshold.
Grading DR from digital images took longer than film, because readers could more quickly move a Donaldson viewer among stereo slide pairs than loading digital image files. Large files take time to load, even on fast computers and networks. With half as many files, reviewing monoscopic digital formats took less time than viewing stereoscopic digital. For the same reason, time differences between grading Dm or DmMos versus F were minor compared with D or Dc. No time was saved in reviewing one mosaic image compared with seven fields. There may be a minimum time necessary for readers to examine and classify DR regardless of format. Although the readers had many years' experience in grading seven standard fields and 35-mm film, they may have been less fluent in grading mosaicked images, particularly with a customized grid simulating a seven-field retinal division.
Because there is a low prevalence of advanced retinopathy in the general population, this study is limited by having a small sample of eyes with level 53 (severe NPDR), NVD, and FPD.
Film has been the basis for diabetic retinal evaluation for many years. Criteria for color film slides in DR studies are well established. There are no widely accepted digital photography standards for acquiring and reviewing DR. Our results suggest that under controlled conditions, compression, absence of stereo effect, or deviation from ETDRS standard fields do not have a negative effect on DR assessment according to the ETDRS scale. Parameters maintained across all digital formats replicated properties of the ETDRS film protocol: resolution high enough to distinguish the smallest DR lesion, color balance similar to film, documentation of retinal regions essential to the ETDRS classification, and sufficient viewing magnification. We also augmented grading color digital images with green-channel views. All studied digital formats were comparable to 35-mm film. These results may be primarily due to the translation of important film protocol characteristics into digital equivalents.
Supported by a grant from Juvenile Diabetes Foundation Research International, New York, NY (HKL) and by unrestricted grants from Research to Prevent Blindness (Department of Ophthalmology and Visual Sciences, The University of Texas Medical Branch and the Department of Ophthalmology and Visual Sciences, University of Wisconsin School of Medicine and Public Health).
The authors are grateful to staff from the Department of Ophthalmology and Visual Sciences, University of Wisconsin-Madison. From the Ocular Epidemiology Group of Barbara E.K. Klein, MD and Ronald Klein, MD: Andrew F. Ewen, Anne E. Mosher, and Maria K. Swift for grading diabetic retinopathy images, Stacy M. Meuer for grading supervision, and Daniel P. Murach for computer support. From the Fundus Photograph Reading Center directed by Ronald P. Danis, MD: Trina M. Harding for grading orientation, Qian Peng for statistical advice, Jeff T. Klaves for statistical analyses, and Matthew D. Davis, MD, for suggestions regarding this manuscript.