To our knowledge, this is the first study to use quantitative vascular parameters to generate composite wide-angle retinal images at different diagnostic thresholds of recognized ROP experts. The key findings were that (1) computer-based image analysis permits quantification of retinal vascular features, and a spectrum of abnormalities in arterial curvature and venous diameter is seen in ROP; and (2) selection of appropriate vessels from multiple photographs can produce composite plus disease images corresponding to expert opinions.
Accurate and reliable detection of plus disease is critical to the management of ROP. A standard published narrow-angle photograph displays the minimum amount of vascular abnormality required for presence of plus disease.
4 5 6 However, our previous studies have demonstrated that there is significant variability in plus disease diagnosis among recognized ROP experts reviewing wide-angle retinal photographs.
10 11 12 15 The variability among responses provided by the evaluators in this study
(Table 2)supports this notion. This highlights the subjective nature of diagnostic judgments by experts and suggests that there are not always clear distinctions between “plus” and “not plus.” Instead, there appears to be a spectrum of findings that consists of retinas that clearly do not reflect plus disease, those that clearly do reflect plus disease, and those that do not clearly fall into either category. Although the new “pre-plus” categorization in the international classification of ROP may be intended to represent this latter category,
5 we have shown in a previous study that disagreement among experts persists even with a three-level (plus, pre-plus, neither) system.
10
The spectrum of changes in arterial curvature and venous diameter, as a function of diagnoses given by experts, may be seen by comparing the images in
Figure 3 . This could provide insight into how the definition of plus disease is interpreted by experts and may improve the way in which ophthalmologists are taught to interpret the range of vascular changes. As shown in
Table 1and
Figure 3A , the image at the 75% sensitivity cutoff illustrates the level associated with 25% underdiagnosis of plus disease and therefore has less AIC and VD than do the images at the 50% and 25% sensitivity cutoffs
(Figs. 3B 3C) . Of note, the AIC value at the 25% sensitivity cutoff (0.061) is 45.2% greater than the corresponding value at the 75% sensitivity cutoff (0.042). In comparison, the venous diameter value at the 25% sensitivity cutoff (4.272) is only 12.6% greater than the corresponding value at the 75% sensitivity cutoff (3.795). This is consistent with findings of several recent studies involving automated ROP image analysis systems, which have suggested that the curvature of retinal vessels may be a more useful measure of plus disease than is vascular dilation.
17 18 20 It is conceivable that vascular diameter values may be confounded by effects such as photographic blurring, variances in axial length or corneal power, and differences in image acquisition technique that result in variable image magnification. However, if arterial curvature is truly shown in future studies to be more important than venous diameter, then an optimal approach for image generation may require unequal weighting of parameters. We have explored the use of linear combinations of multiple parameters to improve accuracy of computer-based ROP image analysis systems,
11 12 15 and this may provide a mechanism for integrating and weighting multiple features.
This methodology for image generation may be applied toward developing a definition for plus disease based on quantitative parameters. To illustrate the feasibility of this approach,
Figure 4compares the appearance of two images generated in this study to published photographs selected by the expert committee.
Figures 4A and 4Bdisplay the composite image reflecting 50% underdiagnosis of true plus disease (i.e., 50% sensitivity cutoff), which has been cropped and magnified to match the perspective of the standard photographic definition of plus disease.
6 Figures 4C and 4Ddisplay the composite image reflecting 25% underdiagnosis of true plus disease (i.e., 75% sensitivity cutoff), compared with a published example of pre-plus disease.
5 This allows direct comparison of features from these images, and illustrates potential limitations of a photographic definition with a smaller field-of-view than indirect ophthalmoscopy or wide-angle imaging. The comparison also illustrates that pre-plus disease and plus disease may represent a spectrum of disease severity that can be quantified using parameters such as integrated curvature and diameter to represent vessel characteristics.
The development of a quantitative definition for plus disease could improve diagnostic accuracy and reliability, but would require identification of the actual vascular features that most closely correlate with the presence of plus disease as judged by experts. One practical difficulty is that there is no consensus regarding which exact aspects of the published standard photograph should actually be considered during diagnosis.
4 5 6 This study used AIC and VD for image generation, because these parameters have been shown to correlate closely with detection of plus disease by experts and because the narrative description of plus disease is characterized by “arterial tortuosity” and “venous dilation.”
4 5 6 11 12 15 Responses from evaluators appear to support the notion that these composite images represent various diagnostic thresholds of disease. For example, all evaluators ranked
Figure 3Aas having the least severe vascular abnormality, and all diagnosed
Figure 3Cas “plus disease,” as might be predicted. However, more evaluators ranked
Figure 3Bas having the most severe vascular changes among the three composite images
(Table 2) . This raises the possibility that other characteristics beyond arterial tortuosity and venous dilation may be considered by experts while assessing plus disease. Techniques from cognitive science such as think-aloud methodologies
21 may provide insight about retinal features that are truly perceived as important for diagnosis and whether other attributes such as vascular branching or congestion are considered during real-world disease management. Because the current photographic standard has been shown in major studies to have prognostic significance,
6 8 development of a quantitative definition would very likely require prospective validation in a clinical trial or retrospective validation using images from premature infants in whom the natural history of untreated ROP is known.
Quantitative interpretation of anatomic characteristics is often useful for image-based diagnosis, and methodologies similar to those described in this article might be applied to other diseases. Parameters of optical coherence tomography (OCT) imaging have been used to detect the presence of glaucoma with good sensitivity and specificity.
22 Structural features of the optic nerve head, such as disc size and retinal vascular arrangement, have correlated with expert opinion by using Rasch analysis to determine whether certain anatomic characteristics show greater heritance than others.
23 Mathematical techniques such as fractal models have been used to simulate neovascularization in corneal disease and diabetic retinopathy, and to generate computer-simulated images.
24 25 Analysis of diabetic retinopathy images, using artificial neural networks and quantification of image features, has demonstrated high sensitivity for detecting disease.
26 27 28 In general medicine, computer-assisted detection of quantitative features of breast masses can increase the sensitivity of mammography for some cancerous breast lesions.
29 30 Quantitative algorithms using mammographic features such as spiculation, border shape, and density may been used to classify breast masses.
30 31 Computer-based analysis of three-dimensional thoracic computed tomography images has been used to extract features such as pulmonary nodule structure, and these features have been used to predict the likelihood of malignancy with very high sensitivity.
32 33 Combining results from quantitative image analysis with diagnostic responses from multiple experts may make it feasible to develop similar composite images in diseases such as these for diagnosis, classification, and educational purposes.
Several limitations should be noted: (1) This study relied on a single set of 34 images that were examined by both the experts and the computer-based system. Larger studies may be necessary to validate the findings. (2) Our study did not account for any potential differences in magnification within the set of 34 images, which could have affected the measurement of system parameters such as vascular diameter. Of course, some variability in magnification may also be seen with standard indirect ophthalmoscopy. (3) System parameter thresholds were derived from analysis of vessels from all four quadrants in each image. This method may not necessarily be equivalent to clinical plus disease diagnosis, which is defined as the requisite amount of vascular abnormality in ≥2 quadrants.
5 7 Further work to derive thresholds based on the two quadrants with greatest vessel tortuosity and dilatation may be warranted.
4 Only sensitivity cutoffs were used to generate composite images, because the specificity curves were steeper than the sensitivity curves. As shown in
Figure 2 , there was a larger absolute slope from 25% to 75% specificity (AIC = 34.48, VD = 1.59) than from 25% to 75% sensitivity (AIC = 26.74, VD = 1.05). For this reason, composite images based on expert specificity over this range would look more similar to one another than composite images based on expert sensitivity.
5 Although reference standard diagnoses in this study were based on majority consensus of recognized ROP experts, they do not necessarily reflect the true presence of plus disease. Use of alternative reference standards based on indirect ophthalmoscopy or other methodologies for obtaining expert consensus may be informative.
6 It could be argued that the use of quantitative cutoff points as a diagnostic tool for plus disease could result in cases lost to therapy (false negatives). However, it has been shown that experienced ophthalmologists often disagree about the presence of plus disease,
10 11 12 15 34 presumably because they have different thresholds for diagnosis. An aggressive cutoff point could be selected so that no false-negative cases would occur in the opinion of any examiner, but this would result in many false-positive referrals because of the inherent tradeoff between sensitivity and specificity as the cutoff point is shifted. Therefore, we feel that it is most useful to visualize vascular changes over a range of cutoff values.
In conclusion, this study describes a methodology for quantifying characteristics of retinal images and generating composite plus disease images over a range of disease severities, based on the opinions of recognized ROP experts. The method may have application as an educational tool for ophthalmologists, may provide a mechanism for developing future quantitative definitions of plus disease, and may eventually be generalized to other image-based diseases.
The authors thank each of the 29 expert participants for their contribution to the study.