In this report, we demonstrate that manual identification of retinal boundaries on raw OCT B-scan images can allow generation of retinal thickness maps and quantitative data virtually identical with that produced by properly functioning automated segmentation and analysis algorithms on the existing StratusOCT instrument.
In this study, the maximum difference in retinal thickness between the automated Stratus output and the manual OCTOR method in any subfield in any case was less than 8 μm
(Table 4) . Among the 11 output parameters, the FCP thickness showed the largest differences both in comparisons between two attempts by one grader (mean percent difference of 1.6%), between two graders (mean percent difference of 2.4%), as well as in comparisons between manual grading and the automated Stratus output (mean percent difference of 2.4%). In contrast, for all other output parameters (in any type of comparison), the mean percentage difference was always less than 1.2%. The slightly greater discrepancy observed for the FCP is likely because the FCP is based on the averaging of only six points, whereas the other parameters (particularly the total macular volume) use many more A-scans.
This study also demonstrates that human graders can manually draw retinal boundaries using a computer mouse with good precision and reproducibility. Intra- and intergrader reproducibility appeared to be similar with this method. The small differences between gradings observed in this study are likely tolerable for most clinical or clinical research applications. Finally, although the interpolation algorithms used by the StratusOCT have not been published, the simple polar approximation used in this study appears to mimic the Stratus results closely.
Clinical OCT technology has dramatically evolved in just the last several years. The StratusOCT system can now render intraretinal and subretinal features in detail that would have seemed impossible only a few years ago. Future spectral domain
36 37 38 39 and ultra-high-resolution
40 OCT technology, with or without adaptive optics imaging, promises to improve imaging resolution even more, while decreasing acquisition times. With all these unique capabilities, OCT has quickly risen to the forefront of retinal diagnostic imaging. Despite limited research evidence to support its use in clinical decision-making, it is relied on as an important diagnostic tool by many ophthalmologists, and it is beginning to make its way into organized clinical trials. Many trials of macular edema, for example, require a minimum or maximum retinal thickness for eligibility, such as a FCP thickness greater than 300 μm. Errors in Stratus algorithms may affect eligibility decisions in these patients, and manual correction may be a valuable solution.
As OCT makes the transition from a research device to a critical clinical tool, care must be taken to ensure that its usage does not exceed its capabilities. For example, it would be quite easy to assume that the quantitative accuracy of retinal thickness measurements from this device should at least be equal to the superb imaging resolution evident in its cross-sectional images. Although most clinicians interpret the quantitative information in relation to the morphologic findings and their observations from the biomicroscopic examination, some clinicians may be tempted to rely on the machine’s automated, quantitative data summaries, particularly when morphologic changes over time are not striking.
Although this reliance on processed OCT information may be based on the assumption that the machine’s quantitative output is as accurate as its imaging output, mounting evidence
29 41 suggests that this may not be the case. Based on recent advances in the understanding of the retinal anatomic correlates of the outer hyperreflective bands present on OCT,
29 this study has identified a mean difference of 35 μm between the measured and true retinal thicknesses. Although this comprises only 15% of the measured value used by clinicians, it suggests that better software algorithms and anatomic knowledge will be needed before clinicians can fully rely on the quantitative output from these devices. Indeed, investigators such as Ishikawa et al.
42 have worked to develop new automated segmentation algorithms that better detect the anatomically correct location of the RPE. In further support of the need for improved segmentation algorithms, several investigators have recently identified another set of errors in StratusOCT automated quantification that stem from problems with automated boundary detection.
31 32 Although the effects of these errors on clinical management have not been extensively studied, they can only be expected to cause greater problems as clinicians increase their reliance on and confidence in OCT data.
Furthermore, clinically relevant intraretinal and subretinal features that are clearly evident to the human observer are not identified or quantified with current versions of automated segmentation software. For example, although retinal thickness is quantified in patients with macular edema, the volume of retinal cysts is not measured. In addition, an important limitation in eyes with subretinal fluid is that the fluid is often combined with the neurosensory retina by the Stratus software when calculating thickness measurements. For this reason, the traditional Stratus measurements are better termed “retinal height” (from the RPE) rather than retinal thickness.
Unfortunately, the inability of existing analyses to distinguish subretinal fluid volume from retinal volume results in a loss of potentially clinically useful data. In some patients with CNV, for example, a particular treatment may cause resorption of macular edema, but may have no effect on subretinal fluid or RPE elevation. Thus, this computer-assisted grading of OCT images may allow a grader to select and quantify the most clinically useful features, such as subretinal fluid, PED volume, or cystic spaces, in a given patient. Ongoing quality-assurance programs for masked reanalysis of OCT images at the reading center (DIRC) have demonstrated excellent reproducibility (Romano P, unpublished data, 2006) among graders for identifying the boundaries for these spaces. However, it is important to note that the reproducibility data (as well as the data described in this report) were obtained by certified reading center graders. Future studies quantifying various pathologic features, particularly those employing nonstandardized grading personnel, must demonstrate similar reproducibility of data before the results of those analyses can be considered meaningful.
Although manual correction of OCT boundary detection errors and the delineation of boundaries of other structures (such as subretinal fluid) described in this report are potentially useful, short-term solutions to the limitations of existing OCT software, it is important to recognize that ongoing advances in OCT hardware are likely to necessitate improvement in automated segmentation algorithms. New spectral (Fourier domain) OCT devices are capable of capturing more than 200 B-scans within a few seconds, but purely manual correction of boundary detection errors for this large number of scans is clearly not practical. It is hoped that recent breakthroughs in image processing and high-speed computing will allow the software advances to keep pace with the rapid development of enhanced imaging hardware.