December 2006
Volume 47, Issue 12
Free
Glaucoma  |   December 2006
Measurement Variability in Heidelberg Retina Tomograph Imaging of Neuroretinal Rim Area
Author Affiliations
  • Victoria M. F. Owen
    From the Department of Optometry and Visual Science, City University, London, United Kingdom; and the
  • Nicholas G. Strouthidis
    Glaucoma Research Unit, Moorfields Eye Hospital, London, United Kingdom.
  • David F. Garway-Heath
    Glaucoma Research Unit, Moorfields Eye Hospital, London, United Kingdom.
  • David P. Crabb
    From the Department of Optometry and Visual Science, City University, London, United Kingdom; and the
Investigative Ophthalmology & Visual Science December 2006, Vol.47, 5322-5330. doi:10.1167/iovs.06-0096
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Victoria M. F. Owen, Nicholas G. Strouthidis, David F. Garway-Heath, David P. Crabb; Measurement Variability in Heidelberg Retina Tomograph Imaging of Neuroretinal Rim Area. Invest. Ophthalmol. Vis. Sci. 2006;47(12):5322-5330. doi: 10.1167/iovs.06-0096.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

purpose. To investigate the optimal frequency of imaging during follow-up to detect glaucoma progression by characterizing variability (noise) in neuroretinal rim area (RA) measured by Heidelberg Retina Tomograph (HRT; Heidelberg Engineering, Heidelberg, Germany).

methods. RA noise was estimated from patient data and characterized by fitting theoretical distributions to the observed data. Multilevel regression was used to determine factors that significantly affect noise. Computer simulations of disease progression were performed by adding noise generated from the distribution derived from the observed data to the average rate of loss in RA estimated from longitudinal data. Rates of detection of disease progression were investigated for various progression rates, follow-up periods, and rates of imaging.

results. Noise was not normally distributed and was best characterized by the hyperbolic distribution, which fit averages well while allowing for extreme values. Noise was greatly influenced by image quality, but age did not have a significant effect. Rates of detection improved for more frequent imaging, better quality images, and faster rates of disease progression.

conclusions. Noise in HRT measurement of RA is well characterized by the hyperbolic distribution. Sensitivity of detection improves with more frequent testing, but if consistently poor-quality images are yielded for a patient, the probability of detection is low. Results from this work could be used to tailor individual follow-up patterns for patients with different rates of RA loss and image quality, especially in a clinical trial setting.

The Heidelberg Retina Tomograph (HRT; Heidelberg Engineering, Heidelberg, Germany), which has been available for more than a decade for use in the clinical management of patients with glaucoma, provides objective three-dimensional images of the optic disc and peripapillary retina. It has proved to be a useful and reliable tool in the diagnosis of primary open-angle glaucoma (POAG), and studies on diagnostic performance report reasonable levels of sensitivity and specificity. 1 However, one real promise of the HRT lies in the reliable detection of disease progression in glaucoma. One approach to detecting progression is to quantify changes in the morphologic features of the optic disc, typically expressed as stereometric parameters. Neuroretinal rim area (RA) is a reliable measure because it exhibits less test-retest variability than other stereometric parameters and has been shown to be reproducible when, for example, different observers acquire the image. 2 3 RA has been shown to give good separation between glaucomatous and normal eyes in test samples. 4 Given the relative precision of RA measurements and its straightforward interpretation, we consider RA a useful indicator for detecting glaucomatous progression. Furthermore, RA is clinically meaningful because the loss of RA tissue parallels the loss of retinal ganglion cell axons typical of glaucomatous damage. 5 6 7  
In using instruments to assess visual structure and function, measurements are rarely made without error. Thus, variability is inherent in the measurement of RA and can be separated into variability resulting from true physiological change and variability resulting from measurement error. This measurement error (or noise) may be attributed to a range of factors, including patient characteristics (e.g., lens opacity 8 9 ), machine characteristics (e.g., image alignment 10 ), and operator characteristics (e.g., different operators, 11 placement of the contour line outlining the optic disc margin 12 13 ). 
The practical consequences of “noisy ” measurements of RA are clear: if a deterioration in RA is seen in the sequential imaging of glaucomatous eyes, how can the clinician determine whether this change is caused by worsening disease or by measurement error? Making an incorrect decision has consequences for the patient and the health provider: failing to detect true change (a false-negative) may leave the patient with worsening glaucoma that is undertreated, thus compromising his or her visual function. Alternatively, incorrectly diagnosing progression (a false-positive) may result in unnecessary and expensive intervention or treatment changes. Consequently, the major challenge in the quantitative evaluation of RA lies in distinguishing true change in RA from noise. 14  
The aims of this study were to use cross-sectional and longitudinal patient data to characterize the noise in HRT measurement of RA to determine which factors significantly affect noise and to use this information to investigate the optimum frequency of HRT imaging in follow-up for the reliable detection of glaucomatous disease progression. 
Methods
Data
The data used here came from two studies conducted at the Moorfields Eye Hospital in London. A cross-sectional study was conducted with 74 patients (43 with ocular hypertension [OHT] and 31 with POAG). Each patient was imaged five times on two consecutive visits by two operators, giving a total of five images per patient. OHT was defined as intraocular pressure (IOP) greater than 21 mm Hg on two or more occasions and an AGIS visual field (VF) score of 0 (indicating normal VF). 15 POAG was defined as pretreatment IOP greater than 21 mm Hg on two or more occasions and AGIS VF score consistently greater than 0. This study is described in detail elsewhere, including detailed definitions of OHT and POAG. 2 10 Longitudinal data came from a trial of betaxolol against placebo in patients with OHT. 16 The data here are from 216 of those patients, for whom HRT images were performed at 4 to 16 visits (median, 10 visits) over 2 to 7 years (median, 6 years). In the course of follow-up, early glaucomatous field loss developed in 44 patients with OHT. This conversion to early glaucoma was defined on the basis of VF change by AGIS criteria. 15 16  
Both studies used the HRT confocal scanning laser ophthalmoscope (Heidelberg Engineering) for acquisition of three-dimensional images of the posterior segment of the eye. Images were acquired as they are in the clinical setting, with technicians obtaining the best possible images and using the HRT software checks for image quality. All images were carefully inspected by one of the authors (NGS) for clinically apparent misalignment and were manually realigned if necessary. Mean images were generated and analyzed (Eye-Explorer software, version 1.7.0, and HRT Viewing Module, version 3.0.4.10; Heidelberg Engineering). The 320-μm reference plane was used for all analyses because it has been shown to result in measurements that are less variable than the standard reference plane. 8 11 17 Briefly, the height of the standard reference plane can vary depending on the height of the contour line at the temporal optic nerve head margin, whereas the 320-μm reference plane is fixed and thus yields less variable morphometric data. Studies were performed in accordance with the tenets of the Declaration of Helsinki, informed consent was obtained from the participants, and the research was approved by the appropriate ethics committee. 
Estimation of Noise
RA was calculated for the whole optic disc (global RA). The noise in RA measurements was estimated as follows:
  1.  
    Cross-sectional data: For each patient, the mean of the five values of RA was calculated as the best available estimate of the true RA. This mean was subtracted from each of the five individual RA measurements to give five deviations from the mean for each patient, giving a total of 370 deviations. These deviations represented our best estimate of cross-sectional noise.
  2.  
    Longitudinal data: Linear regression of RA over time was fitted to each patient’s individual image series. This effectively removed the changes over time, giving an estimate of the mean RA at each time point. Residuals from the regression model (i.e., differences between each observed point and fitted point) were taken as estimates of noise.
Characterization of Noise
Theoretical distributions were fitted to the cross-sectional noise. In approximating medical data, the normal distribution is typically used because of its broad applicability and mathematical tractability. This is a symmetrical distribution (also referred to as the Gaussian distribution) whose shape is determined by two parameters, location (mean) and spread (standard deviation, SD). If it is assumed that measurements follow a normal distribution, then certain reasonably well-known properties are true, such as approximately two thirds of the data falling within a range ±1 SD from the mean. We also assessed the suitability of the hyperbolic distribution. 18 This distribution belongs to the family of “stable ” distributions, where stable refers to the property of distributions that retain shape when added together. These distributions generalize the normal distribution. They are more “peaked,” more observations fall directly on the average than are seen in a normal distribution, and tails are heavier than are seen in the normal distribution. These distributions are used widely in financial mathematics for modeling stable random variables with extreme values that occur more frequently than in the normal distribution. We hypothesized that the hyperbolic model would mimic the clinical observation of HRT measurements, in which most values are highly reproducible but in which noise sometimes increases dramatically because of image acquisition or processing difficulties. In contrast to the normal distribution, the hyperbolic distribution has four parameters: location, scale, peak, and symmetry. These parameters may be manipulated to give a family of distributions to fit data according to patient characteristics. The goodness-of-fit of these distributions was assessed by the Kolmogorov-Smirnov statistic (for which the null hypothesis states that the distribution fits the data). The distribution that best described the test-retest noise was then validated by assessing its goodness-of-fit to the noise in the longitudinal data. This analysis was repeated for measurements of RA within the six predefined sectors of the optic disc: temporal, temporal superior, temporal inferior, nasal, nasal superior, and nasal inferior. 
The relationship between the cross-sectional noise and potential predictive factors was investigated using regression methods to determine those factors that had a statistically significant effect on noise. Patient factors (age, sex, diagnosis, and lens opacity) and image characteristics (image quality, visit number, and operator) were considered. A measure of lens opacity was obtained using Scheimpflug lens photography (Marcher Case 2000 series; Marcher Diagnostics, Hereford, UK). This system produces a central nuclear dip (CND) value. 19 CND is a measure of the density in the center of the lens nucleus that gives an objective assessment of the degree of nuclear opacification. Image quality was assessed by the SD of the topographic images, each of which comprises the mean of three single images. This SD is known as topographic SD or mean pixel height SD (MPHSD) and is the HRT manufacturer’s index for image quality. 20  
The regression method used to model the data was multilevel modeling (MLM), a standard statistical technique used in the medical, social, and educational sciences. 21 MLM is similar to ordinary multiple linear regression in that a model between a number of predictor variables and a single outcome variable may be developed and approximated by a straight line. Ordinary linear regression makes the assumption that all outcome observations are independent of each other, but in the cross-sectional data each patient contributes five deviations so that deviations are nested within patients and are thus not independent. A deviation-level analysis ignoring this clustering may result in the underestimation of the standard errors of regression coefficients, giving overly small P values, whereas a patient-level analysis (e.g., using average deviations) loses potentially valuable information. 22 MLM adjusts for the hierarchical structure of the data, allowing for the correlation between deviations for each patient and explicitly modeling the way in which deviations are grouped within patients. Essentially, in MLM, patients are regarded as a (random) sample from the population of all patients, and inference is made about the variation between patients in general. Intercepts and slopes of the fitted regression lines can vary randomly between patients. Multilevel modeling was carried out with the use of a software package (MLwiN, version 2.01; Multilevel Models Project, Institute of Education, London, UK). 23  
Frequency of Imaging in Follow-up
Computer simulations of disease progression in patients with glaucoma were performed by combining the best estimate of noise (i.e., the cross-sectional deviations) with the best estimate of noise-free progression. Noise-free progression was calculated by performing a regression of RA over time for all patients who converted to glaucoma in the longitudinal data set (n = 44) and taking the average of these regression slopes. Conversion to glaucoma was defined on the basis of VF change by AGIS criteria. 15 16 One thousand “virtual ” patients were simulated to have this rate of progression, to which noise generated from the distribution of noise observed in the test-retest data was added. Sensitivity and specificity of RA linear regression to disease progression, defined as the average slope, were calculated for a range of frequencies of imaging and lengths of follow-up. A test outcome positive for progression was defined as a negative regression slope of RA over time, with P < 0.05. Computer simulations were performed in the statistical programming language R, version 2.0.1 (The R Foundation for Statistical Computing, Vienna, Austria), 24 and the R package HyperbolicDist 25 was used to model the hyperbolic distribution. 
Results
Estimation and Characterization of Noise
The mean of the cross-sectional deviations (noise) was 0 mm2, SD 0.048 mm2. The distribution of this observed noise was highly peaked, with long tails, and was poorly fitted by the normal distribution (P = 0.02; Kolmogorov-Smirnov test). This suggests that the RA measurements were usually precise but that there were also many very poor measurements. The hyperbolic distribution gave a better fit to the observed noise than the normal distribution (P = 0.60; Kolmogorov-Smirnov test). The fits of the normal and hyperbolic distributions to the cross-sectional noise are shown in Figure 1 . Quantile plots show that the observed data points in the tails of the distribution are farther from the center than would be expected for either the normal or the hyperbolic distribution. This lack of tail fit is less pronounced for the hyperbolic distribution. 
The distribution of noise in each sector of the optic disc is shown in Figure 2 . The spread of the noise, as estimated by the variance of the deviations, was significantly different between sectors (P < 0.001; Mauchly test of sphericity). The noise was more spread in the temporal and nasal sectors and least spread in the nasal superior sector. 
As was the case for noise in global RA, the hyperbolic distribution provided a better fit than the normal distribution for sectoral RA (P < 0.001 in all sectors for normal distribution; Kolmogorov-Smirnov test). 
Results of the multilevel regression analysis are summarized in Table 1 , showing the statistically significant factors affecting the size of RA deviations. The factors shown not to affect noise were sex, diagnosis (POAG or OHT), visit number, operator, and, of particular importance, age. When fitted in a model as the only variable, MPHSD, age, and CND individually had a statistically significant effect on noise. However, when fitted together in a multiple regression model, image quality, as measured by MPHSD, had an overwhelming statistically significant effect on noise. An average increase in MPHSD of 1 μm produced an increase in noise of 0.0005 mm2 (95% confidence interval [CI]: 0.0003–0.0007 mm2). Lens opacity, as measured by CND value, had a moderately strong effect. A unit increase in CND increased noise by 0.002 mm2 (95% CI: 0.0005–0.0040 mm2). Age was not statistically significant in the multiple regression model. 
Figure 3shows the predicted cross-sectional noise from the multilevel regression of MPHSD alone on noise, parameter estimates for which are given in Table 2 . Parallel regression lines were fitted for each patient, indicating significant variation in noise between patients even after adjustment for MPHSD (as evidenced by the separate lines fitted for each patient). The intercept given in Table 2is an average intercept, with the intercepts of the individual regression lines varying randomly about this average. A more elaborate model with randomly varying slopes did not significantly improve the fit of the model to the data. 
Validation on Longitudinal Data
The mean of the longitudinal noise (residuals) was 0 mm2 (SD 0.038 mm2). As was seen in the cross-sectional data, the distribution of the observed noise was highly peaked. The hyperbolic distribution and its estimated parameters, which provided the best fit to the cross-sectional noise, was then fitted to the observed noise in the longitudinal data for all cases and separately for three categories of MPHSD: good (≤30 μm), acceptable (31–50 μm), and unacceptable (>50 μm). These categories of MPHSD reflect the categories given in the HRT literature. 20  
Histograms of the observed longitudinal noise along with data generated from the hyperbolic and normal models are shown in Figure 4 . The hyperbolic distribution provided a very close approximation to the data, whereas the normal model failed to predict the peakedness of the data or the long tails. 
The distribution of observed longitudinal noise was less spread and more peaked as MPHSD improved (Fig. 5) . P values obtained with the use of the Kolmogorov-Smirnov test showed that the hyperbolic distribution provided a good fit for acceptable and good values of MPHSD. However, neither the hyperbolic nor the normal distribution fitted the observed data well for unacceptable levels of MPHSD. 
Frequency of Imaging in Follow-up
The median rate of loss in RA per year for the 44 patients who converted to glaucoma was 0.012 mm2 (interquartile range, 0.021 mm2). This represents a loss of approximately 0.75% of an average normal RA per year (where average normal RA is approximately 1.6 mm2). 4  
Figure 6shows the cumulative detection rates for the 1000 virtual patients simulated using the noise characteristics yielded from the results reported. Column A shows detection rates assuming a zero rate of loss of RA (i.e., stable disease). Column B shows rates assuming the median rate of loss. Column C shows rates associated with a faster rate of loss: the upper quartile of loss (0.023 mm2 per year, a loss of approximately 1.5% of an average normal RA per year). Simulations were repeated for the three categories of MPHSD: good (≤30 μm) shown in row I, acceptable (31–50 μm) shown in row II, and unacceptable (>50 μm) shown in row III (Fig. 6) . As expected, these simulation experiments indicated that increasing the frequency of testing improved detection rates in patients with progressive disease at all lengths of follow-up and with all image qualities. For example, for a virtual patient with an average rate of RA loss and good-quality images, imaging once a year for 4 years (Fig. 6BI)resulted in a detection rate of 37%, whereas imaging four times a year for 4 years gave a more acceptable 78% detection rate. Of course, detection rates are better still in eyes with disease that progresses faster (Fig. 6CI) . For the upper quartile of loss in RA, over a follow-up period of 4 years, imaging once a year will detect 71% and imaging 4 times a year will detect 98% of patients with progressive disease. Detection rates also improve as image quality improves. For example, imaging twice a year over a 5-year follow-up period will detect 98% of fast progressing disease with good-quality images (Fig. 6CI) , 89% of patients with acceptable quality images (Fig. 6CII) , and 56% of patients with unacceptable quality images (Fig. 6CIII) . Column A of Figure 6shows the percentage of virtual patients with nonprogressing disease incorrectly identified as progressing, giving an indication of the specificity of detection. Specificity deteriorates over time, more steeply for more frequent testing. 
Discussion
Neuroretinal rim area, as measured by the HRT, is an effective, objective quantifiable indicator to determine whether a patient with glaucoma has stable or worsening disease. RA is often the first area to show glaucomatous changes, 26 and the measurement of RA has already been established as a reliable tool for separating glaucomatous eyes from normal eyes. 27 28 29 Any true deterioration in RA will only be identified as such if it can be distinguished from variability, or noise, in RA measurements. It is clear that the observed noise in the cross-sectional and longitudinal data sets was not normally distributed. This is a significant finding because most statistical analyses in medical applications make an assumption of normality, which in this case would be inappropriate. The normal distribution is used frequently because of its central importance to sampling theory. 30 The noise in the cross-sectional data and the longitudinal data was very much more peaked and had longer tails than in the normal distribution, indicating that scans of most patients are reliable (with values of noise close to 0), but a small number of scans give rise to extreme values of noise. This pattern of observed noise in the cross-sectional data was shown to be well approximated by the hyperbolic distribution. Recently this distribution, originally developed by geomorphologists in the 1940s, 31 has been widely used in economics because it gives a better fit to certain types of financial data than the normal distribution. 32 The hyperbolic distribution has higher peaks and longer tails than the normal distribution and provides a good model for averages while also interpreting exceptional behavior. The fit of this distribution was confirmed on the noise in the longitudinal data, a more realistic data set in terms of what is available to the clinician determining whether progression has occurred. 
When the cross-sectional noise was separated into the six segments of the optic disc, significant differences were clear in the spread of noise across the sectors. It has been suggested that early glaucomatous changes often result in narrowing of RA in the inferior and superior temporal sectors. 33 Therefore, it is particularly important that any reduction in RA in these areas be reliably detected. We found the noise in these areas to be relatively small compared with the noise in the temporal and nasal sectors, which showed the greatest spread. Any RA changes occurring in these latter two sectors would therefore have to be of larger magnitude to be reliably detected. The results described in this study provide a foundation for developing a technique for detecting progression in sectoral RA. The differences in noise distribution in the different disc sectors cannot be explained by a relationship between RA and variability, whereby RA in more damaged discs is noisier than RA in discs with early damage. No relationship has been established between RA and variability, 2 and no statistically significant differences have been found in the test-retest variability of HRT II stereometric parameters between glaucomatous and normal eyes. 34  
Modeling the relationship between noise and possible predictive patient or scan factors allows us to understand which patients are likely to have reliable (low noise) scans. Because the nature of the test-retest data—i.e., more than one measurement of RA per patient—violates the independence assumption of ordinary linear regression, we used multilevel techniques to account for this clustering in the data. This nonindependence is also true of the longitudinal data because patients underwent repeated imaging over time. MLM may also be used to model this sort of data structure. MLM is particularly appealing because the interpretation of the parameter estimates is similar to that of estimates arising from ordinary linear regression. Results from this analysis suggest MPHSD to be the factor with an overriding effect on noise to the exclusion of most other patient factors, including age. Thus one important clinical finding from this study is that useful scans can be obtained during follow-up of older patients with glaucoma, a significant finding given the high prevalence of glaucoma in the elderly population. In fact, taking MPHSD into account when interpreting changes in the RA of patients with POAG would remove much of the uncertainty in deciding how frequently and over how long a follow-up period to image. Noise was found to be less spread in images of better quality, enabling true change to be more easily distinguished from noise and requiring less frequent imaging for the reliable detection of disease progression in patients with high-quality images. This important finding should be incorporated into planned methods for detecting change in RA over time. Our measure of lens opacity, CND, had a statistically significant effect on noise independently of MPHSD; however, CND is primarily used as a research tool and is not readily available in the clinic. 
Our computer simulation experiments of frequency of testing indicated that, in general, the sensitivity of disease progression increased with more frequent testing, for testing over a longer follow-up period, and for better quality images. For example, if we consider a virtual patient progressing at an average rate, imaging twice a year over 4 years gives a sensitivity of 42% for good quality images and 29% for acceptable images. However, sensitivities of 61% and 36% are achieved by imaging four times a year over 4 years for good and acceptable quality images, respectively. Of course, faster rates of loss are detected with better precision; for a virtual patient whose disease is progressing at the upper quartile of loss, imaging twice a year over four years would give sensitivities of 86% and 64% for good and acceptable quality images, respectively, and imaging four times a year over 4 years would result in detection rates of 95% and 82%. In these analyses, the mean (in the cross-sectional sample) and regression line (of longitudinal sample) are only estimates of true RA and might have been biased, indicating that the deviances (or residuals) could have underestimated the true noise. We must also emphasize that the simulation experiments simply demonstrate how ordinary linear regression performs in the presence of measurement noise sampled from a hyperbolic distribution; of course, the process of fitting trend lines by the method of least squares assumes that the errors (more precisely, residuals from the fit) are normally distributed. Alternative methods for fitting a trend to a series of observations, in which the process considers these attributes of the data, may provide more accurate estimates of rates of loss but are the subject of future work. It is hoped that these might improve the diagnostic precision we report from the current computer experiments. 
One important caveat regarding the assessment for progression at each point in time during repeated sequential imaging is that it results in deteriorating specificity analogous to an inflated type I error brought about by repeated statistical hypothesis testing. Corrective statistical methods are required to maintain an acceptable level of specificity throughout follow-up, and any method for detecting change should incorporate solutions for this. Additionally, further modifications may be carried out to reflect the relative importance of tests conducted over a fixed observation period, such as the duration of a clinical trial. 
As is customary in statistical methods, the computer simulations were based on average rates of RA loss, and this use of averages is often at odds with the needs of clinicians who necessarily think in terms of individual patients. However, it may be possible to tailor rates of progression and rates of imaging to individual patients. In a larger data set, patients may be divided according to their rates of loss and their values of MPHSD (the factor that determines the level of noise and thus the rate of imaging necessary to detect progression). The rates of change in RA used in our simulation experiments were based on data from patients with glaucoma that developed according to VF criteria. 15 The patterns of VF change in glaucomatous progression are well documented, but given the lack of any criterion for progression and the measurement error inherent in perimetric assessment, these rates of change are necessarily approximations of any true underlying change. 14  
The value of the HRT for detecting glaucomatous progression will be realized as standards for specificity and optimal image acquisition frequencies are established. Alternative techniques for detecting glaucomatous progression in series of HRT images include topographic change analysis (TCA) 35 36 and, more recently, statistical image mapping (SIM). 37 These methods detect change at the pixel level (or group of pixels in TCA) rather than with summary measures such as RA. Change is evaluated within each patient, thus obviating the need for average measures of change and variability. The noise characteristics in RA measurements may also be apparent in these analyses; this is the subject of future work. The development of methods that make use of stereometric parameters such as RA still have a role in determining progression; they are clinically familiar, and change in an area is easier to grasp than change in topographic height. Summary measures of disc changes are useful for describing disease progression in large samples of patients in clinical trials. It is likely that the complete analysis of longitudinal HRT data may be best served by an amalgam of global analysis of changes in the optic nerve head coupled with techniques that can help the clinician visualize the localized areas of likely change. 
In conclusion, we have established that the distribution of measurement error in HRT imaging of RA is best approximated by the hyperbolic distribution, thus allowing for computer simulations of progression and estimates of the sensitivity and specificity of detection of progression using RA. Issues concerning the attributes of noise may be relevant to other imaging modalities and other structural measures. Image quality is critical in terms of determining progression, and any method for detecting change must take this into account. Detection rates will improve with more frequent imaging, but techniques for correcting false-positive rates must also be applied. The results presented here will be used to develop statistical methods that will improve rates of detection and monitor change more reliably. 
 
Figure 1.
 
Top: histogram of observed noise in cross-sectional data with theoretical normal and hyperbolic probability curves generated from the cross-sectional data. Solid bars: data values >3 SD from the mean: 2.2% of observed data values are >3 SD from the mean, approximately 7 times more than would be expected if the noise followed a normal distribution. Bottom: quantiles of the cross-sectional data plotted against theoretical quantiles from the normal (left) and hyperbolic (right) distributions. If these distributions fit the data exactly, points should fall along the reference lines. The greater the departure from these reference lines, the greater the evidence for concluding that the distributions do not fit the data well, with the emphasis on the behavior in the tails of the distribution.
Figure 1.
 
Top: histogram of observed noise in cross-sectional data with theoretical normal and hyperbolic probability curves generated from the cross-sectional data. Solid bars: data values >3 SD from the mean: 2.2% of observed data values are >3 SD from the mean, approximately 7 times more than would be expected if the noise followed a normal distribution. Bottom: quantiles of the cross-sectional data plotted against theoretical quantiles from the normal (left) and hyperbolic (right) distributions. If these distributions fit the data exactly, points should fall along the reference lines. The greater the departure from these reference lines, the greater the evidence for concluding that the distributions do not fit the data well, with the emphasis on the behavior in the tails of the distribution.
Figure 2.
 
Box and whisker plots showing the distribution of noise in each sector of the optic disc. The boxes represent the interquartile ranges (central 50% of values), and the horizontal lines in the boxes are the median values. The whiskers extend to the minimum and maximum values in each sector.
Figure 2.
 
Box and whisker plots showing the distribution of noise in each sector of the optic disc. The boxes represent the interquartile ranges (central 50% of values), and the horizontal lines in the boxes are the median values. The whiskers extend to the minimum and maximum values in each sector.
Table 1.
 
Results of Multilevel Regression of MPHSD and CND on Cross-sectional Noise
Table 1.
 
Results of Multilevel Regression of MPHSD and CND on Cross-sectional Noise
Parameter Estimate Standard Error P 95% CI
Intercept −0.0231
MPHSD (μm) 0.0005 0.0001 <0.001 (0.0003, 0.0007)
CND 0.0022 0.0009 0.020 (0.0004, 0.0040)
Figure 3.
 
Predicted values of noise for multilevel model of MPHSD on noise. This differs from an ordinary regression plot in that separate regression lines are fitted for each patient, with the intercept for each line varying randomly about an average intercept. The slope of the lines indicates the magnitude of the relationship between MPHSD and noise. Parallel lines indicate that the relationship between MPHSD and noise is the same for all patients.
Figure 3.
 
Predicted values of noise for multilevel model of MPHSD on noise. This differs from an ordinary regression plot in that separate regression lines are fitted for each patient, with the intercept for each line varying randomly about an average intercept. The slope of the lines indicates the magnitude of the relationship between MPHSD and noise. Parallel lines indicate that the relationship between MPHSD and noise is the same for all patients.
Table 2.
 
Results of Multilevel Regression of MPHSD on Cross-sectional Noise
Table 2.
 
Results of Multilevel Regression of MPHSD on Cross-sectional Noise
Parameter Estimate Standard Error P 95% CI
Intercept 0.0111
MPHSD (μm) 0.0006 0.0001 <0.001 (0.0004, 0.0008)
Figure 4.
 
Histograms (top) of observed noise in the longitudinal data and noise generated from the hyperbolic and normal models using cross-sectional parameters. Quantile plots (bottom) show the fit of the generated distributions to the data.
Figure 4.
 
Histograms (top) of observed noise in the longitudinal data and noise generated from the hyperbolic and normal models using cross-sectional parameters. Quantile plots (bottom) show the fit of the generated distributions to the data.
Figure 5.
 
Histograms of observed longitudinal noise and noise generated from the normal and hyperbolic models for poor, acceptable, and good MPHSD. P values are from the Kolmogorov-Smirnov goodness-of-fit test, for which the null hypothesis states that the generated distribution fits the observed data.
Figure 5.
 
Histograms of observed longitudinal noise and noise generated from the normal and hyperbolic models for poor, acceptable, and good MPHSD. P values are from the Kolmogorov-Smirnov goodness-of-fit test, for which the null hypothesis states that the generated distribution fits the observed data.
Figure 6.
 
Results from simulation experiments giving the cumulative percentage of 1000 virtual patients with glaucoma identified as disease progressing over follow-up for various rates of progression. Three different rates of change in RA are assumed: (A) zero rate of loss (stable patients), (B) median rate of loss, and (C) upper quartile of loss. Detection rates are shown separately for images of good quality (I), acceptable quality (II), and unacceptable quality (III).
Figure 6.
 
Results from simulation experiments giving the cumulative percentage of 1000 virtual patients with glaucoma identified as disease progressing over follow-up for various rates of progression. Three different rates of change in RA are assumed: (A) zero rate of loss (stable patients), (B) median rate of loss, and (C) upper quartile of loss. Detection rates are shown separately for images of good quality (I), acceptable quality (II), and unacceptable quality (III).
BurgoyneCF. Image analysis of optic nerve disease. Eye. 2004;18:1207–1213. [CrossRef] [PubMed]
StrouthidisNG, WhiteET, OwenVMF, HoTA, HammondCJ, Garway-HeathDF. Factors affecting the test-retest variability of Heidelberg retina tomograph and Heidelberg retina tomograph II measurements. Br J Ophthalmol. 2005;89:1427–1432. [CrossRef] [PubMed]
TanJCH, Garway-HeathDF, HitchingsRA. Variability across the optic nerve head in scanning laser tomography. Br J Ophthalmol. 2003;87:557–559. [CrossRef] [PubMed]
WollsteinG, Garway-HeathDF, HitchingsRA. Identification of early glaucoma cases with the scanning laser ophthalmoscope. Ophthalmology. 1998;105:1557–1562. [CrossRef] [PubMed]
CiullaTA RegilloCD HarrisA eds. Retina and Optic Nerve Imaging. 2003;Lippincott Williams & Wilkins Philadelphia.
VarmaR, QuigleyHA, PeaseME. Changes in optic disk characteristics and the number of nerve fibers in experimental glaucoma. Am J Ophthalmol. 1992;114:554–559. [CrossRef] [PubMed]
YucelY, GuptaN, KalichmanMW, et al. Relationship of optic disc topography to optic nerve fiber number in glaucoma. Arch Ophthalmol. 2005;116:493–497.
ChauhanBC, LeblancRP, McCormickTA, RogersJB. Test-retest variability of topographic measurements with confocal scanning laser tomography in patients with glaucoma and control subjects. Am J Ophthalmol. 1994;118:9–15. [CrossRef] [PubMed]
ZangwillL, IrakI, BerryCC, GardenV, de Souza LimaM, WeinrebRN. Effect of cataract and pupil size on image quality with confocal scanning laser ophthalmoscopy. Arch Ophthalmol. 1997;115:983–990. [CrossRef] [PubMed]
StrouthidisNG, WhiteET, OwenVMF, HoTA, Garway-HeathDF. Improving the repeatability of Heidelberg retina tomograph and Heidelberg retina tomograph II rim area measurements. Br J Ophthalmol. 2005;89:1433–1437. [CrossRef] [PubMed]
TanJCH, Garway-HeathD F, FitzkeFW, HitchingsRA. Reasons for rim area variability in scanning laser tomography. Invest Ophthalmol Vis Sci. 2003;44:1126–1131. [CrossRef] [PubMed]
IesterM, MikelbergFS, CourtrightP, et al. Interobserver variability of optic disk variables measured by confocal scanning laser tomography. Am J Ophthalmol. 2001;132:57–62. [CrossRef] [PubMed]
Garway-HeathDF, PoinoosawmyD, WollsteinG, et al. Inter- and intraobserver variation in the analysis of optic disc images: comparison of the Heidelberg retina tomograph and computer assisted planimetry. Br J Ophthalmol. 1999;83:664–669. [CrossRef] [PubMed]
ArtesPH, ChauhanBC. Longitudinal changes in the visual field and optic disc in glaucoma. Prog Retin Eye Res. 2005;24:333–354. [CrossRef] [PubMed]
Advanced Glaucoma Intervention Study, 2: visual field test scoring and reliability. Ophthalmology. 1994;101:1445–1455. [CrossRef] [PubMed]
KamalD, Garway-HeathD, RubenS, et al. Results of the betaxolol versus placebo treatment trial in ocular hypertension. Graefes Arch Clin Exp Ophthalmol. 2003;241:196–203. [CrossRef] [PubMed]
Operation Manual for the Heidelberg Retina Tomograph (computer program). Versions 1.09–2.01 . 1993–1999;Heidelberg Engineering Heidelberg, Germany.
Barndorff-NilesenO, BlaesildP. Hyperbolic distributions.JohnsonNL KotzS ReadCB eds. Encyclopedia of Statistical Sciences. 1983;3:700–707.Wiley New York.
HammondCJ, SniederH, SpectorTD, GilbertCI. Genetic and environmental factors in age-related nuclear cataracts in monozygotic and dizygotic twins. N Engl J Med. 2000;342:1786–1790. [CrossRef] [PubMed]
BurkROW. How to Read the Printout. 2000;Heidelberg Engineering Heidelberg, Germany.
GoldsteinH. Multilevel Statistical Models. 2003; 3rd ed.Arnold London.
AltmanDG, BlandJM. Statistics notes: units of analysis (review). BMJ. 1997;314:1874. [CrossRef] [PubMed]
RasbashJ, BrowneW, GoldsteinH, et al. A User’s Guide to MLwiN. 2000;University of London Institute of Education London.
R Development Core Team. R: A Language and Environment for Statistical Computing. 2004;R Foundation for Statistical Computing Vienna.
ScottD. The Hyperbolic Distribution (computer program). Version 0.0–1 . 2003;David Scott Auckland, New Zealand.
Garway-HeathDF, WollsteinG, HitchingsRA. Aging changes of the optic nerve head in relation to open angle glaucoma. Br J Ophthalmol. 1997;81:840–845. [CrossRef] [PubMed]
Garway-HeathDF, HitchingsRA. Quantitative evaluation of the optic nerve head in early glaucoma. Br J Ophthalmol. 1998;82:352–361. [CrossRef] [PubMed]
KiriyamaN, AndoA, FukuiC, et al. A comparison of optic disc topographic parameters in patients with primary open angle glaucoma, normal tension glaucoma, and ocular hypertension. Graefes Arch Clin Exp Ophthalmol. 2003;241:541–545. [CrossRef] [PubMed]
ZangwillLM, van HornS, de Souza LimaM, SamplePA, WeinrebRN. Optic nerve head topography in ocular hypertensive eyes using confocal scanning laser ophthalmoscopy. Am J Ophthalmol. 1996;122:520–525. [CrossRef] [PubMed]
ArmitageP, BerryG. Statistical Methods in Medical Research. 1994; 3rd ed.Blackwell Science Oxford.
Barndorff-NielsenO. Exponentially decreasing distributions for the logarithm of particle size. Proc R Soc Lond A. 1977;353:401–419. [CrossRef]
EberleinE, KellerU. Hyperbolic distributions in finance. Bernoulli. 1995;1:281–299. [CrossRef]
JonasJB, FernandezMC, SturmerJ. Pattern of glaucomatous neuroretinal rim loss. Ophthalmology. 1993;100:63–68. [CrossRef] [PubMed]
SihotaR, GulatiV, AgarwayHC, et al. Variables affecting test-retest variability of Heidelberg Retina Tomograph II stereometric parameters. J Glaucoma. 2002;11:321–328. [CrossRef] [PubMed]
ChauhanBC, BlanchardJW, HamiltonDC, LeBlancRP. Technique for detecting serial topographic changes in the optic disc and peripapillary retina using scanning laser tomography. Invest Ophthalmol Vis Sci. 2000;41:775–782. [PubMed]
ChauhanBC, McCormickTA, NicolelaMT, LeBlancRP. Optic disc and visual field changes in a prospective longitudinal study of patients with glaucoma. Arch Ophthalmol. 2001;119:1492–1499. [CrossRef] [PubMed]
PattersonAJ, Garway-HeathDF, StrouthidisNG, CrabbDP. A new statistical approach for quantifying change in series of retinal and optic nerve head topography images. Invest Ophthalmol Vis Sci. 2005;46:1659–1667. [CrossRef] [PubMed]
Figure 1.
 
Top: histogram of observed noise in cross-sectional data with theoretical normal and hyperbolic probability curves generated from the cross-sectional data. Solid bars: data values >3 SD from the mean: 2.2% of observed data values are >3 SD from the mean, approximately 7 times more than would be expected if the noise followed a normal distribution. Bottom: quantiles of the cross-sectional data plotted against theoretical quantiles from the normal (left) and hyperbolic (right) distributions. If these distributions fit the data exactly, points should fall along the reference lines. The greater the departure from these reference lines, the greater the evidence for concluding that the distributions do not fit the data well, with the emphasis on the behavior in the tails of the distribution.
Figure 1.
 
Top: histogram of observed noise in cross-sectional data with theoretical normal and hyperbolic probability curves generated from the cross-sectional data. Solid bars: data values >3 SD from the mean: 2.2% of observed data values are >3 SD from the mean, approximately 7 times more than would be expected if the noise followed a normal distribution. Bottom: quantiles of the cross-sectional data plotted against theoretical quantiles from the normal (left) and hyperbolic (right) distributions. If these distributions fit the data exactly, points should fall along the reference lines. The greater the departure from these reference lines, the greater the evidence for concluding that the distributions do not fit the data well, with the emphasis on the behavior in the tails of the distribution.
Figure 2.
 
Box and whisker plots showing the distribution of noise in each sector of the optic disc. The boxes represent the interquartile ranges (central 50% of values), and the horizontal lines in the boxes are the median values. The whiskers extend to the minimum and maximum values in each sector.
Figure 2.
 
Box and whisker plots showing the distribution of noise in each sector of the optic disc. The boxes represent the interquartile ranges (central 50% of values), and the horizontal lines in the boxes are the median values. The whiskers extend to the minimum and maximum values in each sector.
Figure 3.
 
Predicted values of noise for multilevel model of MPHSD on noise. This differs from an ordinary regression plot in that separate regression lines are fitted for each patient, with the intercept for each line varying randomly about an average intercept. The slope of the lines indicates the magnitude of the relationship between MPHSD and noise. Parallel lines indicate that the relationship between MPHSD and noise is the same for all patients.
Figure 3.
 
Predicted values of noise for multilevel model of MPHSD on noise. This differs from an ordinary regression plot in that separate regression lines are fitted for each patient, with the intercept for each line varying randomly about an average intercept. The slope of the lines indicates the magnitude of the relationship between MPHSD and noise. Parallel lines indicate that the relationship between MPHSD and noise is the same for all patients.
Figure 4.
 
Histograms (top) of observed noise in the longitudinal data and noise generated from the hyperbolic and normal models using cross-sectional parameters. Quantile plots (bottom) show the fit of the generated distributions to the data.
Figure 4.
 
Histograms (top) of observed noise in the longitudinal data and noise generated from the hyperbolic and normal models using cross-sectional parameters. Quantile plots (bottom) show the fit of the generated distributions to the data.
Figure 5.
 
Histograms of observed longitudinal noise and noise generated from the normal and hyperbolic models for poor, acceptable, and good MPHSD. P values are from the Kolmogorov-Smirnov goodness-of-fit test, for which the null hypothesis states that the generated distribution fits the observed data.
Figure 5.
 
Histograms of observed longitudinal noise and noise generated from the normal and hyperbolic models for poor, acceptable, and good MPHSD. P values are from the Kolmogorov-Smirnov goodness-of-fit test, for which the null hypothesis states that the generated distribution fits the observed data.
Figure 6.
 
Results from simulation experiments giving the cumulative percentage of 1000 virtual patients with glaucoma identified as disease progressing over follow-up for various rates of progression. Three different rates of change in RA are assumed: (A) zero rate of loss (stable patients), (B) median rate of loss, and (C) upper quartile of loss. Detection rates are shown separately for images of good quality (I), acceptable quality (II), and unacceptable quality (III).
Figure 6.
 
Results from simulation experiments giving the cumulative percentage of 1000 virtual patients with glaucoma identified as disease progressing over follow-up for various rates of progression. Three different rates of change in RA are assumed: (A) zero rate of loss (stable patients), (B) median rate of loss, and (C) upper quartile of loss. Detection rates are shown separately for images of good quality (I), acceptable quality (II), and unacceptable quality (III).
Table 1.
 
Results of Multilevel Regression of MPHSD and CND on Cross-sectional Noise
Table 1.
 
Results of Multilevel Regression of MPHSD and CND on Cross-sectional Noise
Parameter Estimate Standard Error P 95% CI
Intercept −0.0231
MPHSD (μm) 0.0005 0.0001 <0.001 (0.0003, 0.0007)
CND 0.0022 0.0009 0.020 (0.0004, 0.0040)
Table 2.
 
Results of Multilevel Regression of MPHSD on Cross-sectional Noise
Table 2.
 
Results of Multilevel Regression of MPHSD on Cross-sectional Noise
Parameter Estimate Standard Error P 95% CI
Intercept 0.0111
MPHSD (μm) 0.0006 0.0001 <0.001 (0.0004, 0.0008)
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×