**Purpose.**:
The aim of this study was to explore how different statistical methods may lead to inconsistent inferences about the association between structure and function in glaucoma.

**Methods.**:
Two datasets from published studies were selected for their illustrative value. The first consisted of measurements of neuroretinal rim area in the superior-temporal sector paired with the corresponding visual field sensitivity. The second consisted of measurements of average retinal nerve fiber layer thickness over all sectors paired with the corresponding visual field sensitivity. Statistical methods included linear and segmented regression, and a nonparametric local-linear fit known as loess. The analyses were repeated with all measurements expressed as percent of mean normal.

**Results.**:
Slopes from linear fits to the data changed by a factor of 10 depending on the linear regression method applied. Inferences about whether structural abnormality precedes functional abnormality varied with the statistical design and the units of measure used.

**Conclusions.**:
The apparent association between structure and function in glaucoma, and consequent interpretation, varies with the statistical method and units of measure. Awareness of the limitations of any statistical analysis is necessary to avoid finding spurious results that ultimately may lead to inadequate clinical recommendations**.**

*R*

^{2}> 0.9), then one could be more confident that both measures are good surrogates for the stage of glaucoma. If the association is weak or moderate (e.g., with

*R*

^{2}< 0.5), then one should be reluctant to accept any statistical model as a good descriptor of association, as large errors are likely to happen when predicting one measure from the other for individual subjects. Moreover, understanding the nature of the association would help clinicians to grade better the severity and stage of the disease. In experimental observations, however, the nature and strength of the association are obscured by sources of variability, including test–retest and intersubject variability, and learning and fatigue effects in perimetry, and artifacts in image acquisition such as misalignments. In addition, anatomical differences between subjects result in errors in pairing structural and functional values using maps of average projections from retinal locations to sectors of the optic disc.

^{ 1–3 }

^{ 2,4–6 }An exception to this general approach was the work by Hood et al., who developed a model based on static automated perimetry and cortical evoked potentials, and tested the ability of the model to describe associations between structure and function in independent datasets.

^{ 7,8 }

^{ 9 }unless such variability is small relative to the ranges of the “true” structural and functional values, so that

*R*

^{2}is close to 1. Bias also would be present if the functional measure were taken as the explanatory variable and all variability attributed to structure, as if linear OLS were performed with the axes swapped. Many other linear regression methods exist,

^{ 10 }some of which acknowledge variability in both axes, but assumptions about the nature of data variability are inescapable and also are likely to lead to bias.

^{ 11 }segmented regression,

^{ 12 }and a type of nonparametric local-linear fit, known as loess,

^{ 13 }which was used to explore linearity between structural and functional methods. The intention of this study was to demonstrate how different fitting methods might lead to inconsistent inferences, rather than seek the most appropriate method of analysis, a much more arduous task, if not impossible.

^{ 6 }(the Hamilton dataset) and by researchers at the Department of Ophthalmology of the University of Pittsburg School of Medicine

^{ 14 }(the Pittsburgh dataset). Both reports stated that the studies followed the principles of the Declaration of Helsinki and were in compliance with the Health Insurance Portability and Accountability Act.

^{ 6 }consists of measurements of neuroretinal rim area in the six sectors of the optic disc and corresponding visual sensitivities expressed in linear (1/ Lambert) units.

^{ 2 }Data for the superior-temporal sector of the optic disc (45°–90°) were selected to allow direct comparisons with the original work (see Fig. 2 in the study of Racette et al.

^{ 6 }). Measurements were taken for 91 healthy eyes, 77 eyes at higher risk for development of glaucoma due to ocular hypertension, 125 eyes with suspected glaucoma, and 92 eyes with definite glaucoma. All subjects were part of the Diagnostic Innovations in Glaucoma Study (DIGS).

**Figure 1**

**Figure 1**

**Figure 2**

**Figure 2**

^{ 14 }consists of measurements of average RNFL thickness and relative visual field sensitivity.

^{ 7 }Measurements were taken for 72 eyes free of disease and 40 eyes with glaucoma. The data points were digitized from the graph in the top panel of Figure 1 from the report of Wollstein et al.

^{ 14 }Six points corresponding to eyes free of disease could not be digitized, due to overlapping, so the sample size of eyes free of disease used here was 66.

^{ 6 }followed the approach of Garway-Heath et al.,

^{ 2 }and used linear and polynomial regression to ask whether a linear or a decibel (dB) scale for visual sensitivities described better its relationship with rim area. Wollstein et al. asked a different question and used segmented regression to find the RNFL thickness value below which visual field damage should be detectable.

^{ 14 }Given their different research questions, it is reasonable that they used different statistical methods. Equally here, different methods were applied to each dataset.

^{ 6 }The second was a linear OLS fit of rim area on visual sensitivity, as if the axes were swapped. The third was a linear SMA fit, which accounts for the fact that rim area and visual sensitivity are subject to measurement errors and intersubject differences. The key assumption in the linear SMA fit is that the ratio from the variance of measurement errors and intersubject differences in structure to that in function equals the ratio between variances of the observed structural and functional measures. In other words, SMA assumes that the signal-to-noise ratios in structure and function are equal. A review of OLS, SMA, and other regression methods, and their fundamental differences, can be found elsewhere.

^{ 11 }The result for SMA does not change when the axes are swapped.

^{ 14 }The second was a broken-line fit of RNFL thickness on relative visual field sensitivity, as if the axes were swapped. The third and fourth were as the first and second fits, but after a change in the units of measure. Sensitivity was converted first to linear units,

^{ 7 }and then RNFL thickness and sensitivity values were divided by the average over the 66 eyes free of disease and multiplied by 100 to get percent of mean normal.

^{ 13 }also was obtained for both datasets after transforming all data to percent normal in the linear scale. The loess fit consists of estimating at each possible value of the explanatory variable the median value of the response by performing weighted least-squares linear regression locally. Locally means that only a portion of the range of possible values around the estimation point is used. The loess fit is determined by the specification of the smoothing parameter that determines which points are to be used in each local linear fit and their corresponding weights. Its value in each fit was selected manually following the guidelines of Cleveland.

^{ 15 }

^{ 16,17 }for automatic selection of the smoothing parameter. The fits obtained for automatic selection (not shown here) were less smooth than for manual selection, but the same overall patterns were found.

^{ 6 }The three linear fits agree qualitatively in that visual sensitivity increases, on average, with rim area. Nevertheless, the estimated slopes are very different from each other. The slope for the linear OLS fit of sensitivity on rim area (Figs. 1a–d, solid lines) was 360 μm

^{2}per 1/Lambert (significantly different from 0 with

*P*< 0.0001).

^{2}per 1/Lambert (

*P*< 0.0001); 10 times smaller. The slope for the SMA fit (Fig. 1d, dotted line) was 110 μm

^{2}per 1/Lambert. The slope estimated with SMA fit was found to be significantly different from slopes estimated with OLS fit of sensitivity on rim area (

*P*< 0.0001) and of rim area on sensitivity (

*P*< 0.0001), with the “common slopes test” for SMA.

^{ 18 }Therefore, from the same dataset, three statistically different apparent associations were obtained from three different methods.

*R*

^{2}= 0.09; well below the conventional cutoff for strong correlation,

*R*

^{2}= 0.50. With such low correlation, when the linear model explains less than half the variation in the data, predictions for a particular instance are likely to be severely biased. The three fits agree in the qualitative assessment that, on average, rim area increases with visual sensitivity. Therefore, structure and function, indeed, seem to be associated, even if the association is very weak. If there were no association, then the estimated slope in the OLS fits would not be significantly different from zero. Thus, the solid line in Figure 1d would be horizontal and the dashed line would be vertical. The SMA fitted line in Figure 1d is between the two OLS fits. By construction, and in contrast to OLS fits, slopes estimated with SMA are systematically different from zero even when there is no association. Therefore, SMA is not an appropriate method for assessing associations between structure and function.

^{ 14 }Six data points were lost in the digitization process. Details for Figures 2a to 2c are the same as those in Figure 1, but for a different dataset and statistical model (the broken-line model as opposed to the linear model). Thus, the solid lines in Figures 2a to 2c represent the broken-line fit of sensitivity on thickness, and the dashed line in Figure 2c represents the broken-line fit of thickness on sensitivity obtained after the axes were swapped. In Figure 2d the broken-line fits for thickness on sensitivity (dashed line) and sensitivity on thickness (solid line) were calculated after data were converted to percent of mean normal (see Methods for the broken-line analysis for details).

^{ 14 }The first question was whether the slope before the breakpoint was equal to zero. The second question was whether the slope after was equal to zero. The third question was whether the slopes before and after the breakpoint were equal to each other. For the last hypothesis the Davies' test

^{ 19 }was used. The Table shows the RNFL thickness for the breakpoint obtained from each broken-line fit in Figure 2, along with the corresponding answers to each of the three questions about slopes.

**Table.**

**Table.**

Breakpoint in μm | Slope Before = 0? | Slope After = 0? | Before = After? | |

Sensitivity on thickness* | 75 | No | Yes | No |

Thickness on sensitivity | 61 | Yes | No | Yes |

Sensitivity on thickness in % normal† | 77 | No | Yes | Yes |

Thickness on sensitivity in % normal | 81 | No | No | Yes |

^{ 14 }the fourth was consistent with a linear association between structure and function.

^{ 2 }

^{ 13 }fits were used to examine visually whether or not there was an apparent change in the association between observations of structural and functional measures. Figure 3 shows loess fits for the Hamilton and Pittsburgh datasets.

**Figure 3**

**Figure 3**

**Figure 4**

**Figure 4**

^{ 9 }and biology,

^{ 11,18 }they often are not recognized in studies of association between structure and function in glaucoma.

^{ 20 }(or a weighted version), may be more appropriate if the nature of the relationship between measurement errors and intersubject differences in structure and function becomes better understood.

^{ 10 }Among others, Passing–Bablok rank-order regression

^{ 21–23 }has been used widely in the clinical domain. Unlike the methods discussed here, Passing-Bablok regression does not require specific assumptions regarding the distribution of measurement error. Furthermore, the estimation of the line parameters is robust to outliers.

^{ 10 }Nevertheless, since the Gaussian-distribution assumptions made by classic regression methods have been shown to have little effect on the estimation of parameters,

^{ 24 }it does not seem to be always advantageous to use Passing–Bablok regression over, for example, SMA or Deming regression.

^{ 20 }Readers are kindly referred to the reports by Ripley and Thompson,

^{ 9 }and other investigators

^{ 10,11,20,24,25 }in which linear fitting procedures for bivariate data and their potential pitfalls are presented in a more detailed and systematic manner. Yet, the problem remains that the true association may not be well described by a linear model.

^{ 6 }

*IOVS*2012;53:ARVO E-Abstract 4629), for instance, have been used to extract the “latent information” shared among different indices for glaucoma (Wollstein, et al.

*IOVS*2012;53:ARVO E-Abstract 218). This approach assumes that there is an unknown, unobservable variable that quantifies the “true” stage of the disease and that can be reconstructed from the observation of the imperfect indices for glaucoma. Other approaches have used radial basis functions under a Bayesian framework,

^{ 26,27 }multiple linear regression based on principal component analysis

^{ 28 }to estimate sensitivity from structural measures at each location of the visual field, and multilayer artificial neural networks

^{ 29 }to integrate structure and function information also at each location of the visual field. These methods, however promising, are subject to other challenges, such as overfitting,

^{ 26 }which are yet to be assessed.

^{ 30 }Awareness of the limitations of any statistical method is necessary to avoid being misled by spurious apparent associations between structure and function in glaucoma.

**I. Marín-Franch**, None;

**R. Malik**, None;

**D.P. Crabb**, None;

**W.H. Swanson**, None

*. 2000; 107: 1809–1815.*

*Amer Acad Ophthalmol**. 2002; 43: 2213–2220. [PubMed]*

*Invest Ophthalmol Vis Sci**. 2012; 105: 70–78. [CrossRef] [PubMed]*

*Exp Eye Res**. 1985; 103: 205–207. [CrossRef] [PubMed]*

*Arch Ophthalmol**. 2007; 48: 763–773. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci**. 2007; 16: 676–684. [CrossRef] [PubMed]*

*J Glaucoma**. 2007; 26: 688–710. [CrossRef] [PubMed]*

*Prog Retin Eye Res**. 2009; 50: 4254–4266. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci**. 1987; 112: 377–383. [CrossRef]*

*Analyst**. 2010; 37: 692–699. [CrossRef] [PubMed]*

*Clin Exp Pharmacol Physiol**. 2006; 81: 259–291. [CrossRef] [PubMed]*

*Biol Rev Camb Philos Soc**. 2003; 22: 3055–3071. [CrossRef] [PubMed]*

*Stat Med**. 1979; 74: 829–836. [CrossRef]*

*J Amer Stat Assoc**. 2011; 96: 47–52. [CrossRef]*

*Brit J Ophthalmol**. Summit, NJ: Hobart Press; 1985: chap 3.*

*The Elements of Graphing Data**. 2002; 83: 265–287. [CrossRef]*

*J Multiv Anal**. 2002; 44: 161–174. [CrossRef]*

*Biometric J**. 1987; 74: 33–43.*

*Biometrika**. 1993; 39: 424–432. [PubMed]*

*Clin Chem**. 1983; 21: 709–720. [PubMed]*

*J Clin Chem Clin Biochem**. 1984; 22: 431–445. [PubMed]*

*J Clin Chem Clin Biochem**. 1988; 26: 783–790. [PubMed]*

*J Clin Chem Clin Biochem**. 1986; 2: 75–106. [CrossRef]*

*Econ Theory**. 2000; 46: 100–104. [PubMed]*

*Clin Chem**. 2010; 51: 5657–5666. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci**. 2011; 129: 1167–1174. [CrossRef] [PubMed]*

*Arch Ophthalmol**. 2011; 2: 1734–1742. [CrossRef] [PubMed]*

*Biomed Opt Express**. 2011; 11: 20. [CrossRef] [PubMed]*

*BMC Ophthalmology**. 2012; 40: 369–380. [CrossRef]*

*Clin Exp Ophthalmol*