**Purpose.**:
To investigate why the specificity of the Moorfields Regression Analysis (MRA) of the Heidelberg Retina Tomograph (HRT) varies with disc size, and to derive accurate normative limits for neuroretinal rim area to address this problem.

**Methods.**:
Two datasets from healthy subjects (Manchester, UK, *n* = 88; Halifax, Nova Scotia, Canada, *n* = 75) were used to investigate the physiological relationship between the optic disc and neuroretinal rim area. Normative limits for rim area were derived by quantile regression (QR) and compared with those of the MRA (derived by linear regression). Logistic regression analyses were performed to quantify the association between disc size and positive classifications with the MRA, as well as with the QR-derived normative limits.

**Results.**:
In both datasets, the specificity of the MRA depended on optic disc size. The odds of observing a borderline or outside-normal-limits classification increased by ∼10% for each 0.1 mm^{2} increase in disc area (*P* < 0.1). The lower specificity of the MRA with large optic discs could be explained by the failure of linear regression to model the extremes of the rim area distribution (observations far from the mean). In comparison, the normative limits predicted by QR were larger for smaller discs (less specific, more sensitive), and smaller for larger discs, such that false-positive rates became independent of optic disc size.

**Conclusions.**:
Normative limits derived by quantile regression appear to remove the size-dependence of specificity with the MRA. Because quantile regression does not rely on the restrictive assumptions of standard linear regression, it may be a more appropriate method for establishing normative limits in other clinical applications where the underlying distributions are nonnormal or have nonconstant variance.

^{ 1–3 }it is also widely used to help detect optic discs that appear suspicious for glaucomatous damage (diagnosis).

^{ 4,5 }This application is particularly important in primary care or screening settings where expert optic disc assessment is not readily available.

^{ 6,7 }Owing to the large variation in optic disc appearance in healthy subjects and patients with glaucoma,

^{ 8–10 }recognition of early optic disc damage is not a trivial task.

^{ 11 }This analysis is based on the well-established relationship between optic disc size and rim area: Large discs tend to have larger rim areas than small discs.

^{ 9,12,13 }By comparing the measured area of neuroretinal rim to normative limits established in a group of healthy subjects, globally as well as in six separate sectors, the MRA classifies discs as “within normal limits,” “borderline,” or “outside normal limits.” The principles of the MRA are intuitive and transparent, and its performance compares well to that of expert observers

^{ 14,15 }and other, more complex statistical analyses such as linear discriminant functions or machine-learning classifiers.

^{ 16–18 }

^{ 6,15,16,19–23 }In healthy eyes, large optic discs are more often classified as borderline or outside normal limits than are medium-sized or small optic discs.

^{ 20–23 }Conversely, in eyes with glaucoma, the MRA is less sensitive in small discs than in large ones.

^{ 6,16,19,22 }Since the loss of neuroretinal tissue may reveal itself less readily in small discs, the lower sensitivity of the MRA in such discs is understandable. However, as the analysis has been specifically designed to accommodate the variation of optic disc size in healthy subjects, its failure to provide uniform specificity is disappointing. This finding is particularly troubling because clinicians may show a similar bias in overcalling glaucomatous damage in large but healthy discs.

^{ 10,24 }

^{ 25 }Although a logarithmic transformation of rim area was performed, it did not completely equalize the variation of the data across the large spectrum of optic disc sizes.

^{ 11 }It is likely, therefore, that the unequal specificity of the MRA is, at least in part, caused by a failure of OLS regression to accurately predict the lower limits of neuroretinal rim area in healthy subjects.

^{ 20 }

^{ 26–29 }These methods have been designed for situations where the extremes of a distribution change at different rates from that of the average. Since these methods can directly estimate the extremes (quantiles) of a distribution, they may be particularly appropriate for deriving normative reference limits (e.g., the fifth percentile) with diagnostic tests. In this article, we show that quantile regression provides a somewhat simpler yet more accurate approach (compared to OLS regression) for modeling the relationship of optic disc size and the lower limits of rim area in healthy subjects. We demonstrate that normative limits for rim area derived by quantile regression remove the size-dependent variation in specificity reported with the MRA. Finally, we briefly discuss other potential applications of quantile regression techniques in ophthalmology.

^{ 30 }the other at the QEII Eye Care Centre of Queen Elizabeth II Health Sciences Centre in Halifax (Nova Scotia, Canada).

^{ 16 }The two datasets are briefly described in the next sections, and the demographics are shown in Table 1. In both studies, optic disc images from healthy subjects and patients with glaucoma were obtained with the HRT 1 and subsequently converted to HRT 3 format. Contour lines had been drawn by experienced clinicians and were all reviewed by one of the authors (PHA) at the outset of the study.

Manchester | Halifax | |||
---|---|---|---|---|

Controls | Glaucoma | Controls | Glaucoma | |

n | 88 | 146 | 76 | 106 |

Age, y, mean ± SD | 59.3 ± 11.1 | 70.2 ± 9.7 | 57.7 ± 12.3 | 68.1 ± 10.7 |

Optic disc size, mm^{2}, mean ± SD | 1.83 ± 0.46 | 2.04 ± 0.43 | 1.94 ± 0.45 | 2.13 ± 0.42 |

(min, max) | (1.05, 3.24) | (1.19, 3.37) | (0.95, 2.87) | (1.42, 3.71) |

Visual field MD, dB | −0.4 ± 1.1 | −6.1 ± 6.1 | −1.1 ± 1.6 | −4.8 ± 3.9 |

*P*> 10%).

*P*< 5%). If both eyes were eligible, one eye was randomly selected as the study eye.

*r*th quantile, then they are taller than the proportion

*r*of the reference group of infants and shorter than the proportion 1 −

*r*. The median (50th), lower (25th) and upper (75th) quartiles are specific quantiles. The 95% reference interval commonly used for measurements in medicine captures the data between the 2.5 and the 97.5 quantiles. Often a measurement varies with age, such as in infant growth, and the challenge is to estimate the 95% reference intervals at different specific ages. The problem then becomes one of regression where limits around the average (mean) relationship between, say, the infant's height and age are captured. However, just as a mean often gives an incomplete picture of a single distribution of numbers (it only indicates the center of the distribution), so the single OLS regression line gives only an incomplete picture of the relationship between two variables. The alternative is quantile regression: Just as classic linear regression methods based on minimizing sums of squared residuals enables one to estimate models for conditional mean functions, regression quantile methods offer a mechanism for estimating models for the conditional median function, and the full range of conditional quantile functions. Estimation here is based on a weighted sum (with weights depending on the order of the quantiles) of absolute values of residuals. In short, we can generate a regression line at any point in the distribution of values, and this is what we sought to do with rim area against optic disc size in this work. There are variants on how to fit the lines but we have adopted the standard method used in a package quantreg

^{ 31,32 }from the open-source statistical programming environment R.

^{ 33,34 }For more technical detail, the statistically minded reader is directed toward Koenker.

^{ 27 }Those who are curious but require a more accessible descriptor of the methods are referred to a well-written paper by Cade and Noon.

^{ 29 }

^{ 35 }were performed separately in healthy subjects and patients with glaucoma. These analyses established how the odds [

*p*/(1 −

*p*)] of observing a positive test (MRA classification of borderline or outside normal limits) vary with a change in the predictor variables (disc area, in units of 0.1 mm

^{2}, and age).

^{ 33,34 }

^{2}increase in disc size, respectively (Table 2). Healthy discs classified as borderline or outside normal limits were significantly larger than those classified as within normal limits (Mann-Whitney,

*P*< 0.05).

MRA Control Subjects | Optic Disc Area (0.1 mm^{2}) | Age (y) |
---|---|---|

Manchester, n = 88 | 1.10 (0.99–1.22) | 1.06 (1.02–1.11) |

P = 0.07 | P = 0.006 | |

Halifax, n = 76 | 1.15 (1.02–1.29) | 1.00 (0.96–1.04) |

P = 0.02 | P = 0.89 | |

Combined, n = 164 | 1.10 (1.02–1.18) | 1.03 (1.00–1.06) |

P = 0.01 | P = 0.04 |

^{ 11 }However, Figure 1 also shows that the log transform of rim area did not equalize the scatter around the regression line. The increase in the variance was statistically significant in both datasets (

*P*< 0.05, Levene's test).

^{ 27 }the logarithmic transform of rim area is no longer needed, simplifying the analyses. Figure 4 shows the 10% and 2% limits derived by quantile regression of rim area against disc size in the combined dataset.

^{2}, and for larger rim areas in discs smaller than that. These effects will translate into a more conservative classification (more specific, less sensitive) for large discs and the opposite for small discs.

*P*= 0.61).

QRA Healthy Subjects | Optic Disc Area (0.1 mm^{2}) | Age (y) |
---|---|---|

Manchester, n = 88 | 1.00 (0.99–1.01) | 1.07 (1.02–1.11) |

P = 0.39 | P = 0.003 | |

Halifax, n = 76 | 1.00 (0.99–1.01) | 1.01 (0.97–1.06) |

P = 0.97 | P = 0.51 | |

Combined, n = 164 | 1.00 (0.99–1.01) | 1.04 (1.01–1.07) |

P = 0.81 | P = 0.01 |

QRA | MRA | |||
---|---|---|---|---|

Within | Borderline | Outside | Total | |

Within | 103 | 8 | 0 | 111 |

57 | 4 | 0 | 61 | |

Borderline | 6 | 31 | 4 | 41 |

1 | 35 | 9 | 45 | |

Outside | 0 | 0 | 12 | 12 |

0 | 6 | 140 | 146 | |

Total | 109 | 39 | 16 | 164 |

58 | 45 | 149 | 252 |

^{2}, SD 0.40;

*P*< 0.001, Wilcoxon) than those whose classification did not change (mean, 1.85; SD 0.44 mm

^{2}). The six healthy subjects (4%) in whom the QRA gave a more positive result than the MRA had smaller discs (mean differences, 0.23 and 0.28 mm

^{2}, NS).

MRA | QRA | |
---|---|---|

Manchester, n = 146 | 1.16 (1.06–1.26) | 1.12 (1.02–1.21) |

P = 0.002 | P = 0.01 | |

Halifax, n = 106 | 1.20 (1.08–1.34) | 1.17 (1.05–1.30) |

P < 0.001 | P = 0.003 | |

Combined, n = 252 | 1.16 (1.08–1.24) | 1.13 (1.06–1.20) |

P < 0.001 | P < 0.001 |

*n*= 88 and 76), we were unable to estimate robust normative limits without pooling the samples. However, the increased variance of log-transformed neuroretinal rim area in larger optic discs was similarly evident in both independent datasets (Figs. 1, 2) and has also been reported by other groups,

^{ 20 }and we therefore believe that quantile regression is the correct approach. However, the true specificity of the quantile regression limits, for any given nominal value, should be estimated from a larger population-based dataset, independent of those used in the present study, but such data are available.

^{ 6 }

^{ 22 }and with expert classification by clinicians.

^{ 10,24 }

^{ 4 }Nevertheless, they should be based on the most appropriate statistical methods to make it easier to interpret the findings. With the MRA, the false-positive rates reported in empiric studies have often been much larger than expected from the nominal designations (5% and 0.1% for borderline or outside normal limits). As the comparison to normal limits is performed seven times (globally, as well as in six optic disc sectors), it is clear that the overall false-positive rate must be somewhat higher than the nominal value. However, in our study, the positive rates of the MRA were matched, approximately, by the 10% and 2% limits of quantile regression. In part, this discrepancy may be due to systematic differences between the datasets in our study and those used by the manufacturers of the instrument, but it is also not unlikely that the distribution of rim area (or log rim area) has longer tails than a normal distribution.

^{ 36 }If this were the case, one would expect that the quantile regression normative limits provide false-positive rates closer to their nominal designations. In the absence of a large independent dataset we were unable to confirm this hypothesis; this remains an objective of future research.

^{ 37 }The classic application for quantile regression in medicine, for example, is the relationship between age and height or weight in children as modeled by growth curves.

^{ 38 }This is one example of an application where the extremes of a distribution change at rates different from the average, leading to changes in the shape and/or the variance. In vision science, similar phenomena are likely to occur with psychophysical tests that exhibit ageing effects (for example in perimetry).

^{ 39–41 }If ageing changes were to vary between subjects, as seems likely, this would lead to a greater spread with increasing age. Quantile regression techniques may prove advantageous over other techniques for estimating limits of normality in these situations. We present this article to encourage the use of these methods, especially since the software to implement them is freely available and easy to use.