**Purpose.**:
Classic regression is based on certain assumptions that conflict with visual field (VF) data. We investigate and evaluate different regression models and their assumptions in order to determine point-wise VF progression in glaucoma and to better predict future field loss for personalised clinical glaucoma management.

**Methods.**:
Standard automated visual fields of 130 patients with primary glaucoma with a minimum of 6 years of follow-up were included. Sensitivity estimates at each VF location were regressed on time with classical linear and exponential regression models, as well as different variants of these models that take into account censoring and allow for robust fits. These models were compared for the best fit and for their predictive ability. The prediction was evaluated at six measurements (approximately 3 years) ahead using varying numbers of measurements.

**Results.**:
For fitting the data, the classical uncensored linear regression model had the lowest root mean square error and 95th percentile of the absolute errors. These errors were reduced in all models when increasing the number of measurements used for the prediction of future measurements, with the classical uncensored linear regression model having the lowest values for these errors irrespective of how many measurements were included.

**Conclusions.**:
All models performed similarly. Despite violation of its assumptions, the classical uncensored linear regression model appeared to provide the best fit for our data. In addition, this model appeared to perform the best when predicting future VFs. However, more advanced regression models exploring any temporal–spatial relationships of glaucomatous progression are needed to reduce prediction errors to clinically meaningful levels.

^{ 1 }Evaluation of a longitudinal series of visual fields (VF), as measured by standard automated perimetry, provides a method to detect early evidence of glaucoma and to determine functional deterioration.

^{ 2 }Hence, treatment strategies can then be optimized to prevent further progression of VF loss. Modeling the series of VFs is beneficial for this evaluation. It tries to extract the underlying VF progression from observed measurements and to distinguish it from noise, and can be used in the prediction of future VFs.

^{ 3 –5 }Modeling of individual VF points is potentially of greater interest, because it provides additional information such as the spatial nature of the fields, that is otherwise lost in global parameters.

^{ 6 }Modeling the progression of glaucoma in individual test locations may be of significant interest to best determine patterns of progression. For example, more centrally located progression would arguably be of greater significance to the patient than more peripherally located progression. In addition, if progression took place in a particular location of the VF that was also affected in the fellow eye, this local progression would be more significant to the patient's quality of life than when the local progression took place elsewhere in the field, where no such overlapping defects existed. Determining and predicting the pattern and location of progression would therefore help clinicians to best tailor the management of glaucoma in their patients.

^{ 7 }discussed measuring the rate of VF progression in glaucoma for each VF location independently. They explored regression analysis of the sensitivity estimates against time in three models: linear, quadratic, and nonlinear exponential.

^{ 7 }There were some aspects of VF data that were not taken into account in these models. Firstly, the maximum luminance of the Humphrey Field Analyzer (Carl Zeiss Meditec, Dublin, CA) perimeter's stimuli is 10,000 apostilbs, which is defined as 0 dB retinal sensitivity. The lowest sensitivity that can be detected by this perimeter is therefore 0 dB, although negative values could in fact occur if it were not for the limitations of this device.

^{ 8 }It is therefore of interest to use a model that takes censoring at 0 dB into account, as suggested by Russell and Crabb.

^{ 9 }We think that this is especially true for locations showing more advanced disease progression, with high sensitivity estimates in the early follow-up period, but a relatively large number of 0 dB sensitivity estimates toward the end of the follow-up period. The Tobit model is used to estimate the relationship between variables when there is either left- or right-censoring (or above- and below-censoring) in the dependent variable.

^{ 10 }Another consideration of regression analysis that deserves attention is that classical least-squares fitting is based on certain assumptions about the measurement error, which in fact may be violated. For example, measurement error in VFs increases with damage, and, hence, low sensitivity estimates have high variability. However, values lower than zero cannot be measured. This inherent censoring introduces a positive bias at low sensitivity estimates, which is made worse by the increased variability for low sensitivity estimates.

^{ 11 }Another problem in regression analysis is that of outliers or points that do not follow the general trend. As in most practical problems, VF data may have several outliers. Single outliers may be easily identified in diagnostic tests, but it is generally more difficult to identify multiple outliers.

^{ 12 }Therefore, regression analyses that are effective, even in the presence of multiple outliers, need to be considered,

^{ 13 }such as robust methods.

^{ 14,15 }

^{ 16 }who concluded that the best compromise between fitting and predicting sensitivity estimates is the linear model. Since glaucoma is a progressive disease and the purpose of modeling the progression is to be able to evaluate future field loss in a clinical setting, the predictive ability of these models is of importance. Our aim therefore was to introduce these censored and robust regression models for the point-specific evolutions of the VFs and to evaluate these models, as well as their assumptions, for modeling and predicting point-wise glaucomatous VF progression.

**Table 1**

**Table 1**

Mean | Median | Interquartile Range | |

Baseline age, y | 59.8 | 61.4 | 53.3 to 67.6 |

Baseline MD, dB | −8.3 | −5.8 | −12.8 to −2.3 |

Average change in MD, dB/y | −0.07 | −0.04 | −0.2 to 0.1 |

Follow-up time, y | 7.7 | 7.6 | 7.2 to 8.2 |

**Figure 1**

**Figure 1**

**Figure 2**

**Figure 2**

*i*denotes in our current study an index that is unique for every VF location in each eye (

^{ 17 }M-estimators,

^{ 18 }and Robust Regression with Ranked Residuals,

^{ 14 }however including all these models would be beyond the scope of this paper.

^{ 19 }

**Figure 3**

**Figure 3**

^{ 8 }

^{20}−2ln(

*L*) measures the model fit to the current data. By adding the model complexity indicated by

*2k,*complex models are penalized. Generally, AIC is used to compare nonnested models with the same response, such as the classical uncensored linear and the classical exponential model. However, if the responses differ, one should compare AIC values with caution. In the case when comparing uncensored and censored models, it is not clear whether AIC can be used. An alternative to the AIC is leave-one-out cross-validation. We combined the errors (given by the predicted minus the observed sensitivity estimate) from each sensitivity estimate into summary error measures and compared the root mean square error (RMSE) and the mean absolute error (MAE) for each of the models. We compared both the RMSE and MAE since, by definition, the classical regression models minimize the RMSE, while the median regression models minimize the MAE. We evaluated only the classical uncensored linear and classical exponential models using AIC, ranking them by percentage of best fits in order to compare our results with previous work,

^{7}but used leave-one-out cross-validation to calculate the RMSE and MAE for the comparison of all models.

**Table 2**

**Table 2**

Model | RMSE, dB | MAE, dB | 95th Percentile, dB |

Classical uncensored linear | 3.65 | 2.37 | 7.93 |

Classical exponential | 3.77 | 2.39 | 8.02 |

Classical censored linear | 3.72 | 2.33 | 8.09 |

Uncensored linear on the median | 3.81 | 2.34 | 8.26 |

Exponential on the median | 3.93 | 2.37 | 8.41 |

Censored linear on the median | 3.83 | 2.34 | 8.31 |

**Figure 4**

**Figure 4**

**Figure 5**

**Figure 5**

**Figure 6**

**Figure 6**

**Table 3**

**Table 3**

Model | RMSE, dB | MAE, dB | 95th Percentile, dB |

Classical uncensored linear | 6.02 | 3.95 | 13.13 |

Classical exponential | 7.29 | 4.39 | 15.00 |

Classical censored linear | 6.34 | 4.05 | 14.00 |

Uncensored linear on the median | 6.34 | 4.06 | 14.00 |

Exponential on the median | 7.58 | 4.49 | 16.20 |

Censored linear on the median | 6.51 | 4.12 | 14.22 |

**Figure 7**

**Figure 7**

^{ 7 }who concluded that the classical exponential model provided the best fit for their data in a very large majority of cases. This could be due to different implementation of the models. Caprioli et al.

^{ 7 }specified a mathematical model and not a statistical model. In the latter model it is also specified how the error term enters the model (i.e., as an additive term to the exponential or behind the exponential sign). Without knowing the statistical model which they used, we are unable to determine whether our implementations were the same.

^{ 11 }However, the difference between the models was more evident between 0 dB and 20 dB when comparing the models for best predictive ability than for best fitting ability. Hence, none of the models were being optimized well in this range, and the result of this problem is even more evident when predicting future measurements. As before, the real value of the sensitivity estimate is unknown. Therefore, the measurements being predicted also have error, so an accurate prediction of this measurement is not possible.

^{ 7,16 }In addition, the model is not monotonic and there are no theoretical reasons (e.g., from anatomy or functional models) for using such a quadratic model. We also evaluated the quadratic model but found no indication of its suitability in our application. We therefore argue that this model can safely be excluded in future research on modeling time series of threshold sensitivities.

^{ 9 }However, recent research found that ganglion saturation can also account for high test–retest variability between 0 dB and 25 dB and, hence, 0 dB is outside the dynamic range of a ganglion cell,

^{ 21 }and that sensitivities below 10 dB are mainly noise.

^{ 22 }For this reason, we also considered censoring and truncating of the predicted response at 10 dB. When censoring and truncation were done at 10 dB rather than 0 dB, the MAE was higher for all the models; however, the classical uncensored linear model was still found to be the best model for fitting and for prediction.

^{ 23 }In order to measure the true progression of the disease, the high, sensitivity-dependent, variability would need to be taken into account by incorporating it into the statistical model, or new, more reproducible methods for determining the VFs' need to be developed. The device should flag measurements if they fall outside a clinically acceptable range determined from previous observations. This would indicate that either the measurement is wrong and the test should be redone, or that there is a real change in the rate of progression and the treatment strategy may need to be adjusted.

^{ 24,25 }A problem with using regression models is that each location is treated as an independent sample and we are not able to link locations belonging to the same eye together. This problem can be addressed by using spatial models or mixed effects models,

^{ 26 }which we propose for future work.

^{ 6 }

**S.R. Bryan,**None;

**K.A. Vermeer,**None;

**P.H.C. Eilers,**None;

**H.G. Lemij,**Carl Zeiss Meditec (C);

**E.M.E.H. Lesaffre,**None

*. 2004; 82: 887– 888. [PubMed]*

*Bull World Health Organ**. Cambridge, MA: Hogrefe & Huber Publishers; 2003.*

*Glaucoma: A Guide For Patients: An Introduction for Care-Providers: A Quick Reference*. 2nd ed*. 2008; 145: 343– 353. [CrossRef] [PubMed]*

*Am J Ophthalmol**. 2012; 119: 1367– 1374. [CrossRef] [PubMed]*

*Ophthalmology**. 2011; 52: 4030– 4038. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci**. 2012; 53: 5403– 5409. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci**. 2011; 52: 4765– 4773. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci**. St. Louis, MO: C V Mosby; 1999.*

*Automated static perimetry*. 2nd ed*. 2011; 52: 9539– 9540. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci**. 1958; 26: 24– 36. [CrossRef]*

*Econometrica**. 2012; 53: 5985– 5990. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci**. London: Chapman and Hall; 1980.*

*Identification of Outliers**. New York: Wiley; 1987.*

*Robust Regression and Outlier Detection**. New York: Wiley; 1998 .*

*Applied Regression Analysis*. 3rd ed*. New York: Cambridge University Press; 2005.*

*Quantile Regression**. 1995; 233: 750– 755. [CrossRef] [PubMed]*

*Graefes Arch Clin Exp Ophthalmol**. 1887; 6: 279– 285.*

*Hermathena**Robust Statistics*. New York: John Wiley and Sons; 1981.

*. New York: Wiley; 1983.*

*Understanding Robust and Exploratory Data Analysis**. 2011; 52: 764– 771. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci**. 2012; 7: e41211. [CrossRef] [PubMed]*

*PLoS One**. 2012; 53: 7010– 7017. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci**. 2012; 53: 6981– 6990. [CrossRef] [PubMed]*

*Invest Ophthalmol Vis Sci**. 2008; 105: 19690– 19695. [CrossRef] [PubMed]*

*Proc Natl Acad Sci**. New York: Springer; 2000.*

*Linear Mixed Models for Longitudinal Data*