**Purpose.**:
We evaluated the usefulness of various regression models, including least absolute shrinkage and selection operator (Lasso) regression, to predict future visual field (VF) progression in glaucoma patients.

**Methods.**:
Series of 10 VFs (Humphrey Field Analyzer 24-2 SITA-standard) from each of 513 eyes in 324 open-angle glaucoma patients, obtained in 4.9 ± 1.3 years (mean ± SD), were investigated. For each patient, the mean of all total deviation values (mTD) in the 10th VF was predicted using varying numbers of prior VFs (ranging from the first three VFs to all previous VFs) by applying ordinary least squares linear regression (OLSLR), M-estimator robust regression (M-robust), MM-estimator robust regression (MM-robust), skipped regression (Skipped), deepest regression (Deepest), and Lasso regression. Absolute prediction errors then were compared.

**Results.**:
With OLSLR, prediction error varied between 5.7 ± 6.1 (using the first three VFs) and 1.2 ± 1.1 dB (using all nine previous VFs). Prediction accuracy was not significantly improved with M-robust, MM-robust, Skipped, or Deepest regression in almost all VF series; however, a significantly smaller prediction error was obtained with Lasso regression even with a small number of VFs (using first 3 VFs, 2.0 ± 2.2; using all nine previous VFs, 1.2 ± 1.1 dB).

**Conclusions.**:
Prediction errors using OLSLR are large when only a small number of VFs are included in the regression. Lasso regression offers much more accurate predictions, especially in short VF series.

^{1–3}However, the medical and surgical IOP reduction can be associated with various ocular and general complications.

^{4–8}Therefore, it is essential to accurately predict future VF progression when making glaucoma treatment decisions. One popular approach to assess VF deterioration and predict future damage is to apply ordinary least squares linear regression (OLSLR) to global VF indices, such as mean deviation (MD), as implemented in the Guided Progression Analysis (GPA) software on the Humphrey Visual Field Analyser (HFA; Carl Zeiss Meditec AG, Dublin, CA, USA). However, VF sensitivity fluctuates in the short-term

^{9}and long-term,

^{10}and the reliability of a measured VF is inherently affected by a patient's concentration. Furthermore, VF measurement noise can be very large even when reliability indices are deemed good.

^{11,12}

^{13–15,17}Accumulating this number of VFs can take years in many clinics; hence, various attempts have been made to develop better models to predict VF progression. For instance, Caprioli et al.

^{16}recently suggested that an exponential method offers more accurate predictions, especially when VF sensitivity approaches the floor level (0 dB).

^{17,18}Very recent research has developed dedicated VF regression models, that take into account nonstationary variability and spatial correlations using a Bayesian approach

^{19}or spatial/temporal patterns of glaucomatous VF progression also using a Bayesian approach.

^{20}There also has been a renewed interest in applying alternative “ready-made” regression models as an alternative to OLSLR. A number of regression models, such as M-estimator robust regression (M-robust), MM-estimator robust regression (MM-robust), skipped regression (Skipped), and deepest regression (Deepest), have been developed to overcome the sensitivity of OLSLR to outliers.

^{21}In addition, others have proposed a shrinkage method for OLSLR in which the sum of the absolute values of the regression coefficients is constrained or penalized, known as least absolute shrinkage and selection operator regression (Lasso).

^{22,23}The most important merit of using Lasso is the optimum penalty can be decided using the actual clinical information of other patients; how the regression model should be shrunk to accurately predict the future VF. On the contrary, M-robust, MM-robust, Skipped, and Deepest obtain the robustness using only an individual's data. Lasso regression has been used in many different fields, including the analysis of human perception and genetic analysis.

^{24,25}Indeed, we recently applied the method to predict the MD values in the 10-2 HFA VFs from 24-2 HFA VFs.

^{26}In this study, several different robust regression models as well as Lasso regression were applied to the mean of total deviation values (mTD) in patients' VF series, and the performance of the different methods for predicting future progression was compared.

^{27}of a cluster of ≥3 points in the pattern deviation plot in a single hemifield (superior/inferior) with

*P*< 0.05, one of which must have been

*P*< 0.01, excluding the outermost test point of Humphrey Field Analyzer 30-2 program; glaucoma hemifield test (GHT) result outside of normal limits; or abnormal pattern standard deviation (PSD) with

*P*< 0.05. The VF measurements were performed using the HFA with either the 30-2 or 24-2 program and the Swedish Interactive Threshold Algorithm Standard. When VFs were obtained with the 30-2 test pattern, only the 52 test locations overlapping with the 24-2 test pattern were used in the analysis and for the calculation of mTD. Patients' first two VFs were excluded from the analysis. Other inclusion criteria in this study were best corrected visual acuity better than 6/12, refraction within ± 6 diopter (D) ametropia, no previous ocular surgery except for cataract extraction, and intraocular lens implantation, and no other anterior and posterior segment of the eye disease that could affect the VF, including cataract other than clinically insignificant senile cataract. Reliability criteria for VFs were applied: fixation losses less than 20% and false-positive responses less than 15%; the false-negative rate was not applied as a reliability criterion based on a previous report.

^{28}The VF of a left eye was mirror-imaged to that of a right eye for statistical analyses.

_{1–3}) of each patient, and the mTD values of the 10th VF (VF

_{10}) were predicted. The same procedure was carried out using the TD values in different series: VF

_{1–4}, VF

_{1–5}, VF

_{1–6}, VF

_{1–7}, VF

_{1–8}, and VF

_{1–9}, and the mTD values of VF

_{10}were predicted every time. The predictive accuracy of each method was compared using absolute errors. As a subanalysis, predictive accuracy also was compared in eyes with an mTD progression rate < −0.25 dB/y (based on a patient's first 10 VFs), which commonly is considered to be deterioration by pressure-independent damaging factors.

^{29–31}

*i*th of

*n*observations, the general M-estimator minimizes the objective function: where the function

*ρ*gives the contribution of each residual to the objective function.

^{32}In short, M-robust regression, due to the formula above, is much less affected by outliers than OLSLR. In the current study, M-statistics were calculated using Huber's method.

^{32}

^{33}in 1987.

^{34}

^{35–38}

*x*∈ R

^{p}denote the variables and let

*y*∈ R denote the response (please note

*x*

_{ij}are normalized and

*y*has mean zero). The Lasso algorithm solves the following problem:

*λP*(

_{α}*β*) is the penalty term for the shrinkage.

^{22,23}Thus the λ (Lambda) value represents the degree of penalty in Lasso. Equation 1 is Lasso regression when

*α*= 1 and Ridge regression when

*α*= 0; however, this discrimination is not applicable to the current study, because there is only one variable (mTD).

*n*= 323 in 324) were used to produce a diagnosis. An optimum

*λ*value was identified for each iteration (patient), and the prediction error was calculated.

*P*values for the problem of multiple testing.

^{39}A linear mixed model was used to analyze the relationship between two values, whereby patients were treated as a “random effect.”

**Table.**

**Figure 1**

**Figure 1**

_{1–3}could not be calculated because a leverage point could not be calculated. Absolute prediction errors became smaller as the number of VF tests included in the regression increased. There was no significant improvement in error by applying M-robust, MM-robust, Skipped, and Deepest, compared to using the OLSLR at any time point (

*P*> 0.05, repeated ANOVA with Benjamini's correction for multiple testing

^{39}), except for M-robust with VF

_{1–8}(

*P*= 0.028, repeated ANOVA with Benjamini's correction for multiple testing

^{39}). The absolute prediction errors with the Lasso model were significantly better than OLSLR when VF

_{1–3}to VF

_{1–8}were used for prediction (

*P*<0.0001, <0.0001, <0.0001, <0.0001, <0.0001, 0.0017, 0.016, repeated ANOVA with Benjamini's correction for multiple testing

^{39}). Among the 513 eyes, 234 eyes showed progression faster than −0.25 dB/y. Interestingly, as shown in Figure 3, prediction accuracy tended to be large in eyes with mTD progression rate < −0.25 dB/y. A significant improvement was observed when applying Lasso, compared to OLSLR, when the initial one or two VFs were used to predict (

*P*= 0.007, 0.035, repeated ANOVA with Benjamini's correction for multiple testing

^{39}).

**Figure 2**

**Figure 2**

**Figure 3**

**Figure 3**

*λ*value derived in relationship to the number of VFs used for prediction. The λ value decreased as the number of VFs used in the prediction increased.

**Figure 4**

**Figure 4**

*λ*value derived for VF

_{1–3}, VF

_{1–4}, and VF

_{1–5}, and the mTD value of VF

_{1}(

*P*= 0.18, 0.12, and 0.31, respectively, linear mixed model). As shown in Figure 6, there was no significant relationship between the optimum λ value derived for VF

_{1–3}, VF

_{1–4}, and VF

_{1–5}, and the difference between mTDs in VF

_{1}and VF

_{10}(

*P*= 0.18, 0.33, and 0.94, respectively, linear mixed model). Figures 5 and 6 are smoothed scatter plots (plotted using the R package “graphics”), which better differentiate dense regions of points.

**Figure 5**

**Figure 5**

**Figure 6**

**Figure 6**

^{40}and consequently, the minimum number of VFs required to obtain reliable VF trend analysis results has been widely discussed in previous studies, with research suggesting that at least five or eight VFs, or even higher are required.

^{13–15,17}Indeed, prediction accuracy associated with OLSLR was poor when a small number of VFs were used in the current study. This poor prediction accuracy was not improved by applying a number of different robust regression methods. On the other hand, Lasso regression performed much better due to the fact that the method uses a penalty term, which helps to reduce prediction errors. In other words, the coefficient terms in Lasso regression are adjusted by the optimum penalty (

*λ*), which is obtained from real data (from other patients). As a result, Lasso mTD trend analysis becomes much more robust to measurement noise and, consequently, prediction accuracy was dramatically improved. This is different to M-robust, MM-robust, Skipped, and Deepest regression, which attempt to improve robustness using only an individual's data.

*λ*value was calculated using only the training dataset; this process was repeated so that each patient was used as a testing dataset once. This is identical to the clinical situation—a new patient can be classified according to the predetermined optimum

*λ*value. Furthermore, in clinical practice, a

*λ*value could be calculated continuously by adding the data of new patients to an ever-growing database of patient data, which would further improve prediction accuracy. In addition, the Lasso regression performed in this study was built using free statistical software and packages, specifically “R” (ver. 3.1.0; The R Foundation for Statistical Computing).

*λ*value was observed when a small number of VFs was used for prediction and it decreased as the number of VFs used for prediction increased. This suggested that any mTD trend analysis should be penalized according to the number of VFs used. It is well-known that test–retest reproducibility varies according to the level of VF sensitivity,

^{19,41}and recently Zhu et al.

^{19}reported a novel approach to measure VF progression that modeled nonstationary variability.

^{19}Interestingly, as shown in Figures 4 and 5, there was no significant relationship between the initial mTD value (and also the difference of mTD values in VF

_{10}and VF

_{1}) and the optimum

*λ*value; this suggests that the same penalty should be given to the MD trend analysis regardless of disease level. This could suggest that the variability of measured VF sensitivity is not merely due to inherent test–retest variability associated with the sensitivity level, but also other elements, such as patients' concentration. However, the information obtained from a small number of VFs is fundamentally limited compared to the large variability and clinicians always should be careful when interpreting trend analyses with a small number of VFs.

^{13–15,17}In agreement, the optimum

*λ*value shrank when the number of VFs used in the prediction was large.

^{20,42–44}Thus, it may be advantageous to optimize Lasso regression (the

*λ*value) based on a patient's progression pattern; that is, prediction accuracy may be further improved by using a clustering approach in combination with Lasso regression; this should be carried out in a future study.

^{45}(Medisoft, Inc., London, UK) improves clinicians' decisions regarding VF progression.

^{46}A possible caveat of the current results is that the prediction with the Lasso regression is not readily useable at the clinical setting. Therefore, it would be clinically beneficial to develop software/support tools to predict VF progression, as introduced in this study, similar to PROGRESSOR.

^{45}In particular, only standard data are needed to apply the current methodology in the clinical setting; having the record of MD values with the date of VF measurements of a patient, since the optimum penalty (

*λ*) value can be calculated from other patients.

**Y. Fujino**, None;

**H. Murata**, None;

**C. Mayama**, None;

**R. Asaoka**, None

*Am J Ophthalmol*. 1999; 127: 623–625.

*Am J Ophthalmol*. 1999; 127: 625–626.

*J Glaucoma*. 1997; 6: 133–138.

*Textbook of Glaucoma*. Baltimore, MD: William & Wilkins; 1997.

*Jpn J Ophthalmol*. 2011; 55: 600–604.

*Jpn J Ophthalmol*. 2014; 58: 212–217.

*J Ocul Pharmacol Ther*. 2001; 17: 235–248.

*Ophthalmology*. 2014; 121: 1001–1006.

*Arch Ophthalmol*. 1984; 102: 876–879.

*Arch Ophthalmol*. 1984; 102: 704–706.

*Invest Ophthalmol Vis Sci*. 2000; 41: 2201–2204.

*Invest Ophthalmol Vis Sci*. 1996; 37: 444–450.

*Acta Ophthalmol*. 1985; 173 (suppl): 19–21.

*Acta Ophthalmol (Copenh)*. 1982; 60: 267–274.

*Invest Ophthalmol Vis Sci*. 2000; 41: 2192–2200.

*Invest Ophthalmol Vis Sci*. 2011; 52: 4765–4773.

*Arch Ophthalmol*. 2009; 127: 1610–1615.

*Invest Ophthalmol Vis Sci*. 2011; 52: 9539–9540.

*PLoS One*. 2014; 9: e85654.

*Invest Ophthalmol Vis Sci*. 2014; 55: 8386–8392.

*Introduction to Robust Estimation and Hypothesis Testing*. Amsterdam, The Netherlands: Elsevier, 2011.

*J R Stat Soc Series B*. 1996; 58: 267–288.

*J Stat Softw*. 2010; 33: 1–22.

*J Opt Soc Am A Opt Image Sci Vis*. 2013; 30: 1687–1697.

*Conf Proc IEEE Eng Med and Biol Soc*. 2014; 2014: 804–807.

*PLoS One*. 2013; 8: e72199.

*Automated Static Perimetry*. St. Louis, MO: C.V. Mosby Co.; 1992.

*Am J Ophthalmol*. 2000; 130: 689.

*Ophthalmology*. 2001; 108: 247–253.

*Nippon Ganka Gakkai zasshi*. 2011; 115: 213–236, discussion 237.

*Ophthalmology*. 2008; 115: 2049–2057.

*Ann Stat*. 1973; 1: 799–821.

*Ann Stat*. 1987; 15: 642–656.

*Nederl Akad Wetensch Proc*. 1950; 53: 386–392, 521–525, 1397–1412.

*J Am Stat Assoc*. 1999; 94: 388–402.

*Ann Stat*. 1999; 27: 1616–1637.

*J Multivariate Anal*. 2000; 73: 83–106.

*J Multivariate Anal*. 2002; 81: 138–166.

*J Roy Stat Soc Series B*. 1995; 57: 289–300.

*Br J Ophthalmol*. 2010; 94: 1404–1405.

*Invest Ophthalmol Vis Sci*. 2002; 43: 2654–2659.

*Proc IEEE Int Conf Data Mining (ICDM2013)*. 2013; 1121–1126.

*Invest Ophthalmol Vis Sci*. 2004; 45: 2596–2605.

*Invest Ophthalmol Vis Sci*. 2012; 53: 6557–6567.

*Br J Ophthalmol*. 1996; 80: 40–48.

*Br J Ophthalmol*. 2003; 87: 726–730.