**Purpose**:
To validate a method for visual field (VF) progression analysis, called ANSWERS (Analysis with Non-Stationary Weibull Error Regression and Spatial Enhancement), which takes into account increasing measurement variability as glaucoma progresses and spatial correlation among test locations.

**Methods**:
ANSWERS outputs both a global index of progression and a pointwise estimate of rate of change at each VF location. ANSWERS was compared with linear regression of mean deviation (MD) and permutation of pointwise linear regression (PoPLR). Visual field series of up to 2 years from the United Kingdom Glaucoma Treatment Study were used. This consists of 9104 Swedish Interactive Thresholding Algorithm Standard 24-2 VFs. ANSWERS and PoPLR rate of change were used to predict the VF at the next visit using subseries that were within 7, 13, 18, or 22 months from the baseline. The comparison was carried out on the statistical sensitivity, specificity, and accuracy of predicting future VF.

**Results**:
Across all subseries, statistical sensitivity of ANSWERS in detecting VF deterioration was significantly better than the linear regression of MD and PoPLR, especially in short time series. Prediction accuracy of ANSWERS was better than PoPLR at all series lengths, and the improvement was particularly marked in shorter series. Seventy-five percent of VF series were better predicted by ANSWERS compared with PoPLR. The average prediction error of ANSWERS was 15% lower than that of PoPLR.

**Conclusions**:
ANSWERS is more sensitive to detect VF progression and predicts future VF loss better than linear regression of MD and PoPLR, especially over short observation periods. (http://www.isrctn.com number, ISRCTN96423140.)

^{1}Accurate and precise assessment of VF change over time is essential for appropriate clinical management of glaucoma so that patients whose condition is worsening receive prompt treatment intervention while those with a stable condition are not overtreated. Currently, however, VF measurement is highly imprecise and has complex statistical properties, which make monitoring changes in VF challenging.

^{2}Summary measures, such as mean deviation (MD) from the average DLS of healthy eyes, are also often used in trend analysis; but, since glaucoma tends not to affect all locations to the same extent, global indices often have inadequate statistical sensitivity to detect worsening compared with methods assessing deterioration at individual locations.

^{3}Permutation analyses of pointwise linear regression (PoPLR),

^{4}a recent advance in PLR trend analysis, involves a random permutation of the order of VFs in a series. It has been reported to provide a better estimate of overall statistical significance of change compared with PLR.

^{5}The significance of change in PoPLR is estimated in the context of permutated series of VFs, which assumes no change in reordered series. As there is a need for permutation in VF series, this technique cannot estimate reliably the significance of change in series with fewer than six VFs because of the limited number of possible permutations in such series. Moreover, despite a different method to estimate the overall statistical significance by PoPLR, the underlying regression model is still that of ordinary linear regression, and therefore the estimate of rate of change and the statistical significance of change at individual locations are identical to those of PLR.

^{6–8}For instance, the repeat measurement range (90% confidence interval [CI]) is 7 dB (26–33 dB) when DLS is healthy at 32 dB, while this range increases to 18 dB (5–27 dB) when DLS deteriorates to 20 dB.

^{9}This changing variability over time is referred to as nonstationary measurement variability. Furthermore, the traversing of the VF test grid by retinal nerve fibers results in correlation between spatially related locations.

^{10}The most widely used SAP VF measurements, such as those taken by Humphrey Visual Field Analyzer (HFA; Carl Zeiss Meditec, Dublin, CA, USA), are made in a regular grid across a patient's field of view. Aside from the neighborhood of test locations, the spatial correlation is also governed by the anatomical arrangement of the retinal nerve fibers.

^{11}Betz-Stablein et al.

^{12}incorporated such spatial correlation in six regions of VF corresponding to the six sectors of the optic disc and demonstrated improved performance in detecting VF progression. Therefore, without taking into account these statistical properties, the detection of change in VF with current methods is potentially delayed or requires more clinic visits than necessary.

^{13}

^{9}In contrast to commonly used ordinary linear regression models, which assume fixed and normally distributed errors, ANSWERS incorporates the nonstationary variability at different levels of DLS modeled as mixtures of Weibull distributions. Spatial correlation of measurements was also included in the model using a Bayesian framework. Despite its optimized statistical attributes, ANSWERS still acts as a linear regression model that outputs both the probability of no deterioration and rate of change at individual locations in the VF and can be interpreted in the same way as PLR. It also produces a global deterioration index summarizing the overall probability of change in the series. The details about derivation and implementation of ANSWERS can be found elsewhere.

^{9}

^{9}

^{14,15}a randomized, double-masked placebo-controlled clinical trial testing the hypothesis that treatment with a topical prostaglandin analogue, compared with placebo, reduces the frequency of VF deterioration events. Patients were followed up for 2 years or until reaching the endpoint criteria. During the 2-year period, patients were tested at 2, 4, 7, 10, 13, 16, 18, 20, 22, and 24 months from baseline, and two repeated VF tests were taken at baseline and at 2, 16, 18, and 24 months from baseline. Details of the dataset have been described elsewhere.

^{14}Visual field tests with false-positive reliability responses over 15% were discarded. Only series that were obtained over at least 4 months (three visits) were included in the analysis. Note that the length of series is purely for evaluation purposes and is not necessitated by ANSWERS. The resulting dataset consisted of 9104 VF tests from 659 series of 437 patients. The median (interquartile range [IQR]) time of follow-up was 22 (15–24) months, and the median (IQR) number of VFs in the series was 11 (6–12).

^{6}Fifty-two eyes of 27 patients were tested 10 times over a short period (maximum 10 weeks). The variance among VFs in these repeat measures indicates the inherent measurement variability. Furthermore, the VF series for each eye, and the same series with arbitrary reordering, represent a stable series with no underlying change. The use of randomly reordered series for the estimate of measurement variability is an established method in various studies.

^{16,17}

^{4}smaller than a given threshold. For ANSWERS and ANSWER, the criterion was a deterioration index

^{9}higher than a given threshold. For linear regression of MD, the criteria were a negative slope and slope

*P*value lower than a set threshold. For each method, a set of thresholds was chosen to achieve specified false-positive rates, and the statistical sensitivity of each method was then compared at equivalent false-positive rates.

^{18}the underlying progression status of each VF series was unknown. Therefore, the methods were compared using the positive rate, which is the proportion of series flagged as progressing in the UKGTS dataset. Given an unknown proportion (

*p*%) of truly progressing series in the dataset, the positive rate was linked to statistical sensitivity as positive rate = (

*p*% × sensitivity) + [(1 −

*p*%) × false-positive rate]. Note that if the false-positive rate is controlled to be equivalent for all the methods, a higher positive rate implies better sensitivity of a method. Therefore, the positive rates of all the methods were compared as a surrogate comparison for statistical sensitivity. Moreover, when the false-positive rate is low, the positive rate is dominated by the sensitivity. Therefore when comparing two methods at lower false-positive rate, the ratio of positive rate between the methods is closer to the ratio of sensitivity. The comparison was made with series of 7, 13, 18, and 22 months from baseline.

^{9}of the false-positive rate and the series length, and the fitting has an

*r*

^{2}> 0.99. This allows the threshold to be calculated for the series longer than 10 that is the maximum length of the series in the test–retest data.

**Figure 1**

**Figure 1**

**Figure 2**

**Figure 2**

**Table 1**

**Table 2**

*P*< 0.01, Wilcoxon signed rank test) smaller estimates of the magnitude of the rate of change compared with PoPLR. ANSWERS provided a statistically significant (

*P*< 0.01%, Wilcoxon signed rank test) greater rate magnitude compared with ANSWER. The comparison of rate of change between ANSWERS, ANSWER, and PoPLR in subseries of 7, 13, 18, and 22 months is presented in Figures 3 and 4, in which the relative relationship between the magnitude of the rate of change from the three methods (PoPLR > ANSWERS > ANSWER,

*P*< 0.01% in Wilcoxon signed rank test) was consistent in all subseries, except that for 22 months, where the rate magnitude did not differ between ANSWERS and ANSWER (

*P*= 20%, Wilcoxon signed rank test). The results can be seen in Figure 3, where the points scatter around a line with a slope of less than 1, and in Figure 4, where the points scatter around a line with slope of more than 1, except for those at 22 months. For 13 and 18 months, although the difference between ANSWERS and ANSWER is statistically different, the amount of the difference is minimal so the points scatter closely to the diagonal line in Figure 4.

**Figure 3**

**Figure 3**

**Figure 4**

**Figure 4**

*P*= 0.01%, Wilcoxon signed rank test) than those from PoPLR and ANSWER (median [95% CI] difference: 15% [10%–19%] and 2% [1%–4%], respectively). The comparison between the three methods for prediction of VFs at 10, 16, 20, and 24 months using subseries of 7, 13, 18 and 22 months is summarized in Table 3. ANSWERS outperformed PoPLR in all subseries. The improvement was greater in shorter subseries. The spatial enhancement in ANSWERS made it a better predictor than ANSWER for all VF predictions except for those at the 24th month.

**Table 3**

**Figure 5**

**Figure 5**

**Table 4**

^{19}often due to limited resources, the usefulness of ANSWERS in short series is of particular interest.

^{20}This is because there are insufficient data to identify nonlinear change, should it exist, owing to the relatively short VF series acquired in clinical practice.

^{13,19}A recent study indicated that change in VF series may follow a nonlinear trend such as an exponential function.

^{21}It is, however, simple to configure ANSWERS to model nonlinear change

^{9}in long VF series. Moreover, PoPLR was used to determine criteria for progression in PLR; however, other criteria defined on the combinations of slope and statistical significance are possible.

^{22}

**H. Zhu**, P;

**D.P. Crabb**, P;

**T. Ho**, None;

**D.F. Garway-Heath**, P

*Visual Fields*. 2nd ed.

*Oxford: Butterworth-Heinemann*; 2000.

*. 1997; 81: 1037–1042.*

*Br J Ophthalmol**. 2003; 44: 3873–3879.*

*Invest Ophthalmol Vis Sci**. 2012; 53: 6776–6784.*

*Invest Ophthalmol Vis Sci**. 2013; 131: 1565–1572.*

*JAMA Ophthalmol**. 2002; 43: 2654–2659.*

*Invest Ophthalmol Vis Sci**. 2000; 41: 417–421.*

*Invest Ophthalmol Vis Sci**. 2012; 53: 5985–5990.*

*Invest Ophthalmol Vis Sci**. 2014; 9: e85654.*

*PLoS One**. 2007; 48: 1642–1650.*

*Invest Ophthalmol Vis Sci**. 2000; 107: 1809–1815.*

*Ophthalmology**. 2013; 54: 1544–1553.*

*Invest Ophthalmol Vis Sci**. 2008; 92: 569–573.*

*Br J Ophthalmol**. 2013; 120: 68–76.*

*Ophthalmology**. 2014; 385: 1295–1304.*

*Lancet**. 2005; 46: 1659–1667.*

*Invest Ophthalmol Vis Sci**. San Diego: Academic Press; 1997.*

*Human Brain Function**. 2002; 43: 1400–1407.*

*Invest Ophthalmol Vis Sci**. 2013; 97: 843–847.*

*Br J Ophthalmol**. 2013; 54: 6694–6700.*

*Invest Ophthalmol Vis Sci**. 2013; 54: 5505–5513.*

*Invest Ophthalmol Vis Sci**. 2013; 54: 6234–6241.*

*Invest Ophthalmol Vis Sci*