**Purpose.**:
To establish a method for estimating the overall statistical significance of visual field deterioration from an individual patient's data, and to compare its performance to pointwise linear regression.

**Methods.**:
The Truncated Product Method was used to calculate a statistic *S* that combines evidence of deterioration from individual test locations in the visual field. The overall statistical significance (*P* value) of visual field deterioration was inferred by comparing *S* with its permutation distribution, derived from repeated reordering of the visual field series. Permutation of pointwise linear regression (PoPLR) and pointwise linear regression were evaluated in data from patients with glaucoma (944 eyes, median mean deviation −2.9 dB, interquartile range: −6.3, −1.2 dB) followed for more than 4 years (median 10 examinations over 8 years). False-positive rates were estimated from randomly reordered series of this dataset, and hit rates (proportion of eyes with significant deterioration) were estimated from the original series.

**Results.**:
The false-positive rates of PoPLR were indistinguishable from the corresponding nominal significance levels and were independent of baseline visual field damage and length of follow-up. At *P* < 0.05, the hit rates of PoPLR were 12, 29, and 42%, at the fifth, eighth, and final examinations, respectively, and at matching specificities they were consistently higher than those of pointwise linear regression.

**Conclusions.**:
In contrast to population-based progression analyses, PoPLR provides a continuous estimate of statistical significance for visual field deterioration individualized to a particular patient's data. This allows close control over specificity, essential for monitoring patients in clinical practice and in clinical trials.

^{ 1 }(GCP) and Pointwise Linear Regression

^{ 2 }(PLR) are therefore more useful than global indices such as Mean Deviation (MD) and Visual Field Index (VFI).

^{ 3 -4 }As yet, criteria for significant change for the entire visual field have only been derived for large groups of subjects, but not for individuals. However, visual field series from individual patients can differ substantially from each other.

^{ 5–7 }Therefore, when population-based criteria are used to define change, some patients are much more likely to produce a false-positive result than others (e.g., Artes PH, et al.

*IOVS*2011; 52:ARVO E-Abstract 4148).

^{ 8–12 }and change at individual test locations has been defined with similarly arbitrary criteria of slope and associated

*P*value.

^{ 8–11,13,14 }In consequence, criteria for visual field change with PLR are not adapted to an individual patient's data. Rather, their specificity is likely to vary between patients, making results difficult to interpret. In addition, PLR does not provide a single significance value for deterioration in the whole field, one which could be varied continuously to produce any desired level of specificity.

*P*value

*combination function*

^{ 15–17 }that combines the significance of deterioration at each location into a single statistic. We then show how a

*P*value for overall change can be derived through

*permutation analysis*

^{ 18–20 }using only the patient's own data. This approach, which we call permutation of pointwise linear regression (PoPLR), provides a conceptually simple and intuitive approach for deriving the overall significance of visual field change that is individualized to a patient's results. In a large clinical dataset of visual fields from patients with glaucoma, we demonstrate that the specificity of PoPLR is extremely close to the desired nominal level (

*P*value), and that its performance compares favorably with many previously proposed PLR progression criteria.

*P*values) from individual locations across the visual field into a combined statistic

*S*. The statistical significance of

*S*in the observed sequence of examinations, denoted

*S*

_{obs}, is then calculated by comparing it with a null distribution of

*S*, derived from reordered (permuted) sequences of the series. The null distribution is derived solely from the patient's own data, representing what would be expected by chance alone, and therefore the overall significance value is individualized to those data. The following section describes the approach in greater detail.

*P*value for change over time at individual locations. Since visual field sensitivity decreases with age, analyses were performed with total deviation, that is, the deviation of the measured threshold from the mean values expected in a healthy individual of the same age. PoPLR tests the null hypothesis that there is no negative change, and therefore one-sided

*P*values were used.

*P*values of individual visual field locations (or

*P*) were combined using the Truncated Product Method (Equation 1), a generalization of Fisher's method

_{i}^{15}that is appropriate when only a small proportion of locations depart from an overall null hypothesis.

^{17}Truncation means that only

*P*values (

*P*) below a given threshold

_{i}*t*are used for the calculation of

_{P}*S*(Equation 2). We choose

*t*= max (0.05, min (

_{P}*P*)), that is, only locations with

_{i}*P*values ≤ 0.05 will contribute to the test statistic; if none of the locations have a

*P*value ≤ 0.05, the smallest

*P*value is used. For small

*P*in the observed sequence, the test statistic

_{i}*S*will be large.

_{obs}.

*S*

_{obs}can be inferred by comparing it to the null distribution obtained from repeated permutation of the observed visual field series.

^{ 20,21 }The examination order was repeatedly permuted, and a test statistic from each unique permutation is added to a permutation (null) distribution

*S*. A total of

_{P}*n*! (“n factorial”) unique permutations exist for a given number of examinations

*n*(e.g., 40,320 for eight examinations). For series with more than six examinations, 5000 unique permutations were randomly selected. The overall significance of

*S*

_{obs}can then be derived by comparing it with the null distribution. If, for example,

*S*

_{obs}corresponds to the 97th percentile of the set of

*S*, the overall

_{P}*P*value is 0.03.

*P*values, whereas, for PLR, conventional “number of significant locations” criteria were used. Significant locations in PLR were defined based on slopes <0 dB/year and <−1 dB/year, with

*P*< 0.01. For MD, the

*P*value of the slope from linear regression over time was used.

^{ 22 }Results were assessed at the fifth, eighth, and final examination.

**Table 1.**

*P*< 0.05, the hit rates of PoPLR were 12.3, 29.2, and 42.3% at the fifth, eighth, and final examinations, respectively. In series with a significant PoPLR result (

*P*< 0.05), the median (interquartile range) MD change from baseline was −2.4 (−3.3, −1.3), −2.5 (−4.5, −1.5), and −2.9 (−5.1, −1.2) dB at the fifth, eighth, and final examinations.

*S*

_{obs}) would be considered significant (

*P*< 0.05, i.e., the 95th percentile of

*S*), varied by a factor of almost 5, with a range of 40 to 187 (see Figs. 4D, 5D, 6D, 7D).

_{P}**Figure 1.**

**Figure 1.**

**Figure 2.**

**Figure 2.**

**Figure 3.**

**Figure 3.**

**Figure 4.**

**Figure 4.**

**Figure 5.**

**Figure 5.**

**Figure 6.**

**Figure 6.**

**Figure 7.**

**Figure 7.**

*P*values followed a uniform distribution (Kolmogorov-Smirnov,

*P*= 0.86, 0.70, 0.69 at the fifth, eighth, and final examinations). At a significance level of 0.05, for example, PoPLR identified significant deterioration in 5.3% of reordered series. Thus the false-positive rates of PoPLR closely matched the nominal significance levels. In addition, false positives of PoPLR were not associated with baseline age, series mean MD, or follow-up length (logistic regression,

*P*= 0.72,

*P*= 0.41,

*P*= 0.80, at the final examination). In contrast, PLR criteria do not have an overall significance value, and their false-positive rate decreased with the number of examinations (Fig. 1).

*P*< 0.001) but not at the fifth examination (

*P*= 0.37).

**Table 2.**

**Table 2.**

Number of Locations | Examination | ||||||||

Fifth | Eighth | Final | |||||||

PLR | PoPLR | PLR | PoPLR | PLR | PoPLR | ||||

False-Positive Rate | Hit Rate | Hit Rate | False-Positive Rate | Hit Rate | Hit Rate | False-Positive Rate | Hit Rate | Hit Rate | |

≥1 | 14.4 | 21.3 | 29.5* | 12.2 | 37.0 | 41.3* | 9.3 | 48.5 | 50.6 |

≥2 | 3.1 | 6.8 | 8.3 | 3.0 | 19.9 | 22.0* | 1.7 | 33.3 | 34.3 |

≥3 | 1.2 | 2.3 | 3.3 | 1.0 | 11.8 | 13.3 | 0.4 | 24.7 | 25.8 |

*P*< 0.01). At the final examination, 25 (6%) eyes with significant change by PoPLR (

*P*< 0.05) had no significantly changing PLR locations, while 15 (3%) eyes not changing by PoPLR (

*P*≥ 0.05) had two or more significant PLR locations.

*P*= 0.003). With PLR, there were three locations with significant change (

*P*< 0.01), of which one had a slope <−1 dB/year. The rate of change of the MD was −0.42 dB/year, significantly different from zero (

*P*= 0.008).

*P*= 0.08). With PLR, two locations met the criteria as changing, with

*P*< 0.01 and slope <−1 dB/year. The rate of change of the MD was −0.45 dB/year, not significantly different from zero (

*P*= 0.23). The MD values showed large variability, and the permutation distribution exhibited a large tail.

*P*= 0.01). This change did not meet a PLR criterion of one or more locations with significant change, and the rate of MD change (−0.15 dB/year) was not significantly different from zero (

*P*= 0.16).

*P*= 0.04), and the rate of change of MD (−0.08 dB/year) was statistically significant (

*P*= 0.03). With PLR, no single location deteriorated with

*P*< 0.01 (two-sided).

*P*value that is individualized to a particular patient's data: the false-positive rate is independent of factors such as variability, level of visual field damage, and length of follow-up. These properties distinguish PoPLR from many other techniques of determining change in the visual field.

*P*value for a clearly defined null hypothesis. Rather, the results need to be interpreted with reference to large groups of subjects (population-based criteria). With GCP, for example, criteria for “likely progression” and “possible progression” have been established in the Early Manifest Glaucoma Trial

^{ 23 }and have been shown to provide high specificity,

*on average*. However, when the properties of an individual patient's data differ from those of the reference group, population-based change criteria can give misleading results. For example, visual fields with advanced damage often contain many locations at which the dynamic range of the instrument has been exhausted (sensitivity < 0 dB) such that further deterioration is no longer measurable. A population-based progression criterion that demands change at a fixed number of visual field locations, for example, will thus be more conservative (less sensitive, more specific) in patients with more advanced damage. We have previously shown that the likelihood of experiencing a false-positive result with the GCP can vary by as much as 40 times between patients (Artes PH, et al.

*IOVS*2011;52:ARVO E-Abstract 4148). With PLR, the specificity of any given criterion varies, among other factors, with the number of examinations. In our data, for example, PLR with a criterion of “≥1 location with a slope < −1.0 dB/year, at a

*P*value < 0.01” gave a false-positive rate of 10.4% after five examinations, decreasing to 5.9% after eight examinations. Since the specificity of population-based criteria varies with the properties of the data, it is difficult to interpret the findings when such criteria are applied to individual patients.

*S*

_{obs}and its permutation distribution in the individual patient's series of visual fields. Large variability within the series will cause the permutation distribution to be wider, and a given

*S*

_{obs}will therefore be associated with a larger

*P*value (lesser significance) than in a series with lower variability. In the highly variable visual field series of case 2 (Fig. 5), for example, a

*S*

_{obs}of 32 was only borderline significant (

*P*= 0.08), while the

*S*

_{obs}of 14 in case 4 (Fig. 7) was associated with a

*P*value of 0.04. The nearly 5-fold variation in the width of the permutation distributions illustrates the importance of judging significance based on the properties of the individual visual field series, rather than by population-based cutoff values. Because PoPLR provides a continuous

*P*value rather than just a categorical classification (change/no change, as with population-based criteria), it will support more differentiated judgments in borderline cases in which the

*P*value is close to a particular significance level (e.g., 0.05).

*P*values followed the expected uniform distribution. This means that the false-positive rate of the approach equals the nominal significance level; for example, the probability of falsely detecting visual field deterioration in a stable series, at a significance level of

*P*< 0.05, would be 5%. Our results also demonstrate that PoPLR performs at least as well as conventional PLR criteria in distinguishing between observed and randomly reordered series. With five examinations, for example, a PLR criterion of “≥1 location with slope < −1 dB/year at

*P*< 0.01” detected deterioration in 17% of the series, at a specificity of 90%. At the same specificity (i.e., with a

*P*value of 0.10), PoPLR detected deterioration in 23% of the observed visual field series (Fig. 2). Both PoPLR and PLR had substantially higher hit rates, at matched specificities, compared to the

*P*value associated with the MD rate of change, underscoring the greater utility of localized analyses of visual field deterioration over global indices.

^{ 24,25 }and have been applied to assess changes in the optic disc in glaucoma.

^{ 26,27 }While they make fewer assumptions than model-based approximations, they require greater computational effort that has only recently become feasible.

^{ 28 }and different types of visual field tests.

^{ 29 -}

^{ 31 }

*Perimetry Update 1990/91*. Amsterdam: Kugler & Ghedini; 1991:303–315.

*Br J Ophthalmol*. 1996;80:40–48. [CrossRef] [PubMed]

*Invest Ophthalmol Vis Sci*. 1990;31:512–520. [PubMed]

*Invest Ophthalmol Vis Sci*. 2003;44:3873–3879. [CrossRef] [PubMed]

*Invest Ophthalmol Vis Sci*. 2000;41:417–421. [PubMed]

*Invest Ophthalmol Vis Sci*. 2005;46:2451–2457. [CrossRef] [PubMed]

*Br J Ophthalmol*. 2011;95:189–193. [CrossRef] [PubMed]

*Graefes Arch Clin Exp Ophthalmol*. 1996;234:411–418. [CrossRef] [PubMed]

*Br J Ophthalmol*. 1997;81:1037–1042. [CrossRef] [PubMed]

*Graefes Arch Clin Exp Ophthalmol*. 2009;247:1659–1669. [CrossRef] [PubMed]

*Invest Ophthalmol Vis Sci*. doi:10.1167/iovs.11-8186 .

*Arch Ophthalmol*. 2007;125:1176–1181. [CrossRef] [PubMed]

*Eye*. 2005;20:98–106. [CrossRef]

*Br J Ophthalmol*. 2004;88:88–94. [CrossRef] [PubMed]

*Statistical Methods for Research Workers*. Edinburgh, London: Oliver and Boyd; 1925.

*Encyclopedia of Biostatistics*. 2nd ed. Chichester, UK: John Wiley & Sons; 2005:987–991.

*Genet Epidemiol*. 2002;22:170–185. [CrossRef] [PubMed]

*The Design of Experiments*. Edinburgh, London: Oliver and Boyd; 1935.

*Ann Math Stat*. 1957;28:181–187. [CrossRef]

*Permutation, Parametric, and Bootstrap Tests of Hypotheses*. 3rd ed. New York: Springer; 2005.

*J R Stat Soc*. 1937;4 (suppl):119–130.

*Clin Chem*. 1993;39:561–577. [PubMed]

*Acta Ophthalmol Scand*. 2003;81:286–293. [CrossRef] [PubMed]

*J Cereb Blood Flow Metab*. 1996;16:7–22. [CrossRef] [PubMed]

*Hum Brain Mapp*. 2002;15:1–25. [CrossRef] [PubMed]

*Invest Ophthalmol Vis Sci*. 2005;46:1659–1667. [CrossRef] [PubMed]

*Arch Ophthalmol*. 2010;128:560–568. [CrossRef] [PubMed]

*Arch Ophthalmol*. 2011;129:1521–1527. [CrossRef] [PubMed]

*Invest Ophthalmol Vis Sci*. 2009;50:974–979. [CrossRef] [PubMed]

*Invest Ophthalmol Vis Sci*. 2009;50:4700–4708. [CrossRef] [PubMed]

*Invest Ophthalmol Vis Sci*. 2010;51:5685–5689. [CrossRef] [PubMed]