Free
Glaucoma  |   October 2012
Visual Field Progression in Glaucoma: Estimating the Overall Significance of Deterioration with Permutation Analyses of Pointwise Linear Regression (PoPLR)
Author Notes
  • From the Department of Ophthalmology and Visual Sciences, Faculty of Medicine, Dalhousie University, Halifax, Nova Scotia, Canada. 
  • Corresponding author: Paul H. Artes, Ophthalmology and Visual Sciences, Dalhousie University, Room 2035, West Victoria, 1276 South Park Street, Halifax, NS, B3H 2Y9, Canada; paul@dal.ca
Investigative Ophthalmology & Visual Science October 2012, Vol.53, 6776-6784. doi:10.1167/iovs.12-10049
  • Views
  • PDF
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Neil O'Leary, Balwantray C. Chauhan, Paul H. Artes; Visual Field Progression in Glaucoma: Estimating the Overall Significance of Deterioration with Permutation Analyses of Pointwise Linear Regression (PoPLR). Invest. Ophthalmol. Vis. Sci. 2012;53(11):6776-6784. doi: 10.1167/iovs.12-10049.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose.: To establish a method for estimating the overall statistical significance of visual field deterioration from an individual patient's data, and to compare its performance to pointwise linear regression.

Methods.: The Truncated Product Method was used to calculate a statistic S that combines evidence of deterioration from individual test locations in the visual field. The overall statistical significance (P value) of visual field deterioration was inferred by comparing S with its permutation distribution, derived from repeated reordering of the visual field series. Permutation of pointwise linear regression (PoPLR) and pointwise linear regression were evaluated in data from patients with glaucoma (944 eyes, median mean deviation −2.9 dB, interquartile range: −6.3, −1.2 dB) followed for more than 4 years (median 10 examinations over 8 years). False-positive rates were estimated from randomly reordered series of this dataset, and hit rates (proportion of eyes with significant deterioration) were estimated from the original series.

Results.: The false-positive rates of PoPLR were indistinguishable from the corresponding nominal significance levels and were independent of baseline visual field damage and length of follow-up. At P < 0.05, the hit rates of PoPLR were 12, 29, and 42%, at the fifth, eighth, and final examinations, respectively, and at matching specificities they were consistently higher than those of pointwise linear regression.

Conclusions.: In contrast to population-based progression analyses, PoPLR provides a continuous estimate of statistical significance for visual field deterioration individualized to a particular patient's data. This allows close control over specificity, essential for monitoring patients in clinical practice and in clinical trials.

Introduction
Visual field progression is one of the most important clinical signs of deterioration in optic neuropathies such as glaucoma, and many different techniques have been designed to measure change over time and its significance. However, there is still no consensus on which criteria are most appropriate for determining visual field progression in glaucoma. 
In patients with glaucoma, progressive changes are often localized to parts of the visual field, and methods that analyze deterioration at individual test locations, such as Glaucoma Change Probability 1 (GCP) and Pointwise Linear Regression 2 (PLR) are therefore more useful than global indices such as Mean Deviation (MD) and Visual Field Index (VFI). 3 -4 As yet, criteria for significant change for the entire visual field have only been derived for large groups of subjects, but not for individuals. However, visual field series from individual patients can differ substantially from each other. 57 Therefore, when population-based criteria are used to define change, some patients are much more likely to produce a false-positive result than others (e.g., Artes PH, et al. IOVS 2011; 52:ARVO E-Abstract 4148). 
PLR was developed as a technique for evaluating change at individual test locations over the entire follow-up. Criteria for overall change (i.e., for the entire visual field) have been based, arbitrarily, on fixed numbers of changing test locations, 812 and change at individual test locations has been defined with similarly arbitrary criteria of slope and associated P value. 811,13,14 In consequence, criteria for visual field change with PLR are not adapted to an individual patient's data. Rather, their specificity is likely to vary between patients, making results difficult to interpret. In addition, PLR does not provide a single significance value for deterioration in the whole field, one which could be varied continuously to produce any desired level of specificity. 
In this paper, we propose an individualized analysis to estimate the overall significance of visual field deterioration, based on a P value combination function 1517 that combines the significance of deterioration at each location into a single statistic. We then show how a P value for overall change can be derived through permutation analysis 1820 using only the patient's own data. This approach, which we call permutation of pointwise linear regression (PoPLR), provides a conceptually simple and intuitive approach for deriving the overall significance of visual field change that is individualized to a patient's results. In a large clinical dataset of visual fields from patients with glaucoma, we demonstrate that the specificity of PoPLR is extremely close to the desired nominal level (P value), and that its performance compares favorably with many previously proposed PLR progression criteria. 
Methods
The PoPLR Approach
The aim of PoPLR is to derive a single, accurate, and easily interpretable statistic for the significance of pointwise visual field deterioration. The null hypothesis tested by PoPLR is that there is no negative change at any visual field location. The approach is to combine evidence for deterioration (P values) from individual locations across the visual field into a combined statistic S. The statistical significance of S in the observed sequence of examinations, denoted S obs, is then calculated by comparing it with a null distribution of S, derived from reordered (permuted) sequences of the series. The null distribution is derived solely from the patient's own data, representing what would be expected by chance alone, and therefore the overall significance value is individualized to those data. The following section describes the approach in greater detail. 
Significance of Change at Individual Visual Field Locations.
Simple linear regression was used to derive P value for change over time at individual locations. Since visual field sensitivity decreases with age, analyses were performed with total deviation, that is, the deviation of the measured threshold from the mean values expected in a healthy individual of the same age. PoPLR tests the null hypothesis that there is no negative change, and therefore one-sided P values were used. 
Combination of Significance Values across the Visual Field.
P values of individual visual field locations (or Pi) were combined using the Truncated Product Method (Equation 1), a generalization of Fisher's method15 that is appropriate when only a small proportion of locations depart from an overall null hypothesis.17 Truncation means that only P values (Pi) below a given threshold tP are used for the calculation of S (Equation 2).  We choose tP = max (0.05, min (Pi)), that is, only locations with P values ≤ 0.05 will contribute to the test statistic; if none of the locations have a P value ≤ 0.05, the smallest P value is used. For small Pi in the observed sequence, the test statistic S will be large. 
Deriving the Statistical Significance of Sobs.
The statistical significance of S obs can be inferred by comparing it to the null distribution obtained from repeated permutation of the observed visual field series. 20,21 The examination order was repeatedly permuted, and a test statistic from each unique permutation is added to a permutation (null) distribution SP . A total of n! (“n factorial”) unique permutations exist for a given number of examinations n (e.g., 40,320 for eight examinations). For series with more than six examinations, 5000 unique permutations were randomly selected. The overall significance of S obs can then be derived by comparing it with the null distribution. If, for example, S obs corresponds to the 97th percentile of the set of SP , the overall P value is 0.03. 
Animations of PoPLR are available (see Supplementary Material). 
Evaluation of PoPLR
Dataset.
The data used in this report were obtained from the glaucoma clinics at the Queen Elizabeth II Health Sciences Centre in Halifax, NS. In accordance with the Declaration of Helsinki, the Ethics Review Board of the QEII Health Sciences Centre approved the retrospective use of these anonymized data. 
The visual fields of all patients examined using program 24-2 Swedish Interactive Threshold Algorithm Standard with the Humphrey Field Analyzer (HFA) II (Carl Zeiss Meditec, Inc., Dublin, CA) between August 1998 and January 2011 were retrieved. Only eyes with 10 or more examinations were included (1109 eyes of 614 patients), and the first two examinations of each eye were omitted to reduce learning effects. Examinations that had been repeated within 3 months were excluded, and no intervals between consecutive examinations greater than 2 years were permitted. Thereafter, only eyes with at least 8 examinations over at least 4 years were selected for analysis. If both eyes of a patient qualified, both were included. In total, 9930 visual fields from 944 eyes of 520 patients were included. 
Comparison of Performance of PoPLR with PLR and MD.
PoPLR was compared with PLR and with linear regression of the MD. Criteria for PoPLR were the overall P values, whereas, for PLR, conventional “number of significant locations” criteria were used. Significant locations in PLR were defined based on slopes <0 dB/year and <−1 dB/year, with P < 0.01. For MD, the P value of the slope from linear regression over time was used. 
Because there is no independent reference standard for true visual field change, it is difficult to estimate sensitivity and specificity from clinical data. We therefore used the positive rate in the originally observed series (subsequently referred to as “hit rate,” i.e., true positives and false positives) in place of sensitivity. To estimate the false-positive rate (1 − specificity), a “no-change” dataset was created from the original data, by randomly reordering each series once. Under the assumption that random reordering removes systematic change, the false-positive rates of the different methods were then estimated from the positive rates in this reordered dataset. 
By varying the criteria applied with each method (PoPLR, PLR, and MD), curves of hit rate versus false-positive rate were constructed to evaluate and compare the performance of the methods, similar to Receiver Operator Characteristic analyses. 22 Results were assessed at the fifth, eighth, and final examination. 
Analyses were performed on MatLab (R2010b; MathWorks, Inc., Natick, MA). Calculating the overall significance from PoPLR on a series of 10 visual field examinations took approximately 2 seconds on a 2.67-GHz Pentium processor. 
An implementation of PoPLR is freely available in visualFields, a package for the statistical analysis of visual field data for the open-source environment R (http://cran.r-project.org/web/packages/visualFields/, in the public domain and last accessed August 28, 2012). 
Results
Demographic Details Of Patients
Table 1 provides details of the visual field data selected for analysis. In 424 of 520 patients (82%), both eyes met the selection criteria and were analyzed. In the remaining 96 patients, only one eye was eligible for analysis. At the fifth, eighth, and final examinations of each series the median (interquartile range) follow-up durations were 3.4 (2.7, 4.1), 6.0 (4.9, 7.0), and 8.0 (6.7, 9.2) years. 
Table 1. 
 
Details of Baseline and Follow-up (Median and Interquartile Range) for Included Visual Field Series
Table 1. 
 
Details of Baseline and Follow-up (Median and Interquartile Range) for Included Visual Field Series
Number of patients 520
Number of eyes 944
Follow-up duration, y 8.0 (6.7, 9.2)
Number of examinations 10 (9, 12)
Baseline age, y 67.2 (58.8, 75.0)
Baseline MD, dB −2.9 (−6.3, −1.2)
Baseline PSD, dB 2.6 (1.8, 5.5)
Hit Rate
At P < 0.05, the hit rates of PoPLR were 12.3, 29.2, and 42.3% at the fifth, eighth, and final examinations, respectively. In series with a significant PoPLR result (P < 0.05), the median (interquartile range) MD change from baseline was −2.4 (−3.3, −1.3), −2.5 (−4.5, −1.5), and −2.9 (−5.1, −1.2) dB at the fifth, eighth, and final examinations. 
The permutation distributions differed greatly between series. The cutoff value of the permutation distribution beyond which the observed statistic (S obs) would be considered significant (P < 0.05, i.e., the 95th percentile of SP ), varied by a factor of almost 5, with a range of 40 to 187 (see Figs. 4D, 5D, 6D, 7D). 
Figure 1. 
 
False-positive rates of PLR with one, two, and three significantly deteriorating locations at the fifth, eighth, and final examinations. Significantly deteriorating locations were defined by slope <0 dB/year, P <0.01 (left panel) and slope <−1 dB/year, P <0.01 (right panel).
Figure 1. 
 
False-positive rates of PLR with one, two, and three significantly deteriorating locations at the fifth, eighth, and final examinations. Significantly deteriorating locations were defined by slope <0 dB/year, P <0.01 (left panel) and slope <−1 dB/year, P <0.01 (right panel).
Figure 2. 
 
Hit rate versus false-positive rate of PoPLR, pointwise linear regression (PLR) with a given number (n) of significant locations, and the P value of simple linear regression of mean deviation (MD) over time, at the fifth, eighth, and final examinations.
Figure 2. 
 
Hit rate versus false-positive rate of PoPLR, pointwise linear regression (PLR) with a given number (n) of significant locations, and the P value of simple linear regression of mean deviation (MD) over time, at the fifth, eighth, and final examinations.
Figure 3. 
 
Relationships between the significance of PoPLR and the number of significantly deteriorating PLR locations (slope < 0 dB/year, P < 0.01) at the fifth, eighth, and final examinations. The area of each circle is proportional to the number of series with PoPLR significance between the indicated limits (horizontal lines) and a given number of significant PLR locations. The number of series is given to the top right or inside of each circle. Where the significance of PoPLR was greater than 0.05, numbers are represented by open circles.
Figure 3. 
 
Relationships between the significance of PoPLR and the number of significantly deteriorating PLR locations (slope < 0 dB/year, P < 0.01) at the fifth, eighth, and final examinations. The area of each circle is proportional to the number of series with PoPLR significance between the indicated limits (horizontal lines) and a given number of significant PLR locations. The number of series is given to the top right or inside of each circle. Where the significance of PoPLR was greater than 0.05, numbers are represented by open circles.
Figure 4. 
 
Case 1. (A) Grayscale maps of sensitivity of the first, middle, and last examinations, with patient age at each examination. (B) Mean deviation over time with fitted simple linear regression line, with slope and associated P value (two-sided). A red line indicates a significantly negative (P < 0.05) slope. The three examinations in A are indicated by black dots. (C) A map of slopes and associated P values (one-sided) of total deviation over time. Colors indicate the direction of the slope (red: slope < 0 dB/year, green: slope > 0 dB/year). Gray squares highlight locations with P < 0.05, which contribute to the Truncated Product Method test-statistic (S) used in PoPLR. To show which locations would be classified as changing by PLR criteria, locations outlined with a dark square indicate a two-sided P value < 0.01. In addition, slopes below a critical value of −1 dB/year are further indicated by a heavily outlined square. Locations with sensitivities ≤0 dB across the entire series are represented by a gray point (slope and P value not available). (D) The permutation distribution SP of the calculated test statistics S for 5000 unique, random permutations. Values of S > 70 are binned together. The test statistic for the observed series S obs is indicated by the position of the red line—the further to the right, the more significant the change in the observed series. An overall significance for deterioration, P associated with S obs, is also shown. The 95th percentile (S 95) of the distribution, indicated by a black dashed arrow, is the cutoff beyond which S obs would be considered significant (P < 0.05).
Figure 4. 
 
Case 1. (A) Grayscale maps of sensitivity of the first, middle, and last examinations, with patient age at each examination. (B) Mean deviation over time with fitted simple linear regression line, with slope and associated P value (two-sided). A red line indicates a significantly negative (P < 0.05) slope. The three examinations in A are indicated by black dots. (C) A map of slopes and associated P values (one-sided) of total deviation over time. Colors indicate the direction of the slope (red: slope < 0 dB/year, green: slope > 0 dB/year). Gray squares highlight locations with P < 0.05, which contribute to the Truncated Product Method test-statistic (S) used in PoPLR. To show which locations would be classified as changing by PLR criteria, locations outlined with a dark square indicate a two-sided P value < 0.01. In addition, slopes below a critical value of −1 dB/year are further indicated by a heavily outlined square. Locations with sensitivities ≤0 dB across the entire series are represented by a gray point (slope and P value not available). (D) The permutation distribution SP of the calculated test statistics S for 5000 unique, random permutations. Values of S > 70 are binned together. The test statistic for the observed series S obs is indicated by the position of the red line—the further to the right, the more significant the change in the observed series. An overall significance for deterioration, P associated with S obs, is also shown. The 95th percentile (S 95) of the distribution, indicated by a black dashed arrow, is the cutoff beyond which S obs would be considered significant (P < 0.05).
Figure 5. 
 
See legend to Figure 4.
Figure 5. 
 
See legend to Figure 4.
Figure 6. 
 
See legend to Figure 4.
Figure 6. 
 
See legend to Figure 4.
Figure 7. 
 
See legend to Figure 4.
Figure 7. 
 
See legend to Figure 4.
False-Positive Rate
In the reordered data, PoPLR P values followed a uniform distribution (Kolmogorov-Smirnov, P = 0.86, 0.70, 0.69 at the fifth, eighth, and final examinations). At a significance level of 0.05, for example, PoPLR identified significant deterioration in 5.3% of reordered series. Thus the false-positive rates of PoPLR closely matched the nominal significance levels. In addition, false positives of PoPLR were not associated with baseline age, series mean MD, or follow-up length (logistic regression, P = 0.72, P = 0.41, P = 0.80, at the final examination). In contrast, PLR criteria do not have an overall significance value, and their false-positive rate decreased with the number of examinations (Fig. 1). 
Hit Rate Versus False-Positive Rate
Figure 2 shows the hit rate versus false-positive rate of PoPLR, compared with PLR and MD. The hit rate of PoPLR was always higher than that of PLR, at matching false-positive rates (Table 2). When progression was defined by one or more significant locations, there was a statistically significant gain in hit rate with PoPLR, compared with PLR, at the eighth and final available examinations (McNemar's test, P < 0.001) but not at the fifth examination (P = 0.37). 
Table 2. 
 
Hit Rates (%) of PoPLR and PLR Criteria at Matched False-Positive Rates (%)
Table 2. 
 
Hit Rates (%) of PoPLR and PLR Criteria at Matched False-Positive Rates (%)
Number of Locations Examination
Fifth Eighth Final
PLR PoPLR PLR PoPLR PLR PoPLR
False-Positive Rate Hit Rate Hit Rate False-Positive Rate Hit Rate Hit Rate False-Positive Rate Hit Rate Hit Rate
≥1 14.4 21.3 29.5* 12.2 37.0 41.3* 9.3 48.5 50.6
≥2 3.1 6.8 8.3 3.0 19.9 22.0* 1.7 33.3 34.3
≥3 1.2 2.3 3.3 1.0 11.8 13.3 0.4 24.7 25.8
At a 5% false-positive rate, the hit rates of PoPLR were similar to those of the significance of the rate of MD change at the fifth examination (12.3% vs. 13.1%) but were higher at the eighth (27.3% vs. 17.7%) and final (41.5% vs. 29.7%) examinations. 
Comparison between PoPLR and PLR
Figure 3 illustrates the significance from PoPLR versus the number of significant PLR locations (slope < 0 dB/year, P < 0.01). At the final examination, 25 (6%) eyes with significant change by PoPLR (P < 0.05) had no significantly changing PLR locations, while 15 (3%) eyes not changing by PoPLR (P ≥ 0.05) had two or more significant PLR locations. 
Case Examples
In Case 1 (Fig. 4) the change analysis for a series with 11 examinations in 7.1 years is shown. There was a statistically significant deterioration with PoPLR (P = 0.003). With PLR, there were three locations with significant change (P < 0.01), of which one had a slope <−1 dB/year. The rate of change of the MD was −0.42 dB/year, significantly different from zero (P = 0.008). 
Case 2 (Fig. 5) shows the results from a series with nine examinations in 5.9 years. There was a borderline significant change with PoPLR (P = 0.08). With PLR, two locations met the criteria as changing, with P < 0.01 and slope <−1 dB/year. The rate of change of the MD was −0.45 dB/year, not significantly different from zero (P = 0.23). The MD values showed large variability, and the permutation distribution exhibited a large tail. 
Case 3 (Fig. 6) illustrates the change analysis of an eye with nine examinations in 6.9 years. There was a reduction in central sensitivity leading to a significant PoPLR result (P = 0.01). This change did not meet a PLR criterion of one or more locations with significant change, and the rate of MD change (−0.15 dB/year) was not significantly different from zero (P = 0.16). 
Case 4 (Fig. 7) shows results from an eye with 13 examinations in 9.9 years. Many locations in the superior field had thresholds ≤0 dB throughout the series. PoPLR showed significant change (P = 0.04), and the rate of change of MD (−0.08 dB/year) was statistically significant (P = 0.03). With PLR, no single location deteriorated with P < 0.01 (two-sided). 
Discussion
The objective of this research was to establish a conceptually simple technique to derive the overall statistical significance of visual field deterioration in individual patients. By comparing an observed series to many reordered arrangements of itself, PoPLR derives a continuous P value that is individualized to a particular patient's data: the false-positive rate is independent of factors such as variability, level of visual field damage, and length of follow-up. These properties distinguish PoPLR from many other techniques of determining change in the visual field. 
Current techniques such as GCP and PLR do not provide a single P value for a clearly defined null hypothesis. Rather, the results need to be interpreted with reference to large groups of subjects (population-based criteria). With GCP, for example, criteria for “likely progression” and “possible progression” have been established in the Early Manifest Glaucoma Trial 23 and have been shown to provide high specificity, on average. However, when the properties of an individual patient's data differ from those of the reference group, population-based change criteria can give misleading results. For example, visual fields with advanced damage often contain many locations at which the dynamic range of the instrument has been exhausted (sensitivity < 0 dB) such that further deterioration is no longer measurable. A population-based progression criterion that demands change at a fixed number of visual field locations, for example, will thus be more conservative (less sensitive, more specific) in patients with more advanced damage. We have previously shown that the likelihood of experiencing a false-positive result with the GCP can vary by as much as 40 times between patients (Artes PH, et al. IOVS 2011;52:ARVO E-Abstract 4148). With PLR, the specificity of any given criterion varies, among other factors, with the number of examinations. In our data, for example, PLR with a criterion of “≥1 location with a slope < −1.0 dB/year, at a P value < 0.01” gave a false-positive rate of 10.4% after five examinations, decreasing to 5.9% after eight examinations. Since the specificity of population-based criteria varies with the properties of the data, it is difficult to interpret the findings when such criteria are applied to individual patients. 
In contrast to GCP and PLR, PoPLR tests the well-defined null hypothesis that none of the visual field locations show negative change, with reference only to the observed statistic S obs and its permutation distribution in the individual patient's series of visual fields. Large variability within the series will cause the permutation distribution to be wider, and a given S obs will therefore be associated with a larger P value (lesser significance) than in a series with lower variability. In the highly variable visual field series of case 2 (Fig. 5), for example, a S obs of 32 was only borderline significant (P = 0.08), while the S obs of 14 in case 4 (Fig. 7) was associated with a P value of 0.04. The nearly 5-fold variation in the width of the permutation distributions illustrates the importance of judging significance based on the properties of the individual visual field series, rather than by population-based cutoff values. Because PoPLR provides a continuous P value rather than just a categorical classification (change/no change, as with population-based criteria), it will support more differentiated judgments in borderline cases in which the P value is close to a particular significance level (e.g., 0.05). 
When PoPLR was applied to randomly reordered series, the P values followed the expected uniform distribution. This means that the false-positive rate of the approach equals the nominal significance level; for example, the probability of falsely detecting visual field deterioration in a stable series, at a significance level of P < 0.05, would be 5%. Our results also demonstrate that PoPLR performs at least as well as conventional PLR criteria in distinguishing between observed and randomly reordered series. With five examinations, for example, a PLR criterion of “≥1 location with slope < −1 dB/year at P < 0.01” detected deterioration in 17% of the series, at a specificity of 90%. At the same specificity (i.e., with a P value of 0.10), PoPLR detected deterioration in 23% of the observed visual field series (Fig. 2). Both PoPLR and PLR had substantially higher hit rates, at matched specificities, compared to the P value associated with the MD rate of change, underscoring the greater utility of localized analyses of visual field deterioration over global indices. 
An assumption made in PoPLR is that the order of the examinations is irrelevant unless a real change has taken place, such that the permutation distribution is a valid approximation of the null distribution. However, no assumptions need to be made about the distribution of measurement errors. PoPLR uses simple linear regression to determine the statistical significance of negative trends at individual locations, but in principle the technique can be adapted to other methods of determining change. Since the overall significance is determined from the permutation distribution, such adaptations will affect only the sensitivity but not the specificity of the approach. It is possible that visual field measurements closer in time to each other are more related than those further apart (temporal autocorrelation) even if no glaucomatous deterioration occurs. In this study, any effects of this phenomenon were reduced by including only one examination from a set of repeat examinations or terminating series when there were long intervals between examinations. A “blocking” strategy may be a more general solution, whereby the sub-sequence of close measurements are blocked (not reordered) in order to take into account their closer relationship. However, autocorrelation properties of visual fields are, as yet, largely unknown, and therefore it is difficult to investigate their effect on approaches such as PoPLR. 
Ideally, methods for determining visual field progression would be evaluated using the classic metrics for diagnostic tests, sensitivity and specificity. However, given the lack of a reference standard for what constitutes true change in the visual field as opposed to random variability, sensitivity and specificity are difficult to determine in real clinical data. In this research, we used a different approach for comparing the performance of PoPLR with previously established PLR criteria. In place of sensitivity, we used the positive rate in the observed visual field data (“hit rate”). Similarly, the positive rate in randomly permuted visual field series was used as a surrogate measure of specificity. This approach is based on the rationale that random reordering will remove any systematic trend present in the observed series, and that a more powerful method will detect change in a larger proportion of visual field series, at the same specificity. In contrast to computer simulations, which by necessity rely on simplifying models of visual field progression and stability, our approach used real clinical data, which are more likely to reflect the complex spatiotemporal properties of visual fields. 
Permutation approaches offer powerful and adaptable solutions for assessing change in complex data. They have been widely used in neuroimaging 24,25 and have been applied to assess changes in the optic disc in glaucoma. 26,27 While they make fewer assumptions than model-based approximations, they require greater computational effort that has only recently become feasible. 
In summary, PoPLR provides a statistical significance for visual field deterioration that is tailored to the individual patient's data. This will make it useful for determining end points in clinical trials and for interpreting change in clinical practice. Because the specificity of PoPLR is, by design, independent of the properties of the underlying data (variability, length of follow-up, number of locations, dB scale), it may also have applications in comparing the evidence of visual field progression between different follow-up protocols 28 and different types of visual field tests. 29 - 31  
Supplementary Materials
Acknowledgments
The authors thank Raymond LeBlanc, Marcelo Nicolela, Lesya Shuba, and Paul Rafuse for making available their clinical data, Ivan Marin-Franch for implementing PoPLR in the R visualFields package, Jörg Weber for licenses of Peridata, and Carl Zeiss Meditec for licenses of the extended HFA XML export function. 
References
Heijl A Lindgren G Lindgren A Extended empirical statistical package for evaluation of single and multiple fields in glaucoma: Statpac 2. In: Mills RP Heijl A eds. Perimetry Update 1990/91 . Amsterdam: Kugler & Ghedini; 1991:303–315.
Fitzke FW Hitchings RA Poinoosawmy D McNaught AI Crabb DP. Analysis of visual field progression in glaucoma. Br J Ophthalmol . 1996;80:40–48. [CrossRef] [PubMed]
Chauhan BC Drance SM Douglas GR. The use of visual field indices in detecting changes in the visual field in glaucoma. Invest Ophthalmol Vis Sci . 1990;31:512–520. [PubMed]
Vesti E Johnson CA Chauhan BC. Comparison of different methods for detecting glaucomatous visual field progression. Invest Ophthalmol Vis Sci . 2003;44:3873–3879. [CrossRef] [PubMed]
Henson DB Chaudry S Artes PH Faragher EB Ansons A. Response variability in the visual field: comparison of optic neuritis, glaucoma, ocular hypertension, and normal eyes. Invest Ophthalmol Vis Sci . 2000;41:417–421. [PubMed]
Artes PH Hutchison DM Nicolela MT LeBlanc RP Chauhan BC. Threshold and variability properties of matrix frequency-doubling technology and standard automated perimetry in glaucoma. Invest Ophthalmol Vis Sci . 2005;46:2451–2457. [CrossRef] [PubMed]
Fogagnolo P Sangermani C Oddone F Long-term perimetric fluctuation in patients with different stages of glaucoma. Br J Ophthalmol . 2011;95:189–193. [CrossRef] [PubMed]
McNaught AI Crabb DP Fitzke FW Hitchings RA. Visual field progression: comparison of Humphrey Statpac2 and pointwise linear regression analysis. Graefes Arch Clin Exp Ophthalmol . 1996;234:411–418. [CrossRef] [PubMed]
Viswanathan AC Fitzke FW Hitchings RA. Early detection of visual field progression in glaucoma: a comparison of PROGRESSOR and STATPAC 2. Br J Ophthalmol . 1997;81:1037–1042. [CrossRef] [PubMed]
Nevalainen J Paetzold J Papageorgiou E Specification of progression in glaucomatous visual field loss, applying locally condensed stimulus arrangements. Graefes Arch Clin Exp Ophthalmol . 2009;247:1659–1669. [CrossRef] [PubMed]
De Moraes CG Demirel S Gardiner SK Effect of treatment on the velocity of visual field progression in the Ocular Hypertension Treatment Study Observation Group [published online ahead of print March 6, 2012]. Invest Ophthalmol Vis Sci . doi:10.1167/iovs.11-8186 .
Nouri-Mahdavi K Hoffman D Ralli M Caprioli J. Comparison of methods to predict visual field progression in glaucoma. Arch Ophthalmol . 2007;125:1176–1181. [CrossRef] [PubMed]
Wilkins MR Fitzke FW Khaw PT. Pointwise linear progression criteria and the detection of visual field change in a glaucoma trial. Eye . 2005;20:98–106. [CrossRef]
Aung T Oen FTS Wong H-T Randomised controlled trial comparing the effect of brimonidine and timolol on visual field loss after acute primary angle closure. Br J Ophthalmol . 2004;88:88–94. [CrossRef] [PubMed]
Fisher RA. Statistical Methods for Research Workers . Edinburgh, London: Oliver and Boyd; 1925.
Westfall PH. Combining P values. In: Armitage P Colton T eds. Encyclopedia of Biostatistics. 2nd ed. Chichester, UK: John Wiley & Sons; 2005:987–991.
Zaykin DV Zhivotovsky LA Westfall PH Weir BS. Truncated product method for combining P-values. Genet Epidemiol . 2002;22:170–185. [CrossRef] [PubMed]
Fisher RA. The Design of Experiments . Edinburgh, London: Oliver and Boyd; 1935.
Dwass M. Modified randomization tests for nonparametric hypotheses. Ann Math Stat . 1957;28:181–187. [CrossRef]
Good PI. Permutation, Parametric, and Bootstrap Tests of Hypotheses. 3rd ed. New York: Springer; 2005.
Pitman EJG. Significance tests which may be applied to samples from any populations. J R Stat Soc . 1937;4 (suppl):119–130.
Zweig MH Campbell G. Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clin Chem . 1993;39:561–577. [PubMed]
Heijl A Leske MC Bengtsson B Hussein M. Measuring visual field progression in the Early Manifest Glaucoma Trial. Acta Ophthalmol Scand . 2003;81:286–293. [CrossRef] [PubMed]
Holmes AP Blair RC Watson JD Ford I. Nonparametric analysis of statistic images from functional mapping experiments. J Cereb Blood Flow Metab . 1996;16:7–22. [CrossRef] [PubMed]
Nichols TE Holmes AP. Nonparametric permutation tests for functional neuroimaging: a primer with examples. Hum Brain Mapp . 2002;15:1–25. [CrossRef] [PubMed]
Patterson AJ Garway-Heath DF Strouthidis NG Crabb DP. A new statistical approach for quantifying change in series of retinal and optic nerve head topography images. Invest Ophthalmol Vis Sci . 2005;46:1659–1667. [CrossRef] [PubMed]
O'Leary N Crabb DP Mansberger SL Glaucomatous progression in series of stereoscopic photographs and Heidelberg retina tomograph images. Arch Ophthalmol . 2010;128:560–568. [CrossRef] [PubMed]
Nouri-Mahdavi K Zarei R Caprioli J. Influence of visual field testing frequency on detection of glaucoma progression with trend analyses. Arch Ophthalmol . 2011;129:1521–1527. [CrossRef] [PubMed]
Wall M Woodward KR Doyle CK Artes PH. Repeatability of automated perimetry: a comparison between standard automated perimetry with stimulus size III and V, matrix, and motion perimetry. Invest Ophthalmol Vis Sci . 2009;50:974–979. [CrossRef] [PubMed]
Artes PH Chauhan BC. Signal/noise analysis to compare tests for measuring visual field loss and its progression. Invest Ophthalmol Vis Sci . 2009;50:4700–4708. [CrossRef] [PubMed]
Schiefer U Papageorgiou E Sample PA Spatial pattern of glaucomatous visual field loss obtained with regionally condensed stimulus arrangements. Invest Ophthalmol Vis Sci . 2010;51:5685–5689. [CrossRef] [PubMed]
Footnotes
 Supported by the Glaucoma Research Foundation (PHA) and by a grant (MOP-11357) from the Canadian Institutes for Health Research (BCC).
Footnotes
 Presented at the annual meetings of the North American Perimetry Society, Skaneateles, New York, September 2011; International Perimetric Society, Melbourne, Australia, January 2012; and Association for Research in Vision and Ophthalmology, Fort Lauderdale, Florida, May 2012.
Footnotes
 Disclosure: N. O'Leary, None; B.C. Chauhan, None; P.H. Artes, None
Figure 1. 
 
False-positive rates of PLR with one, two, and three significantly deteriorating locations at the fifth, eighth, and final examinations. Significantly deteriorating locations were defined by slope <0 dB/year, P <0.01 (left panel) and slope <−1 dB/year, P <0.01 (right panel).
Figure 1. 
 
False-positive rates of PLR with one, two, and three significantly deteriorating locations at the fifth, eighth, and final examinations. Significantly deteriorating locations were defined by slope <0 dB/year, P <0.01 (left panel) and slope <−1 dB/year, P <0.01 (right panel).
Figure 2. 
 
Hit rate versus false-positive rate of PoPLR, pointwise linear regression (PLR) with a given number (n) of significant locations, and the P value of simple linear regression of mean deviation (MD) over time, at the fifth, eighth, and final examinations.
Figure 2. 
 
Hit rate versus false-positive rate of PoPLR, pointwise linear regression (PLR) with a given number (n) of significant locations, and the P value of simple linear regression of mean deviation (MD) over time, at the fifth, eighth, and final examinations.
Figure 3. 
 
Relationships between the significance of PoPLR and the number of significantly deteriorating PLR locations (slope < 0 dB/year, P < 0.01) at the fifth, eighth, and final examinations. The area of each circle is proportional to the number of series with PoPLR significance between the indicated limits (horizontal lines) and a given number of significant PLR locations. The number of series is given to the top right or inside of each circle. Where the significance of PoPLR was greater than 0.05, numbers are represented by open circles.
Figure 3. 
 
Relationships between the significance of PoPLR and the number of significantly deteriorating PLR locations (slope < 0 dB/year, P < 0.01) at the fifth, eighth, and final examinations. The area of each circle is proportional to the number of series with PoPLR significance between the indicated limits (horizontal lines) and a given number of significant PLR locations. The number of series is given to the top right or inside of each circle. Where the significance of PoPLR was greater than 0.05, numbers are represented by open circles.
Figure 4. 
 
Case 1. (A) Grayscale maps of sensitivity of the first, middle, and last examinations, with patient age at each examination. (B) Mean deviation over time with fitted simple linear regression line, with slope and associated P value (two-sided). A red line indicates a significantly negative (P < 0.05) slope. The three examinations in A are indicated by black dots. (C) A map of slopes and associated P values (one-sided) of total deviation over time. Colors indicate the direction of the slope (red: slope < 0 dB/year, green: slope > 0 dB/year). Gray squares highlight locations with P < 0.05, which contribute to the Truncated Product Method test-statistic (S) used in PoPLR. To show which locations would be classified as changing by PLR criteria, locations outlined with a dark square indicate a two-sided P value < 0.01. In addition, slopes below a critical value of −1 dB/year are further indicated by a heavily outlined square. Locations with sensitivities ≤0 dB across the entire series are represented by a gray point (slope and P value not available). (D) The permutation distribution SP of the calculated test statistics S for 5000 unique, random permutations. Values of S > 70 are binned together. The test statistic for the observed series S obs is indicated by the position of the red line—the further to the right, the more significant the change in the observed series. An overall significance for deterioration, P associated with S obs, is also shown. The 95th percentile (S 95) of the distribution, indicated by a black dashed arrow, is the cutoff beyond which S obs would be considered significant (P < 0.05).
Figure 4. 
 
Case 1. (A) Grayscale maps of sensitivity of the first, middle, and last examinations, with patient age at each examination. (B) Mean deviation over time with fitted simple linear regression line, with slope and associated P value (two-sided). A red line indicates a significantly negative (P < 0.05) slope. The three examinations in A are indicated by black dots. (C) A map of slopes and associated P values (one-sided) of total deviation over time. Colors indicate the direction of the slope (red: slope < 0 dB/year, green: slope > 0 dB/year). Gray squares highlight locations with P < 0.05, which contribute to the Truncated Product Method test-statistic (S) used in PoPLR. To show which locations would be classified as changing by PLR criteria, locations outlined with a dark square indicate a two-sided P value < 0.01. In addition, slopes below a critical value of −1 dB/year are further indicated by a heavily outlined square. Locations with sensitivities ≤0 dB across the entire series are represented by a gray point (slope and P value not available). (D) The permutation distribution SP of the calculated test statistics S for 5000 unique, random permutations. Values of S > 70 are binned together. The test statistic for the observed series S obs is indicated by the position of the red line—the further to the right, the more significant the change in the observed series. An overall significance for deterioration, P associated with S obs, is also shown. The 95th percentile (S 95) of the distribution, indicated by a black dashed arrow, is the cutoff beyond which S obs would be considered significant (P < 0.05).
Figure 5. 
 
See legend to Figure 4.
Figure 5. 
 
See legend to Figure 4.
Figure 6. 
 
See legend to Figure 4.
Figure 6. 
 
See legend to Figure 4.
Figure 7. 
 
See legend to Figure 4.
Figure 7. 
 
See legend to Figure 4.
Table 1. 
 
Details of Baseline and Follow-up (Median and Interquartile Range) for Included Visual Field Series
Table 1. 
 
Details of Baseline and Follow-up (Median and Interquartile Range) for Included Visual Field Series
Number of patients 520
Number of eyes 944
Follow-up duration, y 8.0 (6.7, 9.2)
Number of examinations 10 (9, 12)
Baseline age, y 67.2 (58.8, 75.0)
Baseline MD, dB −2.9 (−6.3, −1.2)
Baseline PSD, dB 2.6 (1.8, 5.5)
Table 2. 
 
Hit Rates (%) of PoPLR and PLR Criteria at Matched False-Positive Rates (%)
Table 2. 
 
Hit Rates (%) of PoPLR and PLR Criteria at Matched False-Positive Rates (%)
Number of Locations Examination
Fifth Eighth Final
PLR PoPLR PLR PoPLR PLR PoPLR
False-Positive Rate Hit Rate Hit Rate False-Positive Rate Hit Rate Hit Rate False-Positive Rate Hit Rate Hit Rate
≥1 14.4 21.3 29.5* 12.2 37.0 41.3* 9.3 48.5 50.6
≥2 3.1 6.8 8.3 3.0 19.9 22.0* 1.7 33.3 34.3
≥3 1.2 2.3 3.3 1.0 11.8 13.3 0.4 24.7 25.8
×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×