**Purpose**:
We have previously shown that sensitivities obtained at severely damaged visual field locations (<15–19 dB) are unreliable and highly variable. This study evaluates a testing algorithm that does not present very high contrast stimuli in damaged locations above approximately 1000% contrast, but instead concentrates on more precise estimation at remaining locations.

**Methods**:
A trained ophthalmic technician tested 36 eyes of 36 participants twice with each of two different testing algorithms: ZEST_{0}, which allowed sensitivities within the range 0 to 35 dB, and ZEST_{15}, which allowed sensitivities between 15 and 35 dB but was otherwise identical. The difference between the two runs for the same algorithm was used as a measure of test-retest variability. These were compared between algorithms using a random effects model with homoscedastic within-group errors whose variance was allowed to differ between algorithms.

**Results**:
The estimated test-retest variance for ZEST_{15} was 53.1% of the test-retest variance for ZEST_{0}, with 95% confidence interval (50.5%–55.7%). Among locations whose sensitivity was ≥17 dB on all tests, the variability of ZEST_{15} was 86.4% of the test-retest variance for ZEST_{0}, with 95% confidence interval (79.3%–94.0%).

**Conclusions**:
Restricting the range of possible sensitivity estimates reduced test-retest variability, not only at locations with severe damage but also at locations with higher sensitivity. Future visual field algorithms should avoid high-contrast stimuli in severely damaged locations. Given that low sensitivities cannot be measured reliably enough for most clinical uses, it appears to be more efficient to concentrate on more precise testing of less damaged locations.

^{1,2}Simulation studies suggest that reducing variability (defined as the spread of the frequency-of-seeing [FOS] curve) by 20% would enable progression to be detected, on average, one visit sooner,

^{3}and more than that for many patients.

^{4}For example, the probability that participants responded to a 20,000% (2 dB) contrast in a deep visual field defect was typically only marginally higher than the probability that they would respond to a 2000% (12 dB) contrast at the same location, and this small increase may just reflect effects of light from the stimulus being scattered toward remaining areas of higher sensitivity. For locations with sensitivity worse than 15 to 19 dB (equivalent to contrasts of 1000%–400%), we found that the relation between sensitivity measures from FOS curves and those obtained from clinical perimetry had

*R*

^{2}< 0.1, indicating that the true sensitivity explained less than 10% of the observed variance. This implies that it is not possible with current clinical perimetry to reliably distinguish between sensitivities of 2 dB and 12 dB. We suggested that this phenomenon might partially explain the increase in test-retest variability that is observed in moderate and severe glaucoma.

^{4}

^{5}We have also suggested that it may even slightly improve the ability to detect progression using global indices such as mean deviation (MD) (Pathak M, et al.

*IOVS*2016;57:ARVO E-Abstract 3920). It has previously been shown that ceasing testing of locations with sensitivity below 10 dB does not hinder progression detection using MD, but that censoring at 20 dB resulted in lost information.

^{6}However, those studies used data that had been collected using existing testing algorithms, which give sensitivity estimates down to 0 dB. In this study, we are interested in the change in variability associated with a new testing algorithm that avoids testing with very high contrast stimuli in areas of poor sensitivity. Clinicians and researchers may use this information to design visual field testing algorithms to more accurately assess visual field sensitivity and detect visual field progression with shorter test time and reduced variability in patients with moderate to advanced glaucoma.

^{7}). Additionally, participants were required to have a non–end-stage localized glaucomatous defect, which we defined as having at least two adjacent locations in the same hemifield (not including the blind spot) whose sensitivities differed by ≥6 dB on both of their most recent two visits. Exclusion criteria were an inability to perform reliable visual field testing, best-corrected visual acuity worse than 20/30 due to nonglaucomatous causes, history of angle closure, or any nonglaucomatous ocular pathology likely to affect the visual field. If both eyes met the inclusion and exclusion criteria, one was chosen at random for testing. All protocols were approved and monitored by the Legacy Health Institutional Review Board, and adhered to the Health Insurance Portability and Accountability Act of 1996 and the tenets of the Declaration of Helsinki. All participants provided written informed consent once all of the risks and benefits of participation were explained to them.

^{8}Although the decibel scales, which are defined relative to the maximal intensity stimulus of the instrument, differ between perimeters, in this study we report all measures using the HFA decibel scale. Software to run the testing, together with all analyses, were written using the R statistical programming language.

^{9}

^{10}which has been shown to have good precision and low bias,

^{11}and has been implemented in some clinically available perimeters.

^{12}The algorithms will be denoted as ZEST

_{0}(allowing sensitivities as low as 0 dB), and ZEST

_{15}(only allowing sensitivities down to 15 dB). Other than the range of sensitivities, all other elements of the algorithms were identical.

*P(S)*that the true sensitivity is

*S*for all values of

*S*. The pdf before a stimulus presentation is known as the prior distribution, and a stimulus is presented equal to the mean of this prior (rounded to the nearest 0.1 dB due to the available precision of the instrument). According to Bayes' theorem, the posterior probability that the true sensitivity is

*S*is given by multiplying the prior distribution by the likelihood that you would obtain the observed result (seen or not seen) if the true sensitivity were

*S*. This likelihood function is based on the FOS curve with sensitivity

*S*. In this study, this was defined as a cumulative Gaussian, with SD taken from the formula of Henson et al.

^{1}(capped at an SD of 7.8 dB for sensitivities below 15.0 dB), and assuming 5% false-positive and false-negative responses. The resultant posterior distribution is then used as the prior distribution for the next presentation.

*P(S) = 1/35*extending from

*S =*0 to 35 dB for ZEST

_{0}; and

*P(S) = 1/20*from

*S =*15 to 35 dB for ZEST

_{15}. Five presentations are made at each of these locations. The possible series of stimuli for these seed points are illustrated in Figure 1. In subsequent phases, an “initial guess” of the sensitivity is made, equaling the mean of the sensitivities at those neighboring locations that have already been tested. The prior pdf at these locations was given by

*P(S) = C * (0.1 + k * φ(S))*. Here,

*φ(S)*describes a normal distribution with mean equal to the initial guess, and SD 5 dB.

*k*is a constant such that when

*S*equals the initial guess,

*k * φ(S) = 1. C*is a constant defined such that the integral of the prior pdf over the defined range equals 1, which is a necessary condition for a well-defined pdf. In these subsequent phases, four presentations are made per location, with the set of tested locations expanding with each phase. Within each phase, the location at which the next stimulus would be presented was chosen randomly among those locations that had not yet reached their designated number of presentations.

_{0}was assumed to follow a normal distribution with variance

*Var*; and the pointwise intertest difference for ZEST

_{15}was assumed to follow a normal distribution with variance

*(Effect * Var)*. Using this random effects formulation allows a 95% confidence interval for

*Effect*to be produced. If

*Effect*is less than one, then this implies that ZEST

_{15}is less variable than ZEST

_{0}.

_{15}algorithm was 17 dB. Therefore, the analyses were repeated among the subset of locations whose observed sensitivity was ≥17 dB on both runs using the ZEST

_{0}algorithm. This allows comparison of the test-retest variability among those locations whose sensitivity has not been “censored” at 17 dB.

_{0}and for ZEST

_{15}, and the averages compared against the sensitivity at the same location from the most recent clinical visual field examination. These visits were the second of the two clinical examinations used to assess eligibility for the study as detailed above, and were performed using the SITA standard testing algorithm.

^{7}The correlations between sensitivities from the algorithms were assessed.

_{0}), was assumed to follow a normal distribution with variance Var

_{S}; and the difference (SITA – ZEST

_{15}) was assumed to follow a normal distribution with variance (Effect

_{S}* Var

_{S}). If Effect

_{S}< 1, then it can be concluded that sensitivities from ZEST

_{15}are more closely correlated with SITA than those from ZEST

_{0}.

_{0}, and 2.36 dB for ZEST

_{15}. In the random effects model, the estimated test-retest variance for ZEST

_{15}was 54.5% of the test-retest variance for ZEST

_{0}, with 95% confidence interval 52.0% to 57.1%. The test-retest differences are plotted against the mean of the two runs in Figure 2, showing a decreased spread of data and variability in ZEST

_{15}when compared with ZEST

_{0}.

_{15}could not result in sensitivity estimates below 17 dB, the analysis was repeated on just those locations that also had sensitivity ≥17 dB on both runs for ZEST

_{0}. Among these locations, the SD of test-retest differences was 2.97 dB for ZEST

_{0}, and 2.59 dB for ZEST

_{15}. In the random effects model, the estimated test-retest variance for ZEST

_{15}was 86.5% of the test-retest variance for ZEST

_{0}, with 95% confidence interval 81.6% to 91.7%. This subset of the test-retest differences is plotted against the mean of the two runs in Figure 3, for each algorithm. It is clear that not only is the spread of data visibly narrower (indicating lower variability) for ZEST

_{15}, but this remains true even when the sensitivity is above 30 dB. Figure 4 presents a Bland-Altman plot comparing the mean of the two sensitivities from ZEST

_{15}against the mean of those from ZEST

_{0}, and shows no systematic bias between the algorithms.

_{15}) was 89.2% of the variance of (SITA – ZEST

_{0}), with a 95% confidence interval 85.2% to 93.4%. This implies that the correlation with sensitivities from the SITA standard algorithm was significantly stronger for ZEST

_{15}than for ZEST

_{0}. When excluding locations whose sensitivity (from SITA) was <0 dB, the variance of (SITA – ZEST

_{15}) was 85.8% of the variance of (SITA – ZEST

_{0}), with confidence interval 81.8% to 90.0%. The Pearson correlations with these SITA sensitivities were 0.161 for ZEST

_{0}(95% confidence interval 0.115–0.207) and 0.171 for ZEST

_{15}(0.125–0.217).

_{0}(0.093–0.192) and 0.212 for ZEST

_{15}(0.163–0.259). In this case, the variance of (SITA – ZEST

_{15}) was 73.5% of the variance of (SITA – ZEST

_{0}), with 95% confidence interval 69.7% to 77.4%.

^{4}We have previously shown that “censoring” sensitivities below 15 dB (approximately 1000% contrast) and setting them equal to 15 dB does not harm, and may possibly improve, the ability to detect progression.

^{5}It has also been shown that censoring sensitivities below 10 dB did not harm the ability to detect progression using MD.

^{6}The next step is to assess actually altering the testing algorithm so that it does not test beyond 15 to 19 dB. In this study, we show that this would reduce test-retest variability, not only by removing the variability at these very low values, but also by allowing smaller step sizes and hence more accurate threshold determination at less damaged locations. In the eyes tested here, test-retest variability was reduced by 13.5% among locations that were ≥17 dB, and by an even greater degree among more severely damaged locations, without harming the ability to quantify functional loss. This suggests that visual field testing algorithms could be designed to detect visual field progression sooner without increasing test duration.

^{1}the observer would have a 14% probability of failing to respond to a 5-dB stimulus. This would cause the sensitivity estimate when using the ZEST

_{0}algorithm to be below 5 dB on some test dates but above 15 dB on other test dates, giving high test-retest variability. We have previously provided evidence suggesting that the response probability does not substantially increase below 15 to 19 dB,

^{4}in which case the observer would have a nearly 50% probability of failing to respond to a 5-dB stimulus, causing even higher test-retest variability. However, with the ZEST

_{15}algorithm, the estimated sensitivity at such locations would be no lower than and likely very close to 17 dB every time, giving lower test-retest variability.

_{15}were more highly correlated with SITA than those measured using ZEST

_{0}, albeit with both correlations being weak (below 0.2). This suggests that the staging and quantification of functional defects may actually be improved by not using very high contrast stimuli in testing algorithms. Sensitivities from SITA are also imperfect due to necessary constraints on test duration, and it is impossible to know for certain at present whether an algorithm accurately reflects true functional status, but it is reassuring to see that narrowing the range of stimuli does not harm performance in comparison with the current clinical standard. It is also necessary to demonstrate that reducing the stimulus range would not adversely affect the structure-function relation, and patient testing is under way to examine this issue.

_{0}, or below 17.0 dB for ZEST

_{15}. Clinically, more complex algorithms such as SITA standard,

^{7}SITA Fast,

^{13}or GATE

^{14}are often used, which aim to increase efficiency and accuracy, and may involve postprocessing to filter out some of the variability. The magnitude of the reduction in variability obtained by stopping testing at 15 dB will vary between these algorithms. It remains to be seen whether this would reach the 20% reduction in variability that has been reported to be needed to reduce the average time taken to detect progression by one visit.

^{3}However, our study demonstrates the principle that reducing the technical range of perimetry would reduce variability not just in highly damaged regions of the visual field, but also in less damaged areas by allowing more precise thresholding within the same test duration.

^{15}but this appears to be driven solely by the increase in sensitivity caused by use of larger stimuli,

^{16}implying that it postpones but does not prevent the point at which variability is too high to be able to distinguish true change from noise in a clinic situation. It may be more efficient to present larger stimuli if the test subject does not respond to a 15-dB Size III stimulus, as is done by the Heidelberg Edge Perimeter (Heidelberg Engineering, Heidelberg, Germany). Another possibility would be to use larger stimuli throughout the range to reduce variability

^{15}and extend the range of severities over which reliable measurements could be obtained.

^{16,17}Size threshold perimetry, whereby stimulus size is altered rather than contrast, may also extend the dynamic range without increasing variability,

^{18}although this would require more extensive modifications to clinical perimeters so as to meet the need for a sufficient number of different stimulus sizes.

*. 2000; 41: 417–421.*

*Invest Ophthalmol Vis Sci**. 2002; 43: 2654–2659.*

*Invest Ophthalmol Vis Sci**. 2011; 52: 3237–3245.*

*Invest Ophthalmol Vis Sci**. 2014; 121: 1359–1369.*

*Ophthalmology**. 2016; 57: 288–294.*

*Invest Ophthalmol Vis Sci**. 2012; 7: e41211.*

*PLoS One**. 1997; 75: 368–375.*

*Acta Ophthalmol Scand**. 2012; 12 (11): 22.*

*J Vis**R: A language and environment for statistical computing*. Vienna Austria: R Foundation for Statistical Computing; 2013.

*. 1983; 33: 113–120.*

*Percept Psychophys**. 1994; 34: 885–912.*

*Vision Res**2002; 43: 709–715.*

*Invest Ophthalmol Vis Sci**. 1998; 76: 431–437.*

*Acta Ophthalmol Scand**. 2009; 50: 488–494.*

*Invest Ophthalmol Vis Sci**. 1997; 38: 426–435.*

*Invest Ophthalmol Vis Sci**. 2015; 4 (2): 10.*

*Trans Vis Sci Tech**. 2010; 128: 570–576.*

*Arch Ophthalmol**. 2013; 54: 3975–3983.*

*Invest Ophthalmol Vis Sci*