The results of our pointwise analyses show that the differences between the estimates of the Full Threshold strategy and those of the SITA strategies did not vary in a simple, linear fashion with sensitivity
(Fig. 3) . Although the differences were close to zero at both extremes of the dynamic range, they reached a maximum of approximately 1.5 dB (SITA Standard) and 2.5 dB (SITA Fast) at approximately 15 dB. At this sensitivity, the estimates of SITA Standard and SITA Fast were significantly higher than those of the Full Threshold strategy. Evidenced by the disparities between the means and medians, the shapes of the underlying threshold distributions were also different from that of the Full Threshold strategy
(Fig. 4) .
In the implementation of the Full Threshold strategy in the HFA, sensitivity is estimated as the stimulus attenuation of the last-seen presentation, whereas the SITA algorithms derive the estimate as that stimulus attenuation with the largest likelihood of corresponding to the 50%-point on the frequency-of-seeing curve.
6 11 The SITA strategies therefore would be expected to give estimates that are, on average, 1 dB higher than those of the Full Threshold strategy, independent of sensitivity.
5 When averaged across the entire dynamic range, the sensitivity difference between SITA Standard and the Full Threshold algorithm (0.9 dB) was similar to the expected value, whereas the difference between SITA Fast and Full Threshold was larger. These results
(Fig. 2) agree closely with those reported by others,
10 but the differences are smaller than those reported by Sharma et al.,
12 whose analysis, based on a comparison between one Full Threshold and one SITA Standard examination, may be confounded by a statistical artifact akin to a regression-to-the-mean effect. Locations with very low sensitivity in the first session tend to produce higher (i.e., more sensitive) estimates during the second examination, owing to test–retest variability and to the truncated range of the instrument.
To reduce the effect of such statistical artifacts, we compared single estimates of each strategy against the mean of three Full Threshold examinations (referred to as the best available estimate). Although the Full Threshold strategy is not an ideal gold standard, its properties have been thoroughly investigated, both from clinical data
3 and by computer simulation,
1 2 13 and may therefore be more fully understood than those of the SITA strategies. The staircases of the Full Threshold strategy, for example, commence at values determined from the sensitivity at neighboring locations, or from a normative database if estimates from neighboring locations are not yet available. Because of response variability, the resultant threshold estimates are biased toward the start value if it is remote from the true sensitivity at the given location.
1 13 Although a better estimate of sensitivity may be obtained from frequency-of-seeing (FOS) curves, the number of stimulus presentations required to estimate FOS curves accurately is too large to be practical in a clinical context. Computer simulations, in which the true sensitivity of the observer is known, are the method of choice for investigations relating to the accuracy of psychophysical measurements. However, such simulations require precise details of SITA’s visual field model that are not in the public domain.
Bengtsson and Heijl
5 have hypothesized that the larger than expected differences in the sensitivity estimates with the briefer SITA examinations (compared with the Full Threshold strategy) are due to reduced fatigue. Reduced fatigue effects do not, however, explain the higher than expected sensitivity estimates that they reported from computer simulations of the SITA Fast strategy. Furthermore, our findings persisted when the analysis was repeated on a subset of visual field locations including only the primary seed points and their closest neighbors. These locations are examined early during the course of a Full Threshold test and would therefore be expected to show lower fatigue effects than other test points. These findings question the hypothesis that the differences between the strategies are solely due to reduction in fatigue with the briefer SITA examinations. It has been reported from computer simulations that biased threshold estimates may result from using the mode of the a posteriori probability density function (such as in the SITA strategies), whereas its mean provides a better estimator.
14
Because the magnitude of the bias is likely to be related to the size of the error-related factor (i.e., the permitted uncertainty about the threshold estimate), it may explain the differences between the estimates of SITA Standard and SITA Fast, as well as the results reported from computer simulations of the SITA Fast strategy by Bengtsson and Heijl.
5 This threshold estimation bias may also contribute to the paradoxical finding of lower between-subject variability with SITA Fast compared to SITA Standard.
15 16
The global test–retest variability of SITA Standard was approximately 15% lower, and its retest intervals were generally smaller, compared with those of the Full Threshold strategy. The small systematic differences between the SITA strategies and the Full Threshold algorithm are unlikely to impact on the detection of deep and localized defects that are regarded as the hallmark of glaucomatous field loss. Because global visual field indices, such as MD and pattern standard deviation (PSD), are calculated with reference to normative values of each strategy, good agreement between the indices of these strategies would be expected, and several previous reports have confirmed this.
4 7 8 9 10 It is difficult, however, to estimate how the SITA strategies may represent early diffuse losses of visual field sensitivity that commonly accompany focal glaucomatous defects.
17 18 Computer simulations and longitudinal clinical trials need to establish whether the small systematic differences between the Full Threshold strategy and SITA Standard, and the slightly higher reproducibility of the latter, impact on the detection of visual field progression. When SITA Standard is substituted for the Full Threshold algorithm in the longitudinal follow-up of patients, it may be advantageous to establish new baseline measures.
19 SITA Fast showed higher reproducibility only for high sensitivities. Because, at test locations with sensitivity below 20 dB, its test–retest variability was higher than that of the Full Threshold strategy, SITA Fast is unlikely to be a good choice to monitor established visual field loss, in spite of its short test duration.