purpose. To evaluate the impact of converting from Humphrey 24-2 full-threshold (FT) visual field (VF) testing to SITA-Standard (SS) VF testing during the follow-up phase of a clinical trial.

methods. VF data were obtained from 243 patients in the Collaborative Initial Glaucoma Treatment Study (CIGTS) who had follow-up visits in 2004. FT and SS VF tests were performed in random order on the same day.

results. The average duration of the SS test (6.3 minutes) was shorter (*P* < 0.0001, paired *t*-test) than the FT test (11.8 minutes). The mean deviation did not differ between SS and FT testing. A small difference was found in the pattern SD (PSD) (*P* = 0.02). The mean CIGTS score from the FT test (4.5) was significantly lower (*P* < 0.0001) than the mean CIGTS score from the SS test (6.0). Although the two tests yielded identical Glaucoma Hemifield Test (GHT) results in 179 patients (76%), 16 patients had a normal GHT result on FT testing and an SS test result that was outside normal limits. Six patients had the reverse finding. The most significant factor associated with an increased (positive) difference between the CIGTS VF score generated from SS and FT testing was conducting the FT test first (*P* < 0.0001).

conclusions. Although SS and FT testing yielded very similar mean deviation results, the CIGTS VF score and GHT differed between SS and FT tests. Changing the approach used to measuring a study’s primary VF outcome should be accompanied by a critical evaluation of the change’s impact.

^{ 1 }visual field (VF) change over time is the primary outcome. Its measurement at the study onset (in the fall of 1993) was made with the Humphrey 24-2 full-threshold (FT) test (Carl Zeiss Meditec, Dublin, CA)

^{ 2 }and that testing approach was uniformly used at the study’s 14 clinical centers through December 2003. At that time, study investigators approved converting to VF testing by the Swedish Interactive Threshold Algorithm (SITA) Standard test.

^{ 3 }This decision was based on evidence that the SITA strategy produces results similar to those of the FT test,

^{ 4 }is sensitive and specific for detecting glaucomatous VF defects,

^{ 5 }

^{ 6 }

^{ 7 }and reduced testing time substantially.

^{ 4 }

^{ 5 }

^{ 6 }

^{ 7 }

^{ 8 }

^{ 9 }

^{ 10 }To evaluate the impact of this conversion for the study’s primary outcome, a protocol was instituted in the CIGTS to determine whether the SITA-Standard results yielded similar or different scores from those produced by the FT test.

*n*= 243 with comparable VF test results.

^{ 1 }

^{ 11 }which assigns weights to points on the VF test’s total deviation probability plot according to the extent of departure from normal values, as expressed by point-specific probabilities, which are empirically derived percentiles from the distributions of values at each of the 52 points from age-specific sets of normal subjects collected by the manufacturer.

^{ 2 }The proprietary distributions are built into the VF test software and are not available for inspection. The probability at each of the 52 points is reported as no defect or

*P*≤ 0.05, ≤ 0.02, ≤ 0.01, or ≤ 0.005, meaning that the measured value at that point was at or below the respective percentile of the age-specific empiric distribution at that position in the field for normal subjects. A point is called defective if its probability is 0.05 or less and it has at least two neighboring points with probabilities of 0.05 or less in the same vertical hemifield (superior or inferior). A weight is assigned depending on the minimum depth of the defect at the given point and the two most defective neighboring points. A minimum defect of 0.05, 0.02, 0.01, or 0.005 is given a weight of 1, 2, 3, or 4, respectively. A point without two neighboring points all depressed to at least

*P*≤ 0.05 is given a weight of 0. For example, a point at

*P*≤ 0.01 with only two neighboring points of defect, both at

*P*≤ 0.05, would receive a weight of 1. The weights for all 52 points in the field are summed, resulting in a value between 0 and 208 (52 × 4). The sum is then scaled to a range of 0 to 20 (by dividing by 10.4), resulting in a score that is a nearly continuous measure of VF loss. Other Humphrey VF test parameters that are common to both testing procedures—test duration, pupil diameter, mean deviation (MD), pattern SD (PSD), and Glaucoma Hemifield Test (GHT) result

^{ 12 }—were recorded.

*t*-tests and scatterplots for continuous variables. For categorical variables, we used the McNemar test for dichotomous variables and the Bowker test for symmetry

^{ 13 }for more than two categories. Factors predictive of differences in test results (SITA minus FT) were evaluated by linear regression. Data analyses were performed on computer (SAS, ver. 9.1; SAS, Cary, NC).

^{ 14 }

*P*< 0.0001) than the FT test’s (11.8 minutes). The MD did not significantly differ between SITA and FT testing (

*P*= 0.29), whereas a small difference (0.1 units) was seen between PSD results (

*P*= 0.02). The CIGTS VF score derived from these tests differed substantially (

*P*< 0.0001), with the mean CIGTS VF score from the FT test (4.5) 1.5 units lower (indicating a better VF) than the mean CIGTS VF score computed from the SITA-Standard test (6.0).

*P*= 0.14, Bowker test for symmetry). Even so, the extent of intertest agreement was at best moderate (weighted κ = 0.61), reflecting the 22 (9%) patients for whom a normal versus ONL result was found, and the additional 36 patients (15%) whose intertest GHT results were off by one category.

^{ 15 }(results not shown) of the difference between tests (

*y*-axis) to the mean of the two tests (

*x*-axis) for the three outcomes showed no pattern in the variation of the intertest difference across the mean result for the MD or PSD measure, whereas floor effects were shown for the CIGTS VF score.

*P*< 0.0001) and the presence of less visual field loss (

*P*= 0.05). When the SITA test was performed first, the resultant MD was 0.44 units better than that of the FT test. If the FT test was performed first, the resultant MD from the SITA test was 0.47 units worse than that of the FT test—that is, no matter which test was conducted first, the MD resulting from the first test was better than that from the second test. The difference between these MDs, −0.91, is equal to the regression coefficient in Table 3 . Patients with increasingly worse average MDs had SITA MDs that showed more loss than the FT MDs. This effect was more pronounced when the FT test was performed first versus when the SITA test was first.

*P*= 0.03) and the mean pupil diameter (

*P*= 0.05). Higher PSDs and smaller pupils yielded larger positive PSD differences between SITA and FT VF tests.

*P*< 0.0001). Regardless of which VF test was first, the score was higher (worse) from the SITA than the FT test. When the SITA VF test was first, the resultant CIGTS VF score was 0.87 units higher than the FT test result; when the FT test was first, the resultant CIGTS VF score from the SITA test was 2.22 units higher than the FT test result. The difference between these two results, 1.35 score units, is equal to the regression coefficient in Table 3 . Other significant factors that were associated with a SITA test’s showing more loss than the FT test (in CIGTS VF scores) included more VF loss (

*P*= 0.003) and a smaller mean pupil diameter (

*P*= 0.02).

*P*< 0.0001, McNemar test).

^{ 6 }on 24-2 VF testing of both strategies, and similar reductions in test time have been observed when comparing the two tests by using 30-2 VF testing.

^{ 3 }

^{ 5 }

^{ 8 }

^{ 9 }

^{ 10 }

^{ 16 }

^{ 17 }

^{ 18 }conducted a study of 330 normal subjects that indicated a 1.6-dB higher (better) MD from SITA-Standard testing than from FT testing. Their evaluation of 44 patients with glaucoma and 21 normal subjects

^{ 19 }found almost identical average MD results from SITA standard and FT testing of the patients, but noted that the number of significantly depressed points was higher in SITA testing than in FT testing. Heijl et al.

^{ 4 }tested 31 patients with glaucoma and found that MDs with SITA were on average approximately 1 dB less severe than the 30-2 FT values. They concluded that the SITA test yields results similar to those of the FT test. Sharma et al.

^{ 6 }reported that the MDs and PSDs from SITA and FT 24-2 testing correlated highly (

*r*= 0.92 and 0.93, respectively). In 82 patients with glaucoma, Budenz et al.

^{ 7 }found better MDs derived from SITA-Standard testing than from FT testing, by 0.7 dB. Our lack of difference in MDs from the two testing approaches, although not substantially disparate from that found by others, may have been caused by differences in our patients’ distribution of VF loss, their relatively extensive experience with VF testing, or other factors.

^{ 18 }of the establishment of normal threshold limits for the SITA strategies. They found smaller intersubject variability in SITA-Standard testing (31% less) than in FT testing among 330 normal subjects who were tested, which resulted in normal limits for SITA that were “tightened” between 9% and 29%. Thereby, the statistical significance of a depressed point on SITA testing would be achieved more readily than on FT testing. Of course, this speculation can be critically evaluated only with knowledge of the distributions of normal values used by the SITA-Standard and FT testing software, which are proprietary.

^{ 20 }As pupil diameter increased, two trends were observed in the VF results. First, regardless of the order of testing, the average CIGTS VF score was smaller (indicating less VF loss) with increasing pupil diameter. Second, differences between SITA and FT outcomes for CIGTS VF scores lessened with increasing pupil diameter, although SITA outcomes were consistently greater than FT outcomes across the range of pupil diameters. These results probably relate to a more reliable VF assessment when the pupil diameter is sufficiently wide to allow for optimal light exposure.

^{ 6 }who found variation to be more likely in normal control subjects and patients with suspected or mild glaucoma.

Variable | n | Full-Threshold Test | SITA Test | P ^{*} |
---|---|---|---|---|

Test duration (min) | 243 | 11.8 (1.8); 5.8, 18.1 | 6.3 (1.3); 4.3, 12.9 | <0.0001 |

Pupil diameter | 198^{, †} | 4.2 (1.0); 1.9, 7.6 | 4.2 (0.9); 1.9, 6.6 | 0.67 |

Mean deviation | 243 | −5.5 (5.4); −28.1, 6.1 | −5.4 (5.6); −27.5, 14.2 | 0.29 |

PSD^{c} | 243 | 5.4 (3.7); 0.9, 17.3 | 5.3 (3.9); 1.0, 17.3 | 0.02 |

CIGTS VF score^{c} | 243 | 4.5 (5.1); 0.0, 19.9 | 6.0 (5.7); 0.0, 20.0 | <0.0001 |

SITA-Standard | Full-Threshold | ||||||
---|---|---|---|---|---|---|---|

WNL | Borderline | ONL | Total | ||||

WNL | 43 | 7 | 6 | 56 | |||

Borderline | 11 | 5 | 9 | 25 | |||

ONL | 16 | 9 | 131 | 156 | |||

Total^{*} | 70 | 21 | 146 | 237 |

*n*= 2): general reduction of sensitivity (GRS,

*n*= 1); abnormally high sensitivity (AHS,

*n*= 1); FT: Borderline/SITA: Other (

*n*= 1): GRS; FT: Other/SITA:ONL (

*n*= 1): GRS; FT: Other/SITA:Other (

*n*= 2): both GRS (

*n*= 1); Both AHS (

*n*= 1).

**Figure 1.**

**Figure 1.**

**Figure 2.**

**Figure 2.**

**Figure 3.**

**Figure 3.**

Predictive Factors | Estimate (SE) | P |
---|---|---|

Dependent Variable: [(SITA MD) − (FT MD) ]^{*} | ||

FT test first | −0.91 (0.22) | <0.0001 |

Mean MD^{, †} | 0.04 (0.02) | 0.05 |

Mean pupil diameter^{, †} | 0.09 (0.12) | 0.46 |

Dependent Variable: [(SITA PSD) − (FT PSD)]^{, ‡} | ||

FT test first | 0.10 (0.16) | 0.52 |

Mean PSD^{, †} | 0.05 (0.02) | 0.03 |

Mean pupil diameter^{, †} | −0.17 (0.09) | 0.05 |

Dependent Variable: [(SITA CIGTS VF Score) − (FT CIGTS VF Score)]^{, §} | ||

FT test first | 1.35 (0.33) | <0.0001 |

Mean CIGTS VF score^{, †} | 0.09 (0.03) | 0.003 |

Mean pupil diameter^{, †} | −0.43 (0.18) | 0.02 |

*Ver. 9*. 2002;SAS Institute, Inc. Cary, NC.