Abstract
Purpose.:
To determine the validity, test-retest reliability and repeatability of the UFOV test in healthy controls and glaucoma patients.
Methods.:
Three substudies with the UFOV test were conducted: (1) validity was evaluated in 77 older controls (mean age 64 [SD, 7] years) and 53 glaucoma patients (mean age 69 [SD, 8] years); (2) test-retest reliability was evaluated in 13 young controls (mean age 28 [SD, 4] years), 21 older controls (mean age 66 [SD, 9] years), and 22 glaucoma patients (mean age 68 [SD, 8] years) who performed the test twice within approximately two weeks; (3) repeatability was evaluated in 17 young controls (mean age 33 [SD, 8] years) who performed the test five times on the same day.
Results.:
In the validity substudy, mean total processing time was significantly less for older controls (358.3 ms [SD, 226.8 ms]), than glaucoma patients (580.2 ms [SD, 324.5 ms]), with moderate correlations (rho ≥ 0.40) between total processing time and age, and visual field impairment. In the reliability substudy, mean total processing time was significantly less on retest (P ≤ 0.02), with glaucoma patients showing the largest mean test-retest difference (144.7 ms [SD, 168.9 ms]) compared with young (31.5 ms [SD, 43.7 ms]) and older controls (56.2 ms [SD, 74.8 ms]). The 95% limits of agreement were significantly wider for glaucoma patients (−186.3 and +475.7 ms) compared with young (−54.1 and +117.1 ms) and older controls (−90.5 and +202.9 ms), (P < 0.01). In the repeatability substudy, performance remained constant after the second of five tests (differences in mean total processing time <6 ms).
Conclusions.:
Measurement properties of the UFOV test are important for assessing functional performance, in particular, fitness to drive. Our results indicate moderate variability, greater for glaucoma patients than healthy controls, and a learning effect. Two consecutive tests are suggested to establish reliable baseline measures.
The Useful Field of View (UFOV) test is being increasingly recommended and used for screening, assessing, and training older drivers,
1,2 including those with slight to moderate vision impairment. This has important implications with regard to driver licensing, and consequently, maintenance of a person's independence, quality of life and wellbeing.
3–6
The term “useful field of view” was first used by Ball and colleagues,
7,8 and has since come to be most widely associated with a specific computer-based test, the UFOV test (Visual Awareness Research Group, Inc., Punta Gorda, FL).
9,10 The UFOV test was designed to capture the degree of difficulty experienced by older persons in everyday activities requiring the use of peripheral vision.
9 It measures the speed at which a person can cognitively process visual information within a visual field radius of 30°, in a single glance, under a variety of demanding attentional conditions.
11 Unlike clinical visual fields that require detection of threshold targets, the UFOV test requires both the identification and localization of suprathreshold targets through subtests that tap speed of visual information processing, ability to divide attention, and ability to ignore irrelevant information.
12 Thus, performance depends on both visual sensory function and higher-order cognitive abilities.
13
Performance on the UFOV test has been consistently associated with the driving performance of older persons, including driving cessation, simulator performance, on-road performance, and motor vehicle crashes (MVCs).
1,12,14–16 Such associations have also been found among patients with visual field loss from glaucoma.
17–19 Indeed, some studies have shown performance on the UFOV test is a better predictor of driving performance than standard perimetric visual field assessments among both older persons with normal vision,
14,20,21 and patients with glaucoma.
18 Hence, the UFOV test has been advocated as an assessment and driver screening test
10,12 —although some have questioned its suitability for persons with visual field loss.
22,23
Training on the UFOV test for several hours using preset criteria for success has been shown to improve not only UFOV test performance, but also the driving performance of older persons.
11,24,25 In a large multisite trial of 980 older drivers randomized to one of three cognitive interventions (memory, reasoning, or UFOV speed of processing training) or a control condition, participants who underwent up to 10 sessions of UFOV test training over a 5-week period had approximately a 50% lower rate of at-fault MVCs during the subsequent 6 years compared with participants in the control group, after adjustment for confounding factors,
25 hence its recommendation as a driver training intervention.
2,11,25
In spite of its expanding use for more than 30 years, there have been just a few investigations of the psychometric properties of the UFOV test. These have been limited to studies of community-dwelling and functionally independent older persons, in which evaluation of reliability was limited to test-retest correlation coefficients for each of the three UFOV subtests.
9,10 There are no reports of the reliability of the UFOV test using the more informative and preferred method of Bland and Altman,
26 and there are no reports of its psychometric properties among young persons for comparison or persons with visual field loss, such as from glaucoma.
The broad purpose of this study was to determine the validity, test-retest reliability, and repeatability of the UFOV test in young and older persons with normal vision, as well as patients with glaucoma. Specifically, in the absence of a gold standard for determining criterion validity, we sought to provide evidence of convergent validity by verifying an association between UFOV and age and visual field impairment, and evidence of discriminant validity by affirming the ability of the UFOV test to differentiate between older controls and glaucoma patients. In addition, our aim was to evaluate the test-retest reliability (variability of the measure over two occasions) using the method of Bland and Altman in these two groups and compare them with young controls. Our final specific aim was to assess the repeatability of the UFOV test (variability of a series of measures over a short duration or the effect of practice) in young controls, and in doing so determine the minimum number of tests required to obtain a stable and precise measure.
Validity Substudy.
A sample of 77 older controls with normal vision and 53 glaucoma patients participating in the prospective study of falls and MVCs were utilized to investigate convergent validity between age and UFOV, as well as the ability of the UFOV test to discriminate between persons who are visually healthy and those who are visually impaired. To be eligible, all participants were required to be aged over 50 years. Glaucoma patients were recruited from the Glaucoma Clinic of the Eye Care Centre, Queen Elizabeth II Health Sciences Centre, Halifax, Nova Scotia. Older controls were recruited by spoken communication and public notice within the Health Sciences Centre (including staff, students, family, friends, and visitors).
Older controls were required to have a normal ocular examination and visual acuity (VA) better than 0.30 LogMAR (20/40) in each eye. For the glaucoma group, the inclusion criteria were glaucomatous optic nerve damage (e.g., notching or progressive thinning of the neuroretinal rim) and corresponding visual field impairment defined as a Glaucoma Hemifield Test outside normal limits detected with standard automated perimetry (Humphrey Field Analyzer [HFA]; Carl Zeiss Meditec Inc., Dublin, CA), as diagnosed by an ophthalmologist. Exclusion criteria for both groups were cataract (worse than grade II, Lens Opacities Classification System II)
27 or other concomitant ocular disease, systemic disease or medication known to affect the visual field, cognitive impairment (>2 errors, Short Portable Mental Status Questionnaire),
28 residing in a nursing home, and use of a mobility device.
Test-Retest Reliability Substudy.
Repeatability Substudy.
Validity Substudy.
Test-Retest Reliability Substudy.
Repeatability Substudy.
In addition to the testing procedures described for the validity substudy, participants in the repeatability substudy performed the UFOV a total of five times on the same day, with 30-minute rest periods between testing.
The study adhered to the tenets of the Declaration of Helsinki and the design, recruitment, consent, and procedures were approved by the Capital Health Research Ethics Board.
Validity Substudy.
Data were analyzed using statistical software (SPSS 15.0 for Windows; SPSS Inc., Chicago, IL). All analyses were 2-tailed and P-values less than 0.05 were considered statistically significant. Differences between groups were analyzed using the t-test or ANOVA for ratio data with normal distribution, or an equivalent nonparametric test where appropriate. Associations between age and UFOV scores, and between visual fields and UFOV scores were determined using the Spearman rank correlation coefficient (rho). To examine the dependence of age on UFOV scores, ordinary least squares regression analysis was performed. Additionally, associations between MD and UFOV scores were further evaluated using stepwise multiple linear regression analysis to control for the confounding effect of age.
Test-Retest Substudy.
Repeatability Substudy.
Characteristics and UFOV scores for each group participating in the investigation of validity are given in
Table 1. Participants with glaucoma were older than those with normal vision (mean difference = 5 years;
t = −3.33,
P = 0.001) and as expected, performed worse on all measures of visual function (
P ≤ 0.03). Mean time since glaucoma diagnosis was 13 years (SD, 8 years). The ratio of females to males in the older control group was significantly greater than in the glaucoma group (
P = 0.04).
Table 1. Characteristics and UFOV Scores for Validity Substudy Sample
Table 1. Characteristics and UFOV Scores for Validity Substudy Sample
Variable | Older Controls, n = 77 | Glaucoma Patients, n = 53 | P |
Age, y |
Mean (SD) | 64 (7) | 69 (8) | 0.001 |
Sex |
Male:female | 25:52 | 27:26 | 0.04 |
VA better eye, LogMAR |
Mean (SD) | 0.04 (0.09) | 0.09 (0.12) | 0.03 |
HFA MD better eye, dB |
Mean (SD) | 0.05 (1.58) | −3.85 (5.12) | <0.001 |
HFA MD worse eye, dB |
Mean (SD) | −0.84 (1.62) | −10.26 (7.38) | <0.001 |
HFA binocular Esterman, % seen |
Median (interquartile range) | 100 (98–100) | 97 (90–99) | <0.001 |
UFOV subtest 1: central processing, ms* |
Median (interquartile range) | 16.7 (16.7–16.7) | 16.7 (16.7–33.4) | 0.01 |
UFOV subtest 2: divided attention, ms* |
Mean (SD) | 107.2 (116.5) | 214.4 (184.7) | <0.001 |
UFOV subtest 3: selective attention, ms* |
Mean (SD) | 227.0 (107.5) | 326.3 (133.8) | <0.001 |
UFOV total: sum of subtests, ms* |
Mean (SD) | 358.3 (226.8) | 580.2 (324.5) | <0.001 |
UFOV subtest 1 (central processing) results were highly positively skewed toward the best possible score (16.7 ms) for both controls and glaucoma patients. UFOV subtest 2 (divided attention); subtest 3 (selective attention); and total score were each appropriate for use with both groups, with no upper or lower end-of-scale limitations. Controls performed significantly better on all subtests compared to glaucoma patients (P ≤ 0.01). Mean total processing time was 358.3 ms (SD, 226.8 ms) and 580.2 ms (SD, 324.5 ms), for controls and glaucoma patients, respectively (mean difference = 222.0 ms; t = −4.31, P < 0.001).
For controls, there were significant associations between age and UFOV scores, but not between sex and UFOV scores. The correlation coefficient was strongest for selective attention (rho = 0.52, P < 0.01), where processing time on this subtest increased with age (slope of fitted regression line = 7.61 ms per year; R 2 = 0.26, P < 0.001 [Note: Given the low R 2, this slope should not be used to predict processing time]). The correlation was moderate for divided attention (rho = 0.34, P < 0.01) and total processing time (rho = 0.48, P < 0.01).
For glaucoma patients, there were significant associations between age, visual field measures and UFOV scores (
Table 2). Correlation coefficients were stronger for age (UFOV total rho = 0.64) than for visual field measures (UFOV total rho ≤ −0.40). The strongest correlation with visual fields was obtained between HFA MD better eye and UFOV subtest 3, selective attention (rho = −0.42,
P < 0.01). This association remained significant after controlling for age. Age and HFA MD better eye in combination explained 46% of the variability in selective attention (standardized regression coefficients = 0.48 and −0.41 for age and HFA MD better eye, respectively;
P < 0.001).
Table 2. Spearman Correlation Coefficients for Age, Visual Fields, and UFOV among Glaucoma Patients, n = 53
Table 2. Spearman Correlation Coefficients for Age, Visual Fields, and UFOV among Glaucoma Patients, n = 53
UFOV Score | Age | Binocular Esterman | HFA MD Better Eye | HFA MD Worse Eye |
rho | P | rho | P | rho | P | rho | P |
UFOV subtest 1: central processing, ms | 0.31 | 0.03 | −0.16 | 0.26 | −0.17 | 0.22 | −0.18 | 0.19 |
UFOV subtest 2: divided attention, ms | 0.60 | <0.001 | −0.26 | 0.06 | −0.37 | 0.01 | −0.24 | 0.09 |
UFOV subtest 3: selective attention, ms | 0.58 | <0.001 | −0.28 | 0.04 | −0.42 | <0.01 | −0.31 | 0.03 |
UFOV total: sum of subtests, ms | 0.64 | <0.001 | −0.28 | 0.04 | −0.40 | <0.01 | −0.30 | 0.03 |
Characteristics of participants, mean UFOV scores, and indicators of test-retest reliability are given in
Table 3. There were significant differences in age, (ANOVA:
F = 142.12,
P < 0.001). As expected, Tamhane pairwise comparisons indicated the mean age of the young controls was significantly less than the older controls and the glaucoma patients (
P < 0.001); whereas the mean age of the older controls was not significantly different to the glaucoma patients (
P = 0.85). Mean time since glaucoma diagnosis was 13 years (SD, 6 years). UFOV processing times were shortest for the young controls and longest for the glaucoma patients (
Table 3). Overall group differences in UFOV subtest 2, subtest 3 and total score were significant (ANOVA:
F ≥ 6.56,
P < 0.01). Post-hoc Tamhane tests revealed that all pairwise group differences in selective attention and total processing time were significant (
P < 0.05).
Table 3. Characteristics, UFOV Scores, and Test-Retest Differences for Test-Retest Reliability Substudy Sample
Table 3. Characteristics, UFOV Scores, and Test-Retest Differences for Test-Retest Reliability Substudy Sample
Variable | Young Controls, n = 13 | Older Controls, n = 21 | Glaucoma Patients, n = 22 |
Age, y |
Mean (SD) | 28 (4) | 66 (9) | 68 (8) |
Sex |
Male:female | 6:7 | 8:13 | 9:13 |
VA better eye, LogMAR |
Mean (SD) | −0.08 (0.07) | 0.07 (0.11) | 0.06 (0.10) |
HFA MD better eye, dB |
Mean (SD) | −0.47 (0.72) | 0.03 (1.37) | −2.30 (3.41) |
HFA MD worse eye, dB |
Mean (SD) | −1.07 (1.18) | −0.77 (1.62) | −9.27 (6.50) |
HFA binocular Esterman, % seen |
Median (interquartile range) | 100 (all 100) | 100 (90–100) | 97 (82–00) |
UFOV subtest 1: central processing, ms |
Median (interquartile range)* | 16.7 (all 16.7) | 16.7 (16.7–65.0) | 17.5 (16.7–111.7) |
Mean test-retest diff. (SD)† | na | 6.0 (15.9) | −1.2 (29.2) |
Test-retest 95% LOA | na | −25.2, +37.2 | −58.4, +56.0 |
UFOV subtest 2: divided attention, ms |
Mean (SD)* | 21.2 (11.1) | 95.4 (109.5) | 168.4 (147.8) |
Mean test-retest diff. (SD)† | 9.0 (22.3) | 23.4 (66.5) | 104.8 (127.1)† |
Test-retest 95% LOA | −34.7, +52.7 | −107.0, +153.8 | −144.4, +354.0 |
UFOV subtest 3: selective attention, ms |
Mean (SD)* | 86.9 (27.8) | 217.7 (115.4) | 323.5 (114.7) |
Mean test-retest diff. (SD)† | 22.5 (30.0)‡ | 26.8 (74.8) | 41.0 (90.0)‡ |
Test-retest 95% LOA | −36.2, +81.2 | −119.7, +173.3 | −135.4, +217.4 |
UFOV total: sum of subtests, ms |
Mean (SD)* | 124.8 (31.0) | 334.3 (219.2) | 517.5 (253.7) |
Mean test-retest diff. (SD)† | 31.5 (43.7)‡ | 56.2 (74.8)‡ | 144.7 (168.9)‡ |
Test-retest 95% LOA | −54.1, +117.1 | −90.5, +202.9 | −186.3, +475.7 |
Within groups, mean test-retest differences indicated shorter processing times on retest compared with test (i.e. improved performance) for all subtests except central processing (
Table 3). Among young controls, the mean improvement in test-retest performance was significantly different to zero for subtest 3 selective attention (mean test-retest difference = 22.5 ms;
t = 2.71,
P = 0.02) and total score (mean test-retest difference = 31.5 ms;
t = 2.60,
P = 0.02). Among older controls, improvement was significant for total score (mean test-retest difference = 56.2;
t = 3.44,
P < 0.01) and among glaucoma patients, improvement was significant for subtest 2 divided attention (mean test-retest difference = 104.8;
t = 3.87,
P < 0.001), subtest 3 selective attention (mean test-retest difference = 41.0 ms;
t = 2.13,
P = 0.04) and total score (mean test-retest difference = 144.7 ms;
t = 4.02,
P < 0.001).
Between groups, mean test-retest differences were smallest for young controls and largest for glaucoma patients (
Table 3). Mean total score test-retest differences were 31.5 ms (SD 43.7 ms); 56.2 ms (SD 74.8 ms); and 144.7 ms (SD 168.9 ms) for the young controls, older controls and glaucoma patients, respectively (ANOVA:
F = 4.80,
P < 0.01). Post-hoc Tamhane tests revealed the improvement in total score was significantly larger only for glaucoma patients compared with young controls (
P = 0.02).
The 95% LOA were increasingly wider, indicating greater test-retest variability for young controls, older controls and glaucoma patients, respectively (
Table 3 and
Fig. 1). Specifically, the Levene test for homogeneity of variance revealed subtest 2 divided attention 95% LOA were significantly wider for glaucoma patients compared with young controls (
F = 23.42,
P < 0.001); glaucoma patients compared with the older controls (
F = 10.65,
P < 0.01); and the older controls compared with young controls (
F = 8.47,
P = 0.01). Subtest 3 selective attention 95%LOA were significantly wider only for glaucoma patients compared with young controls (
F = 9.71,
P < 0.01). The total score 95% LOA were −54.1 and +117.1 ms for the young controls, −90.5 and +202.9 ms for the older controls, and −186.3 and +475.7 ms for the glaucoma patients, the 95% LOA being significantly wider for glaucoma patients compared with young controls (
F = 10.00,
P < 0.01), and glaucoma patients compared with older controls (
F = 12.33,
P < 0.01). For all groups, test-retest differences did not vary in a systematic manner over the range of processing speeds measured (
Fig. 1).
Test-retest correlation coefficients for older controls were: 0.42 (P = 0.06), 0.54 (P = 0.01), 0.82 (P < 0.001), 0.90 (P < 0.001), for subtest 1, 2, 3, and total, respectively.
Psychometrically robust clinical tests that are predictive of and sensitive to changes in driving performance are required for screening older and vision impaired drivers and evaluating the outcomes of driver training interventions. The UFOV test has been increasingly used for these purposes in spite of little demonstration or knowledge of its psychometric properties. For the first time, we present data on the validity and reliability of the UFOV for young and older persons with normal vision compared with glaucoma patients, using current methods of analysis.
Convergent validity was demonstrated by a moderately strong correlation between age and total processing speed score among older participants with normal vision and glaucoma patients. This is consistent with several early studies,
14,20 and a large study of older persons (670 males and 2089 females; age 65–94 years) living independently in good functional and cognitive status, where a comparable correlation of 0.43 was obtained.
10 Convergent validity was also demonstrated by a moderate correlation between visual field impairment in the better eye measured using standard perimetry and total processing time among glaucoma patients.
As in previous studies,
9,10 performance on subtest 1, the simplest of the three, was constrained toward the best end of the range for both older controls and glaucoma patients. However, there were no upper or lower end-of-scale limitations observed for subtests 2 and 3, and total score, making them suitable for use with persons with either normal vision or glaucoma. Expected group differences in mean processing time provides evidence the UFOV test has discriminant validity. Mean total processing time was significantly shorter for controls compared to glaucoma patients. The SDs indicate substantial individual variability, unsurprisingly, greater for glaucoma patients compared with older controls. However, it is possible that sex differences between groups contributed somewhat to the differences in UFOV test performance in this study. Compared with available normative data,
10 the variability we observed for controls is consistent. However, mean total processing time was slightly shorter in our sample (358.3 ms [SD, 226.8 ms] vs. 481.9 [SD, 247.5]), possibly because participants were on average younger (64 years [SD, 7 years] vs. 72 years [SD, 7 years]).
Mean total processing time was significantly less on retest compared with the first test for young controls, older controls, and glaucoma patients, with a similar pattern evident in a previous study of 66 independent community older persons.
9 Total score test-retest 95% LOA were significantly wider for glaucoma patients compared with young and older controls. Even for young controls who repeated the test five times, performance improved on the second test and thereafter remained constant. This suggests at least two tests are required to obtain a reliable score.
Among older persons, we did not find a strong test-retest correlation coefficient for subtest 1, most likely because of the ceiling effect noted above. However, the test-retest correlation coefficient for total performance score was high (0.90). Indeed, it was higher than the value of 0.81 obtained in the only other reliability study conducted to date, wherein 50 older persons repeated the UFOV test within a 3-week period.
9 It is noteworthy that in spite of the high correlation, Bland and Altman analysis indicated a significant learning effect. This highlights the problem of meaningful interpretation and difficulty comparing studies using the correlation coefficient with regard to reliability.
26
These findings suggest that with practice, the UFOV test may have adequate validity and reliability for use with persons with normal vision and persons with visual field impairment. However, a limitation of this study was the moderate sample size. Our findings should be replicated in a larger sample, in particular, comprising persons with various types of vision impairment relevant to driving. Also, while we screened for cognitive impairment, we did not collect data on education, a factor that has been associated with UFOV processing time.
10 It is possible there were differences in education between groups that may have affected the results. In addition, repeatability findings were limited to younger persons. It may be useful to confirm the repeatability of scores after the second test and then to investigate the reliability of the second and third test scores in older persons and those with visual field impairment. Furthermore, the usefulness of the UFOV test for predicting driving performance and MVCs among persons with vision impairment, over and above standard visual field tests, should be demonstrated in prospective studies.
22,23 If it is to be used to determine fitness to drive, we suggest further investigation to establish a valid pass/fail criterion. Finally, if the UFOV test is to be used as an outcome measure, future studies should evaluate its responsiveness to change and the minimal clinically important difference.
In summary, the results indicate the UFOV test has moderate variability, greater for patients with glaucoma than persons of similar age with normal vision, and a learning effect. Although the UFOV testing procedure incorporates practice trials, based on a sample of young persons with normal vision, at least two consecutive full tests are suggested to establish reliable baseline measures. These findings and recommendations will be useful to clinicians, driving rehabilitation specialists, driving researchers, road traffic authorities, insurance company managers, and policymakers.