The DLTV has previously been shown to be a useful instrument in assessing the impact of visual impairment in older adults.
2 In the present study, we extended our assessments of the DLTV to include the original version that uses a four-point scale and data obtained with a subset of 11 items with a response scale increased to five steps. We confirmed that the DLTV-11 and the remaining items of the overall DLTV-22 were measuring the same construct. We then sought to determine the scale that represented the more optimal structure, obtain a linear scoring system, and refine internal consistency. We used Rasch analysis to achieve these objectives. Rasch analysis is particularly useful in providing true linear scoring and deciding the appropriateness of the scoring scale. It combines the measure of the ability of the person with the measure of the difficulty of the item in question by subtracting one from the other. The observations in higher categories must be produced by higher measures, indicating that the average measures by category must be evenly separated throughout the rating scale.
Rasch analysis demonstrated that the four-point ordinal scale of the DLTV-22 was clearly optimal in comparison with the five-point scale. The intervals between categories were uneven (shown in
Fig. 3 ), with the least separation between categories 3 and 4 suggesting that these should be collapsed into one. Seeing the results of this adjustment, we can conclude that the categories may be used interchangeably.
The Rasch analysis separation values
(Table 6)also showed that the five-point scale was more imprecise than the four-point scale, with the implication that the extra response level distorts the sensitivity of the instrument rather than improving it. In this context, our findings support the observation of Stelmack et al.
8 who used Rasch analysis on the VA LV VFQ-48 and observed that the five-point scale was suboptimal. In this instrument, reducing the five-point scale to four-point through the combination of categories 2 and 3 also yielded a better fit. The same investigators tested the validity of their approach by analyzing a further dataset of the VA LV VFQ-48 which used a four-point scale that revealed that Guttman’s coherence was identical with that obtained after combination of categories 2 and 3.
Other investigators have demonstrated similar findings on other visual function instruments. Pesudovs et al.
4 found that the five-point scale of the ADVS (Activities of Daily Vision Scale) was suboptimal. Velozo et al.
10 who examined the VF-14 (another visual function instrument), detected underutilization of the lowest response categories and recommended that its five-point scale be reduced to three. In addition Velozo et al. report that the person separation increased from 2.37 with a five-point response scale to 2.53 with a three-point scale. In the present study in an AMD population, in which the frequency of bilateral visual loss is high, we determined person separation to be 2.95 (person reliability = 0.90) for the DLTV-22 (four-point scale) and 2.18 (person reliability = 0.83) for the DLTV-11 (five-point scale). Combining two of the response categories of the DLTV-11 to restore a four-point scale improved person separation to 2.40 (person reliability = 0.85) indicating improved instrument performance. However, the person separation was not as clear as that seen when the instrument was administered as a four-point scale and emphasizes the importance of having the optimal scale in place at the time of administration. Although a three-point scale appears to yield an even better fit of the Rasch model it is possible that the DLTV becomes less sensitive to changes in visual function, and therefore additional studies are needed to investigate this possibility.
An average raw score for each person across all items is computed on the basis of the Likert scoring system. The Rasch rating scale transforms this into an interval scale with the result of each person being given a person measure.
6 7 However, on plotting the average person raw score against the person measure we established that they are not linearly related for the DLTV. This finding strongly supports the recommendations of Massof
3 who questioned the use of the Likert scale in analyzing VFQ.
11 We went on to find that it requires the generation of a double asymptotic nonlinear regression to transform the average person raw score to a Rasch person measure, and this process allowed us to use the DLTV-22 with a Rasch-adjusted scale without having to perform Rasch analysis.
11
We tested content validity by generating person–item maps for the DLTV-22. As visual function can be excellent even if acuity is poor in one eye, we dichotomized the second dataset into two groups: group 1, who have AMD in both eyes but with VA ≥ 0.3 logMAR in the better-seeing eye, and group 2, who have AMD in one eye but with VA< 0.3 logMAR in the better-seeing eye. The person–item map for group 1
(Fig. 6)illustrates the absence of floor or ceiling effects, confirming that the items of the DLTV-22 were well targeted. Group 2
(Fig. 7)in comparison exhibits ceiling effects for most items, indicating that the DLTV-22 is unlikely to discriminate between persons with excellent and good visual function. We further refined the content validity of the DLTV-22 through examination of infit and outfit statistics and the spacing between the items in relation to the distribution of persons in the model.
11 As mean square statistics have been defined such that the uniform value of randomness is indicated by 1.0, Rasch analysis indicates whether it would be prudent to remove items or persons that do not fit the model, thus allowing an improved measurement without losing information.
6 7 12 13 With this in mind, items with high outfit mean squares
(Table 1)were removed, reducing the DLTV from 22 to 17 items. The outfit mean square for response category 3 was reduced to 1.20, leading us to conclude that the high outfit mean square observed in this category was due in part at least to items that were perceived as too easy.
In our previous analyses, we used PCA on raw scores to assign items to domains and identified the presence of four domains in the DLTV-22.
2 14 When the five items with high outfit Rasch statistics were excluded, the 17 remaining were assigned to two domains. These items segregated into the two domains in a fashion almost identical with the domain structure that we had established, with two minor differences (see
Table 7 ). First, “Reading newspaper headlines,” which was previously assigned to domain 2 moved to domain 1, and the item, “Do you feel confident to walk around your own neighborhood?,” which was previously assigned to domain 3, was reallocated to the new domain 2. We believe it important to retain a domain structure, as previous studies have shown that domain 1 is sensitive to changes at the better extreme of the visual acuity scale, whereas domain 2 is sensitive to changes in the moderate range.
14 With respect to the remaining five items of the DLTV which did not fit into either domains 1 or 2, we would recommend analyzing them as individual items if the full DLTV-22 is used. The implications of removal of these five items has not yet been tested fully and it is possible that they may be of value in assessing disease states other than AMD or when applied to younger populations. We also believe that longitudinal studies are needed, to establish the sensitivity of the two domains to change in visual function.
With the establishment of the accuracy and precision of the measurement scale and with its present structure comprising 17 items within two domains, we contend that the DLTV constitutes an easily administrable, robust instrument for the assessment of self-reported functioning in patients with central visual loss. We confirm that a four-point ordinal scale constitutes an optimal scoring system for patients with AMD and that increasing the number of categories to five causes problems with assumptions of linearity. In summary, the use of Rasch analysis has confirmed the appropriateness of the rating scale, improved content validity and provided a linear scoring system for the DLTV.
The authors thank the SFRAD Study Group: Alan Bird, Ian Chisholm, and Gilbert Mackenzie.