Table 1 summarizes the analysis of five response categories for the difficulty ratings. For each difficulty rating, the “count” column shows how many times the rating was used across all items and subjects. The step measure is the value of person ability minus item difficulty at which the probability of responding with category
x equals the probability of responding with category
x − 1. The step measure is an important parameter in the Wright and Masters rating scale model,
38 which is used by WinStep. There is no step measure for category 0, because there is no lower category. The step measures should increase monotonically with response category rank. However, there was a reversal from steps 2 to 3
(Table 1) . Moreover, response 2 was not the most probable response for all the values of the person-minus-item measure. Hence, the rating categories 2 and 3 were combined and reapplied, with the Rasch having 0 as “no difficulty” and 3 as “cannot manage.”
Table 2 shows analyses of the four response categories for difficulty ratings. The table shows that the step measure increased monotonically with response category rank. The expected measure in each category is the average functional reserve for the extreme categories and the functional reserve for the peak of the probability function. In our sample, the expected measure showed a consistent increase with the order of the ratings.
Figure 1 is a histogram of the person measures. Person measure is the opposite sign of the person logit value. The person logit corresponds to the difference between each person’s perceived visual ability (α
n) and the mean item measure p̄. If the person logit is positive, the person’s perceived visual ability is greater than the average required visual ability of the 16 visual function situations. If the person logit is negative, the person’s perceived visual ability is less than the average required visual ability. In our sample, estimates of the perceived visual ability (in logits) for visual function performance were normally distributed (
P = 0.889, Kolmogorov-Smirnov
Z-test). The mean of the distribution (in logits) was −0.87 ± 2.0 (SD).
The 16 visual function items in the questionnaire are listed in
Table 3 in order of least to most visual ability required for functional vision performance according to the subjects’ difficulty ratings. The item number represents the order in which the functional vision questions were listed on the questionnaire. The values in the table are “item logits,” which correspond to difference between the mean item measure for the 16 items (p̄) and the item measure for each item (ρ
i). The item measure (ρ
i) corresponds to the visual ability required for the performance of visual function. Here item measure is the opposite sign of the item logit value. If the item logit is positive, the required visual ability for the performance of visual function is less than the mean required visual ability of all the items, and if the item logit is negative, the required visual ability is greater than the mean required visual ability.
Table 3 shows that “reaching an object that is farther or closer than you thought,” “identifying colors,” and “recognizing the people near them” required the least visual ability, whereas “recognizing small objects,” “reading small print in the newspaper,” “recognizing people across the street,” and “recognizing the bus number” required the most visual ability. The items that required almost the same visual ability are “estimating the distance of a vehicle while crossing the road,” “noticing objects off to the side, when walking and looking straight ahead,” and “recognizing traffic signals/lights.”
Figure 2 shows a patient ability/item difficulty map determined by Rasch analysis for the items in the APEDS-VFQ. Patients (Xs on the left) appear in ascending order of ability from the bottom of the map to the top, and items (item names on the right) appear in ascending order of difficulty from the bottom to the top. On the whole, the item difficulty is meeting with the ability of the persons, which is represented by the Xs located more where the items are located and the means of the two distributions, denoted in
Figure 2 by M, were close to each other.
Table 4 summarizes the global fit statistics for person ability and item difficulty parameters. Content validity is tested with the separation index, which is a measure of how broadly the person and item measures are distributed along the visual ability dimensions and is simply the estimated ratio of estimated true SD to the SE of the estimate. Separation indices of 3.17 for person measures and 5.44 for item measures were observed in our study. Using these indices with the formula of Wright and Masters,
38 we determined that our sample has five statistically distinct levels of person measures and seven statistically distinct levels of item measures. In addition to having the good separation indices, it is important to have high reliability measures. The reliability of the separation is the ratio of the adjusted SD to the SD of the person or item measure distribution. The closer the reliability value is to 1.0, the less the variability in the measurement distribution can be attributed to measurement error.
Table 4 reports high reliability values for the person (0.91) and item (0.97) parameters.
Construct validity (how well the data fit the assumptions of the model) was evaluated by calculating “infit” and “outfit” statistics. The fit statistics are indices of measurement accuracy. The outlier-sensitive (outfit) statistic is sensitive to unexpected behavior by persons on items far from the subject’s ability level. Values close to 1.0 indicate that the variability in the responses is close to the variance expected by the model. Values greater than 1.0 indicate that the variability in the responses is greater than the variability expectations of the model, and values below 1.0 suggest that the variability in responses is influenced by a covariance term. Because the outfit statistic is sensitive to misfitting persons or items, an information-weighted (infit) statistic also was calculated. This statistic more closely represents the variance of the responses of persons whose person measure is close to the item measure. If the data fit the model, the expected value is 1.0. The normalized infit and outfit mean squares (
Z STD) have an expected value of 0 mean and unit SD. Values that exceed ±2 indicate that the mean square exceeded the model’s expectations by more than 2 SD.
Table 3 reports the infit and outfit statistics for the 16 items. Six items in our sample were misfits. The most extreme misfit item was “recognizing small objects.” The mean squares (infit and outfit) of this item exceeded the model’s expectation by 3.2 SD. Other items (“reaching an object that is farther or closer than you thought,” “seeing objects in poorly lit surroundings,” “seeing objects in bright light due to glare,” and “climbing up or down the steps”) had infit and outfit values that exceeded the model’s expectation by more than 2 SD. Ambiguous wording (e.g., farther or closer than you thought) may have contributed to the high variability of responses to these items.
Figure 3 shows the person measures against the
Z STD infit values for the 119 persons whose responses were included in the final analysis. Data points for the persons with the most visual ability for functional vision performance are located at the top of the graph and those for persons with the least visual ability are located at the bottom. Nineteen (16%) persons’
Z STD infit values exceeded 2, indicating that their mean squares exceeded the model’s expectations by more than 2 SD. A retrospective review of the persons (
n = 6) whose
Z STD infit values lay between 3 and 4 were visually impaired with glaucoma (
n = 1), amblyopia
n = (2), or retinal disorders (
n = 3). Only one 75-year-old woman was observed with a
Z STD infit value of more than 4 (actual, 5.5). This person was blind in one eye due to endophthalmitis and was moderately visually impaired in the second eye due to a retinal problem. She reported difficulty (cannot manage) with noticing objects off to the side while walking but not for the other mobility items with which we might expect such a person to have a greater level of difficulty. Elimination of this misfitting subject does not influence the estimation of item or person measures.