Abstract
Purpose :
In previous research the EyeQ item bank which measures vision-related quality of life (Vr-QoL) was developed and calibrated for future use as a computer adaptive test (CAT). In theory, the CAT-algorithm uses only a few items for a reliable Vr-QoL estimation (‘ability’), reducing patient burden. But before using a CAT in clinical practice it is essential to conduct preliminary analyses to evaluate the appropriateness of patient ability estimations (θ) against full test length and actual test length reduction. The aim of the current study was to optimize performance of the CAT-EyeQ by customizing stopping rules.
Methods :
Post-hoc CAT simulations were performed using real responses. Patients (N=704, mean age 76.2), having macular edema due to several exudative retinal diseases completed the EyeQ (46 items). θ’s were estimated after fitting Samejima’s graded response model. Subsequently, plausible responses were imputed and again θ`s based on complete data were obtained. Pearson`s correlation was calculated between EyeQ θincom (incomplete data) and EyeQ θimput (plausible imputed data). Four CAT simulations were performed which varied in combinations of stopping rules: for CATDefaultPROMIS, a minimum and maximum length of 4 and 12 items, and an accuracy level of 0.32 (4|12|0.32) was set; for CATAlt1 2|15|0.32 and for CATAlt2 4|12|0.25. CATBesthealth was defined as CATDefaultPROMIS plus abort if the first four responses were rated as ‘best possible health’. Mean test length, percentage unreliably estimated CAT θ`s and mean standard error was evaluated. The conditional standard error across different levels of ability was examined graphically.
Results :
Pearson`s correlation between EyeQ θincom and EyeQ θimput was 0.99. CATDefaultPROMIS showed the lowest mean number of items needed to administer (6.9), where CATAlt1 showed the lowest amount of unreliably estimated CAT scores (11.5%). CATAlt2 performed worst; mean number of items needed (9.7) and 37% unreliable estimations. Outcomes of CATBestHealth were similar to CATDefaultPROMIS, but further reduced the mean test length to 6 items. In all applied conditions, the conditional standard error was found to be highest at lower levels of ability.
Conclusions :
This study shows that measuring Vr-QoL in clinical practice using the CAT-EyeQ with optimized stopping rules is useful as it leads to a higher measurement efficiency where reliable test outcomes can still be achieved.
This abstract was presented at the 2022 ARVO Annual Meeting, held in Denver, CO, May 1-4, 2022, and virtually.