Abstract
Purpose :
Accounting for missing data is a frequent obstacle when analyzing patient symptom diary data for dry eye studies. We performed a simulation study to compare statistical outcomes based on multiple imputation (MI) methods to handle missing data versus statistical outcomes based on single imputation (SI) methods as previously reported by Hsu et al. at the 2013 ARVO conference.
Methods :
5000 sets of diary data were randomly created from a multivariate normal distribution for active and placebo treatment groups. For each simulation, a complete two weeks of diary data was generated for 50 subjects per treatment, assuming a 0.6 treatment mean difference on a scale of 0-5 with a standard deviation of 1 and a correlation of 0.85 between diary days. Two percent of the data were randomly set as missing and ten percent of the subjects were randomly selected as early withdrawals. The Markov Chain Monte Carlo (MCMC), regression, and predictive mean matching (PMM) MI methods were used to impute missing data, each creating 20 imputed “complete” datasets. A mixed model with treatment and day as covariates (accounting for repeated measures within each subject) was used to estimate the difference in treatment means. Type I error, power, concordance (true positive rate + true negative rate), and discordance (false positive rate and false negative rate) are reported for each MI method and compared to previously reported results of the SI methods and the complete data.
Results :
All three MI methods resulted in power within 0.28% of that obtained with the complete data while LOCF resulted in 0.6% less power than the complete data. The regression and PMM methods showed the lowest false negative rates (0.36% and 0.42% respectively) compared to MCMC with 2.00% and LOCF with 1.16% and showed the highest concordance rates (99.00% and 98.96% respectively) compared to MCMC with 96.08% and LOCF with 98.28%.
Conclusions :
While the MI methods perform comparably to the LOCF method, the probability of falsely concluding no significant treatment difference is slightly improved using the regression and PMM methods compared to the LOCF method. We recommend using a variety of imputation methods to handle missing data, including MI methods, as sensitivity analyses for clinical research.
This is an abstract that was submitted for the 2018 ARVO Annual Meeting, held in Honolulu, Hawaii, April 29 - May 3, 2018.