Abstract
Purpose:
Prospective clinical trials are used to make important clinical decisions that impact patient care. Results from literature are said to be not statistically significant when they yield a p value>0.05. This conclusion may be inaccurate if studies are powered inadequately, resulting in the occurrences of type II errors. The purpose of this study was to examine the power of unpaired t-tests when these tests failed to detect a statistically significant difference and to determine the frequency of type II errors in recently published prospective randomized trials from 4 major ophthalmology journals.
Methods:
We examined all prospective randomized trials published between 2010 and 2012 in four major ophthalmology journals (Archives of Ophthalmology, British Journal of Ophthalmology, Ophthalmology and American Journal of Ophthalmology). Studies that used unpaired t-tests were included. Power was calculated using the number of subjects in each group, standard deviations and α = 0.05. The difference between control and experimental means was set to be (1) 20% and (2) 50% of the absolute value of the control’s initial conditions. Power and Precision version 4.0 software was used to carry out calculations. Finally, the proportion of articles with type II errors was calculated. β=0.3 was set as the largest acceptable value for the probably of type II errors.
Results:
280 articles were screened. Final analysis included 50 randomized control trials using unpaired t-tests. The median power of tests to detect a 50% difference between means was 0.9 and was the same for all 4 journals. The median power of tests to detect a 20% difference between means ranged from 0.26 to 0.9 for the four journals. The median power of these tests to detect a 50% and 20% difference between means was 0.9 and 0.5 for tests that did not achieve statistical significance. A total of 14% and 57% of articles with negative unpaired t-tests contained results with β>0.3 when power was calculated for differences between means of 50% and 20%, respectively.
Conclusions:
A large portion of studies demonstrate high probabilities of type II errors when detecting small differences between means. The power to detect small difference between means varies across journals. It is, therefore, worthwhile for authors to mention the minimum clinically important difference for individual studies. Journals can consider publishing statistical guidelines for authors to use.
Keywords: 459 clinical (human) or epidemiologic studies: biostatistics/epidemiology methodology •
460 clinical (human) or epidemiologic studies: health care delivery/economics/manpower