周末统计问题(12): 样本量不够,I 型错误和 II 型错误会怎么变化?
Question
In a trial of a herbal treatment for symptoms of menopause the power calculation described in the protocol required 400 patients to be randomised. Unfortunately, recruitment was slow and the trial had to be stopped after 200 patients as funding was running out. Which if any of the following statements is true?
a) The possibility of a type one error is increased and the possibility of a type two error is unaltered.
b) The possibility of a type two error is increased and the possibility of a type one error is unaltered.
c) The possibility of both a type one error and a type two error is increased.
d) The possibility of both types of error is unaltered.
Answer
A type one error occurs when a study indicates an effect or an association that does not exist, the so called “false positive result.” The rate of false positive results we are prepared to accept as the price for scientific enquiry is chosen by the investigator or the reader and is referred to as the level for statistical significance. Often this is set at 5% or one in 20. This kind of error has nothing to do with sample size.
A type two error occurs when a study fails to detect an effect or an association that does exist. For any given investigation, smaller studies have less ability to pick up an association or effect than do larger studies.
Small studies can be misleading in another way. As small studies are “imprecise” and have wide confidence intervals, it is only the ones with abnormally large effects that manage to achieve “statistical significance.” If the only studies conducted on, say, a new surgical technique are small, then by relying only on the “statistically significant” ones we are likely to overestimate the promise of the new treatment. So even though we might rely on the “significant” study to show that something is going on, it would be prudent to consider all the “non-significant” studies as well in estimating the likely size of the effect.
中文解释:
当一项研究得到了实际不存在的效应或关联时,会发生一型错误,所谓的“假阳性结果”。一型错误由研究人员或读者决定,并被称为具有统计意义的水平。通常将其设置为5%或20分之一。这种错误与样本量大小无关。
当研究未能检测到实际存在的效果或存在的关联时,将发生二型错误。对于任何给定的调查,与较大规模的研究相比,较小样本量的研究发现真阳性的把握度或能力较小。
小型研究也可能带来其它方式的误导。由于小型研究是“不精确的”并且具有宽泛的置信区间,因此只有具有异常大影响的研究才能实现“统计意义”。比如对一种新的外科手术技术进行的研究样本量很小,那么仅依靠“具有统计学意义”的研究,我们可能会高估新疗法的前景。因此,即使我们可能依靠“统计学意义”研究来表明正在发生某些事情,但在估计影响的可能大小时也应考虑所有“没有统计学意义”研究。
更多信息
培训通告