周末统计问题(17): 多因素线性回归的作用(这题有点难哦)
Question
Which, if any, of the following statements are true of multiple linear regression?
a) It maybe used to control confounding in cohort studies
b) It can only assess a linear (straight line) relationship between variables
c) It gives misleading results when two or more independent variables are highly correlated
d) It can be used with dichotomous or categorical independent variables
e) It can be used with dichotomous or categorical dependent variables
Answer
An example of a linear regression equation would bey=ax1 +bx2 +c. This type of equation can be used to express the relationshipbetween independent variables on the right hand side and one dependent variablethat is on the left hand side.
The dependent variable must be able to take on a wide range of values and ideally should be acontinuous variable such as height or blood pressure. The independent variables, by contrast may be of almost any type including, for example, dichotomous (sex), categorical (eye colour) or continuous (height).
Controlling for confounding is a major use of regression equations in medical statistics. Whereas correlation assesses a straight line relationship, regression may take on more complex curves by using polynomials or smoothed functions for example.
When two independent variables that are highly correlated are included in a regression equation, they will compete for statistical significance and may not appear asin dependent predictors of outcome. For example, in a cohort study of risk factors for osteoarthritis of the knee, including the possession of running shoes and participation in regular exercise may result in unreliable results from a regression analysis.
中文解释:
线性回归方程的示例为y = ax1 + bx2 + c。这种类型的方程式可用于表达右侧的自变量与左侧的一个因变量之间的关系。
因变量必须是一个具有一定范围观察值的变量,并且理想情况下应为连续性变量,例如身高或血压。相反,自变量几乎可以是任何类型,包括例如二分类变量(性别),分类变量(眼睛颜色)或连续变量(身高)。
控制混杂是医学统计中回归方程的主要用途。相关性评估直线关系,而回归可以通过使用多项式或平滑函数来绘制更复杂的曲线。
当两个高度相关的自变量包含在回归方程中时,它们将争夺统计学意义,并且可能不会作为结果的独立预测因子出现。例如,在一项关于膝关节骨关节炎危险因素的队列研究中,包括拥有跑鞋和参加定期运动,可能导致回归分析的结果不可靠。
更多信息