Lecture 4: Properties of OLS

Economics 326 — Introduction to Econometrics II

Author

Vadim Marmer, UBC

Properties of Estimators

  1. Random
  2. Mean
  3. Variance
  4. Distribution

OLS Estimators as Random Variables

  • The model \begin{aligned} &Y_{i} = \alpha + \beta X_{i} + U_{i}, \\ &\E{U_{i} \mid X_1, \ldots, X_n} = 0. \end{aligned} Conditioning on X_1, \ldots, X_n allows us to treat all the X_i’s as fixed, but Y_i is still random.
  • To save on writing, we will use the notation \E{\cdot \mid \mathbf{X}} = \E{\cdot \mid X_1, \ldots, X_n}. That is, \mathbf{X}=(X_1, \ldots, X_n)
  • The estimators \hat{\beta} = \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) Y_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} \text{ and } \hat{\alpha} = \bar{Y}-\hat{\beta}\bar{X} are random because they are functions of the random Y_i’s even after conditioning on the X_i’s.

Linearity of Estimators

  • Since \hat{\beta} = \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) Y_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}, we can write \hat{\beta} = \sum_{i=1}^{n}w_{i}Y_{i}, where w_{i} = \frac{X_{i}-\bar{X}}{\sum_{l=1}^{n}\left( X_{l}-\bar{X}\right) ^{2}}. After conditioning on X’s, w_{i}’s are not random.
  • For \hat{\alpha}, \begin{aligned} \hat{\alpha} &= \bar{Y}-\hat{\beta}\bar{X} \\ &= \frac{1}{n}\sum_{i=1}^{n}Y_{i}-\left( \sum_{i=1}^{n}w_{i}Y_{i}\right) \bar{X} \\ &= \sum_{i=1}^{n}\left( \frac{1}{n}-\bar{X}w_{i}\right) Y_{i} \\ &= \sum_{i=1}^{n}\left( \frac{1}{n}-\bar{X}\frac{X_{i}-\bar{X}}{ \sum_{l=1}^{n}\left( X_{l}-\bar{X}\right) ^{2}}\right) Y_{i}. \end{aligned}

Unbiasedness

Definition and Claim

  • \hat{\beta} is called an unbiased estimator if \E{\hat{\beta}} = \beta.

  • Claim: Suppose that

    • Y_{i}=\alpha +\beta X_{i}+U_{i},
    • \E{U_{i} \mid \mathbf{X}} = 0.
    • Then \E{\hat{\beta}}=\beta.

Proof Step 1: Decomposition into signal and noise

\hat{\beta} = \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) Y_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}

\phantom{\hat{\beta}} = \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) \left( \alpha +\beta X_{i}+U_{i}\right) }{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}

\phantom{\hat{\beta}} = \alpha \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) }{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} + \beta \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) X_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} + \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}

\phantom{\hat{\beta}} = \alpha \frac{0}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} + \beta \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}{ \sum_{i=1}^{n}\left(X_{i}-\bar{X}\right) ^{2}} + \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}

\hat{\beta}={\color{blue}\underbrace{\beta}_{\text{signal}}} +{\color{red}\underbrace{\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}}_{\text{noise}}}

Proof Step 2: Conditioning on Regressors

  • Once we condition on \mathbf{X}, all the X_i’s in \hat{\beta}=\beta +\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} can be treated as fixed.
  • Thus, \begin{aligned} \E{\hat{\beta} \mid \mathbf{X}} & = \E{\beta +\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} \mid \mathbf{X}} \\ &= \beta + \E{\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} \mid \mathbf{X}} \\ &= \beta + \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) \E{U_{i} \mid \mathbf{X}} }{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}. \end{aligned}

Proof Step 3

  • Thus, with \E{U_{i} \mid \mathbf{X}} = 0, we have \begin{aligned} \E{\hat{\beta} \mid \mathbf{X}} &= \beta +\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) \E{U_{i} \mid \mathbf{X}}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} \\ &= \beta +\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) \cdot 0}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} = \beta. \end{aligned}
  • By the LIE, \E{\hat{\beta}} = \E{\E{\hat{\beta} \mid \mathbf{X}}} = \E{\beta} = \beta.

Strong Exogeneity of Regressors

  • \mathbf{X}=(X_1,\ldots,X_n) are strongly exogenous if \E{U_{i} \mid \mathbf{X}} = 0.
  • Alternatively, we can assume that \E{U_{i} \mid X_{i}} = 0 and all observations are independent: \begin{aligned} \E{U_{1} \mid \mathbf{X}} &= \E{U_{1} \mid X_{1}}, \\ \E{U_{2} \mid \mathbf{X}} &= \E{U_{2} \mid X_{2}} \text{ and etc.} \end{aligned}
  • The OLS estimator is in general biased if the strong exogeneity assumption is violated.

Variance of the Slope Estimator

Variance Formula and Homoskedasticity

  • If Y_{i}=\alpha +\beta X_{i}+U_{i}, \E{U_{i} \mid \mathbf{X}} = 0, and \E{U_{i}^{2} \mid \mathbf{X}} = \sigma^{2} = \text{constant}, and for i \neq j \E{U_{i}U_{j} \mid \mathbf{X}} = 0, then \Var{\hat{\beta} \mid \mathbf{X}} = \frac{\sigma^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}.
  • The assumption \E{U_{i}^{2} \mid \mathbf{X}} = \sigma^{2} = \text{constant} is called (conditional) homoskedasticity.
  • The assumption \E{U_{i}U_{j} \mid \mathbf{X}} = 0 for i \neq j can be replaced by the assumption that the observations are independent.

Determinants of Variance

\Var{\hat{\beta} \mid \mathbf{X}} = \frac{\sigma^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}.

  • The variance of \hat{\beta} is positively related to the variance of the errors \sigma^{2} = \Var{U_{i}}.
  • The variance of \hat{\beta} is smaller when the X_i’s are more dispersed.

Derivation of Variance

  • We have \hat{\beta}=\beta +\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X} \right) U_{i}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} and \E{\hat{\beta} \mid \mathbf{X}}=\beta. \begin{aligned} \Var{\hat{\beta} \mid \mathbf{X}} & = \E{\left( \hat{\beta}-\E{\hat{\beta} \mid \mathbf{X}}\right) ^{2} \mid \mathbf{X}} \\ &= \E{\left( \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}\right) ^{2} \mid \mathbf{X}} \\ &= \left( \frac{1}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}\right) ^{2} \E{\left( \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}\right) ^{2} \mid \mathbf{X}}. \end{aligned}

  • Expanding the square, \begin{aligned} \left( \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}\right) ^{2} &= \sum_{i=1}^{n}\sum_{j=1}^{n}\left( X_{i}-\bar{X}\right) \left( X_{j}-\bar{X}\right) U_{i}U_{j} \\ &= \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}U_{i}^{2}\\ &\quad + \sum_{i=1}^{n}\sum_{j\neq i}\left( X_{i}-\bar{X}\right) \left( X_{j}-\bar{X}\right) U_{i}U_{j}. \end{aligned}

  • Since \E{U_{i}U_{j} \mid \mathbf{X}} = 0 for i \neq j, \begin{aligned} \E{\left( \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}\right) ^{2} \mid \mathbf{X}} &= \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}\E{U_{i}^{2} \mid \mathbf{X}} + 0 \\ &= \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}\sigma^{2}. \end{aligned}

  • We have \begin{aligned} &\Var{\hat{\beta} \mid \mathbf{X}} = \left( \frac{1}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}\right) ^{2} \E{\left( \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}\right) ^{2} \mid \mathbf{X}}, \\ &\E{\left( \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}\right) ^{2} \mid \mathbf{X}} = \sigma^{2}\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}, \end{aligned} and therefore, \begin{aligned} \Var{\hat{\beta} \mid \mathbf{X}} &= \left( \frac{1}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}\right) ^{2} \sigma^{2}\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2} \\ &= \left( \frac{1}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}\right) \sigma^{2}. \end{aligned}

Distribution of the Slope Estimator

Normality of the OLS Estimator

  • Assume that U_{i}’s are jointly normally distributed conditional on X’s.
  • Then Y_{i}=\alpha +\beta X_{i}+U_{i} are also jointly normally distributed.
  • Since \hat{\beta}=\sum_{i=1}^{n}w_{i}Y_{i}, where w_{i}=\frac{X_{i}-\bar{X}}{\sum_{l=1}^{n}\left( X_{l}-\bar{X}\right) ^{2}} depend only on the X_i’s, \hat{\beta} is also normally distributed conditional on the X_i’s.
  • Conditional on \mathbf{X} \begin{aligned} \hat{\beta} \mid \mathbf{X}&\sim N\left( \E{\hat{\beta} \mid \mathbf{X}}, \Var{\hat{\beta} \mid \mathbf{X}} \right) \\ &\sim N\left( \beta, \frac{\sigma^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}\right). \end{aligned}