Lecture 6: Estimating the variance of errors

Economics 326 — Introduction to Econometrics II

Author

Vadim Marmer, UBC

The importance of \sigma^2

We use the notation \mathrm{E}\left[\cdot \mid \mathbf{X}\right] = \mathrm{E}\left[\cdot \mid X_1, \ldots, X_n\right].

  • The variance of \hat{\beta} depends on the unknown \sigma^{2} = \mathrm{E}\left[U_i^{2}\right]: \mathrm{Var}\left(\hat{\beta} \mid \mathbf{X}\right) = \frac{\sigma^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right)^{2}}.
  • If U’s were observable, we could estimate \sigma^{2} by \frac{1}{n}\sum_{i=1}^{n}U_{i}^{2}, which is unbiased, but infeasible.
  • Using residuals \hat{U}_{i}=Y_{i}-\hat{\alpha}-\hat{\beta}X_{i} gives a feasible estimator \hat{\sigma}^{2}=\frac{1}{n}\sum_{i=1}^{n}\hat{U}_{i}^{2}, but \hat{\sigma}^{2} is biased.

An unbiased estimator of \sigma^2

  • An unbiased estimator of \sigma^{2} is s^{2}=\frac{1}{n-2}\sum_{i=1}^{n}\hat{U}_{i}^{2}.
  • Assumptions:
    1. Y_{i}=\alpha +\beta X_{i}+U_{i}.
    2. \mathrm{E}\left[U_{i}\mid \mathbf{X}\right] =0 for all i.
    3. \mathrm{E}\left[U_{i}^{2}\mid \mathbf{X}\right] =\sigma ^{2} for all i.
    4. \mathrm{E}\left[U_{i}U_{j}\mid \mathbf{X}\right] =0 for all i\neq j.
  • Since \hat{U}_{i}=Y_{i}-\hat{\alpha}-\hat{\beta}X_{i}, dividing by n-2 adjusts for estimating \alpha and \beta.

Expressing \hat{U}_i in terms of U_i

From \begin{align*} \hat{U}_{i} &= \left( Y_{i}-\bar{Y}\right) -\hat{\beta}\left( X_{i}-\bar{X}\right), \\ Y_{i}-\bar{Y} &= \beta \left( X_{i}-\bar{X}\right) +U_{i}-\bar{U}, \end{align*} we get \hat{U}_{i}=\left( U_{i}-\bar{U}\right) -\left( \hat{\beta}-\beta \right)\left( X_{i}-\bar{X}\right).

Expanding \sum_{i=1}^{n}\hat{U}_i^2

\hat{U}_{i}^{2}=\left( U_{i}-\bar{U}\right) ^{2}+\left( \hat{\beta}-\beta \right)^{2}\left( X_{i}-\bar{X}\right)^{2}-2\left( \hat{\beta}-\beta \right) \left( X_{i}-\bar{X}\right) \left( U_{i}-\bar{U}\right). Thus, \sum_{i=1}^{n}\hat{U}_{i}^{2} =\sum_{i=1}^{n}\left( U_{i}-\bar{U}\right) ^{2}+\left( \hat{\beta}-\beta \right) ^{2}\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}-2\left( \hat{\beta}-\beta \right) \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) \left( U_{i}-\bar{U}\right).

To show \mathrm{E}\left[\sum_{i=1}^{n}\hat{U}_{i}^{2}\right]=\left( n-2\right) \sigma ^{2}, we verify the three expectations below.

Expectation of \sum_{i=1}^{n} \left(U_i-\bar{U}\right)^2

\begin{align*} \sum_{i=1}^{n}\left( U_{i}-\bar{U}\right) ^{2} &= \sum_{i=1}^{n}U_{i}^{2}-\frac{1}{n}\left( \sum_{i=1}^{n}U_{i}\right) ^{2} \\ &= \sum_{i=1}^{n}U_{i}^{2}-\frac{1}{n}\left( \sum_{i=1}^{n}U_{i}^{2}+\sum_{i=1}^{n}\sum_{j\neq i}U_{i}U_{j}\right). \end{align*} Taking expectations and using the assumptions, \mathrm{E}\left[\sum_{i=1}^{n}\left( U_{i}-\bar{U}\right) ^{2}\right]=n\sigma ^{2}-\frac{1}{n}n\sigma ^{2}=\left( n-1\right) \sigma ^{2}.

Expectation of \left( \hat{\beta}-\beta \right)^{2}\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right)^{2}

Because \mathrm{E}\left[\hat{\beta}\right]=\beta (conditionally on \mathbf{X}), \mathrm{E}\left[\left( \hat{\beta}-\beta \right) ^{2}\right]=\mathrm{Var}\left(\hat{\beta}\right)=\frac{\sigma ^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}. Hence, \mathrm{E}\left[\left( \hat{\beta}-\beta \right) ^{2}\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}\right]=\sigma^{2}.

Expectation of \left( \hat{\beta} - \beta\right) \sum_{i=1}^n \left(X_i - \bar{X}\right)\left(U_i-\bar{U}\right)

Note that \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) \left( U_{i}-\bar{U}\right) =\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}, and \hat{\beta}-\beta =\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}. Therefore, \begin{align*} \left( \hat{\beta}-\beta \right) \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right)\left( U_{i}-\bar{U}\right) &=\frac{1}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}\left( \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}\right) ^{2}. \end{align*} Conditionally on \mathbf{X}, \mathrm{E}\left[\left( \hat{\beta}-\beta \right) \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) \left( U_{i}-\bar{U}\right)\right] = \frac{1}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}\left( \sigma ^{2}\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}\right)=\sigma ^{2}.

Putting it together

Using the three expectations above, \mathrm{E}\left[\sum_{i=1}^{n}\hat{U}_{i}^{2}\right]=\left( n-1\right) \sigma ^{2}+\sigma^{2}-2\sigma ^{2}=\left( n-2\right) \sigma ^{2}, so s^{2} is unbiased for \sigma^{2}.

Estimating the variance of \hat{\beta}

  • Variance of \hat{\beta} (conditional on \mathbf{X}): \mathrm{Var}\left(\hat{\beta}\right) =\frac{\sigma ^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}.
  • Estimator of \sigma ^{2}: s^{2}=\frac{1}{n-2}\sum_{i=1}^{n}\hat{U}_{i}^{2}=\frac{1}{n-2}\sum_{i=1}^{n}\left( Y_{i}-\hat{\alpha}-\hat{\beta}X_{i}\right) ^{2}.
  • Estimator of the variance of \hat{\beta}: \widehat{\mathrm{Var}}\left( \hat{\beta}\right) =\frac{s^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}.
  • Standard error of \hat{\beta}: \mathrm{SE}\left( \hat{\beta}\right) =\sqrt{\widehat{\mathrm{Var}}\left( \hat{\beta}\right)}=\sqrt{\frac{s^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}}.