The variance of \hat{\beta} depends on the unknown \sigma^{2} = \mathrm{E}\left[U_i^{2} \mid \mathbf{X}\right]:
\mathrm{Var}\left(\hat{\beta} \mid \mathbf{X}\right) = \frac{\sigma^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right)^{2}}.
If U’s were observable, we could estimate \sigma^{2} by \frac{1}{n}\sum_{i=1}^{n}U_{i}^{2}, which is unbiased. This is not possible as U’s are unobservable.
Using sample residuals instead, \hat{U}_{i}=Y_{i}-\hat{\alpha}-\hat{\beta}X_{i}, gives a feasible estimator:
\hat{\sigma}^{2}=\frac{1}{n}\sum_{i=1}^{n}\hat{U}_{i}^{2},
\hat{\sigma}^{2} is biased.
An unbiased estimator of \sigma^2
An unbiased estimator of \sigma^{2} is
s^{2}=\frac{1}{n-2}\sum_{i=1}^{n}\hat{U}_{i}^{2}.
Assumptions:
Y_{i}=\alpha +\beta X_{i}+U_{i}.
\mathrm{E}\left[U_{i}\mid \mathbf{X}\right] =0 for all i.
\mathrm{E}\left[U_{i}^{2}\mid \mathbf{X}\right] =\sigma ^{2} for all i.
\mathrm{E}\left[U_{i}U_{j}\mid \mathbf{X}\right] =0 for all i\neq j.
Since \hat{U}_{i}=Y_{i}-\hat{\alpha}-\hat{\beta}X_{i}, dividing by n-2 adjusts for estimating two parameters: \alpha and \beta.
Expressing \hat{U}_i in terms of U_i
\hat{U}_{i}=Y_{i}-\hat{\alpha}-\hat{\beta}X_{i}
\hat{\alpha}=\bar{Y}-\hat{\beta}\bar{X}, so
\hat{U}_{i} = \left( Y_{i}-\bar{Y}\right) -\hat{\beta}\left( X_{i}-\bar{X}\right)
Estimator of the variance of \hat{\beta}:
\widehat{\mathrm{Var}}\left(\hat{\beta}\right) =\frac{s^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}.
Standard error of \hat{\beta}: \begin{align*}
\mathrm{se}\left(\hat{\beta}\right) &= \sqrt{\widehat{\mathrm{Var}}\left(\hat{\beta}\right)} \\
&= \sqrt{\frac{s^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}}.
\end{align*}
Example in R
Regress hourly wage on years of education using the wage1 dataset from Wooldridge:
library(wooldridge)data("wage1")fit <-lm(wage ~ educ, data = wage1)summary(fit)
Call:
lm(formula = wage ~ educ, data = wage1)
Residuals:
Min 1Q Median 3Q Max
-5.3396 -2.1501 -0.9674 1.1921 16.6085
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.90485 0.68497 -1.321 0.187
educ 0.54136 0.05325 10.167 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 3.378 on 524 degrees of freedom
Multiple R-squared: 0.1648, Adjusted R-squared: 0.1632
F-statistic: 103.4 on 1 and 524 DF, p-value: < 2.2e-16
Std. Error column: standard errors \mathrm{se}\left(\hat{\alpha}\right) and \mathrm{se}\left(\hat{\beta}\right)
Residual standard error: s = \sqrt{s^2}; so s^2 = 3.378^2 \approx 11.41
Using the three expectations above, \begin{align*}
\mathrm{E}\left[\sum_{i=1}^{n}\hat{U}_{i}^{2}\mid \mathbf{X}\right]
&= \left( n-1\right) \sigma ^{2}+\sigma^{2}-2\sigma ^{2} \\
&= \left( n-2\right) \sigma ^{2},
\end{align*} so s^{2} is unbiased for \sigma^{2}.
Source Code
---title: "Lecture 6: Estimating the variance of errors"subtitle: "Economics 326 — Introduction to Econometrics II"author: - name: "Vadim Marmer, UBC"format: html: output-file: 326_06_errors_variance.html toc: true toc-depth: 3 toc-location: right toc-title: "Table of Contents" theme: cosmo smooth-scroll: true html-math-method: katex pdf: output-file: 326_06_errors_variance.pdf pdf-engine: xelatex geometry: margin=0.75in fontsize: 10pt number-sections: false toc: false classoption: fleqn revealjs: output-file: 326_06_errors_variance_slides.html theme: solarized css: slides_no_caps.css smaller: true slide-number: c/t incremental: true html-math-method: katex scrollable: true chalkboard: false self-contained: true transition: none---## The importance of $\sigma^2$::: {.hidden}$\gdef\E#1{\mathrm{E}\left[#1\right]}\gdef\Var#1{\mathrm{Var}\left(#1\right)}\gdef\Cov#1{\mathrm{Cov}\left(#1\right)}\gdef\Vhat#1{\widehat{\mathrm{Var}}\left(#1\right)}\gdef\se#1{\mathrm{se}\left(#1\right)}$:::- The variance of $\hat{\beta}$ depends on the unknown $\sigma^{2} = \E{U_i^{2} \mid \mathbf{X}}$: $$ \Var{\hat{\beta} \mid \mathbf{X}} = \frac{\sigma^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right)^{2}}. $$- If $U$'s were observable, we could estimate $\sigma^{2}$ by $\frac{1}{n}\sum_{i=1}^{n}U_{i}^{2}$, which is unbiased. This is not possible as $U$'s are unobservable.- Using sample residuals instead, $\hat{U}_{i}=Y_{i}-\hat{\alpha}-\hat{\beta}X_{i}$, gives a feasible estimator: $$ \hat{\sigma}^{2}=\frac{1}{n}\sum_{i=1}^{n}\hat{U}_{i}^{2}, $$- $\hat{\sigma}^{2}$ is biased.## An unbiased estimator of $\sigma^2$- An unbiased estimator of $\sigma^{2}$ is $$ s^{2}=\frac{1}{n-2}\sum_{i=1}^{n}\hat{U}_{i}^{2}. $$- Assumptions: 1. $Y_{i}=\alpha +\beta X_{i}+U_{i}$. 2. $\E{U_{i}\mid \mathbf{X}} =0$ for all $i$. 3. $\E{U_{i}^{2}\mid \mathbf{X}} =\sigma ^{2}$ for all $i$. 4. $\E{U_{i}U_{j}\mid \mathbf{X}} =0$ for all $i\neq j$.- Since $\hat{U}_{i}=Y_{i}-\hat{\alpha}-\hat{\beta}X_{i}$, dividing by $n-2$ adjusts for estimating two parameters: $\alpha$ and $\beta$.## Expressing $\hat{U}_i$ in terms of $U_i$- $\hat{U}_{i}=Y_{i}-\hat{\alpha}-\hat{\beta}X_{i}$- $\hat{\alpha}=\bar{Y}-\hat{\beta}\bar{X}$, so $$ \hat{U}_{i} = \left( Y_{i}-\bar{Y}\right) -\hat{\beta}\left( X_{i}-\bar{X}\right) $$- Also: $$ Y_{i}-\bar{Y} = \beta \left( X_{i}-\bar{X}\right) +U_{i}-\bar{U}, $$- We have the following relationship between $\hat{U}_i$ and $U_i$: $$ \hat{U}_{i}=U_{i}-\bar{U} -\left( \hat{\beta}-\beta \right)\left( X_{i}-\bar{X}\right). $$- $\hat{U}_i$ is related to $U_i$, but it is contaminated by the estimation errors.## Expanding $\sum \hat{U}_i^2$- We have: $$ \hat{U}_{i} = \left( Y_{i}-\bar{Y}\right) -\hat{\beta}\left( X_{i}-\bar{X}\right) $$- The squared residual is: \begin{align*} \hat{U}_{i}^{2} &= \left( U_{i}-\bar{U}\right) ^{2}+\left( \hat{\beta}-\beta \right)^{2}\left( X_{i}-\bar{X}\right)^{2} \\ &\quad -2\left( \hat{\beta}-\beta \right) \left( X_{i}-\bar{X}\right) \left( U_{i}-\bar{U}\right). \end{align*}- Thus, \begin{align*} \sum_{i=1}^{n}\hat{U}_{i}^{2} &=\sum_{i=1}^{n}\left( U_{i}-\bar{U}\right) ^{2}\\ &\quad+\left( \hat{\beta}-\beta \right) ^{2}\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2} \\ &\quad-2\left( \hat{\beta}-\beta \right) \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) \left( U_{i}-\bar{U}\right). \end{align*}- To show $\E{\sum_{i=1}^{n}\hat{U}_{i}^{2}\mid \mathbf{X}}=\left( n-2\right) \sigma ^{2}$, we verify the three terms on the RHS:- **Claim 1:** $$ \E{\sum_{i=1}^{n}\left( U_{i}-\bar{U}\right) ^{2}\mid \mathbf{X}}=\left( n-1\right) \sigma ^{2}. $$- **Claim 2:** \begin{align*} &\E{\left( \hat{\beta}-\beta \right) ^{2}\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}\mid \mathbf{X}} \\ &\quad =\sigma ^{2}. \end{align*}- **Claim 3:** \begin{align*} &-2\cdot \E{\left( \hat{\beta}-\beta \right) \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) \left( U_{i}-\bar{U}\right)\mid \mathbf{X}} \\ &\quad =-2\sigma ^{2}. \end{align*}## Estimating the variance of $\hat{\beta}$- Variance of $\hat{\beta}$ conditional on $\mathbf{X}$: $$ \Var{\hat{\beta}\mid \mathbf{X}} =\frac{\sigma ^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}. $$- Estimator of $\sigma ^{2}$: \begin{align*} s^{2} &= \frac{1}{n-2}\sum_{i=1}^{n}\hat{U}_{i}^{2} \\ &= \frac{1}{n-2}\sum_{i=1}^{n}\left( Y_{i}-\hat{\alpha}-\hat{\beta}X_{i}\right) ^{2}. \end{align*}- Estimator of the variance of $\hat{\beta}$: $$ \Vhat{\hat{\beta}} =\frac{s^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}. $$- Standard error of $\hat{\beta}$: \begin{align*} \se{\hat{\beta}} &= \sqrt{\Vhat{\hat{\beta}}} \\ &= \sqrt{\frac{s^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}}. \end{align*}## Example in R- Regress hourly wage on years of education using the `wage1` dataset from Wooldridge:```{r}library(wooldridge)data("wage1") fit <-lm(wage ~ educ, data = wage1)summary(fit)```- `Std. Error` column: standard errors $\se{\hat{\alpha}}$ and $\se{\hat{\beta}}$- `Residual standard error`: $s = \sqrt{s^2}$; so $s^2 = 3.378^2 \approx 11.41$- The estimate $s^2$ can also be extracted directly:```{r}summary(fit)$sigma^2```## Proof of Claim 1\begin{align*}\sum_{i=1}^{n}\left( U_{i}-\bar{U}\right) ^{2}&= \sum_{i=1}^{n}U_{i}^{2}-\frac{1}{n}\left( \sum_{i=1}^{n}U_{i}\right) ^{2} \\&= \sum_{i=1}^{n}U_{i}^{2} \\&\quad -\frac{1}{n}\left( \sum_{i=1}^{n}U_{i}^{2}+\sum_{i=1}^{n}\sum_{j\neq i}U_{i}U_{j}\right).\end{align*}Taking conditional expectations and using the assumptions,\begin{align*}\E{\sum_{i=1}^{n}\left( U_{i}-\bar{U}\right) ^{2}\mid \mathbf{X}}&= n\sigma ^{2}-\frac{1}{n}\cdot n\sigma ^{2} \\&= \left( n-1\right) \sigma ^{2}.\end{align*}## Proof of Claim 2Because $\E{\hat{\beta}\mid \mathbf{X}}=\beta$,\begin{align*}\E{\left( \hat{\beta}-\beta \right) ^{2}\mid \mathbf{X}}&= \Var{\hat{\beta}\mid \mathbf{X}} \\&= \frac{\sigma ^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}.\end{align*}Hence,\begin{align*}&\E{\left( \hat{\beta}-\beta \right) ^{2}\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}\mid \mathbf{X}} \\&\quad =\sigma^{2}.\end{align*}## Proof of Claim 3Note that$$\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) \left( U_{i}-\bar{U}\right) =\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i},$$and$$\hat{\beta}-\beta =\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}.$$Therefore,\begin{align*}&\left( \hat{\beta}-\beta \right) \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right)\left( U_{i}-\bar{U}\right) \\&\quad =\frac{\left( \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}\right) ^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}.\end{align*}Conditionally on $\mathbf{X}$,\begin{align*}&\E{\left( \hat{\beta}-\beta \right) \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) \left( U_{i}-\bar{U}\right)\mid \mathbf{X}} \\&\quad = \frac{\sigma ^{2}\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} \\&\quad = \sigma ^{2}.\end{align*}## Putting it togetherUsing the three expectations above,\begin{align*}\E{\sum_{i=1}^{n}\hat{U}_{i}^{2}\mid \mathbf{X}}&= \left( n-1\right) \sigma ^{2}+\sigma^{2}-2\sigma ^{2} \\&= \left( n-2\right) \sigma ^{2},\end{align*}so $s^{2}$ is unbiased for $\sigma^{2}$.