Lecture 4: Properties of OLS

Economics 326 — Introduction to Econometrics II

Author

Vadim Marmer, UBC

Published

April 5, 2026

Properties of Estimators

Random
Mean
Variance
Distribution

OLS Estimators as Random Variables

The model \begin{aligned} &Y_{i} = \alpha + \beta X_{i} + U_{i}, \\ &\mathrm{E}\left[U_{i} \mid X_1, \ldots, X_n\right] = 0. \end{aligned} Conditioning on X_1, \ldots, X_n allows us to treat all the X_i’s as fixed, but Y_i is still random.
To save on writing, we will use the notation \mathrm{E}\left[\cdot \mid \mathbf{X}\right] = \mathrm{E}\left[\cdot \mid X_1, \ldots, X_n\right]. That is, \mathbf{X}=(X_1, \ldots, X_n).
The estimators \hat{\beta} = \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) Y_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} \text{ and } \hat{\alpha} = \bar{Y}-\hat{\beta}\bar{X} are random because they are functions of the random Y_i’s even after conditioning on the X_i’s.

Linearity of Estimators

Since \hat{\beta} = \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) Y_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}, we can write \hat{\beta} = \sum_{i=1}^{n}w_{i}Y_{i}, where w_{i} = \frac{X_{i}-\bar{X}}{\sum_{l=1}^{n}\left( X_{l}-\bar{X}\right) ^{2}}. After conditioning on X’s, w_{i}’s are not random.
For \hat{\alpha}, \begin{aligned} \hat{\alpha} &= \bar{Y}-\hat{\beta}\bar{X} \\ &= \frac{1}{n}\sum_{i=1}^{n}Y_{i}-\left( \sum_{i=1}^{n}w_{i}Y_{i}\right) \bar{X} \\ &= \sum_{i=1}^{n}\left( \frac{1}{n}-\bar{X}w_{i}\right) Y_{i} \\ &= \sum_{i=1}^{n}\left( \frac{1}{n}-\bar{X}\frac{X_{i}-\bar{X}}{ \sum_{l=1}^{n}\left( X_{l}-\bar{X}\right) ^{2}}\right) Y_{i}. \end{aligned}

Unbiasedness

Definition and Claim

\hat{\beta} is called an unbiased estimator if \mathrm{E}\left[\hat{\beta}\right] = \beta.
Claim: Suppose that
- Y_{i}=\alpha +\beta X_{i}+U_{i},
- \mathrm{E}\left[U_{i} \mid \mathbf{X}\right] = 0.
- Then \mathrm{E}\left[\hat{\beta}\right]=\beta.

Proof Step 1: Decomposition into signal and noise

\hat{\beta} = \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) Y_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}

\phantom{\hat{\beta}} = \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) \left( \alpha +\beta X_{i}+U_{i}\right) }{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}

\phantom{\hat{\beta}} = \alpha \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) }{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} + \beta \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) X_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} + \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}

\phantom{\hat{\beta}} = \alpha \frac{0}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} + \beta \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}{ \sum_{i=1}^{n}\left(X_{i}-\bar{X}\right) ^{2}} + \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}

\hat{\beta}={\color{blue}\underbrace{\beta}_{\text{signal}}} +{\color{red}\underbrace{\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}}_{\text{noise}}}

Proof Step 2: Conditioning on Regressors

Once we condition on \mathbf{X}, all the X_i’s in \hat{\beta}=\beta +\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} can be treated as fixed.
Thus, \begin{aligned} \mathrm{E}\left[\hat{\beta} \mid \mathbf{X}\right] & = \mathrm{E}\left[\beta +\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} \mid \mathbf{X}\right] \\ &= \beta + \mathrm{E}\left[\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} \mid \mathbf{X}\right] \\ &= \beta + \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) \mathrm{E}\left[U_{i} \mid \mathbf{X}\right] }{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}. \end{aligned}

Proof Step 3

Thus, with \mathrm{E}\left[U_{i} \mid \mathbf{X}\right] = 0, we have \begin{aligned} \mathrm{E}\left[\hat{\beta} \mid \mathbf{X}\right] &= \beta +\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) \mathrm{E}\left[U_{i} \mid \mathbf{X}\right]}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} \\ &= \beta +\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) \cdot 0}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} = \beta. \end{aligned}
By the LIE, \mathrm{E}\left[\hat{\beta}\right] = \mathrm{E}\left[\mathrm{E}\left[\hat{\beta} \mid \mathbf{X}\right]\right] = \mathrm{E}\left[\beta\right] = \beta.

Strong Exogeneity of Regressors

\mathbf{X}=(X_1,\ldots,X_n) are strongly exogenous if \mathrm{E}\left[U_{i} \mid \mathbf{X}\right] = 0.
Alternatively, we can assume that \mathrm{E}\left[U_{i} \mid X_{i}\right] = 0 and all observations are independent: \begin{aligned} \mathrm{E}\left[U_{1} \mid \mathbf{X}\right] &= \mathrm{E}\left[U_{1} \mid X_{1}\right], \\ \mathrm{E}\left[U_{2} \mid \mathbf{X}\right] &= \mathrm{E}\left[U_{2} \mid X_{2}\right] \text{ etc.} \end{aligned}
The OLS estimator is in general biased if the strong exogeneity assumption is violated.

Variance of the Slope Estimator

Variance Formula and Homoskedasticity

If Y_{i}=\alpha +\beta X_{i}+U_{i}, \mathrm{E}\left[U_{i} \mid \mathbf{X}\right] = 0, and \mathrm{E}\left[U_{i}^{2} \mid \mathbf{X}\right] = \sigma^{2} = \text{constant}, and for i \neq j \mathrm{E}\left[U_{i}U_{j} \mid \mathbf{X}\right] = 0, then \mathrm{Var}\left(\hat{\beta} \mid \mathbf{X}\right) = \frac{\sigma^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}.
The assumption \mathrm{E}\left[U_{i}^{2} \mid \mathbf{X}\right] = \sigma^{2} = \text{constant} is called (conditional) homoskedasticity.
The assumption \mathrm{E}\left[U_{i}U_{j} \mid \mathbf{X}\right] = 0 for i \neq j can be replaced by the assumption that the observations are independent.

Determinants of Variance

\mathrm{Var}\left(\hat{\beta} \mid \mathbf{X}\right) = \frac{\sigma^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}.

The variance of \hat{\beta} is positively related to the variance of the errors \sigma^{2} = \mathrm{Var}\left(U_{i}\right).
The variance of \hat{\beta} is smaller when the X_i’s are more dispersed.

Derivation of Variance

We have \hat{\beta}=\beta +\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X} \right) U_{i}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} and \mathrm{E}\left[\hat{\beta} \mid \mathbf{X}\right]=\beta. \begin{aligned} \mathrm{Var}\left(\hat{\beta} \mid \mathbf{X}\right) & = \mathrm{E}\left[\left( \hat{\beta}-\mathrm{E}\left[\hat{\beta} \mid \mathbf{X}\right]\right) ^{2} \mid \mathbf{X}\right] \\ &= \mathrm{E}\left[\left( \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}\right) ^{2} \mid \mathbf{X}\right] \\ &= \left( \frac{1}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}\right) ^{2} \mathrm{E}\left[\left( \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}\right) ^{2} \mid \mathbf{X}\right]. \end{aligned}
Expanding the square, \begin{aligned} \left( \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}\right) ^{2} &= \sum_{i=1}^{n}\sum_{j=1}^{n}\left( X_{i}-\bar{X}\right) \left( X_{j}-\bar{X}\right) U_{i}U_{j} \\ &= \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}U_{i}^{2}\\ &\quad + \sum_{i=1}^{n}\sum_{j\neq i}\left( X_{i}-\bar{X}\right) \left( X_{j}-\bar{X}\right) U_{i}U_{j}. \end{aligned}
Since \mathrm{E}\left[U_{i}U_{j} \mid \mathbf{X}\right] = 0 for i \neq j, \begin{aligned} \mathrm{E}\left[\left( \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}\right) ^{2} \mid \mathbf{X}\right] &= \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}\mathrm{E}\left[U_{i}^{2} \mid \mathbf{X}\right] + 0 \\ &= \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}\sigma^{2}. \end{aligned}
We have \begin{aligned} &\mathrm{Var}\left(\hat{\beta} \mid \mathbf{X}\right) = \left( \frac{1}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}\right) ^{2} \mathrm{E}\left[\left( \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}\right) ^{2} \mid \mathbf{X}\right], \\ &\mathrm{E}\left[\left( \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}\right) ^{2} \mid \mathbf{X}\right] = \sigma^{2}\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}, \end{aligned} and therefore, \begin{aligned} \mathrm{Var}\left(\hat{\beta} \mid \mathbf{X}\right) &= \left( \frac{1}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}\right) ^{2} \sigma^{2}\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2} \\ &= \left( \frac{1}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}\right) \sigma^{2}. \end{aligned}

Distribution of the Slope Estimator

Normality of the OLS Estimator

Assume that U_{i}’s are jointly normally distributed conditional on X’s.
Then Y_{i}=\alpha +\beta X_{i}+U_{i} are also jointly normally distributed.
Since \hat{\beta}=\sum_{i=1}^{n}w_{i}Y_{i}, where w_{i}=\frac{X_{i}-\bar{X}}{\sum_{l=1}^{n}\left( X_{l}-\bar{X}\right) ^{2}} depend only on the X_i’s, \hat{\beta} is also normally distributed conditional on the X_i’s.
Conditional on \mathbf{X} \begin{aligned} \hat{\beta} \mid \mathbf{X}&\sim N\left( \mathrm{E}\left[\hat{\beta} \mid \mathbf{X}\right], \mathrm{Var}\left(\hat{\beta} \mid \mathbf{X}\right) \right) \\ &\sim N\left( \beta, \frac{\sigma^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}\right). \end{aligned}

--- title: "Lecture 4: Properties of OLS" subtitle: "Economics 326 — Introduction to Econometrics II" author: - name: "Vadim Marmer, UBC" date: today date-format: "MMMM D, YYYY" format: html: output-file: 326_04_simple_properties.html toc: true toc-depth: 3 toc-location: right toc-title: "Table of Contents" theme: cosmo smooth-scroll: true html-math-method: katex pdf: output-file: 326_04_simple_properties.pdf pdf-engine: xelatex geometry: margin=0.75in fontsize: 10pt number-sections: false toc: false classoption: fleqn revealjs: output-file: 326_04_simple_properties_slides.html date: "" theme: solarized css: slides_no_caps.css smaller: true slide-number: c/t incremental: true html-math-method: katex scrollable: true chalkboard: false self-contained: true transition: none include-in-header: text: | <style> .reveal h1 .math, .reveal h2 .math, .reveal h3 .math, .reveal h4 .math, .reveal h5 .math { text-transform: none; } </style> --- ## Properties of Estimators ::: {.hidden} \gdef\E#1{\mathrm{E}\left[#1\right]} \gdef\Var#1{\mathrm{Var}\left(#1\right)} \gdef\Cov#1{\mathrm{Cov}\left(#1\right)} ::: 1. Random 2. Mean 3. Variance 4. Distribution ## OLS Estimators as Random Variables - The model $$ \begin{aligned} &Y_{i} = \alpha + \beta X_{i} + U_{i}, \\ &\E{U_{i} \mid X_1, \ldots, X_n} = 0. \end{aligned} $$ Conditioning on $X_1, \ldots, X_n$ allows us to treat all the $X_i$'s as fixed, but $Y_i$ is still random. - To save on writing, we will use the notation $$ \E{\cdot \mid \mathbf{X}} = \E{\cdot \mid X_1, \ldots, X_n}. $$ That is, $\mathbf{X}=(X_1, \ldots, X_n)$. - The estimators $$ \hat{\beta} = \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) Y_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} \text{ and } \hat{\alpha} = \bar{Y}-\hat{\beta}\bar{X} $$ are random because they are functions of the random $Y_i$'s even after conditioning on the $X_i$'s. ## Linearity of Estimators - Since $$ \hat{\beta} = \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) Y_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}, $$ we can write $\hat{\beta} = \sum_{i=1}^{n}w_{i}Y_{i}$, where $$ w_{i} = \frac{X_{i}-\bar{X}}{\sum_{l=1}^{n}\left( X_{l}-\bar{X}\right) ^{2}}. $$ After conditioning on $X$'s, $w_{i}$'s are not random. - For $\hat{\alpha}$, $$ \begin{aligned} \hat{\alpha} &= \bar{Y}-\hat{\beta}\bar{X} \\ &= \frac{1}{n}\sum_{i=1}^{n}Y_{i}-\left( \sum_{i=1}^{n}w_{i}Y_{i}\right) \bar{X} \\ &= \sum_{i=1}^{n}\left( \frac{1}{n}-\bar{X}w_{i}\right) Y_{i} \\ &= \sum_{i=1}^{n}\left( \frac{1}{n}-\bar{X}\frac{X_{i}-\bar{X}}{ \sum_{l=1}^{n}\left( X_{l}-\bar{X}\right) ^{2}}\right) Y_{i}. \end{aligned} $$ # Unbiasedness ## Definition and Claim - $\hat{\beta}$ is called an unbiased estimator if $\E{\hat{\beta}} = \beta$. - Claim: Suppose that - $Y_{i}=\alpha +\beta X_{i}+U_{i}$, - $\E{U_{i} \mid \mathbf{X}} = 0$. - Then $$ \E{\hat{\beta}}=\beta. $$ ## Proof Step 1: Decomposition into signal and noise ::: {style="display: flex; flex-direction: column; align-items: flex-start;"} $$ \hat{\beta} = \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) Y_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} $$ ::: {.fragment} $$ \phantom{\hat{\beta}} = \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) \left( \alpha +\beta X_{i}+U_{i}\right) }{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} $$ ::: ::: {.fragment} $$ \phantom{\hat{\beta}} = \alpha \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) }{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} + \beta \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) X_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} + \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} $$ ::: ::: {.fragment} $$ \phantom{\hat{\beta}} = \alpha \frac{0}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} + \beta \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}{ \sum_{i=1}^{n}\left(X_{i}-\bar{X}\right) ^{2}} + \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} $$ ::: ::: {.fragment} ::: {style="border: 2px solid; padding: 10px;"} $$ \hat{\beta}={\color{blue}\underbrace{\beta}_{\text{signal}}} +{\color{red}\underbrace{\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}}_{\text{noise}}} $$ ::: ::: ::: ## Proof Step 2: Conditioning on Regressors - Once we condition on $\mathbf{X}$, all the $X_i$'s in $$ \hat{\beta}=\beta +\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} $$ can be treated as fixed. - Thus, $$ \begin{aligned} \E{\hat{\beta} \mid \mathbf{X}} & = \E{\beta +\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} \mid \mathbf{X}} \\ &= \beta + \E{\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} \mid \mathbf{X}} \\ &= \beta + \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) \E{U_{i} \mid \mathbf{X}} }{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}. \end{aligned} $$ ## Proof Step 3 - Thus, with $\E{U_{i} \mid \mathbf{X}} = 0$, we have $$ \begin{aligned} \E{\hat{\beta} \mid \mathbf{X}} &= \beta +\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) \E{U_{i} \mid \mathbf{X}}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} \\ &= \beta +\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) \cdot 0}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}} = \beta. \end{aligned} $$ - By the LIE, $\E{\hat{\beta}} = \E{\E{\hat{\beta} \mid \mathbf{X}}} = \E{\beta} = \beta$. ## Strong Exogeneity of Regressors - $\mathbf{X}=(X_1,\ldots,X_n)$ are strongly exogenous if $\E{U_{i} \mid \mathbf{X}} = 0$. - Alternatively, we can assume that $\E{U_{i} \mid X_{i}} = 0$ and all observations are independent: $$ \begin{aligned} \E{U_{1} \mid \mathbf{X}} &= \E{U_{1} \mid X_{1}}, \\ \E{U_{2} \mid \mathbf{X}} &= \E{U_{2} \mid X_{2}} \text{ etc.} \end{aligned} $$ - The OLS estimator is in general biased if the strong exogeneity assumption is violated. # Variance of the Slope Estimator ## Variance Formula and Homoskedasticity - If $Y_{i}=\alpha +\beta X_{i}+U_{i}$, $\E{U_{i} \mid \mathbf{X}} = 0$, and $$ \E{U_{i}^{2} \mid \mathbf{X}} = \sigma^{2} = \text{constant}, $$ and for $i \neq j$ $$ \E{U_{i}U_{j} \mid \mathbf{X}} = 0, $$ then $$ \Var{\hat{\beta} \mid \mathbf{X}} = \frac{\sigma^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}. $$ - The assumption $\E{U_{i}^{2} \mid \mathbf{X}} = \sigma^{2} = \text{constant}$ is called (conditional) homoskedasticity. - The assumption $\E{U_{i}U_{j} \mid \mathbf{X}} = 0$ for $i \neq j$ can be replaced by the assumption that the observations are independent. ## Determinants of Variance $$ \Var{\hat{\beta} \mid \mathbf{X}} = \frac{\sigma^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}. $$ - The variance of $\hat{\beta}$ is positively related to the variance of the errors $\sigma^{2} = \Var{U_{i}}$. - The variance of $\hat{\beta}$ is smaller when the $X_i$'s are more dispersed. ## Derivation of Variance - We have $\hat{\beta}=\beta +\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X} \right) U_{i}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}$ and $\E{\hat{\beta} \mid \mathbf{X}}=\beta$. $$ \begin{aligned} \Var{\hat{\beta} \mid \mathbf{X}} & = \E{\left( \hat{\beta}-\E{\hat{\beta} \mid \mathbf{X}}\right) ^{2} \mid \mathbf{X}} \\ &= \E{\left( \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}}{ \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}\right) ^{2} \mid \mathbf{X}} \\ &= \left( \frac{1}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}\right) ^{2} \E{\left( \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}\right) ^{2} \mid \mathbf{X}}. \end{aligned} $$ - Expanding the square, $$ \begin{aligned} \left( \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}\right) ^{2} &= \sum_{i=1}^{n}\sum_{j=1}^{n}\left( X_{i}-\bar{X}\right) \left( X_{j}-\bar{X}\right) U_{i}U_{j} \\ &= \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}U_{i}^{2}\\ &\quad + \sum_{i=1}^{n}\sum_{j\neq i}\left( X_{i}-\bar{X}\right) \left( X_{j}-\bar{X}\right) U_{i}U_{j}. \end{aligned} $$ - Since $\E{U_{i}U_{j} \mid \mathbf{X}} = 0$ for $i \neq j$, $$ \begin{aligned} \E{\left( \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}\right) ^{2} \mid \mathbf{X}} &= \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}\E{U_{i}^{2} \mid \mathbf{X}} + 0 \\ &= \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}\sigma^{2}. \end{aligned} $$ - We have $$ \begin{aligned} &\Var{\hat{\beta} \mid \mathbf{X}} = \left( \frac{1}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}\right) ^{2} \E{\left( \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}\right) ^{2} \mid \mathbf{X}}, \\ &\E{\left( \sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) U_{i}\right) ^{2} \mid \mathbf{X}} = \sigma^{2}\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}, \end{aligned} $$ and therefore, $$ \begin{aligned} \Var{\hat{\beta} \mid \mathbf{X}} &= \left( \frac{1}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}\right) ^{2} \sigma^{2}\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2} \\ &= \left( \frac{1}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}\right) \sigma^{2}. \end{aligned} $$ # Distribution of the Slope Estimator ## Normality of the OLS Estimator - Assume that $U_{i}$'s are jointly normally distributed conditional on $X$'s. - Then $Y_{i}=\alpha +\beta X_{i}+U_{i}$ are also jointly normally distributed. - Since $\hat{\beta}=\sum_{i=1}^{n}w_{i}Y_{i}$, where $w_{i}=\frac{X_{i}-\bar{X}}{\sum_{l=1}^{n}\left( X_{l}-\bar{X}\right) ^{2}}$ depend only on the $X_i$'s, $\hat{\beta}$ is also normally distributed conditional on the $X_i$'s. - Conditional on $\mathbf{X}$ $$ \begin{aligned} \hat{\beta} \mid \mathbf{X}&\sim N\left( \E{\hat{\beta} \mid \mathbf{X}}, \Var{\hat{\beta} \mid \mathbf{X}} \right) \\ &\sim N\left( \beta, \frac{\sigma^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}\right). \end{aligned} $$