\mathrm{E}\left[U_{i}\mid \mathbf{X}\right] =0 for all i’s.
\mathrm{E}\left[U_{i}^{2}\mid \mathbf{X}\right] =\sigma ^{2} for all i’s.
\mathrm{E}\left[U_{i}U_{j}\mid \mathbf{X}\right] =0 for all i\neq j.
U’s are jointly normally distributed conditional on \mathbf{X}.
The OLS estimator \hat{\beta}_{1} is a point estimator of \beta _{1}.
With probability one, we have that \hat{\beta}_{1}\neq \beta _{1}.
To construct interval estimators, we need to know the distribution of \hat{\beta}_{1}.
Normal distribution
A normal rv is a continuous rv that can take on any value. The PDF of a normal rv X is
f(x) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{(x - \mu)^2}{2\sigma^2}\right), \text{ where}
\mu = \mathrm{E}\left[X\right] \text{ and } \sigma^2 = \mathrm{Var}\left(X\right).
We usually write X \sim N(\mu, \sigma^2).
If X \sim N(\mu, \sigma^2), then a + bX \sim N(a + b\mu, b^2\sigma^2).
Standard normal distribution
Standard normal rv has \mu = 0 and \sigma^2 = 1. Its PDF is \phi(z) = \frac{1}{\sqrt{2\pi}} \exp\left(-\frac{z^2}{2}\right).
Symmetric around zero (mean): if Z \sim N(0, 1), P(Z > z) = P(Z < -z).
Thin tails: P(-1.96 \leq Z \leq 1.96) = 0.95.
If X \sim N(\mu, \sigma^2), then (X - \mu)/\sigma \sim N(0, 1).
Bivariate normal distribution
X and Y have a bivariate normal distribution if their joint PDF is given by:
f(x, y) = \frac{1}{2\pi\sqrt{(1-\rho^2) \sigma_X^2 \sigma_Y^2}} \exp\left[-\frac{Q}{2(1-\rho^2)}\right],
where Q = \frac{(x-\mu_X)^2}{\sigma_X^2} + \frac{(y-\mu_Y)^2}{\sigma_Y^2} - 2\rho\frac{(x-\mu_X)(y-\mu_Y)}{\sigma_X\sigma_Y},
a + bX + cY \sim N(\mu^*, (\sigma^*)^2), where
\mu^* = a + b\mu_X + c\mu_Y, \quad (\sigma^*)^2 = b^2\sigma_X^2 + c^2\sigma_Y^2 + 2bc\rho\sigma_X\sigma_Y.
\mathrm{Cov}\left(X, Y\right) = 0 \Longrightarrow X and Y are independent.
Can be generalized to more than 2 variables (multivariate normal).
Normality of the OLS estimator
Assume that U_{i}’s are jointly normally distributed conditional on \mathbf{X} (Assumption 5).
Then Y_{i}=\beta _{0}+\beta _{1}X_{i}+U_{i} are also jointly normally distributed conditional on \mathbf{X}.
Since \hat{\beta}_{1}=\sum_{i=1}^{n}w_{i}Y_{i}, where w_{i}=\frac{X_{i}-\bar{X}}{\sum_{l=1}^{n}\left( X_{l}-\bar{X}\right) ^{2}} depend only on \mathbf{X}, \hat{\beta}_{1} is also normally distributed conditional on \mathbf{X}.
We want to construct an interval estimator for \beta _{1}:
The interval estimator is called a confidence interval (CI).
A CI contains the true value \beta _{1}with some pre-specified probability1-\alpha, where \alpha is a small probability of error.
For example, if \alpha =0.05, then the random CI will contain \beta _{1} with probability 0.95.
1-\alpha is called the coverage probability.
Confidence interval: CI_{1-\alpha }=[LB_{1-\alpha },UB_{1-\alpha }]. The lower bound (LB) and upper bound (UB) should depend on the coverage probability 1-\alpha.
The formal definition of CI: It is a random intervalCI_{1-\alpha} such that conditionally on \mathbf{X},
P\left( \beta _{1}\in CI_{1-\alpha } \mid \mathbf{X}\right) =1-\alpha .
Note that the random element is CI_{1-\alpha}.
Sometimes, a CI is defined as P\left( \beta _{1}\in CI_{1-\alpha}\right) \geq 1-\alpha .
Symmetric CIs
One approach to constructing CIs is to consider a symmetric interval around the estimator \hat{\beta}_{1}:
CI_{1-\alpha }=\left[ \hat{\beta}_{1}-c_{1-\alpha },\hat{\beta}_{1}+c_{1-\alpha }\right]
The problem is choosing c_{1-\alpha } such that P\left( \beta_{1}\in CI_{1-\alpha } \mid \mathbf{X}\right) =1-\alpha .
In choosing c_{1-\alpha } we will be relying on the fact that given our assumptions and conditionally on \mathbf{X}: \begin{align*}
&\hat{\beta}_{1} \mid \mathbf{X} \sim N\left( \beta _{1},\mathrm{Var}\left(\hat{\beta}_{1} \mid \mathbf{X}\right)\right), \\
&\mathrm{Var}\left(\hat{\beta}_{1} \mid \mathbf{X}\right) =\frac{\sigma ^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}.
\end{align*}
Note that conditionally on \mathbf{X}:
\frac{\hat{\beta}_{1}-\beta _{1}}{\sqrt{\mathrm{Var}\left(\hat{\beta}_{1} \mid \mathbf{X}\right) }}\sim N\left( 0,1\right) .
Standard normal quantiles
Let Z\sim N\left( 0,1\right) . The \tau-th quantile (percentile) of the standard normal distribution is z_{\tau } such that
P\left( Z\leq z_{\tau }\right) =\tau .
Median: \tau =0.5 and z_{0.5}=0. (P\left( Z\leq 0\right) =0.5).
If \tau =0.975 then z_{0.975}=1.96. Due to symmetry, if \tau =0.025 then z_{0.025}=-1.96.
\sigma^2 is known (infeasible CIs)
Suppose (for a moment) that \sigma ^{2} is known, and we can compute exactly the variance of \hat{\beta}_{1}:
\mathrm{Var}\left(\hat{\beta}_{1} \mid \mathbf{X}\right) =\frac{\sigma ^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}.
Since \sigma ^{2} is unknown, we must estimate it from the data:
s^{2}=\frac{1}{n-2}\sum_{i=1}^{n}\hat{U}_{i}^{2}=\frac{1}{n-2}\sum_{i=1}^{n}\left( Y_{i}-\hat{\beta}_{0}-\hat{\beta}_{1}X_{i}\right) ^{2}.
We can replace \sigma ^{2} by s^{2}; however, the result does not have a normal distribution anymore: \begin{align*}
&\frac{\hat{\beta}_{1}-\beta _{1}}{\sqrt{\widehat{\mathrm{Var}}\left( \hat{\beta}_{1}\right) }}\sim t_{n-2}, \\
&\text{where } \widehat{\mathrm{Var}}\left( \hat{\beta}_{1}\right) =\frac{s^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}.
\end{align*} Here t_{n-2} denotes the t-distribution with n-2 degrees of freedom.
The degrees of freedom depend on
the sample size (n),
and the number of parameters one has to estimate to compute s^{2} (two in this case, \beta _{0} and \beta _{1}).
Feasible CIs (\sigma^2 unknown)
Let t_{df,\tau } be the \tau-th quantile of the t-distribution with the number of degrees of freedom df: If T\sim t_{df} then
P\left( T\leq t_{df,\tau }\right) =\tau .
Similarly to the normal distribution, the t-distribution is centered at zero and is symmetric around zero: t_{n-2,1-\alpha /2}=-t_{n-2,\alpha/2}.
We can now construct a feasible confidence interval with 1-\alpha coverage as: \begin{align*}
CI_{1-\alpha } = \Big[ &\hat{\beta}_{1}-t_{n-2,1-\alpha /2}\sqrt{\widehat{\mathrm{Var}}\left( \hat{\beta}_{1}\right) }, \\
&\hat{\beta}_{1}+t_{n-2,1-\alpha /2}\sqrt{\widehat{\mathrm{Var}}\left( \hat{\beta}_{1}\right) }\Big], \\
\text{where } &\widehat{\mathrm{Var}}\left( \hat{\beta}_{1}\right) =\frac{s^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}.
\end{align*}
Example: Data
Data: rental from the wooldridge R package. 64 US cities in 1990.
reg <-lm(rent ~ avginc, data = rental90)summary(reg)
Call:
lm(formula = rent ~ avginc, data = rental90)
Residuals:
Min 1Q Median 3Q Max
-94.67 -47.27 -13.68 25.65 228.46
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.488e+02 3.210e+01 4.635 1.89e-05 ***
avginc 1.158e-02 1.308e-03 8.851 1.34e-12 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 66.56 on 62 degrees of freedom
Multiple R-squared: 0.5582, Adjusted R-squared: 0.5511
F-statistic: 78.34 on 1 and 62 DF, p-value: 1.341e-12
Example: Extracting key values
# Estimated slope and its standard errorbeta1_hat <-coef(reg)["avginc"]se_beta1 <-summary(reg)$coefficients["avginc", "Std. Error"]cat("beta_1_hat =", round(beta1_hat, 4), "\n")
beta_1_hat = 0.0116
cat("SE(beta_1_hat) =", round(se_beta1, 4), "\n")
SE(beta_1_hat) = 0.0013
# Degrees of freedom: n - 2n <-nrow(rental90)df <- n -2cat("n =", n, ", df =", df, "\n")
# Check with confint()confint(reg, "avginc", level =0.90)
5 % 95 %
avginc 0.009395296 0.01376472
The effect of estimating \sigma^2
The t-distribution has heavier tails than the normal.
t_{df,1-\alpha /2}>z_{1-\alpha /2}, but as df increases t_{df,1-\alpha /2}\rightarrow z_{1-\alpha /2}.
When the sample size n is large, t_{n-2,1-\alpha /2} can be replaced with z_{1-\alpha /2}.
Interpretation of confidence intervals
The confidence interval CI_{1-\alpha } is a function of the sample\left\{ \left( Y_{i},X_{i}\right) :i=1,\ldots ,n\right\}, and therefore is random. This allows us to talk about the probability of CI_{1-\alpha } containing the true value of \beta _{1}.
Once the confidence interval is computed given the data, we have its one realization. The realization of CI_{1-\alpha } (the computed confidence interval) is not random, and it does not make sense anymore to talk about the probability that it includes the true \beta _{1}.
Once the confidence interval is computed, it either contains the true value \beta _{1} or it does not.
Source Code
---title: "Lecture 7: Confidence intervals"subtitle: "Economics 326 — Econometrics II"author: - name: "Vadim Marmer, UBC"format: html: output-file: 326_07_cis.html toc: true toc-depth: 3 toc-location: right toc-title: "Table of Contents" theme: cosmo smooth-scroll: true html-math-method: katex pdf: output-file: 326_07_cis.pdf pdf-engine: xelatex geometry: margin=0.75in fontsize: 10pt number-sections: false toc: false classoption: fleqn revealjs: output-file: 326_07_cis_slides.html theme: solarized css: slides_no_caps.css smaller: true slide-number: c/t incremental: true html-math-method: katex scrollable: true chalkboard: false self-contained: true transition: none---## Point estimation::: {.hidden}\gdef\E#1{\mathrm{E}\left[#1\right]}\gdef\Var#1{\mathrm{Var}\left(#1\right)}\gdef\Cov#1{\mathrm{Cov}\left(#1\right)}:::- Our model: 1. $Y_{i}=\beta _{0}+\beta _{1}X_{i}+U_{i},\qquad i=1,\ldots ,n.$ 2. $\E{U_{i}\mid \mathbf{X}} =0$ for all $i$'s. 3. $\E{U_{i}^{2}\mid \mathbf{X}} =\sigma ^{2}$ for all $i$'s. 4. $\E{U_{i}U_{j}\mid \mathbf{X}} =0$ for all $i\neq j$. 5. $U$'s are jointly normally distributed conditional on $\mathbf{X}$.- The OLS estimator $\hat{\beta}_{1}$ is a **point estimator** of $\beta _{1}.$- With probability **one**, we have that $\hat{\beta}_{1}\neq \beta _{1}.$- To construct interval estimators, we need to know the distribution of $\hat{\beta}_{1}$.## Normal distribution- A normal rv is a continuous rv that can take on any value. The PDF of a normal rv $X$ is $$ f(x) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{(x - \mu)^2}{2\sigma^2}\right), \text{ where} $$ $$ \mu = \E{X} \text{ and } \sigma^2 = \Var{X}. $$ We usually write $X \sim N(\mu, \sigma^2)$.- If $X \sim N(\mu, \sigma^2)$, then $a + bX \sim N(a + b\mu, b^2\sigma^2)$.## Standard normal distribution- Standard normal rv has $\mu = 0$ and $\sigma^2 = 1$. Its PDF is $\phi(z) = \frac{1}{\sqrt{2\pi}} \exp\left(-\frac{z^2}{2}\right)$.- Symmetric around zero (mean): if $Z \sim N(0, 1)$, $P(Z > z) = P(Z < -z)$.- Thin tails: $P(-1.96 \leq Z \leq 1.96) = 0.95$.- If $X \sim N(\mu, \sigma^2)$, then $(X - \mu)/\sigma \sim N(0, 1)$.## Bivariate normal distribution- $X$ and $Y$ have a bivariate normal distribution if their joint PDF is given by: $$ f(x, y) = \frac{1}{2\pi\sqrt{(1-\rho^2) \sigma_X^2 \sigma_Y^2}} \exp\left[-\frac{Q}{2(1-\rho^2)}\right], $$ where $Q = \frac{(x-\mu_X)^2}{\sigma_X^2} + \frac{(y-\mu_Y)^2}{\sigma_Y^2} - 2\rho\frac{(x-\mu_X)(y-\mu_Y)}{\sigma_X\sigma_Y}$, $\mu_X = \E{X}, \mu_Y = \E{Y}, \sigma_X^2 = \Var{X}, \sigma_Y^2 = \Var{Y}$, and $\rho = \mathrm{Corr}(X, Y)$.## Properties of bivariate normalIf $X$ and $Y$ have a bivariate normal distribution:- $a + bX + cY \sim N(\mu^*, (\sigma^*)^2)$, where $$ \mu^* = a + b\mu_X + c\mu_Y, \quad (\sigma^*)^2 = b^2\sigma_X^2 + c^2\sigma_Y^2 + 2bc\rho\sigma_X\sigma_Y. $$- $\Cov{X, Y} = 0 \Longrightarrow X$ and $Y$ are independent.- Can be generalized to more than 2 variables (multivariate normal).## Normality of the OLS estimator- Assume that $U_{i}$'s are jointly normally distributed conditional on $\mathbf{X}$ (Assumption 5).- Then $Y_{i}=\beta _{0}+\beta _{1}X_{i}+U_{i}$ are also jointly normally distributed conditional on $\mathbf{X}$.- Since $\hat{\beta}_{1}=\sum_{i=1}^{n}w_{i}Y_{i}$, where $w_{i}=\frac{X_{i}-\bar{X}}{\sum_{l=1}^{n}\left( X_{l}-\bar{X}\right) ^{2}}$ depend only on $\mathbf{X}$, $\hat{\beta}_{1}$ is also normally distributed conditional on $\mathbf{X}$.- Conditionally on $\mathbf{X}$: \begin{align*} &\hat{\beta}_{1} \mid \mathbf{X} \sim N\left( \beta _{1},\Var{\hat{\beta}_{1} \mid \mathbf{X}} \right), \\ &\Var{\hat{\beta}_{1} \mid \mathbf{X}} =\frac{\sigma ^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}. \end{align*}## Interval estimation problem- We want to construct an **interval estimator** for $\beta _{1}$: - The interval estimator is called a **confidence interval** (CI). - A CI contains the **true** value $\beta _{1}$ **with some pre-specified probability** $1-\alpha$, where $\alpha$ is a small probability of error. - For example, if $\alpha =0.05$, then the random CI will contain $\beta _{1}$ with probability 0.95.- $1-\alpha$ is called the **coverage probability**.- Confidence interval: $CI_{1-\alpha }=[LB_{1-\alpha },UB_{1-\alpha }].$ The lower bound (LB) and upper bound (UB) should depend on the coverage probability $1-\alpha.$- The formal definition of CI: It is a **random interval** $CI_{1-\alpha}$ such that conditionally on $\mathbf{X}$, $$ P\left( \beta _{1}\in CI_{1-\alpha } \mid \mathbf{X}\right) =1-\alpha . $$ Note that the random element is $CI_{1-\alpha}$.- Sometimes, a CI is defined as $P\left( \beta _{1}\in CI_{1-\alpha}\right) \geq 1-\alpha .$## Symmetric CIs- One approach to constructing CIs is to consider a **symmetric** interval around the estimator $\hat{\beta}_{1}$: $$ CI_{1-\alpha }=\left[ \hat{\beta}_{1}-c_{1-\alpha },\hat{\beta}_{1}+c_{1-\alpha }\right] $$- The problem is choosing $c_{1-\alpha }$ such that $P\left( \beta_{1}\in CI_{1-\alpha } \mid \mathbf{X}\right) =1-\alpha .$- In choosing $c_{1-\alpha }$ we will be relying on the fact that given our assumptions and conditionally on $\mathbf{X}$: \begin{align*} &\hat{\beta}_{1} \mid \mathbf{X} \sim N\left( \beta _{1},\Var{\hat{\beta}_{1} \mid \mathbf{X}}\right), \\ &\Var{\hat{\beta}_{1} \mid \mathbf{X}} =\frac{\sigma ^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}. \end{align*}- Note that conditionally on $\mathbf{X}$: $$ \frac{\hat{\beta}_{1}-\beta _{1}}{\sqrt{\Var{\hat{\beta}_{1} \mid \mathbf{X}} }}\sim N\left( 0,1\right) . $$## Standard normal quantiles- Let $Z\sim N\left( 0,1\right) .$ The $\tau$-th **quantile** (percentile) of the standard normal distribution is $z_{\tau }$ such that $$ P\left( Z\leq z_{\tau }\right) =\tau . $$- **Median**: $\tau =0.5$ and $z_{0.5}=0.$ ($P\left( Z\leq 0\right) =0.5$).- If $\tau =0.975$ then $z_{0.975}=1.96$. Due to symmetry, if $\tau =0.025$ then $z_{0.025}=-1.96.$## $\sigma^2$ is known (infeasible CIs)- **Suppose** (for a moment) that $\sigma ^{2}$ is known, and we can compute exactly the variance of $\hat{\beta}_{1}$: $$ \Var{\hat{\beta}_{1} \mid \mathbf{X}} =\frac{\sigma ^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}. $$- Consider the following CI: \begin{align*} CI_{1-\alpha } = \Big[ &\hat{\beta}_{1}-z_{1-\alpha /2}\sqrt{\Var{\hat{\beta}_{1} \mid \mathbf{X}} }, \\ &\hat{\beta}_{1}+z_{1-\alpha /2}\sqrt{\Var{\hat{\beta}_{1} \mid \mathbf{X}} }\Big] . \end{align*}- For example, if $1-\alpha =0.95 \Longleftrightarrow \alpha =0.05 \Longleftrightarrow z_{1-\alpha/2}=z_{0.975}=1.96$, and \begin{align*} CI_{0.95} = \Big[ &\hat{\beta}_{1}-1.96\sqrt{\Var{\hat{\beta}_{1} \mid \mathbf{X}} }, \\ &\hat{\beta}_{1}+1.96\sqrt{\Var{\hat{\beta}_{1} \mid \mathbf{X}} }\Big] . \end{align*}## Infeasible CI validity ($\sigma^2$ known)- We need to show that $$ P\left( \beta _{1}\in CI_{1-\alpha} \mid \mathbf{X}\right) =1-\alpha . $$- Next, let $\sigma_{\hat{\beta}_{1}} = \sqrt{\Var{\hat{\beta}_{1} \mid \mathbf{X}}}$. Then: \begin{align*} &\hat{\beta}_{1}-z_{1-\alpha /2}\,\sigma_{\hat{\beta}_{1}} \leq \beta _{1}\leq \hat{\beta}_{1}+z_{1-\alpha /2}\,\sigma_{\hat{\beta}_{1}} \\ &\Longleftrightarrow -z_{1-\alpha /2}\,\sigma_{\hat{\beta}_{1}} \leq \beta _{1}-\hat{\beta}_{1}\leq z_{1-\alpha /2}\,\sigma_{\hat{\beta}_{1}} \\ &\Longleftrightarrow -z_{1-\alpha /2}\,\sigma_{\hat{\beta}_{1}} \leq \hat{\beta}_{1}-\beta _{1}\leq z_{1-\alpha /2}\,\sigma_{\hat{\beta}_{1}} \\ &\Longleftrightarrow -z_{1-\alpha /2}\leq \frac{\hat{\beta}_{1}-\beta _{1}}{\sigma_{\hat{\beta}_{1}}}\leq z_{1-\alpha /2} \end{align*}## Infeasible CI validity ($\sigma^2$ known)- We have that \begin{align*} &\beta _{1}\in CI_{1-\alpha} \\ &\Longleftrightarrow -z_{1-\alpha /2}\leq \frac{\hat{\beta}_{1}-\beta _{1}}{\sqrt{\Var{\hat{\beta}_{1} \mid \mathbf{X}} }}\leq z_{1-\alpha /2}. \end{align*}- Let $Z=\frac{\hat{\beta}_{1}-\beta _{1}}{\sqrt{\Var{\hat{\beta}_{1} \mid \mathbf{X}} }}\sim N\left( 0,1\right)$ conditionally on $\mathbf{X}$. \begin{align*} &P\left( -z_{1-\alpha /2}\leq Z \leq z_{1-\alpha /2} \mid \mathbf{X}\right) \\ &=P\left( z_{\alpha /2}\leq Z\leq z_{1-\alpha /2} \mid \mathbf{X}\right) \\ &=1-\alpha /2-\alpha /2=1-\alpha . \end{align*}## Feasible CIs ($\sigma^2$ unknown)- Since $\sigma ^{2}$ is unknown, we must estimate it from the data: $$ s^{2}=\frac{1}{n-2}\sum_{i=1}^{n}\hat{U}_{i}^{2}=\frac{1}{n-2}\sum_{i=1}^{n}\left( Y_{i}-\hat{\beta}_{0}-\hat{\beta}_{1}X_{i}\right) ^{2}. $$- We can replace $\sigma ^{2}$ by $s^{2}$; however, the result does not have a normal distribution anymore: \begin{align*} &\frac{\hat{\beta}_{1}-\beta _{1}}{\sqrt{\widehat{\mathrm{Var}}\left( \hat{\beta}_{1}\right) }}\sim t_{n-2}, \\ &\text{where } \widehat{\mathrm{Var}}\left( \hat{\beta}_{1}\right) =\frac{s^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}. \end{align*} Here $t_{n-2}$ denotes the $t$-distribution with $n-2$ degrees of freedom.- The degrees of freedom depend on - the sample size ($n$), - and the number of parameters one has to estimate to compute $s^{2}$ (two in this case, $\beta _{0}$ and $\beta _{1}$).## Feasible CIs ($\sigma^2$ unknown)- Let $t_{df,\tau }$ be the $\tau$-th quantile of the $t$-distribution with the number of degrees of freedom $df$: If $T\sim t_{df}$ then $$ P\left( T\leq t_{df,\tau }\right) =\tau . $$- Similarly to the normal distribution, the $t$-distribution is centered at zero and is symmetric around zero: $t_{n-2,1-\alpha /2}=-t_{n-2,\alpha/2}.$- We can now construct a feasible confidence interval with $1-\alpha$ coverage as: \begin{align*} CI_{1-\alpha } = \Big[ &\hat{\beta}_{1}-t_{n-2,1-\alpha /2}\sqrt{\widehat{\mathrm{Var}}\left( \hat{\beta}_{1}\right) }, \\ &\hat{\beta}_{1}+t_{n-2,1-\alpha /2}\sqrt{\widehat{\mathrm{Var}}\left( \hat{\beta}_{1}\right) }\Big], \\ \text{where } &\widehat{\mathrm{Var}}\left( \hat{\beta}_{1}\right) =\frac{s^{2}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}}. \end{align*}## Example: Data- Data: `rental` from the `wooldridge` R package. 64 US cities in 1990. - `rent`: average monthly rent (\$) - `avginc`: per capita income (\$)- Model: Rent$_{i}=\beta _{0}+\beta _{1}$AvgInc$_{i}+U_{i}.$```{r}library(wooldridge)data("rental")rental90 <-subset(rental, y90 ==1)head(rental90[, c("city", "rent", "avginc")])```## Example: OLS regression```{r}reg <-lm(rent ~ avginc, data = rental90)summary(reg)```## Example: Extracting key values```{r}# Estimated slope and its standard errorbeta1_hat <-coef(reg)["avginc"]se_beta1 <-summary(reg)$coefficients["avginc", "Std. Error"]cat("beta_1_hat =", round(beta1_hat, 4), "\n")cat("SE(beta_1_hat) =", round(se_beta1, 4), "\n")``````{r}# Degrees of freedom: n - 2n <-nrow(rental90)df <- n -2cat("n =", n, ", df =", df, "\n")```## Example: 95% confidence interval```{r}# Critical valuet_95 <-qt(0.975, df)cat("t_{62, 0.975} =", round(t_95, 3), "\n")``````{r}# 95% CI: beta_1_hat +/- t * SECI_95 <-c(beta1_hat - t_95 * se_beta1, beta1_hat + t_95 * se_beta1)round(CI_95, 4)``````{r}# Check with confint()confint(reg, "avginc", level =0.95)```## Example: 90% confidence interval```{r}# Critical valuet_90 <-qt(0.95, df)cat("t_{62, 0.95} =", round(t_90, 3), "\n")``````{r}# 90% CI: beta_1_hat +/- t * SECI_90 <-c(beta1_hat - t_90 * se_beta1, beta1_hat + t_90 * se_beta1)round(CI_90, 4)``````{r}# Check with confint()confint(reg, "avginc", level =0.90)```## The effect of estimating $\sigma^2$- The $t$-distribution has heavier tails than the normal.- $t_{df,1-\alpha /2}>z_{1-\alpha /2}$, but as $df$ increases $t_{df,1-\alpha /2}\rightarrow z_{1-\alpha /2}.$- When the sample size $n$ is large, $t_{n-2,1-\alpha /2}$ can be replaced with $z_{1-\alpha /2}.$## Interpretation of confidence intervals- The confidence interval $CI_{1-\alpha }$ is a function of the **sample** $\left\{ \left( Y_{i},X_{i}\right) :i=1,\ldots ,n\right\}$, and therefore is **random**. This allows us to talk about the probability of $CI_{1-\alpha }$ containing the true value of $\beta _{1}.$- Once the confidence interval is computed given the data, we have its **one realization**. The realization of $CI_{1-\alpha }$ (the computed confidence interval) is not random, and it does not make sense anymore to talk about the probability that it includes the true $\beta _{1}.$- **Once the confidence interval is computed, it either contains the true value $\beta _{1}$ or it does not**.