Economics 326 — Introduction to Econometrics II
The OLS estimator \(\hat{\beta}\) has desirable properties:
\(\hat{\beta}\) is unbiased if the errors are strongly exogenous: \(\mathrm{E}\left[U_i \mid \mathbf{X}\right] =0.\)
If in addition the errors are homoskedastic, then \(\widehat{\mathrm{Var}}\left(\hat{\beta}\right)=s^{2}/\sum_{i=1}^{n}\left( X_{i}-\bar{X}\right) ^{2}\) is an unbiased estimator of the conditional variance of \(\hat{\beta}\).
If in addition the errors are normally distributed (given \(\mathbf{X}\)), then \(T=\left( \hat{\beta}-\beta \right) /\sqrt{\widehat{\mathrm{Var}}\left(\hat{\beta}\right)}\) has a \(t\) distribution which can be used for hypothesis testing.
If the errors are only weakly exogenous: \[ \mathrm{E}\left[X_{i}U_{i}\right] =0, \] the OLS estimator is in general biased.
If the errors are heteroskedastic: \[ \mathrm{E}\left[U_{i}^{2} \mid X_{i}\right] =h\left( X_{i}\right), \] the “usual” variance formula is invalid; we also do not have an unbiased estimator for the variance in this case.
If the errors are not normally distributed conditional on \(\mathbf{X}\), then \(T\)- and \(F\)-statistics do not have \(t\) and \(F\) distributions under the null hypothesis.
Asymptotic (large-sample) theory allows us to derive approximate properties and distributions of estimators and test statistics by assuming that the sample size \(n\) is very large.
A sequence of real numbers \(a_{1}, a_{2}, \ldots\) converges to \(a\) if for every \(\varepsilon > 0\) there exists \(N\) such that \(|a_{n} - a| < \varepsilon\) for all \(n \geq N\). We write \(a_{n} \to a\).
Since \(\varepsilon_{1} > \varepsilon_{2}\), the \(\varepsilon_{2}\)-band is narrower, so it takes more terms for the sequence to stay inside it: \(N_{2} > N_{1}\). Smaller \(\varepsilon\) requires larger \(N\).
A sequence that does not converge: \(a_{n} = a + c\sin(n)\) oscillates indefinitely around \(a\).
For \(\varepsilon_{1} > c\), all terms lie within the \(\varepsilon_{1}\)-band. But for \(\varepsilon_{2} < c\), terms keep falling outside the \(\varepsilon_{2}\)-band (red dots) no matter how far along the sequence we go. Convergence requires the condition to hold for all \(\varepsilon > 0\), so the sequence does not converge.
Our estimator \(\hat{\beta}_{n}\) is random: its value changes with each sample. To apply the concept of convergence, we need to convert it into a non-random sequence indexed by \(n\).
We take \(a_{n} = P\left(\left\vert \hat{\beta}_{n}-\beta \right\vert \geq \varepsilon \right)\), which is a non-random number for each \(n\). We say \(\hat{\beta}_{n}\) converges in probability to \(\beta\) if \(a_{n} \to 0\) for all \(\varepsilon > 0\).
More generally, let \(\theta _{n}\) be a sequence of random variables indexed by the sample size \(n.\) We say that \(\theta _{n}\) converges in probability to \(\theta\) if \[ \lim_{n\rightarrow \infty }P\left( \left\vert \theta _{n}-\theta \right\vert \geq \varepsilon \right) =0\text{ for all }\varepsilon >0. \]
We denote this as \(\theta _{n}\rightarrow _{p}\theta\) or \(p\lim \theta _{n}=\theta.\)
An example of convergence in probability is a Law of Large Numbers (LLN):
Let \(X_{1},X_{2},\ldots ,X_{n}\) be a random sample such that \(\mathrm{E}\left[X_{i}\right] =\mu\) for all \(i=1,\ldots ,n,\) and define \(\bar{X}_{n}=\frac{1}{n}\sum_{i=1}^{n}X_{i}.\) Then, under certain conditions, \[ \bar{X}_{n}\rightarrow _{p}\mu. \]
Let \(X_{1},\ldots ,X_{n}\) be a sample of independent identically distributed (iid) random variables. Let \(\mathrm{E}\left[X_{i}\right]=\mu\). If \(\mathrm{Var}\left(X_{i}\right)=\sigma ^{2}<\infty\), then \[ \bar{X}_{n}\rightarrow _{p}\mu. \]
In fact when the data are iid, the LLN holds if \[ \mathrm{E}\left[\left\vert X_{i}\right\vert\right] <\infty, \] but we prove the result under a stronger assumption that \(\mathrm{Var}\left(X_{i}\right)<\infty.\)
Markov’s inequality. Let \(W\) be a random variable. For \(\varepsilon >0\) and \(r>0\), \[ P\left( \left\vert W\right\vert \geq \varepsilon \right) \leq \frac{\mathrm{E}\left[\left\vert W\right\vert ^{r}\right]}{\varepsilon ^{r}}. \]
With \(r=2,\) we have Chebyshev’s inequality. Suppose that \(\mathrm{E}\left[X\right]=\mu.\) Take \(W\equiv X-\mu\) and apply Markov’s inequality with \(r=2\). For \(\varepsilon >0,\)
\[ \begin{aligned} P\left( \left\vert X-\mu \right\vert \geq \varepsilon \right) &\leq \frac{\mathrm{E}\left[\left\vert X-\mu \right\vert ^{2}\right]}{\varepsilon ^{2}} \\ &= \frac{\mathrm{Var}\left(X\right)}{\varepsilon ^{2}}. \end{aligned} \]
The probability of observing an outlier (a large deviation of \(X\) from its mean \(\mu\)) can be bounded by the variance.
For any event \(A\), the expectation of its indicator equals the probability of the event: \[ \mathrm{E}\left[\mathbf{1}(A)\right] = 1 \cdot P(A) + 0 \cdot P(A^c) = P(A). \]
Define the indicator \(\mathbf{1}\left(\left\vert W \right\vert \geq \varepsilon\right)\), which equals \(1\) when \(\left\vert W \right\vert \geq \varepsilon\) and \(0\) otherwise. Then:
\[ \begin{aligned} &\mathbf{1}\left(\left\vert W \right\vert \geq \varepsilon\right)\\ &\fragment{{}\quad= \mathbf{1}\left(\left\vert W \right\vert^r \geq \varepsilon^r\right)} \\ &\fragment{{}\quad= \mathbf{1}\!\left(\frac{\left\vert W \right\vert^r}{\varepsilon^r} \geq 1\right)} \\ &\fragment{{}\quad\leq \frac{\left\vert W \right\vert^r}{\varepsilon^r}} \\ &\fragment{{}\Longrightarrow }\\ &\fragment{{} P\left(\left\vert W \right\vert \geq \varepsilon\right) = \mathrm{E}\left[\mathbf{1}\left(\left\vert W \right\vert \geq \varepsilon\right)\right] \leq \frac{\mathrm{E}\left[\left\vert W \right\vert^r\right]}{\varepsilon^r}.} \end{aligned} \]
\[ \begin{aligned} P\left( \left\vert \bar{X}_{n}-\mu \right\vert \geq \varepsilon \right) &\fragment{{}= P\left( \left\vert \frac{1}{n}\sum_{i=1}^{n}X_{i}-\mu \right\vert \geq \varepsilon \right)} \\ &\fragment{{}= P\left( \left\vert \frac{1}{n}\sum_{i=1}^{n}\left( X_{i}-\mu \right) \right\vert \geq \varepsilon \right)} \\ &\fragment{{}\leq \frac{\mathrm{E}\left[\left( \frac{1}{n}\sum_{i=1}^{n}\left( X_{i}-\mu \right) \right) ^{2}\right]}{\varepsilon ^{2}}} \\ &\fragment{{}= \frac{1}{n^{2}\varepsilon ^{2}}\left( \sum_{i=1}^{n}\mathrm{E}\left[\left( X_{i}-\mu \right) ^{2}\right]+\sum_{i=1}^{n}\sum_{j\neq i}\mathrm{E}\left[\left( X_{i}-\mu \right) \left( X_{j}-\mu \right)\right] \right)} \\ &\fragment{{}= \frac{1}{n^{2}\varepsilon ^{2}}\left( \sum_{i=1}^{n}\mathrm{Var}\left(X_{i}\right)+\sum_{i=1}^{n}\sum_{j\neq i}\mathrm{Cov}\left(X_{i},X_{j}\right)\right)} \\ &\fragment{{}= \frac{n\sigma ^{2}}{n^{2}\varepsilon ^{2}} = \frac{\sigma ^{2}}{n\varepsilon ^{2}} \rightarrow 0 \text{ as }n\rightarrow \infty \text{ for all }\varepsilon >0.} \end{aligned} \]
Let \(X_{1},\ldots ,X_{n}\) be a sample and suppose that
\[ \begin{aligned} \mathrm{E}\left[X_{i}\right] &= \mu \text{ for all }i=1,\ldots ,n, \\ \mathrm{Var}\left(X_{i}\right) &= \sigma ^{2}\text{ for all }i=1,\ldots ,n, \\ \mathrm{Cov}\left(X_{i},X_{j}\right) &= 0\text{ for all }j\neq i. \end{aligned} \]
The mean of the sample average:
\[ \begin{aligned} \mathrm{E}\left[\bar{X}_{n}\right] &= \mathrm{E}\left[\frac{1}{n}\sum_{i=1}^{n}X_{i}\right] \\ &= \frac{1}{n}\sum_{i=1}^{n}\mathrm{E}\left[X_{i}\right] \\ &= \frac{1}{n}\sum_{i=1}^{n}\mu = \frac{1}{n}n\mu =\mu. \end{aligned} \]
The variance of the sample average:
\[ \begin{aligned} \mathrm{Var}\left(\bar{X}_{n}\right) &= \mathrm{Var}\left(\frac{1}{n}\sum_{i=1}^{n}X_{i}\right) \\ &= \frac{1}{n^{2}}\mathrm{Var}\left(\sum_{i=1}^{n}X_{i}\right) \\ &= \frac{1}{n^{2}}\left( \sum_{i=1}^{n}\mathrm{Var}\left(X_{i}\right)+\sum_{i=1}^{n}\sum_{j\neq i}\mathrm{Cov}\left(X_{i},X_{j}\right)\right) \\ &= \frac{1}{n^{2}}\left( \sum_{i=1}^{n}\sigma ^{2}+\sum_{i=1}^{n}\sum_{j\neq i}0\right) \\ &= \frac{1}{n^{2}}n\sigma ^{2}=\frac{\sigma ^{2}}{n}. \end{aligned} \]
The variance of the average approaches zero as \(n\rightarrow \infty\) if the observations are uncorrelated.
Slutsky’s Lemma. Suppose that \(\theta _{n}\rightarrow _{p}\theta,\) and let \(g\) be a function continuous at \(\theta.\) Then, \[ g\left( \theta _{n}\right) \rightarrow _{p}g\left( \theta \right). \]
If \(\theta _{n}\rightarrow _{p}\theta,\) then \(\theta _{n}^{2}\rightarrow _{p}\theta ^{2}.\)
If \(\theta _{n}\rightarrow _{p}\theta\) and \(\theta \neq 0,\) then \(1/\theta _{n}\rightarrow _{p}1/\theta.\)
Suppose that \(\theta _{n}\rightarrow _{p}\theta\) and \(\lambda _{n}\rightarrow _{p}\lambda.\) Then,
\(\theta _{n}+\lambda _{n}\rightarrow _{p}\theta +\lambda.\)
\(\theta _{n}\lambda _{n}\rightarrow _{p}\theta \lambda.\)
\(\theta _{n}/\lambda _{n}\rightarrow _{p}\theta /\lambda\) provided that \(\lambda \neq 0.\)
Let \(\hat{\beta}_{n}\) be an estimator of \(\beta\) based on a sample of size \(n.\)
We say that \(\hat{\beta}_{n}\) is a consistent estimator of \(\beta\) if as \(n\rightarrow \infty,\) \[ \hat{\beta}_{n}\rightarrow _{p}\beta. \]
Consistency means that the probability of the event that the distance between \(\hat{\beta}_{n}\) and \(\beta\) exceeds \(\varepsilon >0\) can be made arbitrarily small by increasing the sample size.
Suppose that:
The data \(\left\{ \left( Y_{i},X_{i}\right) :i=1,\ldots ,n\right\}\) are iid.
\(Y_{i}=\beta _{0}+\beta _{1}X_{i}+U_{i},\) where \(\mathrm{E}\left[U_{i}\right] =0.\)
\(\mathrm{E}\left[X_{i}U_{i}\right] =0.\)
\(0<\mathrm{Var}\left(X_{i}\right)<\infty.\)
Let \(\hat{\beta}_{0,n}\) and \(\hat{\beta}_{1,n}\) be the OLS estimators of \(\beta _{0}\) and \(\beta _{1}\) based on a sample of size \(n\). Under Assumptions 1–4, \[ \begin{aligned} \hat{\beta}_{0,n} &\rightarrow _{p}\beta _{0}, \\ \hat{\beta}_{1,n} &\rightarrow _{p}\beta _{1}. \end{aligned} \]
The key identifying assumption is Assumption 3: \(\mathrm{Cov}\left(X_{i},U_{i}\right)=0.\)
Write
\[ \begin{aligned} \hat{\beta}_{1,n} = \frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) Y_{i}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) ^{2}} &= \beta _{1}+\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) U_{i}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) ^{2}} \\ &= \beta _{1}+\frac{\frac{1}{n}\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) U_{i}}{\frac{1}{n}\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) ^{2}}. \end{aligned} \]
We will show that \[ \begin{aligned} \frac{1}{n}\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) U_{i} &\rightarrow _{p}0, \\ \frac{1}{n}\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) ^{2} &\rightarrow _{p}\mathrm{Var}\left(X_{i}\right), \end{aligned} \]
Since \(\mathrm{Var}\left(X_{i}\right)\neq 0,\) \[ \hat{\beta}_{1,n} = \beta _{1}+\frac{\frac{1}{n}\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) U_{i}}{\frac{1}{n}\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) ^{2}} \rightarrow _{p} \beta _{1}+\frac{0}{\mathrm{Var}\left(X_{i}\right)}= \beta _{1}. \]
\[ \frac{1}{n}\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) U_{i} = \frac{1}{n}\sum_{i=1}^{n}X_{i}U_{i}-\bar{X}_{n}\left( \frac{1}{n}\sum_{i=1}^{n}U_{i}\right). \]
By the LLN,
\[ \begin{aligned} \frac{1}{n}\sum_{i=1}^{n}X_{i}U_{i} &\rightarrow _{p}\mathrm{E}\left[X_{i}U_{i}\right] =0, \\ \bar{X}_{n} &\rightarrow _{p}\mathrm{E}\left[X_{i}\right], \\ \frac{1}{n}\sum_{i=1}^{n}U_{i} &\rightarrow _{p}\mathrm{E}\left[U_{i}\right] =0. \end{aligned} \]
Hence,
\[ \begin{aligned} \frac{1}{n}\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) U_{i} &= \frac{1}{n}\sum_{i=1}^{n}X_{i}U_{i}-\bar{X}_{n}\left( \frac{1}{n}\sum_{i=1}^{n}U_{i}\right) \\ &\rightarrow _{p}0-\mathrm{E}\left[X_{i}\right] \cdot 0 = 0. \end{aligned} \]
The sample variance can be written as \[ \frac{1}{n}\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) ^{2} = \frac{1}{n}\sum_{i=1}^{n}X_{i}^{2}-\bar{X}_{n}^{2}. \]
By the LLN, \(\frac{1}{n}\sum_{i=1}^{n}X_{i}^{2}\rightarrow _{p}\mathrm{E}\left[X_{i}^{2}\right]\) and \(\bar{X}_{n}\rightarrow _{p}\mathrm{E}\left[X_{i}\right].\)
By Slutsky’s Lemma, \(\bar{X}_{n}^{2}\rightarrow _{p}\left( \mathrm{E}\left[X_{i}\right]\right) ^{2}.\)
Thus, \[ \frac{1}{n}\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) ^{2}=\frac{1}{n}\sum_{i=1}^{n}X_{i}^{2}-\bar{X}_{n}^{2}\rightarrow _{p}\mathrm{E}\left[X_{i}^{2}\right] -\left( \mathrm{E}\left[X_{i}\right]\right) ^{2}=\mathrm{Var}\left(X_{i}\right). \]
Under similar conditions to 1–4, one can establish consistency of OLS for the multiple linear regression model: \[ Y_{i}=\beta _{0}+\beta _{1}X_{1,i}+\ldots +\beta _{k}X_{k,i}+U_{i}, \] where \(\mathrm{E}\left[U_{i}\right]=0.\)
The key assumption is that the errors and regressors are uncorrelated: \[ \mathrm{E}\left[X_{1,i}U_{i}\right] =\ldots =\mathrm{E}\left[X_{k,i}U_{i}\right] =0. \]
Suppose that the true model has two regressors: \[ \begin{aligned} &Y_{i}=\beta _{0}+\beta _{1}X_{1,i}+\beta _{2}X_{2,i}+U_{i}, \\ &\mathrm{E}\left[X_{1,i}U_{i}\right] =\mathrm{E}\left[X_{2,i}U_{i}\right] =0. \end{aligned} \]
Suppose that the econometrician includes only \(X_{1}\) in the regression when estimating \(\beta _{1}\):
\[ \begin{aligned} \tilde{\beta}_{1,n} &= \frac{\sum_{i=1}^{n}\left( X_{1,i}-\bar{X}_{1,n}\right) Y_{i}}{\sum_{i=1}^{n}\left( X_{1,i}-\bar{X}_{1,n}\right) ^{2}} \\ &= \frac{\sum_{i=1}^{n}\left( X_{1,i}-\bar{X}_{1,n}\right) \left( \beta _{0}+\beta _{1}X_{1,i}+\beta _{2}X_{2,i}+U_{i}\right) }{\sum_{i=1}^{n}\left( X_{1,i}-\bar{X}_{1,n}\right) ^{2}} \\ &= \beta _{1}+\beta _{2}\frac{\sum_{i=1}^{n}\left( X_{1,i}-\bar{X}_{1,n}\right) X_{2,i}}{\sum_{i=1}^{n}\left( X_{1,i}-\bar{X}_{1,n}\right) ^{2}} \\ &\quad +\frac{\sum_{i=1}^{n}\left( X_{1,i}-\bar{X}_{1,n}\right) U_{i}}{\sum_{i=1}^{n}\left( X_{1,i}-\bar{X}_{1,n}\right) ^{2}}. \end{aligned} \]
Dividing numerator and denominator by \(n\) and applying the LLN as before:
The noise term vanishes: \[ \frac{\frac{1}{n}\sum_{i=1}^{n}\left( X_{1,i}-\bar{X}_{1,n}\right) U_{i}}{\frac{1}{n}\sum_{i=1}^{n}\left( X_{1,i}-\bar{X}_{1,n}\right) ^{2}} \rightarrow _{p} \frac{\mathrm{Cov}\left(X_{1,i},U_{i}\right)}{\mathrm{Var}\left(X_{1,i}\right)} = 0. \]
The bias term converges: \[ \frac{\frac{1}{n}\sum_{i=1}^{n}\left( X_{1,i}-\bar{X}_{1,n}\right) X_{2,i}}{\frac{1}{n}\sum_{i=1}^{n}\left( X_{1,i}-\bar{X}_{1,n}\right) ^{2}} \rightarrow _{p} \frac{\mathrm{Cov}\left(X_{1,i},X_{2,i}\right)}{\mathrm{Var}\left(X_{1,i}\right)}. \]
Therefore, \[ \tilde{\beta}_{1,n} \rightarrow _{p} \beta _{1}+\beta _{2}\frac{\mathrm{Cov}\left(X_{1,i},X_{2,i}\right)}{\mathrm{Var}\left(X_{1,i}\right)}. \]
\(\tilde{\beta}_{1,n}\) is inconsistent unless:
\(\beta _{2}=0\) (the model is correctly specified).
\(\mathrm{Cov}\left(X_{1,i},X_{2,i}\right)=0\) (the omitted variable is uncorrelated with the included regressor).
In this example, the model contains two regressors: \[ \begin{aligned} &Y_{i}=\beta _{0}+\beta _{1}X_{1,i}+\beta _{2}X_{2,i}+U_{i}, \\ &\mathrm{E}\left[X_{1,i}U_{i}\right] =\mathrm{E}\left[X_{2,i}U_{i}\right] =0. \end{aligned} \]
However, since \(X_{2}\) is not controlled for, it goes into the error term: \[ \begin{aligned} Y_{i} &= \beta _{0}+\beta _{1}X_{1,i}+V_{i},\text{ where} \\ V_{i} &= \beta _{2}X_{2,i}+U_{i}. \end{aligned} \]
For consistency of \(\tilde{\beta}_{1,n}\) we need \(\mathrm{Cov}\left(X_{1,i},V_{i}\right) = 0\); however,
\[ \begin{aligned} \mathrm{Cov}\left(X_{1,i},V_{i}\right) &= \mathrm{Cov}\left(X_{1,i},\beta _{2}X_{2,i}+U_{i}\right) \\ &= \mathrm{Cov}\left(X_{1,i},\beta _{2}X_{2,i}\right)+\mathrm{Cov}\left(X_{1,i},U_{i}\right) \\ &= \beta _{2}\mathrm{Cov}\left(X_{1,i},X_{2,i}\right)+0 \\ &\neq 0\text{, unless }\beta _{2}=0\text{ or }\mathrm{Cov}\left(X_{1,i},X_{2,i}\right)=0. \end{aligned} \]
In the previous lectures, we showed that the OLS estimator has an exact normal distribution when the errors are normally distributed.
In this lecture, we argue that even when the errors are not normally distributed, the OLS estimator has an approximately normal distribution in large samples, provided that some additional conditions hold.
Let \(W_{n}\) be a sequence of random variables indexed by the sample size \(n.\)
We say that \(W_{n}\) has an asymptotically normal distribution if its CDF converges to a normal CDF.
Let \(W\) be any random variable with a normal \(N\left( 0,\sigma^{2}\right)\) distribution and let \(F\) denote its CDF. We say that \(W_{n}\) has an asymptotically normal distribution if for all \(x\in \mathbb{R}\):
\[ F_{n}\left( x\right) =P\left( W_{n}\leq x\right) \rightarrow P\left( W\leq x\right) =F\left( x\right) \text{ as }n\rightarrow \infty . \]
Asymptotic normality is an example of convergence in distribution.
We say that a sequence of random variables \(W_{n}\) converges in distribution to \(W\) (denoted as \(W_{n}\rightarrow _{d}W\)) if the CDF of \(W_{n}\) converges to the CDF of \(W\) at all points where the CDF of \(W\) is continuous.
Convergence in distribution is convergence of the CDFs.
An example of convergence in distribution is a CLT.
Let \(X_{1},\ldots ,X_{n}\) be a sample of iid random variables such that \(\mathrm{E}\left[X_{i}\right] =0\) and \(\mathrm{Var}\left(X_{i}\right) =\sigma ^{2}>0\) (finite). Then, as \(n\rightarrow \infty,\)
\[ \frac{1}{\sqrt{n}}\sum_{i=1}^{n}X_{i}\rightarrow _{d}N\left( 0,\sigma^{2}\right) . \]
\(\rightarrow_{d}\) means that the CDF of the scaled sum converges to the normal CDF: for every \(x\), \[ P\left(\frac{1}{\sigma\sqrt{n}}\sum_{i=1}^{n}X_{i} \leq x\right) \rightarrow \Phi(x) \text{ as } n \rightarrow \infty, \] where \(\Phi\) is the standard normal CDF. For large \(n\), the distribution of the scaled sum is approximately normal.
For the CLT we impose 3 assumptions: (1) iid; (2) Mean zero; (3) Finite variance different from zero.
If \(X_{1},\ldots ,X_{n}\) are iid but \(\mathrm{E}\left[X_{i}\right] =\mu \neq 0,\) then consider \(X_{i}-\mu.\) Since \(\mathrm{E}\left[X_{i}-\mu\right] =0,\) we have
\[ \frac{1}{\sqrt{n}}\sum_{i=1}^{n}\left( X_{i}-\mu \right) \rightarrow_{d}N\left( 0,\mathrm{Var}\left(X_{i}\right) \right) . \]
Then
\[ \begin{aligned} \frac{1}{\sqrt{n}}\sum_{i=1}^{n}\left( X_{i}-\mu \right) &= \sqrt{n}\frac{1}{n}\sum_{i=1}^{n}\left( X_{i}-\mu \right) \\ &= \sqrt{n}\left( \frac{1}{n}\sum_{i=1}^{n}X_{i}-\frac{1}{n}\sum_{i=1}^{n}\mu \right) \\ &= \sqrt{n}\left( \bar{X}_{n}-\mu \right) . \end{aligned} \]
From the previous slide: \[ \frac{1}{\sqrt{n}}\sum_{i=1}^{n}\left( X_{i}-\mu \right) = \sqrt{n}\left( \bar{X}_{n}-\mu \right) . \]
Thus, the CLT can be stated as
\[ \sqrt{n}\left( \bar{X}_{n}-\mu \right) \rightarrow _{d}N\left( 0,\mathrm{Var}\left(X_{i}\right) \right) . \]
By the LLN,
\[ \bar{X}_{n}-\mu \rightarrow _{p}0, \]
and
\[ \mathrm{Var}\left(\sqrt{n}\left( \bar{X}_{n}-\mu \right)\right) = n\mathrm{Var}\left(\bar{X}_{n}\right) = n\frac{\mathrm{Var}\left(X_{i}\right)}{n} = \mathrm{Var}\left(X_{i}\right). \]
Suppose that \(W_{n}\rightarrow _{d}N\left( 0,\sigma ^{2}\right)\) and \(\theta _{n}\rightarrow _{p}\theta.\) Then,
\[ \theta _{n}W_{n}\rightarrow _{d}\theta N\left( 0,\sigma ^{2}\right) \stackrel{d}{=} N\left( 0,\theta ^{2}\sigma ^{2}\right) , \]
and
\[ \theta _{n}+W_{n}\rightarrow _{d}\theta +N\left( 0,\sigma ^{2}\right) \stackrel{d}{=} N\left( \theta ,\sigma ^{2}\right) . \]
Suppose that \(Z_{n}\rightarrow _{d}Z\sim N\left( 0,1\right).\) Then,
\[ Z_{n}^{2}\rightarrow _{d}Z^{2}\equiv \chi _{1}^{2}. \]
If \(W_{n}\rightarrow _{d}c=\) constant, then \(W_{n}\rightarrow _{p}c.\)
Suppose that:
Let \(\hat{\beta}_{1,n}\) be the OLS estimator of \(\beta _{1}.\) Then,
\[ \sqrt{n}\left( \hat{\beta}_{1,n}-\beta _{1}\right) \rightarrow _{d}N\left( 0,\frac{\mathrm{E}\left[\left( X_{i}-\mathrm{E}\left[X_{i}\right]\right) ^{2}U_{i}^{2}\right]}{\left(\mathrm{Var}\left(X_{i}\right) \right) ^{2}}\right). \]
\(V=\dfrac{\mathrm{E}\left[\left( X_{i}-\mathrm{E}\left[X_{i}\right]\right) ^{2}U_{i}^{2}\right]}{\left(\mathrm{Var}\left(X_{i}\right) \right) ^{2}}\) is called the asymptotic variance of \(\hat{\beta}_{1,n}.\)
Let \(\overset{a}{\sim}\) denote “approximately in large samples.”
The asymptotic normality
\[ \sqrt{n}\left( \hat{\beta}_{1,n}-\beta _{1}\right) \rightarrow _{d}N\left(0,V\right) \]
can be viewed as the following large-sample approximation:
\[ \sqrt{n}\left( \hat{\beta}_{1,n}-\beta _{1}\right) \overset{a}{\sim} N\left(0,V\right) , \]
or
\[ \hat{\beta}_{1,n}\overset{a}{\sim} N\left( \beta _{1},V/n\right) . \]
Write
\[ \hat{\beta}_{1,n}=\beta _{1}+\frac{\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) U_{i}}{\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) ^{2}}. \]
Now
\[ \hat{\beta}_{1,n}-\beta _{1}=\frac{\frac{1}{n}\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) U_{i}}{\frac{1}{n}\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) ^{2}}, \]
and
\[ \sqrt{n}\left( \hat{\beta}_{1,n}-\beta _{1}\right) =\frac{\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) U_{i}}{\frac{1}{n}\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) ^{2}}. \]
\[ \sqrt{n}\left( \hat{\beta}_{1,n}-\beta _{1}\right) =\frac{\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) U_{i}}{\frac{1}{n}\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) ^{2}}. \]
In Part I, we established
\[ \frac{1}{n}\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) ^{2}\rightarrow_{p}\mathrm{Var}\left(X_{i}\right). \]
We will show that
\[ \frac{1}{\sqrt{n}}\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) U_{i}\rightarrow _{d}N\left( 0,\mathrm{E}\left[\left( X_{i}-\mathrm{E}\left[X_{i}\right]\right)^{2}U_{i}^{2}\right] \right), \]
so that
\[\begin{align*} \sqrt{n}\left( \hat{\beta}_{1,n}-\beta _{1}\right) &= \frac{\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) U_{i}}{\frac{1}{n}\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) ^{2}} \\ &\rightarrow _{d}\frac{N\left( 0,\mathrm{E}\left[\left( X_{i}-\mathrm{E}\left[X_{i}\right]\right) ^{2}U_{i}^{2}\right] \right)}{\mathrm{Var}\left(X_{i}\right)} \\ &\stackrel{d}{=} N\left( 0,\frac{\mathrm{E}\left[\left( X_{i}-\mathrm{E}\left[X_{i}\right]\right)^{2}U_{i}^{2}\right]}{\left(\mathrm{Var}\left(X_{i}\right) \right) ^{2}}\right). \end{align*}\]
\[ \begin{aligned} \frac{1}{\sqrt{n}}\sum_{i=1}^{n}\left( X_{i}-\bar{X}_{n}\right) U_{i} &= \frac{1}{\sqrt{n}}\sum_{i=1}^{n}\left( X_{i}-\mathrm{E}\left[X_{i}\right]+\mathrm{E}\left[X_{i}\right]-\bar{X}_{n}\right) U_{i} \\ &= \frac{1}{\sqrt{n}}\sum_{i=1}^{n}\left( X_{i}-\mathrm{E}\left[X_{i}\right]\right) U_{i}+\left( \mathrm{E}\left[X_{i}\right]-\bar{X}_{n}\right) \frac{1}{\sqrt{n}}\sum_{i=1}^{n}U_{i}. \end{aligned} \]
We have
\[ \mathrm{E}\left[\left( X_{i}-\mathrm{E}\left[X_{i}\right]\right) U_{i}\right] = \mathrm{E}\left[X_{i}U_{i}\right]-\mathrm{E}\left[X_{i}\right]\mathrm{E}\left[U_{i}\right]=0, \]
and \(0<\mathrm{E}\left[\left( X_{i}-\mathrm{E}\left[X_{i}\right]\right) ^{2}U_{i}^{2}\right] <\infty,\) so by the CLT,
\[ \frac{1}{\sqrt{n}}\sum_{i=1}^{n}\left( X_{i}-\mathrm{E}\left[X_{i}\right]\right) U_{i}\rightarrow_{d}N\left( 0,\mathrm{E}\left[\left( X_{i}-\mathrm{E}\left[X_{i}\right]\right) ^{2}U_{i}^{2}\right]\right). \]
It is left to show that
\[ \left( \mathrm{E}\left[X_{i}\right]-\bar{X}_{n}\right) \frac{1}{\sqrt{n}}\sum_{i=1}^{n}U_{i}\rightarrow _{p}0. \]
We have \(\mathrm{E}\left[U_{i}\right]=0\) and \(0<\mathrm{E}\left[U_{i}^{2}\right]<\infty.\) By the CLT,
\[ \frac{1}{\sqrt{n}}\sum_{i=1}^{n}U_{i}\rightarrow _{d}N\left(0,\mathrm{E}\left[U_{i}^{2}\right]\right). \]
By the LLN,
\[ \mathrm{E}\left[X_{i}\right]-\bar{X}_{n}\rightarrow _{p}0. \]
Hence, the result follows.
In Part II, we showed that when the data are iid and the regressors are exogenous, \[ \begin{aligned} Y_{i} &= \beta_{0} + \beta_{1}X_{i} + U_{i}, \\ \mathrm{E}\left[U_{i}\right] &= \mathrm{E}\left[X_{i}U_{i}\right] = 0, \end{aligned} \] the OLS estimator of \(\beta_{1}\) is asymptotically normal: \[ \begin{aligned} \sqrt{n}\left(\hat{\beta}_{1,n} - \beta_{1}\right) &\rightarrow_{d} N\left(0, V\right), \\ V &= \frac{\mathrm{E}\left[\left(X_{i} - \mathrm{E}\left[X_{i}\right]\right)^{2}U_{i}^{2}\right]}{\left(\mathrm{Var}\left(X_{i}\right)\right)^{2}}. \end{aligned} \]
For hypothesis testing, we need a consistent estimator of the asymptotic variance \(V\): \[ \hat{V}_{n} \rightarrow_{p} V. \]
Assume that the errors are homoskedastic: \[ \mathrm{E}\left[U_{i}^{2} \mid X_{i}\right] = \sigma^{2} \text{ for all } X_{i}\text{'s.} \]
In this case, the asymptotic variance can be simplified using the Law of Iterated Expectation: \[ \begin{aligned} \mathrm{E}\left[\left(X_{i} - \mathrm{E}\left[X_{i}\right]\right)^{2}U_{i}^{2}\right] &= \mathrm{E}\left[\mathrm{E}\left[\left(X_{i} - \mathrm{E}\left[X_{i}\right]\right)^{2}U_{i}^{2} \mid X_{i}\right]\right] \\ &= \mathrm{E}\left[\left(X_{i} - \mathrm{E}\left[X_{i}\right]\right)^{2} \mathrm{E}\left[U_{i}^{2} \mid X_{i}\right]\right] \\ &= \mathrm{E}\left[\left(X_{i} - \mathrm{E}\left[X_{i}\right]\right)^{2} \sigma^{2}\right] \\ &= \sigma^{2}\,\mathrm{E}\left[\left(X_{i} - \mathrm{E}\left[X_{i}\right]\right)^{2}\right] = \sigma^{2}\mathrm{Var}\left(X_{i}\right). \end{aligned} \]
Thus, when the errors are homoskedastic with \(\mathrm{E}\left[U_{i}^{2}\right] = \sigma^{2},\) \[ V = \frac{\mathrm{E}\left[\left(X_{i} - \mathrm{E}\left[X_{i}\right]\right)^{2}U_{i}^{2}\right]}{\left(\mathrm{Var}\left(X_{i}\right)\right)^{2}} = \frac{\sigma^{2}\mathrm{Var}\left(X_{i}\right)}{\left(\mathrm{Var}\left(X_{i}\right)\right)^{2}} = \frac{\sigma^{2}}{\mathrm{Var}\left(X_{i}\right)}. \]
Let \(\hat{U}_{i} = Y_{i} - \hat{\beta}_{0,n} - \hat{\beta}_{1,n}X_{i}\), where \(\hat{\beta}_{0,n}\) and \(\hat{\beta}_{1,n}\) are the OLS estimators of \(\beta_{0}\) and \(\beta_{1}.\)
A consistent estimator for the asymptotic variance can be constructed using the Method of Moments: \[ \begin{aligned} \hat{\sigma}_{n}^{2} &= \frac{1}{n}\sum_{i=1}^{n}\hat{U}_{i}^{2}, \\ \widehat{\mathrm{Var}}\left(X_{i}\right) &= \frac{1}{n}\sum_{i=1}^{n}\left(X_{i} - \bar{X}_{n}\right)^{2}, \text{ and} \\ \hat{V}_{n} &= \frac{\hat{\sigma}_{n}^{2}}{\frac{1}{n}\sum_{i=1}^{n}\left(X_{i} - \bar{X}_{n}\right)^{2}}. \end{aligned} \]
From the previous slide: \[ \begin{aligned} \hat{V}_{n} &= \frac{\hat{\sigma}_{n}^{2}}{\frac{1}{n}\sum_{i=1}^{n}\left(X_{i} - \bar{X}_{n}\right)^{2}}, \\ \hat{\sigma}_{n}^{2} &= \frac{1}{n}\sum_{i=1}^{n}\hat{U}_{i}^{2}, \\ \hat{U}_{i} &= Y_{i} - \hat{\beta}_{0,n} - \hat{\beta}_{1,n}X_{i}. \end{aligned} \]
When proving the consistency of OLS (Part I), we showed that \[ \frac{1}{n}\sum_{i=1}^{n}\left(X_{i} - \bar{X}_{n}\right)^{2} \rightarrow_{p} \mathrm{Var}\left(X_{i}\right), \] and to establish \(\hat{V}_{n} \rightarrow_{p} V,\) we need to show that \(\hat{\sigma}_{n}^{2} \rightarrow_{p} \sigma^{2}.\)
The LLN cannot be applied directly to \[ \frac{1}{n}\sum_{i=1}^{n}\hat{U}_{i}^{2} \] because the \(\hat{U}_{i}\)’s are not iid: they are dependent through \(\hat{\beta}_{0,n}\) and \(\hat{\beta}_{1,n}.\)
First, write \[ \begin{aligned} \hat{U}_{i} &= Y_{i} - \hat{\beta}_{0,n} - \hat{\beta}_{1,n}X_{i} \\ &= \left(\beta_{0} + \beta_{1}X_{i} + U_{i}\right) - \hat{\beta}_{0,n} - \hat{\beta}_{1,n}X_{i} \\ &= U_{i} - \left(\hat{\beta}_{0,n} - \beta_{0}\right) - \left(\hat{\beta}_{1,n} - \beta_{1}\right)X_{i}. \end{aligned} \]
Now, \[ \hat{\sigma}_{n}^{2} = \frac{1}{n}\sum_{i=1}^{n}\hat{U}_{i}^{2} = \frac{1}{n}\sum_{i=1}^{n}\left(U_{i} - \left(\hat{\beta}_{0,n} - \beta_{0}\right) - \left(\hat{\beta}_{1,n} - \beta_{1}\right)X_{i}\right)^{2}. \]
We have \[ \begin{aligned} \hat{\sigma}_{n}^{2} &= \frac{1}{n}\sum_{i=1}^{n}\left(U_{i} - \left(\hat{\beta}_{0,n} - \beta_{0}\right) - \left(\hat{\beta}_{1,n} - \beta_{1}\right)X_{i}\right)^{2} \\ &= \frac{1}{n}\sum_{i=1}^{n}U_{i}^{2} + \left(\hat{\beta}_{0,n} - \beta_{0}\right)^{2} + \left(\hat{\beta}_{1,n} - \beta_{1}\right)^{2}\frac{1}{n}\sum_{i=1}^{n}X_{i}^{2} \\ &\quad -2\left(\hat{\beta}_{0,n} - \beta_{0}\right)\frac{1}{n}\sum_{i=1}^{n}U_{i} - 2\left(\hat{\beta}_{1,n} - \beta_{1}\right)\frac{1}{n}\sum_{i=1}^{n}U_{i}X_{i} \\ &\quad +2\left(\hat{\beta}_{0,n} - \beta_{0}\right)\left(\hat{\beta}_{1,n} - \beta_{1}\right)\frac{1}{n}\sum_{i=1}^{n}X_{i}. \end{aligned} \]
By the LLN, \[ \frac{1}{n}\sum_{i=1}^{n}U_{i}^{2} \rightarrow_{p} \mathrm{E}\left[U_{i}^{2}\right] = \sigma^{2}. \]
Because \(\hat{\beta}_{0,n}\) and \(\hat{\beta}_{1,n}\) are consistent, \[ \hat{\beta}_{0,n} - \beta_{0} \rightarrow_{p} 0 \text{ and } \hat{\beta}_{1,n} - \beta_{1} \rightarrow_{p} 0. \]
Thus, when the errors are homoskedastic, \[ \hat{V}_{n} = \frac{\hat{\sigma}_{n}^{2}}{\frac{1}{n}\sum_{i=1}^{n}\left(X_{i} - \bar{X}_{n}\right)^{2}}, \text{ with } \hat{\sigma}_{n}^{2} = \frac{1}{n}\sum_{i=1}^{n}\hat{U}_{i}^{2}, \] is a consistent estimator of \(V = \frac{\sigma^{2}}{\mathrm{Var}\left(X_{i}\right)}.\)
Similarly, \[ s^{2} = \frac{1}{n-2}\sum_{i=1}^{n}\hat{U}_{i}^{2} \rightarrow_{p} \sigma^{2}, \] and therefore \[ \hat{V}_{n} = \frac{s^{2}}{\frac{1}{n}\sum_{i=1}^{n}\left(X_{i} - \bar{X}_{n}\right)^{2}} \] is also a consistent estimator of \(V = \frac{\sigma^{2}}{\mathrm{Var}\left(X_{i}\right)}.\)
This version has an advantage over the one with \(\hat{\sigma}_{n}^{2}\): in addition to being consistent, \(s^{2}\) is also an unbiased estimator of \(\sigma^{2}\) if the regressors are strongly exogenous.
The result \(\sqrt{n}\left(\hat{\beta}_{1,n} - \beta_{1}\right) \rightarrow_{d} N\left(0, V\right)\) is used as the following approximation: \[ \hat{\beta}_{1,n} \overset{a}{\sim} N\left(\beta_{1}, \frac{V}{n}\right), \] where \(\overset{a}{\sim}\) denotes approximately in large samples. Thus, the variance of \(\hat{\beta}_{1,n}\) can be taken as approximately \(V/n.\)
With \(\hat{V}_{n} = \frac{s^{2}}{\frac{1}{n}\sum_{i=1}^{n}\left(X_{i} - \bar{X}_{n}\right)^{2}}\) we have \[ \frac{\hat{V}_{n}}{n} = \frac{s^{2}}{\frac{1}{n}\sum_{i=1}^{n}\left(X_{i} - \bar{X}_{n}\right)^{2}} \cdot \frac{1}{n} = \frac{s^{2}}{\sum_{i=1}^{n}\left(X_{i} - \bar{X}_{n}\right)^{2}}. \]
From the previous slide: \[ \frac{\hat{V}_{n}}{n} = \frac{s^{2}}{\sum_{i=1}^{n}\left(X_{i} - \bar{X}_{n}\right)^{2}} \]
Thus, in the case of homoskedastic errors we have the following asymptotic approximation: \[ \hat{\beta}_{1,n} \overset{a}{\sim} N\left(\beta_{1}, \frac{s^{2}}{\sum_{i=1}^{n}\left(X_{i} - \bar{X}_{n}\right)^{2}}\right). \]
In finite samples, we have the same result exactly, when the regressors are strongly exogenous and the errors are normal.
Consider testing \(H_{0}: \beta_{1} = \beta_{1,0}\) vs \(H_{1}: \beta_{1} \neq \beta_{1,0}.\)
Consider the behavior of the \(T\) statistic under \(H_{0}: \beta_{1} = \beta_{1,0}\). Since \[ \sqrt{n}\left(\hat{\beta}_{1,n} - \beta_{1}\right) \rightarrow_{d} N\left(0, V\right) \text{ and } \hat{V}_{n} \rightarrow_{p} V, \] we have \[ \begin{aligned} T = \frac{\hat{\beta}_{1,n} - \beta_{1,0}}{\sqrt{\hat{V}_{n}/n}} &= \frac{\sqrt{n}\left(\hat{\beta}_{1,n} - \beta_{1,0}\right)}{\sqrt{\hat{V}_{n}}} \\ &\overset{H_{0}}{=} \frac{\sqrt{n}\left(\hat{\beta}_{1,n} - \beta_{1}\right)}{\sqrt{\hat{V}_{n}}} \\ &\rightarrow_{d} \frac{N\left(0, V\right)}{\sqrt{V}} \stackrel{d}{=} N\left(0, 1\right). \end{aligned} \]
Under \(H_{0}: \beta_{1} = \beta_{1,0},\) \[ T = \frac{\hat{\beta}_{1,n} - \beta_{1,0}}{\sqrt{\hat{V}_{n}/n}} \rightarrow_{d} N\left(0, 1\right), \] provided that \(\hat{V}_{n} \rightarrow_{p} V\) (the asymptotic variance of \(\hat{\beta}_{1,n}\)).
An asymptotic size \(\alpha\) test rejects \(H_{0}: \beta_{1} = \beta_{1,0}\) against \(H_{1}: \beta_{1} \neq \beta_{1,0}\) when \[ \left|T\right| > z_{1-\alpha/2}, \] where \(z_{1-\alpha/2}\) is a standard normal critical value.
Asymptotically, the variance of the OLS estimator is known; we behave as if the variance were known.
In general, the errors are heteroskedastic: \(\mathrm{E}\left[U_{i}^{2} \mid X_{i}\right]\) is not constant and changes with \(X_{i}.\)
In this case, \(\hat{V}_{n} = \frac{s^{2}}{\frac{1}{n}\sum_{i=1}^{n}\left(X_{i} - \bar{X}_{n}\right)^{2}}\) is not a consistent estimator of the asymptotic variance \(V = \frac{\mathrm{E}\left[\left(X_{i} - \mathrm{E}\left[X_{i}\right]\right)^{2}U_{i}^{2}\right]}{\left(\mathrm{Var}\left(X_{i}\right)\right)^{2}}\): \[ \begin{aligned} \frac{s^{2}}{\frac{1}{n}\sum_{i=1}^{n}\left(X_{i} - \bar{X}_{n}\right)^{2}} &\rightarrow_{p} \frac{\mathrm{E}\left[U_{i}^{2}\right]}{\mathrm{Var}\left(X_{i}\right)} \\ &= \frac{\mathrm{Var}\left(X_{i}\right)\cdot\mathrm{E}\left[U_{i}^{2}\right]}{\left(\mathrm{Var}\left(X_{i}\right)\right)^{2}} \\ &\neq \frac{\mathrm{E}\left[\left(X_{i} - \mathrm{E}\left[X_{i}\right]\right)^{2}U_{i}^{2}\right]}{\left(\mathrm{Var}\left(X_{i}\right)\right)^{2}}. \end{aligned} \]
In the case of heteroskedastic errors, a consistent estimator of \(V = \frac{\mathrm{E}\left[\left(X_{i} - \mathrm{E}\left[X_{i}\right]\right)^{2}U_{i}^{2}\right]}{\left(\mathrm{Var}\left(X_{i}\right)\right)^{2}}\) can be constructed as follows: \[ \hat{V}_{n}^{HC} = \frac{\frac{1}{n}\sum_{i=1}^{n}\left(X_{i} - \bar{X}_{n}\right)^{2}\hat{U}_{i}^{2}}{\left(\frac{1}{n}\sum_{i=1}^{n}\left(X_{i} - \bar{X}_{n}\right)^{2}\right)^{2}}. \]
One can show that \(\hat{V}_{n}^{HC} \rightarrow_{p} V\) whether the errors are heteroskedastic or homoskedastic.
We have the following asymptotic approximation: \[ \hat{\beta}_{1,n} \overset{a}{\sim} N\left(\beta_{1}, \frac{\hat{V}_{n}^{HC}}{n}\right), \] and the standard errors can be computed as \(\mathrm{se}\left(\hat{\beta}_{1,n}\right) = \sqrt{\hat{V}_{n}^{HC}/n}.\)
In R, the HC estimator of standard errors can be obtained using the sandwich package:
Standard (homoskedastic) standard errors:
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.872735 0.728964 -3.9408 9.225e-05 ***
educ 0.598965 0.051284 11.6795 < 2.2e-16 ***
exper 0.022340 0.012057 1.8528 0.06447 .
tenure 0.169269 0.021645 7.8204 2.935e-14 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
HC (robust) standard errors:
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.872735 0.807415 -3.5579 0.0004078 ***
educ 0.598965 0.061014 9.8169 < 2.2e-16 ***
exper 0.022340 0.010555 2.1165 0.0347731 *
tenure 0.169269 0.029278 5.7814 1.277e-08 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1