Economics 326 — Introduction to Econometrics II
$
Terminology for Simple Regression
| y | x |
|---|---|
| Dependent variable | Independent variable |
| Explained variable | Explanatory variable |
| Response variable | Control variable |
| Predicted variable | Predictor variable |
| Regressand | Regressor |
The econometrician observes random data:
| observation | dependent variable | regressor |
|---|---|---|
| 1 | Y_{1} | X_{1} |
| 2 | Y_{2} | X_{2} |
| \vdots | \vdots | \vdots |
| n | Y_{n} | X_{n} |
A pair X_{i}, Y_{i} is called an observation.
Sample: \left\{ \left( X_{i},Y_{i}\right) : i=1,…,n\right\}.
The population is the joint distribution of the sample.
We model the relationship between Y and X using the conditional expectation: \mathrm{E}\left[Y_{i}|X_{i}\right] = \alpha + \beta X_{i}.
Intercept: \alpha = \mathrm{E}\left[Y_{i}|X_{i}=0\right].
Slope: \beta measures the effect of a unit change in X on Y: \begin{aligned} \beta &= \mathrm{E}\left[Y_{i}|X_{i}=x+1\right] - \mathrm{E}\left[Y_{i}|X_{i}=x\right] \\ &= \left[ \alpha + \beta (x+1)\right] - \left[ \alpha + \beta x\right]. \end{aligned}
Marginal effect of X on Y: \beta = \frac{d\mathrm{E}\left[Y_{i}|X_{i}\right]}{dX_{i}}.
The effect is the same for all x.
\alpha and \beta in \mathrm{E}\left[Y_{i}|X_{i}\right] = \alpha + \beta X_{i} are unknown.
Residual (error): U_{i} = Y_{i} - \mathrm{E}\left[Y_{i}|X_{i}\right] = Y_{i} - \left( \alpha + \beta X_{i}\right). U_{i}’s are unobservable.
The model: \begin{aligned} Y_{i} &= \alpha + \beta X_{i} + U_{i}, \\ \mathrm{E}\left[U_{i}|X_{i}\right] &= 0. \end{aligned}
Consider the following model: \log Y_{i} = \alpha + \beta X_{i} + U_{i}.
In this case, \begin{aligned} \beta &= \frac{d\left( \log Y_{i}\right)}{dX_{i}} \\ &= \frac{dY_{i}/Y_{i}}{dX_{i}} = \frac{dY_{i}/dX_{i}}{Y_{i}}. \end{aligned}
\beta measures percentage change in Y as a response to a unit change in X.
In this model, it is assumed that the percentage change in Y is the same for all values of X (constant).
In \log \left( \text{Wage}_{i}\right) = \alpha + \beta \times \text{Education}_{i} + U_{i}, \beta measures the return to education.
Consider the following model: \log Y_{i} = \alpha + \beta \log X_{i} + U_{i}.
In this model, \begin{aligned} \beta &= \frac{d\log Y_{i}}{d\log X_{i}} \\ &= \frac{dY_{i}/Y_{i}}{dX_{i}/X_{i}} = \frac{dY_{i}}{dX_{i}}\frac{X_{i}}{Y_{i}}. \end{aligned}
\beta measures elasticity: the percentage change in Y as a response to 1% change in X.
Here, the elasticity is assumed to be the same for all values of X.
Example: Cobb-Douglas production function: Y=\alpha K^{\beta_{1}}L^{\beta_{2}} \Longrightarrow \log Y=\log \alpha + \beta_{1}\log K + \beta_{2}\log L (two regressors: log of capital and log of labour).
The model: Y_{i} = \alpha + \beta X_{i} + U_{i}.
We assume that \mathrm{E}\left[U_{i}|X_{i}\right] = 0.
\mathrm{E}\left[U_{i}\right] = 0. \mathrm{E}\left[U_{i}\right] \overset{\text{Law of Iterated Expectation}}{=} \mathrm{E}\left[\mathrm{E}\left[U_{i}|X_{i}\right]\right] = \mathrm{E}\left[0\right] = 0.
\mathrm{Cov}\left(X_{i},U_{i}\right) = \mathrm{E}\left[X_{i}U_{i}\right] = 0. \begin{aligned} \mathrm{E}\left[X_{i}U_{i}\right] &\overset{\text{Law of Iterated Expectation}}{=} \mathrm{E}\left[\mathrm{E}\left[X_{i}U_{i}|X_{i}\right]\right] \\ &= \mathrm{E}\left[X_{i} \mathrm{E}\left[U_{i}|X_{i}\right]\right] = \mathrm{E}\left[X_{i} \cdot 0\right] = 0. \end{aligned}
Y_{i} = \underbrace{\alpha + \beta X_{i}}_{ \text{Predicted by } X} + \underbrace{U_{i}}_{ \text{"Orthogonal to" } X}
Problem: estimate the unknown parameters \alpha and \beta using the data (n observations) on Y and X.
Our assumptions imply: \begin{aligned} \mathrm{E}\left[U_{i}\right] &= \mathrm{E}\left[Y_{i} - \alpha - \beta X_{i}\right] = 0. \\ \mathrm{E}\left[X_{i}U_{i}\right] &= \mathrm{E}\left[X_{i}\left( Y_{i} - \alpha - \beta X_{i}\right)\right] = 0. \end{aligned}
An estimator is a function of the observable data; it can depend only on observable X and Y. Let \hat{\alpha} and \hat{\beta} denote the estimators of \alpha and \beta.
Method of moments: replace expectations with averages. Normal equations: \begin{aligned} \frac{1}{n}\sum_{i=1}^{n}\left( Y_{i} - \hat{\alpha} - \hat{\beta} X_{i}\right) &= 0. \\ \frac{1}{n}\sum_{i=1}^{n}X_{i}\left( Y_{i} - \hat{\alpha} - \hat{\beta} X_{i}\right) &= 0. \end{aligned}
Let \bar{Y}=\frac{1}{n}\sum_{i=1}^{n}Y_{i} and \bar{X}=\frac{1}{n}\sum_{i=1}^{n}X_{i} (averages).
\frac{1}{n}\sum_{i=1}^{n}\left( Y_{i} - \hat{\alpha} - \hat{\beta}X_{i}\right) = 0 implies \begin{aligned} \frac{1}{n}\sum_{i=1}^{n}Y_{i} - \frac{1}{n}\sum_{i=1}^{n}\hat{\alpha} - \hat{\beta}\frac{1}{n}\sum_{i=1}^{n}X_{i} &= 0 \text{ or} \\ \bar{Y} - \hat{\alpha} - \hat{\beta}\bar{X} &= 0. \end{aligned}
The fitted regression line goes through the averages.
\hat{\alpha} = \bar{Y} - \hat{\beta}\bar{X}.
0 = \sum_{i=1}^{n}X_{i}\left( Y_{i} - \bar{Y}\right) - \hat{\beta}\sum_{i=1}^{n}X_{i}\left( X_{i} - \bar{X}\right) \text{ or} \hat{\beta} = \frac{\sum_{i=1}^{n}X_{i}\left( Y_{i} - \bar{Y}\right)}{\sum_{i=1}^{n}X_{i}\left( X_{i} - \bar{X}\right)}.
Since \begin{aligned} \sum_{i=1}^{n}X_{i}\left( Y_{i} - \bar{Y}\right) &= \sum_{i=1}^{n}\left( X_{i} - \bar{X}\right) \left( Y_{i} - \bar{Y}\right) = \sum_{i=1}^{n}\left( X_{i} - \bar{X}\right) Y_{i} \\ \sum_{i=1}^{n}X_{i}\left( X_{i} - \bar{X}\right) &= \sum_{i=1}^{n}\left( X_{i} - \bar{X}\right)\left( X_{i} - \bar{X}\right) = \sum_{i=1}^{n}\left( X_{i} - \bar{X}\right)^{2} \end{aligned} we can also write \hat{\beta} = \frac{\sum_{i=1}^{n}\left( X_{i} - \bar{X}\right) Y_{i}}{\sum_{i=1}^{n}\left( X_{i} - \bar{X}\right)^{2}}.
Minimize Q\left( a,b\right) = \sum_{i=1}^{n}\left( Y_{i} - a - bX_{i}\right)^{2} with respect to a and b.
Derivatives: \begin{aligned} \frac{dQ\left( a,b\right)}{da} &= -2\sum_{i=1}^{n}\left( Y_{i} - a - bX_{i}\right). \\ \frac{dQ\left( a,b\right)}{db} &= -2\sum_{i=1}^{n}\left( Y_{i} - a - bX_{i}\right) X_{i}. \end{aligned}
First-order conditions: \begin{aligned} 0 &= \sum_{i=1}^{n}\left( Y_{i} - \hat{\alpha} - \hat{\beta}X_{i}\right) = \sum_{i=1}^{n}\hat{U}_{i}. \\ 0 &= \sum_{i=1}^{n}\left( Y_{i} - \hat{\alpha} - \hat{\beta}X_{i}\right) X_{i} = \sum_{i=1}^{n}\hat{U}_{i}X_{i}. \end{aligned}
Method of moments = OLS: \hat{\beta} = \frac{\sum_{i=1}^{n}\left( X_{i} - \bar{X}\right) Y_{i}}{\sum_{i=1}^{n}\left( X_{i} - \bar{X}\right)^{2}} \quad \text{and} \quad \hat{\alpha} = \bar{Y} - \hat{\beta}\bar{X}.