Lecture 12: Hypothesis testing in multiple regression

Economics 326 — Introduction to Econometrics II

Author

Vadim Marmer, UBC

The model

  • We consider the classical normal linear regression model:

    1. Y_{i}=\beta_{0}+\beta_{1}X_{1,i}+\ldots+\beta_{k}X_{k,i}+U_{i}.

    2. Conditional on \mathbf{X}, \mathrm{E}\left[U_{i} \mid \mathbf{X}\right]=0 for all i’s.

    3. Conditional on \mathbf{X}, \mathrm{E}\left[U_{i}^{2} \mid \mathbf{X}\right]=\sigma^{2} for all i’s.

    4. Conditional on \mathbf{X}, \mathrm{E}\left[U_{i}U_{j} \mid \mathbf{X}\right]=0 for all i\neq j.

    5. Conditional on \mathbf{X}, U_{i}’s are jointly normally distributed.

  • We also continue to assume no perfect multicollinearity: the k regressors and constant do not form a perfect linear combination, i.e., we cannot find constants c_{1},\ldots,c_{k},c_{k+1} (not all equal to zero) such that for all i’s:

    c_{1}X_{1,i}+\ldots+c_{k}X_{k,i}+c_{k+1}=0.

Testing a single coefficient

  • Take the j-th coefficient \beta_{j}, j\in\left\{0,1,\ldots,k\right\}.

  • Under our assumptions, conditional on \mathbf{X}, the OLS estimator \hat{\beta}_{j} satisfies \hat{\beta}_{j}\sim N\left(\beta_{j},\mathrm{Var}\left(\hat{\beta}_{j} \mid \mathbf{X}\right)\right), where \mathrm{Var}\left(\hat{\beta}_{j} \mid \mathbf{X}\right)=\sigma^{2}/\sum_{i=1}^{n}\tilde{X}_{j,i}^{2} (see Lecture 11).

  • Therefore, \left(\hat{\beta}_{j}-\beta_{j}\right)/\sqrt{\mathrm{Var}\left(\hat{\beta}_{j} \mid \mathbf{X}\right)}\sim N\left(0,1\right).

  • The conditional variance \mathrm{Var}\left(\hat{\beta}_{j} \mid \mathbf{X}\right) is unknown because \sigma^{2} is unknown. The estimator for \mathrm{Var}\left(\hat{\beta}_{j} \mid \mathbf{X}\right) is

    \widehat{\mathrm{Var}}\left(\hat{\beta}_{j}\right)=\frac{s^{2}}{\sum_{i=1}^{n}\tilde{X}_{j,i}^{2}},

    where s^{2}=\sum_{i=1}^{n}\hat{U}_{i}^{2}/\left(n-k-1\right) (see Lecture 10).

Testing a single coefficient

  • We have that conditional on \mathbf{X},

    \frac{\hat{\beta}_{j}-\beta_{j}}{\sqrt{\widehat{\mathrm{Var}}\left(\hat{\beta}_{j}\right)}}\sim t_{n-k-1}.

  • Standard error: \mathrm{se}\left(\hat{\beta}_{j}\right)=\sqrt{\widehat{\mathrm{Var}}\left(\hat{\beta}_{j}\right)}=\sqrt{s^{2}/\sum_{i=1}^{n}\tilde{X}_{j,i}^{2}}.

Testing a single coefficient: two-sided

  • Consider testing H_{0}:\beta_{j}=\beta_{j,0} against H_{1}:\beta_{j}\neq\beta_{j,0}.

  • Under H_{0}, we have that

    T=\frac{\hat{\beta}_{j}-\beta_{j,0}}{\sqrt{\widehat{\mathrm{Var}}\left(\hat{\beta}_{j}\right)}}\sim t_{n-k-1}.

  • Let t_{df,\tau} be the \tau-th quantile of the t_{df} distribution.

  • Test: Reject H_{0} when \left\vert T\right\vert>t_{n-k-1,1-\alpha/2}.

  • P-value: \text{p-value}=2\left(1-F_{t_{n-k-1}}\!\left(\left\vert T\right\vert\right)\right), where F_{t_{n-k-1}} is the CDF of the t_{n-k-1} distribution.

Testing a linear combination of coefficients

  • Let c_{0},c_{1},\ldots,c_{k},r be some constants. Consider testing

    \begin{aligned} H_{0} &: c_{0}\beta_{0}+c_{1}\beta_{1}+\ldots+c_{k}\beta_{k}=r \text{ against} \\ H_{1} &: c_{0}\beta_{0}+c_{1}\beta_{1}+\ldots+c_{k}\beta_{k}\neq r. \end{aligned}

  • Example 1: Consider \ln Y_{i}=\beta_{0}+\beta_{1}\ln L_{i}+\beta_{2}\ln K_{i}+U_{i}. To test for constant returns to scale, H_{0}:\beta_{1}+\beta_{2}=1, set c_{0}=0, c_{1}=1, c_{2}=1, r=1.

  • Example 2: Consider \ln\left(\mathit{Wage}_{i}\right) = \beta_{0}+\beta_{1}\mathit{Experience}_{i}+\beta_{2}\mathit{PrevExperience}_{i}+\ldots+U_{i}. To test that the two experience variables have the same effect on wage, H_{0}:\beta_{1}-\beta_{2}=0, set c_{0}=0, c_{1}=1, c_{2}=-1, c_{3}=\ldots=c_{k}=0, r=0.

  • Example 3: Consider \ln\left(\mathit{Wage}_{i}\right)=\beta_{0}+\beta_{1}\mathit{Exper}_{i}+\beta_{2}\mathit{Exper}_{i}^{2}+\ldots+U_{i}. The marginal effect of experience is \beta_{1}+2\beta_{2}\mathit{Exper}_{i}. If the wage-experience profile is concave (\beta_{2}<0), the marginal effect is smallest at the highest experience level. To test whether the marginal effect equals zero at \mathit{Exper}=20: H_{0}:\beta_{1}+40\beta_{2}=0, with c_{1}=1, c_{2}=40, r=0.

Testing a linear combination of coefficients

  • We have that under H_{0}: c_{0}\beta_{0}+c_{1}\beta_{1}+\ldots+c_{k}\beta_{k}=r,

    \begin{aligned} & \frac{c_{0}\hat{\beta}_{0}+c_{1}\hat{\beta}_{1}+\ldots+c_{k}\hat{\beta}_{k}-r}{\sqrt{\mathrm{Var}\left(c_{0}\hat{\beta}_{0}+c_{1}\hat{\beta}_{1}+\ldots+c_{k}\hat{\beta}_{k} \mid \mathbf{X}\right)}} \\ &= \frac{c_{0}\hat{\beta}_{0}+c_{1}\hat{\beta}_{1}+\ldots+c_{k}\hat{\beta}_{k}-\left(c_{0}\beta_{0}+c_{1}\beta_{1}+\ldots+c_{k}\beta_{k}\right)}{\sqrt{\mathrm{Var}\left(c_{0}\hat{\beta}_{0}+c_{1}\hat{\beta}_{1}+\ldots+c_{k}\hat{\beta}_{k} \mid \mathbf{X}\right)}} \\ & \sim N\left(0,1\right). \end{aligned}

  • The variance of the linear combination is

    \begin{align*} &\mathrm{Var}\left(c_{0}\hat{\beta}_{0}+c_{1}\hat{\beta}_{1}+\ldots+c_{k}\hat{\beta}_{k} \mid \mathbf{X}\right) \\ &= \sum_{j=0}^{k}c_{j}^{2}\mathrm{Var}\left(\hat{\beta}_{j} \mid \mathbf{X}\right) + \sum_{j=0}^{k}\sum_{l\neq j}c_{j}c_{l}\mathrm{Cov}\left(\hat{\beta}_{j},\hat{\beta}_{l} \mid \mathbf{X}\right). \end{align*}

Testing a linear combination of coefficients

  • Consider

    T=\frac{c_{0}\hat{\beta}_{0}+c_{1}\hat{\beta}_{1}+\ldots+c_{k}\hat{\beta}_{k}-r}{\sqrt{\widehat{\mathrm{Var}}\left(c_{0}\hat{\beta}_{0}+c_{1}\hat{\beta}_{1}+\ldots+c_{k}\hat{\beta}_{k}\right)}}.

  • Under H_{0}: c_{0}\beta_{0}+c_{1}\beta_{1}+\ldots+c_{k}\beta_{k}=r,

    T\sim t_{n-k-1}.

  • Two-sided test: Reject H_{0} when \left\vert T\right\vert>t_{n-k-1,1-\alpha/2}.

CRS test: details

  • Consider the model \ln Y_{i}=\beta_{0}+\beta_{1}\ln L_{i}+\beta_{2}\ln K_{i}+U_{i}.

  • We want to test for constant returns to scale: H_{0}:\beta_{1}+\beta_{2}=1.

  • The test statistic: T=\dfrac{\hat{\beta}_{1}+\hat{\beta}_{2}-1}{\sqrt{\widehat{\mathrm{Var}}\left(\hat{\beta}_{1}+\hat{\beta}_{2}\right)}}.

  • The estimated variance:

    \widehat{\mathrm{Var}}\left(\hat{\beta}_{1}+\hat{\beta}_{2}\right)=\widehat{\mathrm{Var}}\left(\hat{\beta}_{1}\right)+\widehat{\mathrm{Var}}\left(\hat{\beta}_{2}\right)+2\widehat{\mathrm{Cov}}\left(\hat{\beta}_{1},\hat{\beta}_{2}\right).

    • \widehat{\mathrm{Var}}\left(\hat{\beta}_{1}\right) and \widehat{\mathrm{Var}}\left(\hat{\beta}_{2}\right) can be computed from the corresponding standard errors reported by R.

    • In R, \widehat{\mathrm{Cov}}\left(\hat{\beta}_{1},\hat{\beta}_{2}\right) can be obtained (together with the variances) by using the command vcov(fit) after running a regression.

  • Reject H_{0}:\beta_{1}+\beta_{2}=1 if \left\vert T\right\vert>t_{n-3,1-\alpha/2}.

Example

  • 1000 observations were generated using the following model:

    \begin{aligned} &\left.\begin{array}{l} L_{i}=e^{l_{i}} \\ K_{i}=e^{k_{i}} \end{array}\right\} \text{ where } l_{i},k_{i} \text{ are iid } N\left(0,1\right), \mathrm{Cov}\left(l_{i},k_{i}\right)=0.5, \\ &U_{i}\sim \text{iid } N\left(0,1\right) \text{ is independent of } l_{i},k_{i}, \\ &Y_{i}=L_{i}^{0.35}K_{i}^{0.52}e^{U_{i}}. \end{aligned}

  • The following equation was estimated:

    \ln Y_{i}=\beta_{0}+\beta_{1}\ln L_{i}+\beta_{2}\ln K_{i}+U_{i}.

  • We test H_{0}:\beta_{1}+\beta_{2}=1 against H_{1}:\beta_{1}+\beta_{2}\neq 1 at the 5% significance level.

    set.seed(123)
    n <- 1000
    lnL <- rnorm(n)
    lnK <- 0.5 * lnL + sqrt(1 - 0.5^2) * rnorm(n)
    U <- rnorm(n)
    lnY <- 0.35 * lnL + 0.52 * lnK + U

Example: regression output

  • Regression output:

    fit <- lm(lnY ~ lnL + lnK)
    summary(fit)
    
    Call:
    lm(formula = lnY ~ lnL + lnK)
    
    Residuals:
        Min      1Q  Median      3Q     Max 
    -2.8360 -0.6277 -0.0370  0.6538  3.3787 
    
    Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
    (Intercept) -0.02093    0.03098  -0.676    0.499    
    lnL          0.31263    0.03735   8.371   <2e-16 ***
    lnK          0.55176    0.03555  15.522   <2e-16 ***
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 0.9788 on 997 degrees of freedom
    Multiple R-squared:  0.3942,  Adjusted R-squared:  0.393 
    F-statistic: 324.4 on 2 and 997 DF,  p-value: < 2.2e-16
  • The variance-covariance matrix of the coefficient estimates:

    vcov(fit)
                  (Intercept)           lnL           lnK
    (Intercept)  9.598283e-04  1.015829e-05 -4.491794e-05
    lnL          1.015829e-05  1.394680e-03 -7.281792e-04
    lnK         -4.491794e-05 -7.281792e-04  1.263649e-03
  • The critical value t_{n-3,0.975}:

    qt(1 - 0.025, df = fit$df.residual)
    [1] 1.962346

Example: manual calculation

  • From the regression output:

    b1 <- coef(fit)["lnL"]
    b2 <- coef(fit)["lnK"]
    V <- vcov(fit)
    
    cat("b1 =", b1, "\n")
    b1 = 0.3126275 
    cat("b2 =", b2, "\n")
    b2 = 0.5517621 
    cat("Var(b1) =", V["lnL", "lnL"], "\n")
    Var(b1) = 0.00139468 
    cat("Var(b2) =", V["lnK", "lnK"], "\n")
    Var(b2) = 0.001263649 
    cat("Cov(b1, b2) =", V["lnL", "lnK"], "\n")
    Cov(b1, b2) = -0.0007281792 
  • The standard error of \hat{\beta}_{1}+\hat{\beta}_{2}:

    se_sum <- sqrt(V["lnL", "lnL"] + V["lnK", "lnK"] + 2 * V["lnL", "lnK"])
    cat("se(b1 + b2) =", se_sum, "\n")
    se(b1 + b2) = 0.03466944 
  • The test statistic:

    T_stat <- (b1 + b2 - 1) / se_sum
    cat("T =", T_stat, "\n")
    T = -3.911526 
  • The critical value:

    cv <- qt(1 - 0.025, df = fit$df.residual)
    cat("|T| =", abs(T_stat), ", critical value =", cv, "\n")
    |T| = 3.911526 , critical value = 1.962346 
  • Since \left\vert T\right\vert > t_{997,0.975}, we reject H_{0}.

  • Ignoring the covariance leads to an incorrect result:

    se_wrong <- sqrt(V["lnL", "lnL"] + V["lnK", "lnK"])
    T_wrong <- (b1 + b2 - 1) / se_wrong
    cat("T (ignoring covariance) =", T_wrong, "\n")
    T (ignoring covariance) = -2.6302 

Re-parametrization approach

  • We want to test \beta_{1}+\beta_{2}=1 in \ln Y_{i}=\beta_{0}+\beta_{1}\ln L_{i}+\beta_{2}\ln K_{i}+U_{i}.

  • Define \delta=\beta_{1}+\beta_{2}, or \beta_{2}=\delta-\beta_{1}, so that

    \begin{aligned} \ln Y_{i} &= \beta_{0}+\beta_{1}\ln L_{i}+\beta_{2}\ln K_{i}+U_{i} \\ &= \beta_{0}+\beta_{1}\ln L_{i}+\left(\delta-\beta_{1}\right)\ln K_{i}+U_{i} \\ &= \beta_{0}+\beta_{1}\left(\ln L_{i}-\ln K_{i}\right)+\delta\ln K_{i}+U_{i}. \end{aligned}

  • Generate a new variable D_{i}=\ln L_{i}-\ln K_{i}.

  • Estimate \ln Y_{i}=\beta_{0}+\beta_{1}D_{i}+\delta\ln K_{i}+U_{i}.

  • Test H_{0}:\delta=1 against H_{1}:\delta\neq 1.

Example: reparameterization

  • Reparameterized regression output:

    D <- lnL - lnK
    fit2 <- lm(lnY ~ D + lnK)
    summary(fit2)
    
    Call:
    lm(formula = lnY ~ D + lnK)
    
    Residuals:
        Min      1Q  Median      3Q     Max 
    -2.8360 -0.6277 -0.0370  0.6538  3.3787 
    
    Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
    (Intercept) -0.02093    0.03098  -0.676    0.499    
    D            0.31263    0.03735   8.371   <2e-16 ***
    lnK          0.86439    0.03467  24.932   <2e-16 ***
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 0.9788 on 997 degrees of freedom
    Multiple R-squared:  0.3942,  Adjusted R-squared:  0.393 
    F-statistic: 324.4 on 2 and 997 DF,  p-value: < 2.2e-16
  • The 95% confidence interval for the coefficient on \ln K:

    confint(fit2, "lnK")
            2.5 %   97.5 %
    lnK 0.7963561 0.932423
  • The interval does not include 1, so we reject H_{0}.

  • In the original equation, \hat{\beta}_{1}+\hat{\beta}_{2} equals the coefficient on \ln K in the reparameterized regression, and \mathrm{se}\left(\hat{\beta}_{1}+\hat{\beta}_{2}\right) equals its standard error.

Testing with linearHypothesis() in R

  • The car package provides linearHypothesis(), which directly tests linear restrictions on regression coefficients.

  • Testing for constant returns to scale (\beta_{1}+\beta_{2}=1):

    library(car)
    linearHypothesis(fit, "lnL + lnK = 1")
    
    Linear hypothesis test:
    lnL  + lnK = 1
    
    Model 1: restricted model
    Model 2: lnY ~ lnL + lnK
    
      Res.Df    RSS Df Sum of Sq    F    Pr(>F)    
    1    998 969.76                                
    2    997 955.10  1    14.657 15.3 9.793e-05 ***
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  • linearHypothesis() reports an F-statistic. If T \sim t_{n-k-1}, then F = T^{2} \sim F_{1, n-k-1}.

  • For a single linear restriction, the F-test and the two-sided t-test are equivalent: F = T^{2} and the p-values are identical.

  • Testing for equal effects (\beta_{1}=\beta_{2}):

    linearHypothesis(fit, "lnL = lnK")
    
    Linear hypothesis test:
    lnL - lnK = 0
    
    Model 1: restricted model
    Model 2: lnY ~ lnL + lnK
    
      Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
    1    998 968.42                                  
    2    997 955.10  1    13.314 13.898 0.0002039 ***
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1