1. Consider the following regression where airq is an indicator of air quality (lower is better) for a particular metropolitan area in California, dens1000 is the number of 1000s of people per square mile, coas indicates whether or not the metro area is on the coast, and medi1000 is the median income in the metro area (in thousands of dollars).

    data("Airq", package="Ecdat")
    library(modelsummary)
    Airq$coas <- 1*(Airq$coas=="yes")
    Airq$dens1000 <- Airq$dens/1000
    Airq$medi1000 <- Airq$medi/1000
    reg1 <- lm(airq ~ dens1000 + coas + dens1000*coas + medi1000, data=Airq)
    modelsummary(reg1, fmt=1, gof_omit=".")

     (1)

    (Intercept)

    120.6

    (9.5)

    dens1000

    −0.3

    (2.8)

    coas

    −31.2

    (11.3)

    medi1000

    0.8

    (0.4)

    dens1000 × coas

    −1.2

    (3.4)

    1. Which regressors are statistically significant in this regression?

      Answer: The intercept and coas are statistically significant. medi1000 is marginally statistically significant (the t-statistic is exactly equal to 2 from the available information); none of the other regressors are statistically significant.

    2. What is the predicted value for the air quality index for a metro area with 1000 people per square mile, that is not located on the coast, and with median income equal to $50,000?

      Answer: The predicted value is given by:

      120.6 - 0.3(1) - 31.2(0) + 0.8(50) - 1.2(1)(0) = 160.3




  1. Let \(Y\) denote a person’s age in the United States. Suppose that you have the theory that \(\mathbb{E}[Y] = 35\). You are able to collect a random sample of 100 observations. Using this data, you calculate \(\bar{Y} = 37\) and that \(\widehat{\mathrm{var}}(Y) = 6\).

    1. Calculate a t-statistic for testing the null hypothesis that \(\mathbb{E}[Y]=35\). Do you reject the null hypothesis here? Explain.

      Answer:

      \[ \begin{aligned} t &= \frac{\sqrt{n}(\bar{Y} - \mu_0)}{\sqrt{\widehat{\mathrm{var}}(Y)}} \\ &= \frac{10(37-35)}{\sqrt{6}} \\ &= \frac{20}{\sqrt{6}} \\ &= 8.16 \end{aligned} \]

      Since \(|t| > 1.96\), you reject the null hypothesis here. In other words, if the null hypothesis were true, there is less than a 5% chance that we would get a t-statistic this large (in absolute value).

    2. What is the standard error of \(\bar{Y}\).

      Answer:

      \[ \begin{aligned} \textrm{s.e.}(\bar{Y}) &= \frac{\sqrt{\widehat{\mathrm{var}}(Y)}}{\sqrt{n}} \\ &= \frac{\sqrt{6}}{\sqrt{100}} \\ &= 0.245 \end{aligned} \]

    3. Calculate a p-value for the null hypothesis that \(\mathbb{E}[Y]=35\). How do you interpret it?

      Answer:

      \[ \begin{aligned} \textrm{p-value} &= 2 \Phi(-|t|) \\ &= 2 \Phi(-8.16) \\ &= 3\times 10^{-16} \approx 0 \end{aligned} \]

      This p-value indicates that, if the null hypothesis were true, it is virtually certain that we would not get a t-statistic as large in absolute value as we did — in other words, we have very strong evidence against \(H_0\) here.

    4. Calculate a 95% confidence interval for \(\mathbb{E}[Y]\). How do you interpret it?

      Answer:

      \[ \begin{aligned} CI &= [\bar{Y} - 1.96 \textrm{s.e.}(\bar{Y}), \bar{Y} + 1.96 \textrm{s.e.}(\bar{Y})] \\ &= [37 - 1.96 \cdot 0.245, 37 + 1.96 \cdot 0.245] \\ &= [36.52, 37.48] \end{aligned} \]

      There is a 95% chance that the interval \([36.52,37.48]\) contains the true value of \(\mathbb{E}[Y]\).



  1. Consider the following conditional expectation using country-level data, where \(pcGDP\) is a country’s per capita GDP (in thousands of dollars), \(Inflation\) is the country’s current inflation rate, \(Europe\) is a binary variable indicating whether the country is located in Europe, and where \(Democracy\) is a binary variable indicating whether a country has democratic institutions.

    \[\mathbb{E}[pcGDP|Inflation, Europe, Democracy] = \beta_0 + \beta_1 Inflation + \beta_2 Inflation \cdot Europe + \beta_3 Inflation^2 + \beta_4 Democracy\]

    Further suppose that \(\beta_0 = 45\), \(\beta_1=-1\), \(\beta_2=-2\), \(\beta_3=-0.1\), \(\beta_4=8\)

    1. What is the expected value of per capita GDP for a European country with democratic institutions whose inflation rate is equal to 4?

      Answer:

      \[ \begin{aligned} \mathbb{E}[pcGDP|Inflation=4, Europe=1, Democracy=1] &= \beta_0 + \beta_1 (4) + \beta_2 (4)(1) + \beta_3 (4)^2 + \beta_4(1) \\ &= 45 + (-1)(4) + (-2)(4)(1) + (-0.1)(16) + 8(1) \\ &= 39.4 \end{aligned} \]

    2. What is the expected value of per capita GDP for a European country with democratic institutions whose inflation rate is equal to 5?

      Answer:

      \[ \begin{aligned} \mathbb{E}[pcGDP|Inflation=5, Europe=1, Democracy=1] &= \beta_0 + \beta_1 (5) + \beta_2 (5)(1) + \beta_3 (5)^2 + \beta_4(1) \\ &= 45 + (-1)(5) + (-2)(5)(1) + (-0.1)(25) + 8(1) \\ &= 35.5 \end{aligned} \]

    3. What is the expected value of per capita GDP for a non-European country with democratic institutions whose inflation rate is equal to 4?

      Answer:

      \[ \begin{aligned} \mathbb{E}[pcGDP|Inflation=4, Europe=0, Democracy=1] &= \beta_0 + \beta_1 (4) + \beta_2 (4)(0) + \beta_3 (4)^2 + \beta_4(1) \\ &= 45 + (-1)(4) + (-2)(4)(0) + (-0.1)(16) + 8(1) \\ &= 47.4 \end{aligned} \]