Consider the following regression where

`airq`

is an indicator of air quality (lower is better) for a particular metropolitan area in California,`dens1000`

is the number of 1000s of people per square mile,`coas`

indicates whether or not the metro area is on the coast, and`medi1000`

is the median income in the metro area (in thousands of dollars).`data("Airq", package="Ecdat") library(modelsummary) Airq$coas <- 1*(Airq$coas=="yes") Airq$dens1000 <- Airq$dens/1000 Airq$medi1000 <- Airq$medi/1000 reg1 <- lm(airq ~ dens1000 + coas + dens1000*coas + medi1000, data=Airq) modelsummary(reg1, fmt=1, gof_omit=".")`

Model 1

(Intercept)

120.6

(9.5)

dens1000

−0.3

(2.8)

coas

−31.2

(11.3)

medi1000

0.8

(0.4)

dens1000 × coas

−1.2

(3.4)

Which regressors are statistically significant in this regression?

`The intercept and `coas` are statistically significant. `medi1000` is marginally statistically significant (the t-statistic is exactly equal to 2 from the available information); none of the other regressors are statistically significant.`

What is the predicted value for the air quality index for a metro area with 1000 people per square mile, that is not located on the coast, and with median income equal to $50,000?

`The predicted value is given by: 120.6 - 0.3(1) - 31.2(0) + 0.8(50) - 1.2(1)(0) = 160.3`

Let \(Y\) denote a person’s age in the United States. Suppose that you have the theory that \(\mathbb{E}[Y] = 35\). You are able to collect a random sample of 100 observations. Using this data, you calculate \(\bar{Y} = 37\) and that \(\hat{\mathrm{var}}(Y) = 6\).

Calculate a t-statistic for testing the null hypothesis that \(\mathbb{E}[Y]=35\). Do you reject the null hypothesis here? Explain.

\[ \begin{aligned} t &= \frac{\sqrt{n}(\bar{Y} - \mu_0)}{\sqrt{\widehat{\mathrm{var}}(Y)}} \\ &= \frac{10(37-35)}{\sqrt{6}} \\ &= \frac{20}{\sqrt{6}} \\ &= 8.16 \end{aligned} \]

Since \(|t| > 1.96\), you reject the null hypothesis here. In other words, if the null hypothesis were true, there is less than a 5% chance that we would get a t-statistic this large (in absolute value).

What is the standard error of \(\bar{Y}\).

\[ \begin{aligned} \textrm{s.e.}(\bar{Y}) &= \frac{\sqrt{\widehat{\mathrm{var}}(Y)}}{\sqrt{n}} \\ &= \frac{\sqrt{6}}{\sqrt{100}} \\ &= 0.245 \end{aligned} \]

Calculate a p-value for the null hypothesis that \(\mathbb{E}[Y]=35\). How do you interpret it? \[ \begin{aligned} \textrm{p-value} &= 2 \Phi(-|t|) \\ &= 2 \Phi(-8.16) \\ &= 3\times 10^{-16} \approx 0 \end{aligned} \]

This p-value indicates that, if the null hypothesis were true, it is virtually certain that we would not get a t-statistic as large in absolute value as we did — in other words, we have very strong evidence against \(H_0\) here.

Calculate a 95% confidence interval for \(\mathbb{E}[Y]\). How do you interpret it?

\[ \begin{aligned} CI &= [\bar{Y} - 1.96 \textrm{s.e.}(\bar{Y}), \bar{Y} + 1.96 \textrm{s.e.}(\bar{Y})] \\ &= [37 - 1.96 \cdot 0.245, 37 + 1.96 \cdot 0.245] \\ &= [36.52, 37.48] \end{aligned} \]

95% of confidence intervals (in the repeated sampling thought experiment) would contain the true value of \(\mathbb{E}[Y]\).

Consider the following conditional expectation using country-level data, where \(pcGDP\) is a country’s per capita GDP (in thousands of dollars), \(Inflation\) is the country’s current inflation rate, \(Europe\) is a binary variable indicating whether the country is located in Europe, and where \(Democracy\) is a binary variable indicating whether a country has democratic institutions.

\[\mathbb{E}[pcGDP|Inflation, Europe, Democracy] = \beta_0 + \beta_1 Inflation + \beta_2 Inflation \cdot Europe + \beta_3 Inflation^2 + \beta_4 Democracy\]

Further suppose that \(\beta_0 = 45\), \(\beta_1=-1\), \(\beta_2=-2\), \(\beta_3=-0.1\), \(\beta_4=10\)

What is the expected value of per capita GDP for a European country with democratic institutions whose inflation rate is equal to 4?

\[ \begin{aligned} \mathbb{E}[pcGDP|Inflation=4, Europe=1, Democracy=1] &= \beta_0 + \beta_1 (4) + \beta_2 (4)(1) + \beta_3 (4)^2 + \beta_4(1) \\ &= 45 + (-1)(4) + (-2)(4)(1) + (-0.1)(16) + 10(1) \\ &= 41.4 \end{aligned} \]

What is the expected value of per capita GDP for a European country with democratic institutions whose inflation rate is equal to 5?

\[ \begin{aligned} \mathbb{E}[pcGDP|Inflation=5, Europe=1, Democracy=1] &= \beta_0 + \beta_1 (5) + \beta_2 (5)(1) + \beta_3 (5)^2 + \beta_4(1) \\ &= 45 + (-1)(5) + (-2)(5)(1) + (-0.1)(25) + 10(1) \\ &= 37.5 \end{aligned} \]

What is the expected value of per capita GDP for a non-European country with democratic institutions whose inflation rate is equal to 4?

\[ \begin{aligned} \mathbb{E}[pcGDP|Inflation=4, Europe=0, Democracy=1] &= \beta_0 + \beta_1 (4) + \beta_2 (4)(0) + \beta_3 (4)^2 + \beta_4(1) \\ &= 45 + (-1)(4) + (-2)(4)(0) + (-0.1)(16) + 10(1) \\ &= 49.4 \end{aligned} \]