Consider the following regression where

`airq`

is an indicator of air quality (lower is better) for a particular metropolitan area in California,`dens1000`

is the number of 1000s of people per square mile,`coas`

indicates whether or not the metro area is on the coast, and`medi1000`

is the median income in the metro area (in thousands of dollars).`data("Airq", package="Ecdat") library(modelsummary) Airq$coas <- 1*(Airq$coas=="yes") Airq$dens1000 <- Airq$dens/1000 Airq$medi1000 <- Airq$medi/1000 reg1 <- lm(airq ~ dens1000 + coas + dens1000*coas + medi1000, data=Airq) modelsummary(reg1, fmt=1, gof_omit=".")`

Model 1

(Intercept)

120.6

(9.5)

dens1000

−0.3

(2.8)

coas

−31.2

(11.3)

medi1000

0.8

(0.4)

dens1000 × coas

−1.2

(3.4)

Which regressors are statistically significant in this regression?

What is the predicted value for the air quality index for a metro area with 1000 people per square mile, that is not located on the coast, and with median income equal to $50,000?

Consider the following regression, where

`child_fincome`

is child’s family income,`parent_fincome`

is parents’ family income,`sex`

is binary variable indicating whether a child is male,`yearborn`

is the year that the child was born in, and`education`

is the years of education of the child.`load("../Detailed Course Notes/data/intergenerational_mobility.RData") reg2 <- lm(log(child_fincome) ~ log(parent_fincome) + sex + yearborn + education, data=intergenerational_mobility) summary(reg2)`

`## ## Call: ## lm(formula = log(child_fincome) ~ log(parent_fincome) + sex + ## yearborn + education, data = intergenerational_mobility) ## ## Residuals: ## Min 1Q Median 3Q Max ## -3.11404 -0.32489 0.04514 0.36940 2.70867 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 21.3037430 1.9719502 10.803 < 2e-16 *** ## log(parent_fincome) 0.5964735 0.0198679 30.022 < 2e-16 *** ## sex 0.0318506 0.0194484 1.638 0.101572 ## yearborn -0.0085957 0.0009896 -8.686 < 2e-16 *** ## education 0.0012618 0.0003437 3.672 0.000244 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.5834 on 3625 degrees of freedom ## Multiple R-squared: 0.2221, Adjusted R-squared: 0.2212 ## F-statistic: 258.8 on 4 and 3625 DF, p-value: < 2.2e-16`

How do you interpret the coefficient on

`log(parent_fincome)`

in this model?

Let \(Y\) denote a person’s age in the United States. Suppose that you have the theory that \(\mathbb{E}[Y] = 35\). You are able to collect a random sample of 100 observations. Using this data, you calculate \(\bar{Y} = 37\) and that \(\hat{\mathrm{var}}(Y) = 6\).

Calculate a t-statistic for testing the null hypothesis that \(\mathbb{E}[Y]=35\). Do you reject the null hypothesis here? Explain.

What is the standard error of \(\bar{Y}\).

Calculate a p-value for the null hypothesis that \(\mathbb{E}[Y]=35\). How do you interpret it?

Calculate a 95% confidence interval for \(\mathbb{E}[Y]\). How do you interpret it?

Consider the following regression using country-level data, where \(GDP\) is a country’s GDP, \(Inflation\) is the country’s current inflation rate, \(Europe\) is a binary variable indicating whether the country is located in Europe, and where \(Democracy\) is a binary variable indicating whether a country has democratic institutions.

\[GDP = \beta_0 + \beta_1 Inflation + \beta_2 Inflation \cdot Europe + \beta_3 Inflation^2 + \beta_4 Democracy + U\]

What is the partial effect of Inflation in this model?

What is the average partial effect of Inflation in this model?

Given relevant data, how would you estimate the average partial effect of Inflation?