\[\newcommand{\E}{\mathbb{E}}\]
For this problem, we’ll use the mtcars
data. Use the first 20 rows of mtcars
to estimate the following regression
\[ mpg = \beta_0 + \beta_1 hp + \beta_2 wt + \beta_3 gear + U \]
Using these regression results, predict mpg
for the remaining 12 observations in mtcars
. Which of these cars has the largest prediction error in absolute value?
Rules:
To win
Raise your hand, and tell me which car has the largest prediction error in absolute value, and your team will advance to the championship code challenge.
Solution below…
train_data <- mtcars[1:20,]
test_data <- mtcars[21:(nrow(mtcars)), ]
train_reg <- lm(mpg ~ hp + wt + gear, data=mtcars)
Yhat <- predict(train_reg, newdata=test_data)
uhat <- test_data$mpg - Yhat
uhat
## Toyota Corona Dodge Challenger AMC Javelin Camaro Z28
## -2.1227485 -2.7993972 -3.3712111 -0.4814222
## Pontiac Firebird Fiat X1-9 Porsche 914-2 Lotus Europa
## 2.8595428 -0.1779368 -0.9227150 2.2815510
## Ford Pantera L Ferrari Dino Maserati Bora Volvo 142E
## -1.4649819 -2.1180654 1.6259525 -1.7939863
max(abs(uhat))
## [1] 3.371211
which is the prediction error for the AMC Javelin