The main material for the final exam comes from Sections 4.10-6.6 in the Course Notes (excluding Sections 4.11 (you still have to know how to interpret standard errors, p-values, etc. from a regression, but you do not need to know all the details from 4.11) and all of Section 5.4). The exam is cumulative, but I think that you should expect 75% of the material to come from these new sections. The more challenging material on the exam will also come from the new sections.

As I see it, some of the main topics from the first part of the course include:

Being able to write/interpret R code

Being able to work with/manipulate expressions involving expectations

Hypothesis testing, being able to interpret standard errors, p-values, confidence intervals, etc.

Being able to interpret regression results

Besides these, the topics of tradeoffs between bias and variance that we talked about earlier in the semester are related to the prediction problems that we have talked about more recently; and omitted variable bias that we previously talked about is closely related to the causal inference topics that we have been talking about recently too.

My general advice for studying is to (i) study the notes that you have take in class, (ii) study the Course Notes, and (iii) for topics where you still have any doubts, follow the cross-references from the Course Notes to the textbook for an additional reference.

R questions are fair game for the exam. As for previous exams, you should expect some coding question(s), but a larger portion (as well as the more challenging questions) of the exam will concern the prediction/causal inference parts of the class.

I anticipate that the exam will take 1.5 hours to complete, but you will have the full 3 hours to take the exam.

Finally, I have provided some extra questions here. My recommendation is to study some before you try to answer these questions. We will cover (some of) these questions in class on Monday.