Detailed Course Notes for ECON 4750
1 Introduction
These are a set of detailed course notes for ECON 4750 at UGA. They largely come from my personal notes that I have used to teach this course in previous years, which, in turn, are largely based on the Stock and Watson textbook serves as the other main reference for the course.
These notes have improved drastically over the course of several years. However, they are a work in progress and have not been professionally edited or anything like that…so it is possible that there are some mistakes (if you find any, please let me know). That said, I hope you will find this to be a useful resource this semester.
1.1 What is this?
In previous years, I used to provide these notes as supplementary material for the textbook. Now, I think that there is enough detail here that it can be used as a main reference for the course. And, for example, there are several topics that we will cover in class in substantially more detail than in the textbook.
The material provided here can also be useful as a quasi-study guide. In particular, the notes cover pretty much exactly what we will be able to cover in a one semester course. In some ways, covering less material relative to the textbook can (in my view) make things easier for students. The notes also provide cross-references to the corresponding section in the textbook. For some topics here, I provide substantially less detail than the book; this may help you when it comes to studying as it may give you a sense of the material I see as being most important. In addition, there are additional practice questions (some with answers) provided at the end of each section.
1.2 What is this not?
There are a couple of things that I want to explicitly say that this material is not. These include:
A substitute for coming to class — I will take attendance anyway, but not attending class and relying on this material is not a good plan. Please do not do this.
A full substitute for the textbook — In a number of places, the textbook contains substantially more details than I am able to provide here. In my experience, it is also useful to have different “voices” that you can refer to (e.g., you may find my explanation unclear but may find the textbook much clearer). The textbook also contains substantially more material than what I provide in these notes. If there are additional topics in the textbook that we do not cover that you would like to learn about, these should all be understandable for you by the end of this semester.
Sufficient for making good grades on exams — I don’t suspect that it will be a good strategy to rely exclusively on the material provided here in order to do well on the exams. I think this material should be helpful and perhaps even a good starting point when it comes to studying, but it is not sufficient for making a good grade in the class.
1.3 Why did I write this?
I have a strong opinion about the best order to teach the material in ECON 4750. And, although, there are a number of advanced undergraduate textbooks in Econometrics that I like and reference as I teach the course, none of them go in the same order that I would like to teach. Therefore, I think one way that I can both go in the order that I want to during the semester without confusing you (the student) too much is to provide a detailed set of references to material that we are covering throughout the semester.
The book that I strongly suggest for the course is Stock and Watson. By the end of the semester, we will have covered in much detail the first 14 chapters of the textbook (and some topics we will have covered in substantially more detail than in the textbook). If I had more time, the next topic that I would cover in this course would be Time Series Econometrics. I used to cover this material in the current course, but I have found that I am happier with the tradeoff of understanding slightly fewer topics better relative to covering more topics but at a faster pace. Time series is especially important for students who are interested in macroeconomics or finance. Fortunately, we offer a time series econometrics course, and, if you take it, you should be well prepared for it coming out of this course.
1.4 Additional References
These are all free to download; they are not main textbooks but I sometimes consult them for the class and could potentially be useful for you to consult in the future:
For R programming: Introduction to Econometrics with R, by Cristoph Hanck, Martin Arnold, Alexander Gerber, and Martin Schmelzer
For prediction/machine learning: An Introduction to Statistical Learning, by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani
For causal inference: Causal Inference: The Mixtape, by Scott Cunningham
Additional R References:
There are tons of free R resources available online. Here are some that seem particularly useful to me.
Manageable Introduction: Introduction to R and RStudio, by Stephanie Spielman
Full length book: Introduction to Data Science: Data Analysis and Prediction Algorithms with R, by Rafael Irizarry (this is way more than you will need for this course, but I suggest checking out Chapters 1, 2, 3, and 5, and there’s plenty more that you might find interesting).
Full length book: STAT 545: Data Wrangling, exploration, and analysis with R, by Jenny Bryan
1.5 Goals for the Course
I have three high level goals — that by the end of the semester, students should be able to
Run regressions and be able to interpret them (this includes even complex regressions) and also be able to think through which regression you ought to run
Use data in order to be able to predict outcomes of interest
Be able to think clearly about when statistical results can be interpreted as causal effects.
In order to make progress towards these three goals, we also will need to learn about two additional topics:
Statistical Programming
Probability and Statistics
We’ll start the course off talking about these two topics. Perhaps this will be review material for some of you, but I have found that it is worth it to spend several weeks getting everyone on the same page with respect to these topics.
1.6 Studying for the Class
Students ask me all the time “How should I study for your class?” My advice (and I think this applies to most classes, not just my class) is for you to start by studying the notes from class. The things that I have discussed in class are the things that I think are most important for you learn in this sort of class and are the material that will be covered on the exam. That said, it may sometimes be the case that you do not fully understand a lecture or the notes that you took from a lecture when you are studying (this certainly applied to me when I was a student). If there are places that you do not understand what the notes mean, then I think that is the time when you should find the relevant portion of the textbook (or supplementary notes provided here) in order to “supplement” what the notes say.
1.7 Data Used in the Course
The following provides the full list of data that we will use this semester. Some of the links below are to data posted on my website; others point to data hosted externally (if you notice any broken links, please let me know).
Note: some of the datasets below require installing an R
package. For example, to access the Airq
data, you need to install the Ecdat
package first (i.e., by running install.packages("Ecdat")
) and then run data(Airq, package="Ecdat")
.
acs
Data from the 2019 American Community Survey about Education and Earnings
Access: Course Website
Description: Course Website
Airq
Data about air quality in California counties in 1972.
Access:
data(Airq, package="Ecdat")
Description:
?Ecdat::Airq
AJR
Data about GDP, institutions, and settler mortality
Access:
data(AJR, package="hdm")
Description:
?hdm::AJR
banks
Data about bank closures in Mississippi during the Great Depression
Access: Course Website
Description: Course Website
Birthweight_Smoking
Data about infant birthweights and mother’s smoking from PA 1989
Access: Course Website
Description: Mark Watson’s website
Caschool
School-level test score data from California in 1998-1999
Access:
data(Caschool, package="Ecdat")
Description:
?Ecdat::Caschool
diamond_train
Data about diamond prices. The full version of this data I got from Kaggle, and then I split it into training and testing data.
Access: Course Website
Description: A description of each column in the data is available under the
Description
tab on Kaggle
diamond_test
Out of sample version of
diamond_train
dataAccess: Course Website
Description: A description of each column in the data is available under the
Description
tab on Kaggle
Fair
Individual-level data about affairs in the United States
Access:
data(Fair, package="Ecdat")
Description:
?Ecdat::Fair
Fatalities
State-level panel data about drunk driving laws and traffic fatalities
Access:
data(Fatality, package="AER")
Description:
?AER::Fatality
fertilizer_2000
Country-level data about fertilizer and crop yields from the year 2000. See description of
fertilizer_panel
below for more detailsAccess: Course Website
Description: Course Website
fertilizer_panel
Country-level panel data from 1965-2000 (every 5 years) about fertilizer and crop yields for 68 developing countries. This data is a smaller version of the data used in McArthur, John W., and Gordon C. McCord. “Fertilizing growth: Agricultural inputs and their effects in economic development.” Journal of Development Economics 127 (2017): 133-152. url: https://doi.org/10.1016/j.jdeveco.2017.02.007.
Access: Course Website
Description: Course Website
house
U.S. House of Representatives elections data from 1946-1998
Access: Course Website
Description: Course Website
intergenerational_mobility
Intergenerational mobility data from PSID
Access: Course Website
Description: Course Website
Lead_Mortality
Infant mortality and lead pipes in 1900
Access: Access: Course Website
Description: Mark Watson’s website
mlda
Car accident deaths by age group
Access: Course Website
Description: Course Website
mroz
Labor force particpation of married women
Access:
data(mroz, package="wooldridge")
Description:
?wooldridge::mroz
mutual_funds
Mutual fund performance data
Access: Course Website
Description: Course Website
rand_hie
RAND health insurance experiment
Access: Course Website
Description: Course Website
Star
Data from Project STAR that randomly assigned some students to smaller class sizes
Access:
data(Star, package="Ecdat")
Description:
?Ecdat::Star
titanic_training
Passenger level data on surviving Titanic. This data is a slightly adapted version of the
titanic
data on KaggleAccess: Course Website
Description: Kaggle
titanic_testing
Out of sample version of
titanic_training
dataAccess: Course Website
Description: Kaggle
us_data
Data from the 2019 American Community Survey via IPUMS
Access: Course Website
Description: Course Website
1.8 First Week of Class
Related Reading: SW All of Chapter 1
In the first few classes, we will talk at very high level about the objectives of Econometrics.