Topic 0: Introduction to R Programming

Material Covered in Class

There are a number of useful resources for R programming. I pointed out quite a few in the course syllabus. The material for this section mainly comes from Introduction to Data Science: Data Wrangling and Visualization with R by Rafael Irizarry. I’ll cover some introductory topics that I think are most useful.

The discussion in class will follow chapters 2 and 3 of my course notes for my undergraduate course, which are available here

Additional Material

Most Important Readings: Chapters 1 (Introduction), 2 (R Basics), 3 (Programming Basics), 6 (Importing Data), 20 (Reproducible Research)

Secondary Readings: (please read as you have time) Chapters 4-5 (The tidyverse and data.table), 7-10 (Data Visualization)

The remaining chapters of this book are all useful, but you can read them over the course of the semester as you have time.

List of useful R packages

  • AER — package containing data from Applied Econometrics with R

  • wooldridge — package containing data from Wooldridge’s text book

  • ggplot2 — package to produce sophisticated looking plots

  • dplyr — package containing tools to manipulate data

  • haven — package for loading different types of data files

  • fixest — package for working with panel data

  • fixest — another package for working with panel data

  • ivreg — package for IV regressions, diagnostics, etc.

  • estimatr — package that runs regressions but with standard errors that economists often like more than the default options in R

  • modelsummary — package for producing nice output of more than one regression and summary statistics

Practice loading data

Version: [csv] [RData] [dta]

If, for some reason this doesn’t work, you can use the following code to reproduce this data

firm_data <- data.frame(name=c("ABC Manufacturing",
                               "Martin\'s Muffins",
                               "Down Home Appliances",
                               "Classic City Widgets",
                               "Watkinsville Diner"),
                        industry=c("Manufacturing",
                                   "Food Services",
                                   "Manufacturing",
                                   "Manufacturing",
                                   "Food Services"),
                        county=c("Clarke",
                                 "Oconee",
                                 "Clarke",
                                 "Clarke",
                                 "Oconee"),
                        employees=c(531, 6, 15, 211, 25))