There are a number of useful resources for R programming. I pointed out quite a few in the course syllabus and in the introduction to econ 8080 slides. The notes for today’s class mainly come from Introduction to Data Science by Rafael Irizarry. I’ll cover some introductory topics that I think are most useful.

**Most Important Readings:** Chapters 1 (Introduction), 2 (R Basics), 3 (Programming Basics), and 5 (Importing Data)

**Secondary Readings:** (please read as you have time) Chapters 4 (The tidyverse), 7 (Introduction to data visualization), 8 (ggplot2)

The remaining chapters below are just in case you are particularly interested in some topic (these are likely more than you need to know for our course):

- Data visualization - Chapters 9-12
- Data wrangling - Chapter 21-27
- Github - Chapter 40
- Reproducible Research - Chapter 41

I think you can safely ignore all other chapters.

I’m not sure if it is helpful or not, but here are the notes to myself that I used to teach our two review sessions on R.

`AER`

— package containing data from*Applied Econometrics with R*`wooldridge`

— package containing data from Wooldridge’s text book`ggplot2`

— package to produce sophisticated looking plots`dplyr`

— package containing tools to manipulate data`haven`

— package for loading different types of data files`plm`

— package for working with panel data`fixest`

— another package for working with panel data`ivreg`

— package for IV regressions, diagnostics, etc.`estimatr`

— package that runs regressions but with standard errors that economists often like more than the default options in`R`

`modelsummary`

— package for producing nice output of more than one regression and summary statistics

If, for some reason this doesn’t work, you can use the following code to reproduce this data

```
firm_data <- data.frame(name=c("ABC Manufacturing", "Martin\'s Muffins", "Down Home Appliances", "Classic City Widgets", "Watkinsville Diner"),
industry=c("Manufacturing", "Food Services", "Manufacturing", "Manufacturing", "Food Services"),
county=c("Clarke", "Oconee", "Clarke", "Clarke", "Oconee"),
employees=c(531, 6, 15, 211, 25))
```

**Note:** We’ll try to do these on our own, but if you get stuck, the solutons are here

Create two vectors as follows

`x <- seq(2,10,by=2) y <- c(3,5,7,11,13)`

Add

`x`

and`y`

, subtract`y`

from`x`

, multiply`x`

and`y`

, and divide`x`

by`y`

and report your results.The geometric mean of a set of numbers is an alternative measure of central tendency to the more common “arithmetic mean” (this is the mean that we are used to). For a set of \(J\) numbers, \(x_1,x_2,\ldots,x_J\), the geometric mean is defined as

\[ (x_1 \cdot x_2 \cdot \cdots \cdot x_J)^{1/J} \]

Write a function called

`geometric_mean`

that takes in a vector of numbers and computes their geometric mean. Compute the geometric mean of`c(10,8,13)`

Use the

`lubridate`

package to figure out how many days there were between Jan. 1, 1981 and Jan. 10, 2022.`mtcars`

is one of the data frames that comes packaged with base R.How many observations does

`mtcars`

have?How many columns does

`mtcars`

have?What are the names of the columns of

`mtcars`

?Print only the rows of

`mtcars`

for cars that get at least 20 mpgPrint only the rows of

`mtcars`

that get at least 20 mpg and have at least 100 horsepower (it is in the column called`hp`

)Print only the rows of

`mtcars`

that have 6 or more cylinders (it is in the column labeld`cyl`

) or at least 100 horsepowerRecover the 10th row of

`mtcars`

Sort the rows of

`mtcars`

by mpg (from highest to lowest)