2.2 Installing R Packages

Related Reading: IDS 1.5

When you download R, you get “base” R. Base R contains “basic” functions that are commonly used by most R users. To give some examples, base R gives you the ability add, subtract, divide, or multiply numbers. Base R gives you the ability to calculate the mean (the function is called mean) or standard deviation (the function is called sd) of a vector of numbers.

Base R is quite powerful and probably the majority of code you will write in R will only involve Base R.

That being said, there are many cases where it is useful to expand the base functionality of R. This is done through packages. Packages expand the functionality of R. R is open source so these packages are contributed by users.

It also typically wouldn’t make sense for someone to install all available R packages. For example, a geographer might want to install a much different set of packages relative to an economist. Therefore, we will typically install only the additional functionality that we specifically want.

Example 2.1 In this example, we’ll install the dslabs package (which is from the IDS book) and the lubridate package (which is a package for working with dates in R).

# install dslabs package
install.packages("dslabs")

# install lubridate package
install.packages("lubridate")

Installing a package is only the first step to using a package. You can think of installing a package like downloading a package. To actually use a package, you need to load it into memory (i.e., “attach” it) or at least be clear about the package where a function that you are trying to call comes from.

Example 2.2 Dates can be tricky to work with in R (and in programming languages generally). For example, they are not exactly numbers, but they also have more structure than just a character string. The lubridate package contains functions for converting numbers/strings into dates.

bday <- "07-15-1985"
class(bday) # R doesn't know this is actually a date yet
#> [1] "character"

# load the package
library(lubridate)
# mdy stands for "month, day, year"
# if date were in different format, could use ymd, etc.
date_bday <- mdy(bday)
date_bday
#> [1] "1985-07-15"
# now R knows this is a date
class(date_bday)
#> [1] "Date"

Another (and perhaps better) way to call a function from a package is to use the :: syntax. In this case, you do not need the call to library from above. Instead, you can try

lubridate::mdy(bday)
#> [1] "1985-07-15"

This does exactly the same thing as the code before. What is somewhat better about this code is that it is easier to tell that the mdy function came from the lubridate package.

2.2.1 A list of useful R packages

  • AER — package containing data from Applied Econometrics with R

  • wooldridge — package containing data from Wooldridge’s text book

  • ggplot2 — package to produce sophisticated looking plots

  • dplyr — package containing tools to manipulate data

  • haven — package for loading different types of data files

  • plm — package for working with panel data

  • fixest — another package for working with panel data

  • ivreg — package for IV regressions, diagnostics, etc.

  • estimatr — package that runs regressions but with standard errors that economists often like more than the default options in R

  • modelsummary — package for producing nice output of more than one regression and summary statistics

As of this writing, there are currently 18,004 R packages available on CRAN (R’s main repository for contributed packages).