2.1 Setting up R

This section covers how to set up R and RStudio and then what RStudio will look like when you open it up.

2.1.1 What is R?

Related Reading: IDS 1.1

R is a statistical programming language. This is important for two reasons

  • It looks like a “real” programming language. In my view, this is a big advantage. And many of the programming skills that we will learn in this class will be transferable. What I mean is that, if you one day want to switch to writing code in Stata or Python, I think the switch should be not-too-painful because learning new “syntax” (things like where to put the semi-colons) is usually relatively easy compared to the “way of thinking” about how to write code. Some other statistical programming languages are more “canned” than R. In some sense, this makes them easier to learn, but this also comes with the drawback that whatever skills that you learn are quite specific to that one language.

  • Even though R is a real programming language, it is geared towards statistics. Compared to say, Matlab, a lot of common statistical procedures (e.g., running a regression) will be quite easy for you.

R is very popular among statisticians, computer scientists, economists.

It is easy to share code across platforms: Linux, Windows, Mac. Besides that, it is easy to write and contribute extensions. I have 10+ R packages that you can easily download and immediately use.

There is a large community, and lots of available, helpful resources.

  • First place to look if you don’t know how to do something: DuckDuckGo (or…err…Google)!

  • StackOverflow

2.1.2 Downloading R

We will use R (https://www.r-project.org/) to analyze data. R is freely available and available across platforms. You should go ahead and download R for your personal computer as soon as possible — this should be relatively straightforward. It is also available at most computer labs on campus.

2.1.3 RStudio

Base R comes with a lightweight development environment (i.e., a place to write and execute code), but most folks prefer RStudio as it has more features. You can download it here: https://www.rstudio.com/products/rstudio/download/#download; choose the free version based on your operating system (Linux, Windows, Mac, etc.).

2.1.4 RStudio Development Environment

Related Reading: IDS 1.4

When you first open Rstudio, it will look something like this

Typically, we will write scripts, basically just as a way to save the code that we have written. Go to File -> New File -> R Script. This will open up a new pane, and your screen should look something like this

Let’s look around here. The top left pane is called the “Source Pane”. It is where you can write an R script. Try typing

1+1

in that pane. This is a very simple R program. Now, type Ctrl+s to save the script. This will likely prompt you to provide a name for the script. You can call it first_script.R or something like that. The only thing that really matters is that the file name ends in “.R” (although you should at least give the file a reasonably descriptive name).

Now let’s move to the bottom left pane. This is called the “Console Pane”. It is where the actual computations happen in R (Notice that, although we have already saved our first script, we haven’t actually run any code). Beside the blue arrow in that pane, try typing

2+2

and then press ENTER. This time you should actually see the answer.

Now, let’s go back to the Source pane. Often, it is convenient to run R programs line by line (mainly in order for it to be easy for you to digest the results). You can do this by pressing Ctrl+ENTER on any line in your script for it to run next. Try this on the first line of your script file where we previously typed 1+1. This code should now run, and you should be able to see the result down in the bottom left Console pane.

We will ignore the two panes on the right for now and come back to them once we get a little more experience programming in R.