3.3 Lab 2: Monte Carlo Simulations

In this lab, we will study the theoretical properties of the estimators that we have been discussing in this chapter.

Monte Carlo simulations are a useful way to study/understand the properties of an estimation procedure. The basic idea is that, instead of using real data, we are going to use simulated data where we control the data generating process. This will be useful for two reasons. First, we will know what the truth is and compare results coming from our estimation procedure to the truth. Second, because we are simulating data, we can actually carry out our thought experiment of repeatedly drawing a sample of some particular size.

For this lab, we are going to make simulated coin flips.

  1. Write a function called flip that takes in an argument p where p stands for the probability of flipping a heads (you can code this as a 1 and 0 for tails) and outputs either 1 or 0. Run the code

    flip(0.5)

    Hint: It may be helpful to use the R function sample.

  2. Write a function called generate_sample that takes in the arguments n and p and generates a sample of n coin flips where the probability of flipping heads is p. Run the code

    generate_sample(10,0.5)
  3. Next, over 1000 Monte Carlo simulations (i.e., do the following 1000 times),

    1. generate a new sample with 10 observations

    2. calculate an estimate of \(p\)

    (Hint: you can estimate \(p\) by just calculating the average number of heads flipped in a particular simulation)

    1. a t-statistic for the null hypothesis that \(p=0.5\)

    2. and record whether or not you reject the null hypothesis that \(p=0.5\) in that simulation

Then, using all 1000 Monte Carlo simulations, report (i) an estimate of the bias of your estimator, (ii) an estimate of the variance of your estimator, (iii) an estimate of the mean squared error of your estimator, (iv) plot a histogram of the t-statistics across iterations, and (v) report the fraction of times that you reject \(H_0\).

  1. Same as #3, but with 50 observations in each simulation. What differences do you notice?

  2. Same as #3, but with 50 observations and test \(H_0:p=0.6\). What differences do you notice?

  3. Same as #3, but with 50 observations and test \(H_0:p=0.9\). What differences do you notice?

  4. Same as #3, but with 1000 observations and test \(H_0:p=0.6\). What differences do you notice?

  5. Same as #3, but now set \(p=0.95\) (so that this is an unfair coin that flips heads 95% of the time) and with 10 observations and test \(H_0:p=0.95\). What differences do you notice?

  6. Same as #8, but with 50 observations. What differences do you notice?

  7. Same as #8, but with 1000 observations. What differences do you notice?

Hint: Since problems 3-10 ask you to do roughly the same thing over and over, it is probably useful to try to write a function to do all of these but with arguments that allow you to change the number of observations per simulation, the true value of \(p\), and the null hypothesis that you are testing.