Solutions to Midterm 1 Extra Questions

Ch. 8, Extra Question 1

Consistency is a large sample property for an estimator. It says that, if we have a large sample, then our estimator should be close to the population quantity that we are trying to estimate.

An unbiased estimator is one that, if we could repeatedly draw samples of size $n$ ( $n$ could be large or small here) and re-estimate the parameter of interest, then on average (across samples in this repeated sampling thought experiment), our estimate would be equal to the population parameter that we are trying to estimate.

Ch. 8, Extra Question 2

No, an unbiased estimator does not necessarily have to be consistent. Consider estimating $E [Y]$ by $Y_{1}$ (i.e., using the value of $Y$ for the first observation in the data). This estimate of $E [Y]$ is unbiased (since $E [Y_{1}] = E [Y])$ , but it is not consistent (because it does not even depend on the sample size at all).

Ch. 8, Extra Question 3

No, consistent estimators can be biased. Consider estimating $E [Y]$ by $\bar{Y} + \frac{c}{n}$ where $c$ is a constant. This is consistent because $\bar{Y} \to E [Y]$ as $n \to \infty$ (by the law of large numbers) and $\frac{c}{n} \to 0$ as $n \to \infty$ (because it is a constant divided by a number that is going to infinity). However, notice that

$\begin{aligned} E [\bar{Y} + \frac{c}{n}] & = E [Y] + \frac{c}{n} \\ \neq E [Y] \end{aligned}$ so, for any fixed sample size $n$ , $\bar{Y} + \frac{c}{n}$ is biased for $E [Y]$ .

Ch. 8, Extra Question 4

Since $n$ grows faster than $\sqrt{n}$ , $n (\frac{1}{n} \sum_{i = 1}^{n} (Y_{i} - E [Y]))$ diverges (i.e., the absolute value goes to infinity as $n \to \infty$ )
Since $n^{1 / 3}$ grows slower than $\sqrt{n}$ , $n^{1 / 3} (\frac{1}{n} \sum_{i = 1}^{n} (Y_{i} - E [Y]))$ converges to 0 as $n \to \infty$

Solutions to Additional Questions

1. What would be the result from running the following code?
```
all( c(1,2,3,4,5) > 0)
```
  Answer: TRUE, this checks if all elements of the vector 1,2,3,4,5 are greater than 0. Since they all are, it returns TRUE
2. Consider the following function
```
a_function <- function(n) {
  out <- 0
  for (i in 1:n) {
    out <- out + i^2
  }
  out
}
```
  If you run the following code, what will it output?
```
a_function(5)
```
  Answer: It will output 55 which comes from adding up $1 + 4 + 9 + 16 + 25$
Suppose there are two random variables $X$ and $Y$ .
1. If you know that $X$ and $Y$ are independent, do you know what their covariance is equal to? Explain. If yes, what is the covariance equal to?
  
  Answer: Yes, their covariance is 0. Independent random variables have 0 covariance.
2. If you know that $cov (X, Y) = 0$ , are $X$ and $Y$ independent? Explain.
  
  Answer: Not necessarily, random variables can have 0 covariance without being fully independent.
3. If you know that $cov (X, Y) = 1$ , are $X$ and $Y$ independent? Explain.
  
  Answer: $X$ and $Y$ are not independent in this case. If they were, their covariance would be equal to 0.
Suppose that $X_{1}$ and $X_{2}$ are two random variables such that $E [X_{1}] = 0$ , $E [X_{2}] = 5$ , $var (X_{1}) = 1$ , $var (X_{2}) = 10$ and $cov (X_{1}, X_{2}) = - 1$ . Suppose that $Y = X_{1} + X_{2}$ .
1. What is $E [Y]$ ?
  
  Answer:
  
  $\begin{aligned} E [Y] & = E [X_{1} + X_{2}] \\ = E [X_{1}] + E [X_{2}] \\ = 0 + 5 = 5 \end{aligned}$ where the second equality holds because expectations can pass through sums
2. What is $var (Y)$ ?
  
  Answer:
  
  $\begin{aligned} var (Y) & = var (X_{1} + X_{2}) \\ = var (X_{1}) + var (X_{2}) + 2 cov (X_{1}, X_{2}) \\ = 1 + 10 + 2 (- 1) \\ = 9 \end{aligned}$
Consider a random variable $Y$ that is equal to a firm’s profits (in thousands of dollars) and another random variable $X$ that is equal to firm’s number of employees. Suppose you know that $\begin{array}{r} E [Y | X = x] = 50 + 10 x \end{array}$
1. Explain how to interpret $E [Y | X = x]$ .
  
  Answer: This is the conditional expectation of $Y$ given $X$ takes the particular value $x$ . In the context of the problem, it is the mean profit’s of firms that have $x$ number of employees. The value can change for different numbers of employees.
2. What is $E [Y | X = 10]$ ?
  
  Answer: $E [Y | X = 10] = 50 + 10 (10) = 150$
3. Suppose that $var (Y) = 40$ , $E [X] = 30$ , and $var (Y) = 20$ , calculate $E [Y]$ .
  
  Answer:
  
  $\begin{aligned} E [Y] & = E [50 + 10 X] \\ = 50 + 10 E [X] \\ = 50 + 10 (30) \\ = 350 \end{aligned}$
Suppose that we have a random sample of $n$ observations of $X$ and $Y$ .
1. Suppose that you want to estimate the covariance between $X$ and $Y$ using the data that we have. Propose an estimator for the covariance. Hint: Try using the analogy principle and the expression $cov (X, Y) = E [X Y] - E [X] E [Y]$ .
  
  Answer:
  
  $\begin{array}{r} \hat{cov} (X, Y) = \frac{1}{n} \sum_{i = 1}^{n} X_{i} Y_{i} - (\frac{1}{n} \sum_{i = 1}^{n} X_{i}) (\frac{1}{n} \sum_{i = 1}^{n} Y_{i}) \end{array}$
2. Alternatively, the definition of covariance is $cov (X, Y) = E [(X - E [X]) (Y - E [Y])]$ . Propose an estimator for the covariance based on this expression. Would you expect this to give you the same estimate of the covariance as in part a?
  
  Answer: To conserve on notation, let $\bar{X}$ and $\bar{Y}$ denote the sample averages of $X$ and $Y$ , respectively. Then,
  
  $\begin{array}{r} \hat{cov} (X, Y) = \frac{1}{n} \sum_{i = 1}^{n} (X_{i} - \bar{X}) (Y_{i} - \bar{Y}) \end{array}$