Week 5: Inference and Introduction to Gauss-Markov Theorem (02/08/2024)
Ozlem Tuncel
otuncelgurlek1@gsu.edu
⚠️ CAUTION: DO NOT SOLELY RELY ON MY NOTES. THERE MIGHT BE TYPOS AND MISTAKES. ALWAYS TAKE YOUR NOTES!
✔️ The goal of this week is to learn matrix notation for OLS regression and get familiar with Gauss-Markov theorem.
Here are some key points:
- This is a math heavy week so prepare yourself. For this week, we are mostly interested in Gauss-Markov theorem, and we will go into details of assumptions next week. For this week, we have to blindly make some assumptions.
- Our estimators $\hat{\beta_0}$ and $\hat{\beta_1}$ are random variables.
- Conflict data example: you have the whole population in Correlates of War (you know every conflict out there) and why you are using inferential statistics given that you have the question? Quick answer: parallel universe argument. Confused? see here and here.
- Goal of the applied social science: explain some outcome with some set of variables. Can we really do that? NO. Humans are inpredictable. That’s why things that we cannot explain systematically goes into the error term.
- Alone, point estimates are useless. That’s why we are mostly interested in the variance. We hope $E(\bar{x}) = \mu$. Thus, we are mostly interested in the precision of our estimates - this helps us to make meaningful inferences. And, we denote uncertainity using confidence intervals.
- Gauss-Markov Theorem is the most important thing for this class – we understand when we can use OLS regression and cannot use it.
- Key assumptions we make:
- X is fixed or non-stochastic/non-random (we will explain this later in detail and discuss how X can be fixed and random at the same time)
- Y has both systematic and random/stochastic variation (randomness goes into the error term, and we assume error terms is i.i.d and normally distributed).
- Two types of variation in Y: random and systematic – one is perfectly fine (random) and the other one is worse (systematic).
- i.i.d. = independently and identically distributed
- N means normally distributed
- $u_i \sim i.i.d. N(0, \sigma^2)$
-
$Var(Y |
X\beta) = \sigma^2$ - stochastic (random) variation in Y |
- Remember that $\sigma^2$ is population standard deviation
- Simply having more variation in our independent variable is going to lead to larger variation in our estimators.
- Z score is the coefficient estimate divided by the standard error => higher z score, low p-value
- If you have enough number of observations, you will get statistical significance. So, small number of sample size is bad, high number of sample size is meaningless.
- Best means minimum variance among other estimators.
Some suggestions: