Ozlem Tuncel
otuncelgurlek1@gsu.edu
⚠️ CAUTION: DO NOT SOLELY RELY ON MY NOTES. THERE MIGHT BE TYPOS AND MISTAKES.
✔️ The goal of this week is to learn about binary independent variables and data transformations.
Here are some key points:
- Substantive meaning: clearly describe the effect that you observe on your regression result.
- We can use interval and ratio as DV in OLS (and if we have enough number of categories, ordinal can be also used)
- We can use factors, ordinal, interval, and ratio as independent variables. But, when using factors (binary variable), be careful about interpretation.
- Dichotomous, dummy, factor, binary, indicator - these are all the same thing!
- Use 0 and 1 for binary variables – 1 representing the theoretical interest. Make sure to meaningfully name your variables.
- Notational D is a vector of binary variables
- “Expected values of Y for D (i.e., male) is …” This is how we interpret dichotomous variables.
- Reference category = baseline category
- Since $D_l$ are mutually exclusive and exhaustive, we need to omit the “reference category” or $\beta_0$ (due to perfect multicollinearity)
- If we omit the reference the category, it becomes the baseline, and we can interpret using the comparison with the baseline.
- How to choose a reference cateogry? Either there is natural neutral category in your variable (i.e., Independents, Neutral). Or, my theory can give a me a meaningful category to pick as a reference. If you do not have any guidence from theory, use the modal category (mode of the variable).
- Higher order transformations (e.g., squares) inflates high numbers and compresses small numbers; and lower-order transformations (e.g., logs) compresses large values and inflates smaller ones.
- $\forall$ means “for all” and $s.t.$ means “such that”
Some suggestions: