Errata

In the first iteration of Causal Inference: The Mixtape, I inadvertently forgot to cite material influence by the work of Marcelo Perrailon; the online version has been updated to correct this oversight, and this correction has been incorporated into the next printing of the paperback edition.

There was also an inadvertent oversight in the probability and regression chapter in which I forgot to cite material influence by the work of Valerio Filoso whose 2013 Stata Journal article on the regression anatomy theorem package, -reganat-, provided derivations and proofs for the Frisch-Waugh-Lovell theorem.

Acknowledgments

Introduction

Probability and Regression

  • Fixed \(\Pr(\text{Fail} \mid \text{Pass})= 0.45/0.75 = 0.6\) on page 20

  • Fixed set notation (\(cap\) is intersection, \(cup\) is union) on around page 22

  • Fixed notation on contigency table on page 23

  • Added missing \(y_i\) in \(\sum_{i=1}^n(ax_i+by_i) =a\sum_{i=1}^n x_i + b\sum_{j=1}^n y_i\) on page 33

  • Fixed equation with Corr(W,Z) should be Cov(W,Z) on page 36

  • Fixed sentence to read “The second one, though, means that the mean value of the error term does not change with different slices of \(x\).” on page 39

  • SST on page 51 should be a + instead of a - in \((y_i - \hat{y}_i) - (\hat{y}_i 0 \bar{y})\).

  • \(C(x,u)\) does not imply \(x\) and \(u\) are independent, on page 60

  • I wrote “I have found Filoso (2013) very helpful in explaining this with proofs and algebraic derivations, so I will use this notation and set of steps here as well.” to note that the derivations and proofs of the Frisch-Waugh-Lovell theorem were influential in this section.

  • Added * -reganat- is a user-created package by Filoso (2013). to the first line of reganat.do

  • Clarified algebra, on page 61

  • Added \(g \in 1, \dots, G\), on page 93

  • Missing ’ in \(E[u_{ig}u_{jg'}']\), on page 93

Directed Acyclical Graphs

  • Removed double they on page 102

Potential Outcomes

  • Fixed typo in definition of ATT on page 128

  • Fixed typo. “population” instead of “popular” on page 148

  • Clarified notation of p-value in randomization inference on page 157

Matching and Subclassification

  • A typo on Page 180. Should read “I wouldn’t be surprised if more people believe in a flat Earth than that smoking doesn’t causes lung cancer.”

  • Titanic_subclassifcation script was incorrectly calculating strata weights.

Regression Discontinuity

  • Update lmb_5.R and lmb_6.R #12

  • In nonparametric kernels subsection, the equation is missing argmin.

  • On page 257, the last paragraph should read: “Sometimes these abstract ideas become much easier to understand with data. Health economist Marcelo Perraillon uses simulated data to teach about th eestimation challenges with nonlinearities. I will use a similar approach using Stata and R in the following examples.

  • On page 257, line 6 of rdd_simulate1.do, write “* Generate running variable. Stata code attributed to Marcelo Perraillon.”

  • On page 259, change line 1 of rdd_simulate2.do to say “* Stata code attributed to Marcelo Perraillon.”

  • On page 261, change line 1 of rdd_simulate3.do to say “* Stata code attributed to Marcelo Perraillon.”

  • On page 265, line 1 of rdd_simulate4.do, replace the early line reading “* Polynomial modeling” with “* Stata code attributed to Marcelo Perraillon.”

  • Page 281, fixed powers on polynomial for the equation following “The estimated first stage would be:” and removed \(X_i\) from the equation following “The reduced form for this regression is:”

  • On page 294, line 2 of lmb_1.do, write “* Stata code attributed to Marcelo Perraillon.”

  • On page 295, change line 1 of lmb_2.do to “* Stata code attributed to Marcelo Perraillon.”

  • On page 296, change line 1 of lmb_3.do to “* Re-center the running variable. Stata code attributed to Marcelo Perraillon.”

  • On page 298, change line 1 of lmb_4.do to “* Stata code attributed to Marcelo Perraillon.”

  • On page 298, change line 1 of lmb_5.do to “* Stata code attributed to Marcelo Perraillon.”

  • On page 300, change line 1 of lmb_6.do to “* Use 5 points from cutoff. Stata code attributed to Marcelo Perraillon.”

  • On page 303, change line 1 of lmb_7.do to “* Stata code attributed to Marcelo Perraillon.”

  • On page 307, change line 1 of lmb_8.do to “* Stata code attributed to Marcelo Perraillon.”

  • On page 309, change line 1 of lmb_9.do to “* Stata code attributed to Marcelo Perraillon.”

  • On page 311, change line 1 of lmb_10.do to “* McCrary density test. Stata code attributed to Marcelo Perraillon.”

  • Figures 25, 26, 27, 28 have captions that should end with “Figures attributed to Marcelo Perraillon.”

  • In Acknowledgements, page x, insert this paragraph after mentioning Young MC but before “Finally…” “I would like to thank the health economist, Marcelo Perraillon, whose insightful approach to using simulated variables and figures to illustrate estimation challenges with regression discontinuity shaped my own pedagogy. I also acknowledge him for introducing me to the “cmogram” command in Stata

  • Page 242, dropped “which we discuss in later detail” following Angrist and Lavy (1999).

  • Insert arg min in front of equation at the end of section 6.2.5 (prior to discussing Card, et al. (2008) Medicare paper)

Instrumental Variables

  • Missing + on page 325

  • Introduced Stevenson’s data so that readers would know the dataset is from her article. Wrote ``Let’s now look at data from Stevenson (2018) itself.’’

Panel Data

  • Fixed delta hat equation on page 391

Difference in Differences

  • Update abortion_dd.R script #11 #17

  • “Let’s say there are 5 periods, and \(k\) is treated in period 2. Then it spends 80% of its time under treatment, or 0.8. But let’s say \(l\) is treated in period 4. Then it spends 40% of its time treated, or 0.4”. Original had it backwards on page 465.

Synthetic Control