Background
Understanding causal relationships is critical for researchers. Although data from randomized controlled trials is the best way to understand causal relationships, it is often not feasible or ethical to randomly assign a treatment. Unfortunately, results from observational analyses are prone to bias, especially when the primary right-hand-side variable (i.e., the treatment) is correlated with other factors not included in the analysis; this is often referred to as endogeneity. Why is endogeneity a problem? Regression models assume that all right-hand-side variables are exogenous, hence the right-hand-side variables are often referred to as independent variables. When a variable is endogenous (correlated with unobserved variables), it violates an underlying assumption in the statistical model, resulting in biased regression coefficients.
Instrumental Variables
Instrumental variables (IV) is a statistical modeling technique to correct for endogeneity and potentially give causal estimates from observational data. However, IVs can be difficult to identify and have assumptions that need to be met.
Technical Report #29 describes the use of IVs in VA data. Section 2 provides background on IVs and how to use them, section 3 reviews common examples of IVs in VA data and their pitfalls, and the final section summarizes our discussion.
Learn More
Alternatives to Instrumental Variables
Finding good instrument variables is rare and often the instrumental variable fails for being too weak or not plausibly exogenous. If the instrument fails, then it is best to not use this method or find an alternative instrumental variable. Researchers who cannot find a good instrumental variable may turn to propensity scores. Propensity score is a logistic model that can be used to adjust for observed confounders. Propensity scores use observed information to calculate a single score; the score is the predicted propensity to get sorted into one of two treatments. The challenge with multivariate techniques, including propensity scores, is that they only focus on observables and don't adjust for unobserved characteristics. Researchers cannot obtain causal estimates using propensity scores.
Love plots provide a graphical representation of the pre and post effects of propensity scores. Code for creating Love plots in Stata created by Rebecca Raciborski, PhD.
Learn More
Last updated: April 23, 2023