Problems with Instrumental Variables Estimation When the Correlation between the Instruments and the Endogenous Explanatory Variables is Weak

We draw attention to two problems associated with the use of instrumental variables (IV), the importance of which for empirical work has not been fully appreciated. First, the use of instruments that explain little of the variation in the endogenous explanatory variables can lead to large inconsistencies in the IV estimates even if only a weak relationship exists between the instruments and the error in the structural equation. Second, in finite samples, IV estimates are biased in the same direction as ordinary least squares (OLS) estimates. The magnitude of the bias of IV estimates approaches that of OLS estimates as the R2 between the instruments and the endogenous explanatory variable approaches 0. To illustrate these problems, we reexamine the results of a recent paper by Angrist and Krueger, who used large samples from the U.S. Census to estimate wage equations in which quarter of birth is used as an instrument for educational attainment. We find evidence that, despite huge sample sizes, their IV estimates may suffer from finite-sample bias and may be inconsistent as well. These findings suggest that valid instruments may be more difficult to find than previously imagined. They also indicate that the use of large data sets does not necessarily insulate researchers from quantitatively important finite-sample biases. We suggest that the partial R2 and the F statistic of the identifying instruments in the first-stage estimation are useful indicators of the quality of the IV estimates and should be routinely reported.