Publications

Latent class analysis of response inconsistencies across modes of data collection

Latent class analysis (LCA) has been hailed as a promising technique for studying measurement errors in surveys, because the models produce estimates of the error rates associated with a given question. Still, the issue arises as to how accurate these error estimates are and under what circumstances they can be relied on. Skeptics argue that latent class models can understate the true error rates and at least one paper (Kreuter et al., 2008) demonstrates such underestimation empirically. We applied latent class models to data from two waves of the National Survey of Family Growth (NSFG), focusing on a pair of similar items about abortion that are administered under different modes of data collection. The first item is administered by computer-assisted personal interviewing (CAPI); the second, by audio computer-assisted self-interviewing (ACASI). Evidence shows that abortions are underreported in the NSFG and the conventional wisdom is that ACASI item yields fewer false negatives than the CAPI item. To evaluate these items, we made assumptions about the error rates within various subgroups of the population; these assumptions were needed to achieve an identifiable LCA model. Because there are external data available on the actual prevalence of abortion (by subgroup), we were able to form subgroups for which the identifying restrictions were likely to be (approximately) met and other subgroups for which the assumptions were likely to be violated. We also ran more complex models that took potential heterogeneity within subgroups into account. Most of the models yielded implausibly low error rates, supporting the argument that, under specific conditions, LCA models underestimate the error rates.