The Utility Of Alternative Commercial Data Sources for Survey Operations and Estimation: Evidence from the National Survey of Family Growth

The combination of declining response rates and increasing data collection costs has led to an increase in the use of responsive survey design strategies and the consideration of alternative auxiliary data sources for improvement of survey data collection efficiency and post-survey nonresponse adjustments. Among the myriad sources of auxiliary data that might be appended to a sampling frame to support these efforts, commercial databases maintained by consumer marketing or credit organizations have recently received some research attention, due to the rich amount of information that they make available for survey researchers. Unfortunately, the existing literature raises concerns about the quality of the information in these databases, and few studies to date have considered the ability of these auxiliary data sources to effectively predict survey outcomes, including household eligibility for a given survey, propensity to respond, and key survey measures of interest. Such studies are important to justify the costs of purchasing these commercial data for survey production purposes. With this study, we examine the predictive ability of two alternative commercial databases offering different sets of auxiliary variables to survey researchers. Analyzing survey outcomes from the National Survey of Family Growth (NSFG), we find that these commercial variables improve the fits of models predicting survey eligibility and selected NSFG variables, but that neither data source substantially improves models of response propensity that already include selected NSFG paradata. As a result, the inclusion of the commercial variables in nonresponse adjustments does not result in substantial shifts in selected NSFG estimates. These results suggest that commercial data can be useful for selected survey operations (e.g., identification of eligible households during screening), but may not be useful for post-survey nonresponse adjustment. Suggestions for practice and directions for future research are provided in conclusion.