Sample surveys are a critical resource for health research. Large-scale national health surveys, such as the National Health Interview Survey (NHIS), and a multitude of smaller-scale surveys funded by NIH, that are essential for understanding the health of the population. However, sample surveys of all sizes are increasingly facing the dual pressure of rising nonresponse and shrinking budgets. This pressure threatens the quality of survey estimates.
Survey methodologists have responded to these pressures by proposing two new classes of designs which are aimed at increasing quality or controlling costs. The first new class of designs, known as responsive survey designs, uses incoming data from the field to trigger changes in the design. In effect, these responsive designs identify cases that are not responding well under the current protocol, and offering them a new protocol that is more likely to induce response. The second class of designs is known as adaptive survey design. These designs attempt to identify subgroups in the population for whom different designs may be more effective. The goal is to identify the optimal design, with respect to a stated objective, that is tailored to individuals?i.e., assigns different designs to subgroups.
Both classes of designs rely upon inputs for decision-making. Often, these inputs are in the form of model predictions about the probability of response. Unfortunately, the quality of those inputs has not been evaluated in either of these new classes of survey designs. We propose to evaluate the quality of these inputs. In particular, we will evaluate the impact of model selection procedures on the effectiveness of responsive and adaptive survey designs.
We will use data from the largest ongoing survey in the U.S., the American Community Survey (ACS), to accelerate progress on evaluating different approaches to informing the data collection design. The ACS is a mandatory survey with a 95% response rate that also uses a phased design with multiple protocols, making it ideal for this study. We will vary both the information being used to direct data collection at the sample case level, and the primary objective of the targeted use of more costly methods.
Our objectives are two-fold. First, we will evaluate the impact of model selection on the effectiveness of using predictions from these models in order to inform design decisions aimed at achieving different survey objectives. Second, we will evaluate the ability to achieve different survey objectives (bias reduction, MSE minimization, and response rate maximization) via reallocation of effort.
The results will help accelerate research to inform future study designs to improve survey estimates, and to do so within limited resources.