Penalized Spline Model-Based Estimation of the Finite Populations Total from Probability-Proportional-to-Size Samples

The Horvitz-Thompson (HT) estimator is a simple design-unbiased estimator of the finite population total for sample designs with unequal probabilities of inclusion. Viewed from a modeling perspective, the HT estimator performs well when the ratios of the outcome values yi and the selection probabilities πi are approximately exchangeable. When this assumption is far from met, the Horvitz-Thompson estimator can be very inefficient. We consider alternatives to the HT estimator that posit a smoothly-varying relationship between yi (or a function of yi) and the inclusion probability πi (or a function of πi), and that model this relationship using penalized splines. The methods are intended for situations with probability-proportional-to-size (PPS) sampling and continuous survey outcomes. Simulation studies are conducted to compare the spline-based predictive estimators and parametric alternatives with the HT estimator and extensions such as the generalized regression (GR) estimator. These studies show that the p-spline model-based estimators are generally more efficient than the HT and GR estimators in terms of the root mean squared error. In situations that most favor the HT or GR estimators, the p-spline model-based estimators have comparable efficiency.
The p-spline model-based estimators and the Horvitz-Thompson estimator are compared on a Block Statistics data set from a U.S. census.