Estimation of Underreporting in Diary Surveys: An Application using the National Household Food Acquisition and Purchase Survey

Diary surveys are used to collect data on a variety of topics, including health, time use, nutrition, and expenditures. The US National Household Food Acquisition and Purchase Survey (FoodAPS) is a nationally representative diary survey, providing an important data source for decision-makers to design policies and programs for promoting healthy lifestyles. Unfortunately, a multiday diary survey like the FoodAPS can be subject to various survey errors, especially item nonresponse error occurring at the day level. The FoodAPS public-use data set provides survey weights that adjust only for unit nonresponse. Due to the lack of day-level weights (which could possibly adjust for the item nonresponse that arises from refusals on particular days), the adjustments for unit nonresponse are unlikely to correct any bias in estimates arising from households that initially agree to participate in FoodAPS but then fail to report on particular days. This article develops a general methodology for estimating the extent of underreporting due to this type of item nonresponse error in diary surveys, using FoodAPS as a case study. We describe a methodology combining bootstrap replicate sampling for complex samples and imputation based on a Heckman selection model to predict food expenditures for person-days with missing expenditures. We estimated the item nonresponse error by comparing weighted estimates according to only reported expenditures and both reported expenditures and predictions for missing values. Results indicate that ignoring the missing data would lead to consistent overestimation of the mean expenditures and events per person per day and underestimation of the total expenditures and events. Our study suggests that the household-level weights, which generally account for unit nonresponse, may not be entirely sufficient for addressing the nonresponse occurring at the day level in diary surveys, and proper imputation methods will be important for estimating the size of the underreporting.