Publications

Conditions for Ignoring the Missing-Data Mechanism in Likelihood Inferences for Parameter Subsets

ABSTRACTFor likelihood-based inferences from data with missing values, models are generally needed for both the data and the missing-data mechanism. However, modeling the mechanism is challenging, and parameters are often poorly identified. Rubin (1976) showed that for likelihood and Bayesian inference, sufficient conditions for ignoring the missing data mechanism are (a) the missing data are missing at random (MAR), in the sense that missingness does not depend on the missing values after conditioning on the observed data, and (b) the parameters of the data model and the missing-data mechanism are distinct; that is, there are no a priori ties, via parameter space restrictions or prior distributions, between these two sets of parameters. These conditions are sufficient but not always necessary, and they relate to the full vector of parameters of the model. We propose definitions of partially MAR and ignorability for a subvector of the parameters of particular substantive interest, for direct likelihood/Bayesian and frequentist likelihood-based inference. We apply these definitions to a variety of examples. We also discuss conditioning on the pattern of missingness, as an alternative strategy for avoiding the need to model the missing data mechanism.; ABSTRACTFor likelihood-based inferences from data with missing values, models are generally needed for both the data and the missing-data mechanism. However, modeling the mechanism is challenging, and parameters are often poorly identified. Rubin (1976) showed that for likelihood and Bayesian inference, sufficient conditions for ignoring the missing data mechanism are (a) the missing data are missing at random (MAR), in the sense that missingness does not depend on the missing values after conditioning on the observed data, and (b) the parameters of the data model and the missing-data mechanism are distinct; that is, there are no a priori ties, via parameter space restrictions or prior distributions, between these two sets of parameters. These conditions are sufficient but not always necessary, and they relate to the full vector of parameters of the model. We propose definitions of partially MAR and ignorability for a subvector of the parameters of particular substantive interest, for direct likelihood/Bayesian and frequentist likelihood-based inference. We apply these definitions to a variety of examples. We also discuss conditioning on the pattern of missingness, as an alternative strategy for avoiding the need to model the missing data mechanism.