A Class of Pattern-Mixture Models for Normal Incomplete Data

Likelihood-based methods are developed for analyzing a random sample on two continuous variables when values of one of the variables are missing. Normal maximum likelihood estimates when values are missing completely at random were derived by Anderson (1957). They are also maximum likelihood providing the missing-data mechanism is ignorable, in Rubin's (1976) sense that the mechanism depends only on observed data. A new class of pattern-mixture models (Little, 1993) is described for the situation where missingness is assumed to depend on an arbitrary unspecified function of a linear combination of the two variables. Maximum likelihood for models in this class is straight-forward, and yields the estimates of Anderson (1957) when missingness depends solely on the completely observed variable, and the estimates of Brown (1990) when missingness depends solely on the incompletely observed variable. Another choice of linear combination yields estimates from complete-case analysis. Large-sample and Bayesian methods are described for this model. The data do not supply information about the ratio of the coefficients of the linear combination that controls missingness. If this ratio is not well-determined based on prior knowledge, a prior distribution can be specified, and Bayesian inference is then readily accomplished. Alternatively, sensitivity of inferences can be displayed for a variety of choices of the ratio.