Missing Data in Principal Surrogacy Settings

When an outcome of interest in a clinical trial is late-occurring or difficult to obtain, good surrogate markers can reliably extract information about the effect of the treatment on the outcome of interest. Surrogate measures are obtained post-randomization, and thus the surrogate-outcome relationship may be subject to unmeasured confounding. Thus Frangakis and Rubin (Biometrics 58:21-29, 2002) suggested assessing the causal effect of treatment within “principal strata” defined by the counterfactual joint distribution of the surrogate marker under the treatment arms. Li et al. (Biometrics 66:523-531, 2010) elaborated this suggestion for binary markers and outcomes, developing surrogacy measures that have causal interpretations and utilizing a Bayesian approach to accommodate non-identifiability in the model parameters. Here we extend this work to accommodate missing data under ignorable and non-ignorable settings, focusing on latent ignorability assumptions (Frangakis and Rubin, Biometrika 86:365-379, 1999; Peng et al., Biometrics 60:598-607, 2004; Taylor and Zhou, Biometrics 65:88-95, 2009). We also allow for the possibility that missingness has a counterfactual component, one that might differ between the treatment and control due to differential dropout, a feature that previous literature has not addressed.