Proportional hazards regression with missing covariates

Nonparametric maximum likelihood (NPML) is used to estimate regression parameters in a proportional hazards regression model with missing covariates. The NPML estimator is shown to be consistent and asymptotically normally distributed under some conditions. EM type algorithms are applied to solve the maximization problem. Variance estimates of the regression parameters are obtained by a profile likelihood approach that uses EM-aided numerical differentiation. Simulation results indicate that the NPML estimates of the regression parameters are more efficient than the approximate partial likelihood estimates and estimates from complete-case analysis when missing covariates are missing completely at random, and that the proposed method corrects for bias when the missing covariates are missing at random. KEY WORDS: EM algorithm; Missing data; Nonparametric maximum likelihood; Proportional hazards model.