The Prevention and Treatment of Missing Data in Clinical Trials

Missing data have seriously compromised inferences from clinical trials, yet the topic has received little attention in the clinical-trial community.1 Existing regulatory guidances2-4 on the design, conduct, and analysis of clinical trials have little specific advice on how to address the problem of missing data. A recent National Research Council (NRC) report5 on the topic seeks to address this gap, and this article summarizes some of the main findings and recommendations of that report. The authors of this article served on the panel that prepared the report. Missing data have seriously compromised inferences from clinical trials.1 For example, editorials in the Journal have noted how missing data have limited the ability to draw definitive conclusions from weight-loss trials6 or could lead to incorrect inferences about drug safety.7 High rates of missing data that can affect conclusions occur in trials of treatments for many diseases.8-13 Since existing regulatory guidances2-4 lack specificity, in 2008 the Food and Drug Administration (FDA) requested that the NRC convene an expert panel to prepare “a report with recommendations that would be useful for FDA's development of guidance for clinical trials on appropriate study designs and follow-up methods to reduce missing data and on appropriate statistical methods to address missing data for analysis of results.” This article summarizes some of the main findings and recommendations of the report5 of that panel. More details are provided elsewhere.14 The report focused primarily on phase 3 confirmatory clinical trials for assessing the safety and efficacy of drugs, biologic products, and some medical devices, for which the bar of scientific rigor is set high. The use of randomized study-group assignments predominates in such studies, since this design feature ensures comparability of study groups and allows assessment of causation. However, many of the recommendations are applicable to early-phase randomized trials and epidemiologic studies in general. Missing data are defined as values that are not available and that would be meaningful for analysis if they were observed. For example, measures of quality of life are usually not meaningful for patients who have died and hence would not be considered as missing data under this definition. We focus on missing outcome data here, though analysis methods have also been developed to handle missing covariates and auxiliary data.