Group comparisons involving missing data in clinical trials: a comparison of estimates and power (size) for some simple approaches

M E Miller; T M Morgan; M A Espeland; S S Emerson

doi:10.1002/sim.904

Group comparisons involving missing data in clinical trials: a comparison of estimates and power (size) for some simple approaches

Stat Med. 2001 Aug 30;20(16):2383-97. doi: 10.1002/sim.904.

Authors

M E Miller¹, T M Morgan, M A Espeland, S S Emerson

Affiliation

¹ Section on Biostatistics, Department of Public Health Sciences, Wake Forest University School of Medicine, Medical Center Blvd., Winston-Salem, NC 27157-1063, USA. mmiller@wfubmc.edu

PMID: 11512129
DOI: 10.1002/sim.904

Abstract

When using 'intent-to-treat' approaches to compare outcomes between groups in clinical trials, analysts face a decision regarding how to account for missing observations. Most model-based approaches can be summarized as a process whereby the analyst makes assumptions about the distribution of the missing data in an attempt to obtain unbiased estimates that are based on functions of the observed data. Although pointed out by Rubin as often leading to biased estimates of variances, an alternative approach that continues to appear in the applied literature is to use fixed-value imputation of means for missing observations. The purpose of this paper is to provide illustrations of how several fixed-value mean imputation schemes can be formulated in terms of general linear models that characterize the means of distributions of missing observations in terms of the means of the distributions of observed data. We show that several fixed-value imputation strategies will result in estimated intervention effects that correspond to maximum likelihood estimates obtained under analogous assumptions. If the missing data process has been correctly characterized, hypothesis tests based on variances estimated using maximum likelihood techniques asymptotically have the correct size. In contrast, hypothesis tests performed using the uncorrected variance, obtained by applying standard complete data formula to singly imputed data, can provide either conservative or anticonservative results. Surprisingly, under several non-ignorable non-response scenarios, maximum likelihood based analyses can yield equivalent hypothesis tests to those obtained when analysing only the observed data.

Publication types

Comparative Study
Research Support, U.S. Gov't, P.H.S.

MeSH terms

Aged
Analysis of Variance
Bias
Data Interpretation, Statistical*
Exercise Therapy / methods
Exercise Therapy / standards
Follow-Up Studies
Humans
Likelihood Functions
Linear Models*
Osteoarthritis, Knee / rehabilitation
Randomized Controlled Trials as Topic*
Research Design / standards*
Sample Size*
Treatment Outcome

Abstract

Publication types

MeSH terms

Grants and funding