Review ArticlePropensity score methods gave similar results to traditional regression modeling in observational studies: a systematic review
Introduction
In observational studies, patient assignment to the exposure of interest is not under the investigators' control. Therefore, there are likely to be important differences in confounding factors between the exposure groups, so any differences in outcome may be caused by the exposure itself, by differences in the measured and unmeasured confounders, or by both.
Multivariate regression is often used to lessen the bias caused by measured confounders, although it cannot adjust for unmeasured confounders; however, investigators frequently seek to construct parsimonious regression models using as few covariates as possible to predict the outcome, and interaction and nonlinear terms are rarely added. Achieving the best possible adjustment for bias may be sacrificed to improve the comprehensibility of the model. Furthermore, regression modeling may not alert investigators to situations where the confounders do not adequately overlap between exposure groups, threatening the validity of conclusions drawn from the data. This problem could be exaggerated when small differences in each of a large number of confounders produce marked separation between the exposure groups, and hence irresolvable selection bias.
Trying to circumvent these difficulties, Rosenbaum and Rubin [1] proposed “propensity scores” in 1983 as a method of controlling for confounding in observational studies. An individual's propensity score is defined as his or her conditional probability of a particular exposure versus another, given the observed confounders. It can be estimated with logistic regression, modeling the exposure as the dependent variable and the potential confounders as the independent variables. Because the model itself is not the focus of the study, it need not be parsimonious and easy to understand, so it can include numerous covariates (including those with statistically insignificant coefficients) and interactions and nonlinear terms. Two patients with the same propensity score have an equal estimated probability of exposure. If one was exposed and the other unexposed, the exposure allocation could be considered random, conditional on the observed confounders. Therefore, akin to a randomized trial, there is balance of the confounders between exposure groups after adjusting for the propensity score. Such balance can be assessed by comparing the distribution of confounders between exposure groups within propensity score strata or within cohorts matched on propensity score. The final propensity score model selected should maximize confounder balance between the groups. Inability to balance important confounders alerts investigators that the exposure groups are inadequately overlapping and that there is selection bias that cannot be resolved. Like regression modeling, propensity score methods cannot control for unknown confounders, but the sensitivity of the model to unknown confounders can be estimated [2].
Propensity scores can be applied in several ways. Exposed and unexposed cohorts matched on the propensity score can be formed, and the outcomes can be compared between them. Alternatively, patients can be stratified by the propensity score, and pooled stratum-specific estimates of the outcome can be computed. The former method results in well-balanced but smaller groups for comparison; the latter method retains a larger sample size, but the exposure groups are more heterogeneous within each stratum. In yet another application, the propensity score itself, representing a summary of all the other potential confounders, can be included with exposure as a covariate in a multivariate regression model predicting outcome, with or without inclusion of other potential confounders as additional covariates.
There has been an explosion of observational studies using propensity scores; however, whether this method yields different results from traditional regression modeling has not been determined.
Section snippets
Searching
We performed a systematic review of published observational studies that used both traditional regression and propensity score methodology to control for confounding. Citations indexed up to June 2003 were sought from both Medline and Embase. The search strategy selected articles containing “propensity scor$” as a textword, or those containing “propensity” as a textword and also indexed with the exploded MeSH subject headings “regression analysis” or “multivariate analysis.”
Selection
To be selected for
Trial flow
The search strategy (Fig. 1) identified 536 potentially relevant citations. Of the 130 retrieved for further consideration, 87 did not meet the selection criteria, for a net 43 studies included.
Study characteristics
Forty-three articles that reported both traditional regression and propensity score methods to control for confounding in the same exposure–outcome association were selected for review. The majority of studies were from the cardiology and cardiac surgery literature. Nearly two thirds were published in
Discussion
Braitman and Rosenbaum [46] discussed two strategies to adjust for overt biases in observational studies. One focuses on the relationship between prognostic variables and outcomes and models the response directly, using traditional regression methods. The other focuses on the relationship between prognostic variables and exposures without any consideration of outcome and uses propensity scores to emulate randomization. Propensity score methods offer theoretical advantages over traditional
Acknowledgments
We are grateful to Drs. Thérèse Stukel and Muhammad Mamdani for their helpful comments on this manuscript. Dr. Shah is supported by a Clinician-Scientist award and Dr. Austin by a New Investigator award from the Canadian Institutes of Health Research (CIHR). Dr. Laupacis is a Senior Scientist of the CIHR.
References (50)
- et al.
Effect of lipid-lowering therapy on early mortality after acute coronary syndromes: an observational study
Lancet
(2001) - et al.
Splenectomy and risk of blast transformation in myelofibrosis with myeloid metaplasia
Blood
(1998) - et al.
Mortality benefit of beta-blockade after successful elective percutaneous coronary intervention
J Am Coll Cardiol
(2002) - et al.
Primary angioplasty and selection bias inpatients presenting late (>12 h) after onset of chest pain and ST elevation myocardial infarction
J Am Coll Cardiol
(2002) - et al.
Internal thoracic artery grafting in the elderly patient undergoing coronary artery bypass grafting: room for process improvement?
J Thorac Cardiovasc Surg
(2002) - et al.
Continuous retrograde blood cardioplegia is associated with lower hospital mortality after heart valve surgery
J Thorac Cardiovasc Surg
(2003) - et al.
A propensity analysis of cigarette smoking and mortality with consideration of the effects of alcohol
Am J Cardiol
(2001) - et al.
Propensity score analysis of stroke after off-pump coronary artery bypass grafting
Ann Thorac Surg
(2002) - et al.
Predicting incidence of some critical events by sun signs: The PISCES study
ACC Curr J Rev
(2003) - et al.
Early mortality and morbidity of bilateral versus single internal thoracic artery revascularization: propensity and risk modeling
J Am Coll Cardiol
(2001)