Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
See companion article.1 The world cannot revolve and medicine cannot advance using randomised controlled trial (RCT) data alone. Increasing weight is, quite appropriately, being given to observational research. This is both in the context of answering important clinical questions and in the development of guidelines and policy. With that in mind, the quality of these observational studies is paramount, and quality needs to be appreciated at several levels, including methodology, interpretation and data transparency. There are plenty of examples of observational studies and RCTs failing to reach the same conclusions.2–4 Based on multifaculty input from 47 editors of the main respiratory, critical care and sleep journals, in a recent consensus statement in the Annals of the American Thoracic Society, Lederer and colleagues publish guidance on how to control for confounders and report results in observational studies.5
Investigation of the causal effect of an exposure (eg, risk factor) on a health outcome in observational studies is not straightforward. Despite a very rich epidemiological and statistical literature covering all issues and methods available to address them, this literature can be a difficult read for clinical researchers—and help from a methodologist is not always at hand. By simplifying complex statistical concepts using accessible language and simple graphical displays, articles like the review by Lederer and colleagues play an important role in making clinical researchers aware of the main issues surrounding causal inference in observational studies, and in providing them with some practical guidance.
So, what is all the fuss about? Let’s start with the basics. The aim of an observational study is very often to evaluate the causal effect of an exposure on an outcome. Unlike in an RCT, where confounding is dealt with by design through randomisation so that everything should be the same in the two groups apart from the intervention (exposure) of interest, the problem of confounding needs to be carefully considered and addressed in an observational study. Taking the principles of the review and following a simple example, let’s ask the question ‘Does reading Lederer et al make me more likely to undertake a high-quality observational study?’ In this example, our exposure is reading Lederer et al and our outcome high-quality observational research.
But it’s never as simple as this. Maybe, a statistician is more likely to read the review, and a statistician is also more likely to perform studies with robust methodology. An observed association between reading the review and performing high-quality observational studies could therefore be (completely or partly) due to the presence of statisticians among the readers, rather than the fact that readers have learnt from the review. Being a statistician would therefore be a confounder. This is not actually a problem and could easily be dealt with by adjusting for it in the analysis, for example, by using multiple regression.
To get rid of confounding, we might be tempted to adjust the analysis for any factor possibly associated with both exposure and outcome, without thinking about nature and direction of these associations. Unfortunately, this does not work since inappropriate adjustment can itself introduce bias. For example, completion of a short course on epidemiology may look like a confounder similar to being a statistician, but what if it were a consequence of increased interest in the topic having read the review and one of the mechanisms underlying its effect on performing a high-quality study? In this case, completion of the course would act like a mediator, and adjusting for it would mask a true effect of reading the review on performing a high-quality study (bias towards the null).
Inappropriate adjustment may even produce a spurious association (bias away from the null) if the factor we adjust for is a collider, that is, something that represents a consequence of both the exposure and the outcome. For example, the authors of the review recommend using directed acyclic graphs (DAGs) to visually represent causal models and refer the reader to DAGitty.net for a simple interface.6 Since DAGs are more likely to be used by investigators interested in methodological aspects, visiting DAGitty.net might be a result of reading the review as well as a result of being interested in designing a high-quality study; adjusting for it could introduce a false link between our exposure and outcome of interest. So, how do we take this all into consideration?
The suggestion in this review is to approach identification of confounders from several angles and visually represent the possible relationships; thinking about what is known already in the field about the relationship between exposure and outcome, analysing each variable and how it may affect this relationship and, perhaps most importantly, thinking about how other variables, which may act as mediators or colliders, should be handled.
The most important thing to remember about a confounder is that you must have thought about it and measured it to be able to adjust for it. No amount of statistical jiggery-pokery can control for something that is not measured. The authors go on to stress the importance of how variables for a model are selected, and how residual confounding, if not properly thought about, may lead one to misinterpret the true effect of the exposure on the outcome.
Another important aspect highlighted in this review is the issue of interpretation of p values, given they do not actually help in determining the clinical importance of an association. P values measure the statistical strength of the evidence, but are two papers reporting the association between the same exposure and outcome with p values of 0.048 and 0.052 really different in their conclusion about the likely presence of an association? And most importantly, p values do not provide any information on the magnitude of the association. Arguably, the CI, a measure of uncertainty around the size of the association estimated in the study, is far more informative and clinically useful.
This leads nicely on to the importance of clinical versus statistical significance and the importance of both when thinking about causal diagrams and what variables belong in a model. Just because age does not happen to be ‘statistically significant’ in a model looking at the association between cardiovascular disease and death, it does not mean we would not include it, as clinically it is important and is certainly a confounder.
The final point the authors talk about is transparency. This is crucial in helping readers to interpret results and contextualising findings.
So, our advice? Read the review and dive into this further by learning about specific issues and methods relevant to your research, so that we all design, analyse and interpret observational studies better.7
Contributors JKQ wrote the first draft, both authors contributed to subsequent drafts.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Patient consent Not required.
Provenance and peer review Commissioned; internally peer reviewed.