Background: Clinical trials measure exacerbations of chronic obstructive pulmonary disease (COPD) inconsistently. A study was undertaken to determine if different methods for ascertaining and analysing COPD exacerbations lead to biased estimates of treatment effects.
Methods: Information on the methods used to count, analyse and report COPD exacerbation rates was abstracted from clinical trials of long-acting bronchodilators or long-acting bronchodilator/inhaled steroid combination products published between 2000 and 2006. Data from the Canadian Optimal Therapy of COPD Trial was used to illustrate how different analytical approaches can affect the estimate of exacerbation rates and their confidence intervals.
Results: 22 trials (17 156 patients) met the inclusion criteria and were reviewed. None of the trials adjudicated exacerbations or determined independence of events. 14/22 studies (64%) introduced selection bias by not analysing outcome data for subjects who prematurely stopped study medications. Only 31% of trials used time-weighted analyses to calculate the mean number of exacerbations/patient-year and only 15% accounted for between-subject variation. In the Canadian Optimal Therapy of COPD Trial the rate ratio for exacerbations/patient-year was 0.85 when all data were included in a time-weighted analysis, but was overestimated as 0.79 when data for those who prematurely stopped study medications were excluded and was further overestimated as 0.46 when a time-weighted analysis was not conducted; p values ranged from 0.03 to 0.24 depending on how exacerbations were determined and analysed.
Conclusions: Clinical trials have used widely different methods to define and analyse COPD exacerbations and this can lead to biased estimates of treatment effects. Future trials should strive to include blinded adjudication and assessment of the independence of exacerbation events, and trials should report time-weighted intention-to-treat analyses with adjustments for between-subject variation in COPD exacerbations.
Statistics from Altmetric.com
Patients with chronic obstructive pulmonary disease (COPD) exhibit slow progressive deterioration in airflow and respiratory status that can be punctuated by acute episodes of clinical deterioration known as COPD exacerbations. Acute exacerbations of COPD are characterised clinically by acute or subacute worsening of respiratory symptoms and may include abrupt increases in cough, sputum production, sputum purulence and breathlessness.1
COPD exacerbations have an important negative impact on health-related quality of life2–4 and generate considerable economic costs.5 The prevention of exacerbations is now recognised as a primary goal of COPD therapy.6 Earlier trials of COPD therapy considered lung function as the primary outcome and analysed exacerbations as secondary outcomes.7 8 More recently, many clinical trials of maintenance medications for COPD have evaluated COPD exacerbation rates as a primary outcome. Unfortunately, clinical trials have not been consistent in how they count, record or analyse COPD exacerbation rates, and methodological errors in the assessment of COPD exacerbations may lead to biased or spurious results.
The objective of this study was twofold. The first objective was to perform a systematic review of clinical trials published since 2000 to document potential inconsistencies in how COPD exacerbations were counted, analysed and reported among published studies. The second objective was to use data from a clinical trial to illustrate how differences in methodology and analysis affect the accuracy and precision of the results, and to determine if improper methods for ascertaining and analysing COPD exacerbation rates lead to biased estimates of treatment effects. It is hoped that this study will provide valuable information to help investigators design future intervention studies that evaluate acute exacerbations of COPD.
We performed a systematic literature search of the MEDLINE and Cochrane Clinical Trials Registry Databases to identify randomised controlled trials, published in print or on the internet between 2000 and November 2006, of patients with COPD who were treated with long-acting β agonist bronchodilators, long-acting anticholinergic bronchodilators or long-acting β agonist/inhaled corticosteroid combination products. Studies published in any language were included if they were randomised controlled trials that reported COPD exacerbation rates as a primary or secondary outcome. The search was performed using the search terms: COPD, chronic obstructive pulmonary disease, or obstructive airway disease; and adrenergic β-agonist, long-acting β-agonists, administration/inhalation, formoterol, salmeterol, anticholinergic, cholinergic antagonist, tiotropium, inhaled corticosteroid, fluticasone-salmeterol, salmeterol-fluticasone, budesonide-formoterol or formoterol-budesonide. In addition, relevant systematic reviews and meta-analyses were reviewed and all references of identified trials were retrieved.
Information on the methods used in each study was extracted to define, count, record and analyse acute exacerbations of COPD. Specific issues assessed were:
Counting exacerbation events
Analysing exacerbation events
Reporting exacerbation events
Counting exacerbation events
Was the definition of an acute exacerbation of COPD used a symptom-based or event-based definition? A symptom-based definition uses a complex of worsening respiratory symptoms to define an acute exacerbation of COPD whereas an event-based definition requires a therapeutic intervention such as a change in COPD medications or a change in healthcare utilisation to define an event.6
How did investigators distinguish between new exacerbations and slow-to-resolve exacerbations or relapse of previous exacerbations? Was independence of individual events assured?
Was there blinded adjudication of exacerbation events to ensure consistency with the study definition?
Were patients maintained in the study regardless of whether they prematurely discontinued study treatments?
Analysing exacerbation events
Patients may drop out early from clinical trials. An unweighted statistical approach does not adjust for time spent in the trial, and it therefore produces a biased estimate because it overestimates exacerbations that occur in patients who drop out early. In contrast, a weighted statistical approach adjusts for asymmetry in follow-up times by accounting for each patient’s time spent in the trial, and this approach produces an unbiased estimate.9 A second issue is that, in most parallel group clinical trials, variations between subjects in the effect of treatment results in overdispersion of residuals in a standard parametric analysis. This leads to inappropriate narrowing of the confidence intervals around the estimate. This can be corrected either by using a Poisson distribution with adjustment for the estimated overdispersion parameter or by using a negative binomial error distribution.9 10
We therefore assessed:
Reporting exacerbation events
Did studies report the proportion of subjects who experienced an exacerbation in addition to the mean exacerbation events/patient-year?
Finally, data from the Canadian Optimal Therapy of COPD Trial11 were analysed to determine how the rate of exacerbations/patient-year and the resultant rate ratio were affected by:
Use of blinded adjudication and assessment of independence of exacerbation events.
Exclusion of patients when they prematurely discontinued study medications.
Use of time-weighted compared with unweighted mean rates.
Use of overdispersion corrections to assess the statistical significance of the results.
Results of the systematic review
A total of 339 potentially relevant citations were retrieved, from which 35 published clinical trials were identified which potentially fulfilled the inclusion criteria. Of these, 13 articles were excluded (9 because exacerbations were only identified as adverse events and were not identified as a primary or secondary outcome12–20 24 and 3 because the results overlapped with previously published trials21–23).
Of the 22 trials included, 7 evaluated long-acting β agonists (LABAs),25–31 8 studied LABA/inhaled steroid combination products (all of these studies also included a LABA arm),32–39 and 7 studies evaluated a long-acting anticholinergic bronchodilator (2 of these studies also included a LABA arm).40–46 The characterstics of the 22 trials included in the systematic review are shown in the table in the online Appendix.
Definition of COPD exacerbation used
Seventeen of the 22 studies (77%) used an event-based definition of COPD exacerbations (see online table). Of the 17 studies, 11 limited their definition to exacerbation episodes that required new treatment with antibiotics and/or systemic corticosteroids and/or hospitalisation to count as an event-based outcome. Six studies (27%) also counted “mild exacerbations”, defined as days requiring increased use of as-needed inhalations of reliever medication above the usual daily use.
Four of the 22 studies (18%) used a symptom-based definition of acute exacerbations of COPD, defined as a complex of worsening respiratory symptoms lasting at least 3 days which was not necessarily associated with a therapeutic intervention. One study used a symptom-based definition to identify mild exacerbations and an event-based definition to identify moderate or severe exacerbations.29
Methods used to count exacerbations and enhance the quality of exacerbation measurements
Independence of events
None of the 22 trials reported whether they determined independence of individual exacerbation events, and none reported whether they used criteria to distinguish a new exacerbation event from a relapse of the original exacerbation. One study did state criteria for how they determined the end point of a mild exacerbation, but did not state equivalent criteria by which they determined the end point of a moderate or severe exacerbation.38
Adjudication of exacerbation events
None of the 22 trials reported whether they obtained medical records from the patient or healthcare provider in order to adjudicate suspected exacerbation events. None of the studies described quality control measures to ensure that events counted as exacerbations were consistent with the study definition of exacerbation.
Premature withdrawal of patients
Only one study explicitly stated that attempts were made to follow all patients for the full duration of the study and record all exacerbation events, regardless of whether patients continued on study medications.43 Fourteen of the 22 studies (64%) automatically withdrew patients from the study if they stopped study drugs and these studies did not continue to monitor these patients or record any subsequent exacerbation events.
Methods used to analyse exacerbation rates statistically
Of the13 trials that reported the mean number of exacerbations/patient-year, only 4 (31%) used analyses which weighed each patient’s individual exacerbation rate by their follow-up time.33–35 39 Of the 13 trials that reported the mean number of exacerbations/patient-year, only 2 (15%) accounted for the effect of between-subject variation on precision of the estimates by incorporating an overdispersion parameter in the analysis.33 35
Methods used to report exacerbation rates
Nine of the 22 trials reported the proportion of patients experiencing an exacerbation in each treatment group, 7 reported the mean number of exacerbations/patient-year and 6 reported both the proportions experiencing an exacerbation and the mean number of exacerbations/patient-year.
Effects of alternative methods for counting and analysing acute exacerbations of COPD on a real clinical trial data set
Data from the Canadian Optimal Therapy of COPD Trial were analysed to determine how the rate of exacerbations per patient per year and the resultant rate ratio and confidence intervals were affected by alternative methods of counting and analysing COPD exacerbations.
The Optimal Trial randomised 449 patients with moderate or severe COPD to 1 year of treatment with tiotropium + placebo or tiotropium + salmeterol or tiotropium + fluticasone/salmeterol.11 The primary outcome was the proportion of patients in each treatment group who experienced a COPD exacerbation requiring treatment with oral or intravenous steroids and/or antibiotics within 52 weeks of randomisation. Patients were followed for the full 52-week duration of the trial and primary and secondary outcomes were recorded throughout the 1-year period regardless of whether patients had experienced an exacerbation or discontinued study medications. A patient was considered to have experienced a new COPD exacerbation if they had been off oral steroids and antibiotics for at least 14 days following their previous exacerbation. For every suspected exacerbation a full report was prepared which included a patient symptom questionnaire as well as physician, emergency department and hospital records describing the circumstances of each suspected exacerbation. The assembled data from the suspected exacerbation visit was presented to a blinded Adjudication Committee which confirmed whether the event met the study definition of a COPD exacerbation, and also whether the event met the study criteria for a new exacerbation rather than a relapse or continuation of a previously recorded exacerbation.
Although the Optimal Trial had three treatment arms, for the purposes of illustration only two treatment arms (tiotropium + placebo and tiotropium + fluticasone/salmeterol) are presented (table 1). Patients randomised to receive tiotropium + placebo experienced 222 exacerbations/138 patient-years of follow-up = 1.61 exacerbations/patient-year compared with 188 exacerbations/137 patient-years of follow-up = 1.37 exacerbations/patient-year in those randomised to receive tiotropium + fluticasone/salmeterol. The weighted rate ratio is simply calculated as 1.37/1.61 = 0.85, and the relative risk reduction is therefore equal to 15%.
Table 2 shows the effects of failure to determine the independence and validity of reported possible exacerbation events through adjudication. The Optimal Trial considered patients to have experienced a new COPD exacerbation if they had been off oral steroids and antibiotics for at least 14 days following their previous exacerbation; 66/288 possible exacerbation events (23%) in the tiotropium + placebo group and 62/250 possible events (25%) in the tiotropium + fluticasone/salmeterol group were adjudicated and judged not to be true exacerbation events, either because they did not meet the study definition of exacerbation (eg, patient received antibiotics for sinusitis rather then COPD exacerbation) or because they were not independent events (eg, patient presented for COPD exacerbation on two occasions within 1 week). If these suspected events had not been excluded by adjudication, this would have artificially inflated the rate of exacerbations in each treatment group producing a small change in the rate ratio (table 3).
Table 3 shows the effects of using an unweighted mean approach to determine rates of exacerbation. This approach divides each patient’s number of exacerbations by the length of time each patient was followed. The mean rate for the group is then estimated using the average of these individual patient rates. In contrast, a weighted approach divides the total number of exacerbations in a treatment group by the total duration of follow-up time of the group. As shown in table 4, use of an unweighted approach produces a biased estimate of the mean rates and consequently the rate ratio, since individuals who drop out of a study early after having had one or more exacerbations will contribute proportionally more to the mean rate than if they were analysed using unbiased time-weighted methods. In the dataset in the Canadian Optimal Therapy of COPD Trial, the rate ratio changes from 0.85 to 0.74 when an unweighted approach is used. This has the effect of exaggerating the benefits of treatment and inflating the relative risk reduction from 15% to 26%.
Table 3 also shows the effects of excluding outcome data for patients after they prematurely stop study medications. The effect is to exaggerate the effects of treatment, such that in the Optimal dataset the rate ratio drops from 0.85 when patients are followed until termination of the one-year study period, down to 0.79 when patients are excluded at the point when they prematurely stop study medications. These effects are further compounded if an un-weighted approach is used together with premature exclusion of patients, in which case the rate ratio is even further underestimated at 0.46. This has the effect of grossly exaggerating the benefits of treatment, and inflating the relative risk reduction from 15% to 54%.
The statistical significance of the weighted rate ratios for exacerbation was assessed by the p value and the precision of the estimates is presented as confidence intervals (table 4). The least biased p value is produced when the intention-to-treat dataset is analysed using either a Poisson regression analysis with adjustment for overdispersion (p = 0.24) or a negative binomial analysis which contains a term that accounts for the degree of overdispersion (p = 0.23). Table 4 shows that the effect of excluding data from patients after they prematurely discontinue study medications is to narrow the confidence intervals around the estimate. Thus, p values are systematically smaller and hence “more significant” when patients are prematurely excluded from the analysis. As shown in table 4, p values can vary from 0.24 down to 0.03 depending on which statistical approach is used. Thus, results can easily cross the traditional threshold of statistical significance (p = 0.05) depending on how the data are analysed.
Our systematic review has shown that clinical trials published between 2000 and 2006 used widely varying definitions of COPD exacerbations. Even for those studies that uniformly used an event-based definition, the criteria for defining an exacerbation were highly variable. Thus, some studies included and counted “mild exacerbations” which were defined as days requiring increased use of an as-needed reliever medication, whereas others only counted exacerbations that were treated with systemic corticosteroids or antibiotics or hospitalisation. Without a consistent and standardised definition of an outcome, it is impossible to compare one trial with another—or even one medication against another—to determine the relative efficacy of different therapies in reducing the rate of COPD exacerbations.
Our review uncovered other methodological inconsistencies in how trials count and analyse COPD exacerbations. Exclusion of patients from the study analysis after they prematurely stopped study medications was common and occurred in 64% of the reviewed trials. Premature exclusion of patients may be inappropriate since it precludes an effectiveness analysis of the medication in question—ie, how the drug will act in “real-world” circumstances when some patients are non-compliant. In addition, early exclusion of patients can introduce bias because the factors which determined whether a patient might be excluded may often also be related to the outcome. For instance, some patients may prematurely discontinue a study medication because they are doing poorly and about to have an exacerbation in the near future. Premature exclusion of these patients after they stop study drugs introduces bias since the subsequent exacerbation is not counted and attributed to the study drug in question. In order to be consistent with CONSORT guidelines,47 patients who prematurely stop a study medication should not be considered “drop-outs” unless they absolutely refuse permission for the study to continue to follow them. Ideally, these patients should be retained in the study for its duration and any subsequent COPD exacerbations should be attributed to their randomised group.
It should be acknowledged that the intention-to-treat approach to analysis described above is correct, but it is also conservative. In some clinical trials of COPD, proportionately more patients randomised to the placebo limb have exited the study early. Many of these patients subsequently used active open-label therapies for the duration of the study. If such therapies are effective at reducing exacerbations, then an intention-to-treat analysis might reduce the possibility of a difference being found between the placebo group and the active arms in these instances. Since it is impossible to know a priori the direction and magnitude of the effect of patient non-compliance, and because of the potential biases involved in premature exclusion of patients, it is preferable for investigators to report two separate analyses—a true intention-to-treat effectiveness analysis as well as a secondary efficacy analysis that excludes patients when they stop study medications. If the results of both analyses are reported, then the reader can make up his/her mind to decide on the effectiveness of the intervention in question.
Our review of the published literature revealed other methodological issues confounding contemporary COPD clinical trials. None of the trials reported whether or not they determined the independence of individual exacerbation events. The problem is that patients may present to healthcare providers recurrently with symptoms of an acute exacerbation over short periods of time. For instance, a patient may present with symptoms of cough, dyspnoea and sputum to a physician on 1 January and be given an antibiotic, then present again on 7 January for identical symptoms and be given a second antibiotic, then present again on 14 January with the same symptoms and be treated with oral steroids. The question is: are these truly independent events or are these latter two events simply relapses or continuations of the original exacerbation? The negative binomial or Poisson distribution assumes that individual events will be independent of prior events. This assumption can be satisfied if it is clear that the patient had reverted to his/her baseline between events.
The Canadian Optimal Therapy of COPD Trial considered patients to have experienced a new COPD exacerbation if they had been off oral steroids and antibiotics for at least 14 days following their previous exacerbation.11 Other options to determine independence could include an assessment of patient symptoms using symptom diaries with a reversion of symptoms to baseline before a new event can be said to occur.48 49
None of the 22 trials included in the systematic review employed blinded adjudication of exacerbation events. This is problematic, since trials are thus reliant on the individual investigator to assign an outcome. Problems arise with diagnostic exchange; for example, should a respiratory event be classified as a COPD exacerbation or an upper respiratory tract infection or pneumonia? Adjudication committees can review the assembled clinical and radiographic data to determine if adverse events such as pneumonia had occurred. Adjudicated acute exacerbations of COPD can also be potentially further validated against daily diary card-defined exacerbations. Use of a blinded adjudication committee to review assembled data to ensure that the event met the pre-stated study definition of a COPD exacerbation can thus help avoid mistakes, inconsistencies and diagnostic exchange.
An analysis of the13 trials that reported the mean number of exacerbations/patient-year revealed that only 4 used analyses which weighed each patient’s individual exacerbation rate by their follow-up time. Suissa has shown in a previous “simulated trial” that using unweighted analyses underestimates the rate ratio and thus overestimates the apparent effectiveness of the treatment at preventing exacerbations.9 Our analysis of the Canadian Optimal Therapy of COPD Trial used a real-life clinical trial dataset and confirmed Suissa’s observations.
Clinicians who treat COPD are aware that there is considerable between-subject variability in COPD exacerbations; two patients with the same degree of lung dysfunction may have markedly different rates of exacerbation. The Poisson regression technique assumes that the variance of the rate of exacerbations is less than and is proportional to the mean,10 but in COPD this is unusual. Only 2 of 13 trials published since 2000 correctly accounted for between-subject variation by incorporating an overdispersion parameter into their analysis of the mean number of exacerbations/patient-year. Unless between-subject variability is accounted for by incorporating an overdispersion correction into the Poisson distribution or by using a negative binomial model, then statistical significance may be assumed inappropriately.
The TORCH study was published in early 2007 after completion of our review. This study did use weighted statistical analyses and accounted for between-subject variation when analysing COPD exacerbation rates.50 However, COPD exacerbations were not adjudicated in the TORCH trial, and those COPD exacerbations that occurred after patients prematurely discontinued their study medications were not included in the analysis of exacerbation outcomes.
Our systematic review has revealed inconsistencies in how exacerbation rates are reported. Seven of the 22 trials did not report the proportion of patients who experienced at least one exacerbation over the trial period; rather, these studies only reported the mean number of exacerbations per patient-year. Both methods of reporting COPD exacerbations have their merits and disadvantages. The mean number of exacerbations/patient-year captures patients with multiple exacerbations which may be clinically and economically important. However, measurement of the mean number of exacerbations/patient-year can be heavily influenced by a small minority of patients who experience multiple exacerbation events, and it cannot yield a number-needed-to-treat since this can only be derived from the absolute difference in the proportion of patients who experience at least one exacerbation.51 52 Conversely, the proportion of patients who experience at least one exacerbation is not always an ideal measurement since it is heavily influenced by the duration of the trial; for instance, if the study continues for an extended time period, then most/all patients will eventually experience an exacerbation.
We would suggest that trials be designed, and sample sizes calculated, using the mean number of exacerbations per patient-year as the primary outcome. However, it is also important for studies to report the proportion of patients who experienced at least one exacerbation over the trial period as a secondary outcome in order to determine both whether treatment will prevent an individual patient from having an exacerbation and also whether treatment may prevent some patients from having multiple exacerbations.
An analysis of actual clinical trial data from the Canadian Optimal Therapy of COPD Trial has shown that different methods for counting and analysing COPD exacerbations can result in major differences in the magnitude of the treatment effect. Results can go from statistically insignificant to statistically significant depending on how exacerbation events are counted, analysed and reported.
We would suggest that clinical trials adopt a standard consensus definition for COPD exacerbations and that studies should strive to incorporate parameters in their definition that assure independence of events and use blinded adjudication committees to ensure that suspected COPD exacerbations meet study definitions. Additionally, it would be ideal if trials could use intention-to-treat approaches to discourage premature exclusion of patients from the study analysis after they stop study medications. Correct statistical analysis using weighted mean rates and employing statistical corrections for between-patient variability should be obligatory. Use of standardised measures for defining, counting and analysing COPD exacerbations should help ensure comparability of clinical trial results.
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Files in this Data Supplement:
- Data supplement 1 - Online appendix
Funding: Funded by the Canadian Institutes of Health Research (Grant no. MCT-63139) and the Ontario Thoracic Society.
Competing interests: None.