Statistics from Altmetric.com
The global evolution of airway clearance techniques (ACTs) for cystic fibrosis (CF) and other respiratory disorders, with corresponding research, spans over four decades. Over this era, different ACTs have been invented, modified, retained or rejected. Some involve airway oscillation, some are independently performed and others require electricity or physical assistance. Some have strong geographical dominance, often more closely related to the origin of the technique and the strength of marketing, than to best evidence.
There has also been dramatic progress in the design, quality and rigour of research conducted to identify best practice. The best evidence from a plethora of early, underpowered, short-term studies (1–14 days in length), typically cross-over design, has been synthesised in five Cochrane reviews related to ACTs for CF, published between 2000 and 2011.1–6 All conclude that there is currently insufficient evidence to suggest superiority of any one technique. The calls for properly conducted, long-term, randomised controlled trials (RCTs) with standardised and meaningful outcome measures have intensified.
This study by McIlwaine et al answers the call elegantly.7 It is a well-designed, properly funded, long-term RCT. Results suggest that patients on high-frequency chest wall oscillation (HFCWO) therapy had more exacerbations than those on positive expiratory pressure (PEP) mask therapy, and they had them sooner. Given the substantive cost and marketing behind HFCWO (over 200 times more expensive than PEP mask therapy), these results provide important and clinically useful information that clinicians will find very helpful in making cost-effective and best practice decisions about airway clearance for their patients.7 In a landscape of ACT comparison literature, which invariably concludes that one technique or device is ‘equivalent’ to another, this is a noteworthy finding.1–6
In addition however, apart from confirming the difficulty of reaching recruitment targets, this study illuminates effectively two other challenges facing airway clearance research, even in near-perfect conditions: forced expiratory volume in one second (FEV1) as a gold-standard outcome measure and participant dropouts related to preference.7 Both contributed to the derailment of two recent long-term ACT studies and will jeopardise similar future studies unless they are carefully examined and addressed.8 ,9
The primary outcome measure in McIlwaine's study provides a relevant and thoughtful antidote (long overdue) to FEV1 in airway clearance studies. The ‘number of pulmonary exacerbations, with symptoms lasting longer than 3 days, requiring the use of an antibiotic’ is meaningful to patients, clinically important and potentially expensive.7 McIlwaine's is the latest of the four recent long-term ACT RCTs, which have exposed the shortcomings of FEV1. Over the last two decades, with overall improvements in CF care, the magnitude and trajectory of annual decline in FEV1 have become inconveniently unpredictable and notoriously difficult to interpret. The anticipated annual change in FEV1 has crept towards zero from a decline exceeding 2%. It is now very difficult to calculate statistical power accurately on the basis of FEV1 or define a reference against which a clinically important change can be established.
Three long-term ACT studies published in 2010 were powered on the basis of a predicted decline in FEV1 between −2% and −2.3% per annum, which failed to materialise, undermining analysis. Authors in all three papers argued that FEV1, despite its wide recognition as a gold-standard measure, was not a sensitive clinical trial or healthcare indicator.8–10
If the current study had, like so many others, used FEV1 as a primary outcome, findings would have joined the growing collection of ‘one ACT is as good as the other’ literature. Average FEV1 was within the normal range at baseline with both groups demonstrating an improvement over 12 months which was not different between the groups. Instead, results were able to identify important differences in the number of, and time to, pulmonary exacerbations, with meaningful consequences for patients and healthcare providers. These overall differences in health profiles during the year were not detected or reflected in the FEV1 data.7
The second challenge relates to patient preference and dropouts. In McIlwaine's study, 16 participants dropped out at, or just before, randomisation with a further three dropping out during the trial (18%). Most dropouts were not related to any experiential problems with either technique as the substantive majority dropped out before they had started using either technique. Most of these participants were quite transparent about the fact that they were worried about, or preferred allocation to, a specific treatment arm.7
Similarly, Pryor's study (2010) comparing five different ACTs over 1 year, suffered from substantive dropouts (29% of 75 participants), with more than half of these patients admitting that they did not like the regimen to which they had been randomised.10 This also resonates with Sontag et al.'s three-arm parallel RCT, in which 34% of the 166 enrolled participants dropped out, a third of these at or around randomisation and the remainder during the trial.9 The number of dropouts in different groups was so disproportionate (35, 16 and 5) that the analysis was unfeasible and the study was terminated prematurely. The pattern of early and late dropouts strongly suggested preference for an alternative treatment as there were no other variables between the groups. Authors reported that ‘dissatisfaction with the therapy’ was an independent predictor of withdrawing.9
The dropout rate appeared to increase with age, suggesting that preferences became more entrenched with maturity, along with the confidence to withdraw from a treatment that was not preferred. The HFCWO treatment, with the lowest dropout rate, was favoured mostly by patients in terms of perceived efficacy. Like this study, however, it fared least well, showing a significantly faster decline in FEF25–75 (forced expiratory flow occurring in the middle 50% of exhaled volume) compared with the other techniques.9 Unsurprisingly, preference may improve compliance, but does not guarantee efficacy.
The previous McIlwaine et al.'s cross-over trial was also terminated prematurely following an intractable problem with dropouts that manifested in an unforeseen way.8 More than half of the participants (58%) in one group refused to cross over to the other arm after 1 year of study. Those who were willing to cross over began to incorporate the first technique into their second-year treatments, invalidating progression of the study. Once again, the issue of preference was so important that it destabilised and unhinged a well-designed and well-conducted research study.
This pattern of dropouts in all four recent, well-designed, international long-term RCTs is too important to ignore. Dropouts, largely as a result of strong preferences by patients for a particular ACT, entirely derailed two of these studies and compromised the remaining two.8–10
This issue is gaining important recognition in specific areas of clinical research.11–15 RCTs are arguably the most rigorous way of deciding if one intervention is better than the other. However, the success of the RCT is underpinned by two key principles, which are essentially incompatible with physiotherapy ACT research questions. These are blinding and random allocation.
Physiotherapy treatments are notoriously difficult to ‘blind’ by concealment or placebo. They are physical, may involve hands or devices, special breathing, hard exercise, vibrations or oscillations, and they are certainly time-consuming, effortful and demanding. An inability to conceal treatment from patients undermines a fundamental strength of the RCT design and opens the door to an uncontrollable post-randomisation preference bias.
In addition, while random allocation almost entirely eliminates selection bias and other uncontrolled variables in treatment assignment, it is really only palatable for patients if there is genuine equipoise about the relative benefits, AND they do not know or care what they are getting OR if they only are required to endure the relative burden of treatment for a short time.11 Neither of the latter conditions is met in long-term ACT research.
Patients with CF will quite naturally develop and modify their strong preferences about ACTs and how they participate in managing their chronic lung disease. These preferences matter a great deal because people's own beliefs about treatments are known to be the most important determinant of whether and how they are taken.16 In practical terms, these effects can manifest in non-compliance or dropouts.
Seasoned CF trial participants know that continued participation in any trial cannot be enforced. They may sign up because they have an interest in a particular treatment, but drop out immediately if they do not get the preferred arm. If this happens for one arm more than the other, as in the Sontag study, the bias cannot be corrected.9 If this happens halfway through a cross-over study, as happened in McIlwaine et al.8, the trial is equally doomed.
It is now recognised that the problematic effects of strong preference are most apparent in (1) unblinded trials in which patients are aware of the treatment they are receiving and (2) trials in which participants are required to sustain an effortful and demanding role for an extended period.15 ,17 There are few circumstances in which these two arguments against a standard RCT are met more strongly than in ACTs for CF. These long-term clinical trials need to be considered and conducted differently.11 The Rucker design, first described in 1989, or variations on this since, have offered solutions for identifying, quantifying and managing the separate effects of preference, selection and treatment.18–21
This is a rare area of clinical research in which the ‘perfect storm’ of strong preference, lack of blinding and the requirement for effortful and demanding participation over long intervals will continue to threaten any serious effort to find the best ACT for patients with CF. This will be a spectacular waste of time and money and a terrible disservice to patients with CF. In future, two things appear certain; preference needs to be accounted for in long-term physiotherapy airway clearance studies, and FEV1 can no longer be the gold standard outcome measure in these trials.
Competing interests None.
Provenance and peer review Commissioned; internally peer reviewed.