Health status measurement in chronic obstructive pulmonary disease
- Department of Respiratory Medicine, Division of Physiological Medicine, St George's Hospital Medical School, London SW17 0RE, UK
- Professor P W Jones
- Received 2 December 2000
- Revision requested 7 March 2001
- Revised 12 July 2001
- Accepted 12 July 2001
Health status measurement is a common feature of studies in chronic obstructive pulmonary disease (COPD). This review assesses recent evidence for the validity of these measurements and their role as measures of the overall impact of the disease on the patient's daily life and wellbeing. It reviews the mostly widely used COPD specific questionnaires and examines the contribution that they make to an assessment of the overall effect of treatment. Finally, it addresses the question of how symptomatic benefit may be assessed in individual patients in routine practice.
Health status (or “health related quality of life measurement”) has become a central feature of studies in chronic obstructive pulmonary disease (COPD). One driving force for this is the recognition that treatments for this condition, other than smoking cessation, are largely symptomatic. Another factor is the requirement in Europe that clinical trials of new drugs for COPD should incorporate a symptomatic measure, such as a health status questionnaire, as a co-primary end point along with a measure such as the forced expiratory volume in 1 second (FEV1). This review aims to assemble evidence for the validity of health status measurements, highlight new insights that they are providing about COPD and its treatment, and discuss implications for routine patient care. The review will concentrate solely on disease specific health status questionnaires. Generic questionnaires do have a place in COPD studies but are relatively insensitive to worthwhile treatment effects in this disease,1 2 although the generic SF-36 has shown responsiveness in pulmonary rehabilitation.3
Purpose of health status measurement in COPD
Health status measurement is a means of quantifying, in a standardised and objective manner, the impact of disease on patients' daily life, health, and wellbeing. It is a process that is essentially similar to a highly structured clinical history, although the end product is not a clinical impression but an objective measurement that can be used for scientific purposes. It is no more “soft” or “touchy-feely” than any well taken clinical history. Health status questionnaires usually address emotional and psychological effects of the illness as well as the physical, but the bulk of their items usually concern practical aspects of disturbance to daily life. Some questionnaires are even called functional performance questionnaires.4 Regardless of terminology and precise content, the purpose of these questionnaires is to address a wide range of effects of the disease and, if possible, to summarise these in one overall score.
COPD: a multisystem disorder
What does health status measurement have to offer in COPD, when the FEV1 has served well for four decades? In addressing this question it should be appreciated that, while COPD has its primary effect in the lungs, structural and functional changes also take place in other organs. Even in the lungs there are a number of different pathophysiological processes, each of which may be present to a varying degree. Furthermore, there are important effects of COPD such as fatigue that have no immediate relationship to expiratory airflow limitation. The following paragraphs illustrate the multi-causal nature of COPD symptoms, which justifies the development of health status scales that sum the effects of these multiple processes.
Breathlessness is the characteristic symptom of COPD. It has a complex aetiology that is linked to the process of breathing.5 With increasing lung volume a greater respiratory effort is needed to maintain tidal breathing. There is a curvilinear relationship between the FEV1 and residual volume, but the two correlate only moderately well.6 Since breathlessness is largely associated with inspiration, interest is turning to measurements made during the inspiratory portion of the respiratory cycle. For example, bronchodilator induced reductions in breathlessness at rest have been shown to correlate better with changes in forced inspiratory flow than with changes in FEV1.7 Static lung volumes are increased in COPD,6 but there is also a further rise in functional residual capacity at exercise onset, otherwise known as dynamic hyperinflation. The reduction in breathlessness during exercise that occurs with bronchodilators has been shown to correlate better with improvement in inspiratory capacity during exercise (due to a reduction in dynamic hyperinflation) than improvement in FEV1.8 9 In addition to disturbances of lung mechanics, a further cause of dyspnoea in COPD is the increased respiratory drive that occurs in the presence of exercise induced arterial desaturation. Finally, it should be appreciated that, in addition to the multiplicity of causal mechanisms, there are also large interindividual differences in the perception of breathlessness for a given level of ventilation, even in healthy people with no lung disease.10 As a result, for any given level of work, breathlessness will be the result of an interaction between a number of different mechanisms and patient specific characteristics that are unrelated to the underlying disease.
FATIGUE AND MUSCLE WASTING
Leg fatigue has been shown to be as important as breathlessness in limiting peak exercise performance.11 In one study patients rated it to be a more important problem than breathlessness, although not unless asked directly about it.12 Muscle weakness is a feature of COPD, particularly of the legs13but also of the arms.14 This may not be due entirely to disuse atrophy since nutritional depletion occurs15 and there is evidence of circulating inflammatory cytokines in COPD.16
SLEEP AND MOOD
COPD causes disturbances other than impaired exercise tolerance and decreased mobility. Disturbed sleep is a common feature. A recent survey of patients with COPD in the Breathe Easy Club, carried out by the British Lung Foundation, found that half of the respondents had regular sleep disturbance.17 Disorders of mood state also occur.18 This may be confined to subgroups within a COPD population19 since depression scores are not uniformly increased, even in patients with moderate to severe COPD.20 While mood state may be impaired in a minority of patients, a loss of sense of control or “mastery” over their condition is a common feature of patients with COPD—so much so that it forms an important part of one well established health status questionnaire, the Chronic Respiratory Questionnaire.12 21
The frequency of reported COPD exacerbations increases with disease severity.22 In patients with moderate to severe COPD, prospective data collection using diary cards revealed that patients report only 50% of the exacerbations that they experience and have a median exacerbation rate of 3 per year with a range of 1–8.23 Lung function can take several weeks to recover,24 so exacerbation frequency is clearly an important factor.
Need for an overall summary of the effects of COPD
It is clear that there are multiple consequences of COPD. Even in the lungs there is no single or composite summary measure of impaired lung function. In many circumstances it would be valuable to have an estimate of the overall effect of the disease and the overall impact of treatment. There is a need for a measure that can aggregate into a single score the summed effect of the multiple pathophysiological processes that involve different organs and systems. This is the role of health status measurement—to provide a comprehensive estimate of the primary and secondary effects of the disease.
Health status questionnaires
There are a number of instruments that may be described as COPD specific health status questionnaires including the Chronic Respiratory Questionnaire (CRQ),21 the St George's Hospital Questionnaire (SGRQ) which is for both asthma and COPD,25the Breathing Problems Questionnaire (BPQ),26 27 and the QOL-RIQ.28 These questionnaires tend to have a degree of complexity that makes them unsuitable for routine use, which led to the development of the AQ20—a 20-item instrument that takes 2–3 minutes to complete and score.29 This questionnaire is suitable for both asthma29 and COPD.30 There are also two function limitation questionnaires that are similar in many respects to health status instruments: the modified Pulmonary Functional Status and Dyspnea Questionnaire (PFSDQ-M)4 and the Pulmonary Functional Status Scale (PFSS).31 These two questionnaires are in wide use in pulmonary rehabilitation programmes in the USA. Activity of daily living scales are similar to function limitation questionnaires but address more severe levels of disability. The Nottingham Extended Activity of Daily Living Scale has been shown to have some validity in COPD32 but it was not developed for this disease. A COPD specific activity of daily living scale has recently been described and validated.33
Health status questionnaires are made up of items selected because they are relevant to patients with COPD in terms of frequency and importance. Their content is generally similar, but they differ in terms of their underlying structure. There are relatively few direct comparisons between them. A recent comparison of the CRQ and SGRQ did not favour clearly one over the other.34 Another study reported unfavourable completion rates with the SGRQ compared with the CRQ,35 but this was because the SGRQ was not administered according to the developer's guidelines (patients were often given it to take home). The CRQ, which must be interviewer administered, was completed rather better. Two rehabilitation studies have compared the responsiveness of the CRQ and SGRQ directly. One found that the changes in CRQ score were slightly higher in relative terms than those for the SGRQ,36 whereas another larger study showed that the SGRQ was more responsive.3 Cross sectional correlations between SGRQ and CRQ scores have been reported to be greater thanr=0.7.37 However, a comparison between CRQ Dyspnea and SGRQ Activity scores obtained in a recent rehabilitation study36 found no significant correlation between these related scales (r=0.15, n=123; Bestall and Jones, unpublished). This may reflect the fact that SGRQ items are entirely standardised, but in the CRQ Dyspnea scale the patients choose the activities that are important to them, so this scale may not be suitable for cross sectional comparisons.
Concerning the shorter questionnaires, a UK study concluded that the BPQ provided more valid assessments of health status than the CRQ,38 although a Japanese group reached the opposite conclusion—namely, that the CRQ (and SGRQ) discriminated between patients with different degrees of severity better than the BPQ.37 In terms of responsiveness, there is one report that the BPQ was not as sensitive as the CRQ in detecting change following a pulmonary rehabilitation programme.39 The other short questionnaire (the AQ20) appeared to discriminate between patients as well as the CRQ and SGRQ and also to be responsive to changes following pulmonary rehabilitation.30
Validation of health status questionnaires
This topic is complicated by the use of a number of terms to describe different aspects of the validation process. In essence, its purpose is to test whether a questionnaire measures what its authors claim. There is no single step or test that will validate a questionnaire; indeed, it is quite the reverse. Many different hypotheses have to be tested to build up a picture that will allow an overall judgement as to whether the questionnaire is behaving in the manner expected of an instrument designed to measure impaired health.
Questionnaires are often described as having two broad types of property: discriminative—that is, the ability to distinguish between different levels of impaired health between patients; and evaluative—that is, the ability to detect changes with disease progression or treatment. For a number of practical reasons there are more published data concerning discriminative properties than evaluative properties. Of the most widely documented questionnaires, the CRQ was designed specifically as an evaluative instrument whereas the SGRQ has both discriminative and evaluative properties. Tests of the cross sectional validity of the CRQ against exercise capacity (an important determinant of health status) have produced inconsistent results,37 40 so evidence for the validity of health status measurement using the SGRQ is summarised in table 1. Correlations between changes in health status score and changes in other measures of disease activity are weaker than the corresponding cross sectional comparisons because the range of scores is smaller, but the pattern of between-patient and within-patient correlations is still very similar.25 There is also clear evidence for the validity of within-patient changes in health status measured using the CRQ.2
DISEASE FACTORS ASSOCIATED WITH IMPAIRED HEALTH
The studies summarised in table 1 show that health status scores are significantly associated with abnormalities in a wide range of markers of impaired health. It would be inappropriate to expect high correlations with any specific aspect of COPD, since the questionnaires are designed to address a wide range of different effects of the disease. However, some patterns do emerge from the data. Impaired exercise performance and functional capacity (as measured by the MRC Dyspnoea Scale, for example) are quite strongly associated with poorer health status. The presence of daily symptoms and a high exacerbation frequency are other important factors. Emotional factors have been shown to be important and common in patients with COPD,12so it is not surprising that anxiety and depression are quite consistent correlates of impaired health. A number of factors may have interactive effects on health status. For example, COPD patients with a low mean body mass had much worse SGRQ scores than those in whom mean body mass was normal.41 This association appeared to be attributable to an increase in breathlessness—that is, patients with a low mean body mass have worse health because of higher levels of dyspnoea. This conclusion has been challenged recently since, in another study, impaired health status in patients with low free fat mass could not be explained solely in terms of breathlessness.42
There is a degree of intercorrelation between factors that determine health status impairment. Within the limits of what is measurable in the same population of patients, multivariate analysis has shown that 50% of the variance in SGRQ Total score could be attributed to a combination of cough, wheeze, MRC dyspnoea grade, 6 minute walking distance, and anxiety score (each as statistically significant covariates).25 Thus, it appears that health status questionnaires can bring together a range of effects of COPD into one summary measure of the overall impact of the disease, which is their primary purpose.
FEV1 AND HEALTH STATUS
The reported correlations between FEV1 and health status in COPD are never very high (table 1). Quite a wide range ofr values is found in the literature, but this may just reflect sampling factors in data derived from relatively small study populations. Figure 1 contains what is perhaps the definitive description of this relationship, being obtained from nearly 800 patients measured at baseline in the ISOLDE study.22 43 Lower FEV1 is associated with worse health, but the correlation is weak. At a population level, a clearer association between FEV1 and health status may be seen when mean data from different patient populations are plotted against each other.44 45 Returning to individual patients, the most important inference to be drawn from this weak relationship is that some patients may have very poor health despite mild spirometric impairment (although, on the other hand, there are also patients with severe airways obstruction who appear to have little disturbance to their daily lives, despite severe airflow limitation).
While the correlation between health status and FEV1 is low, this is not entirely surprising in view of the effects of COPD on health that are not mediated through expiratory flow limitation. For example, health status tends to correlate better with exercise performance than with FEV1 (table 1). It will be interesting to see whether peak inspiratory flow or inspiratory capacity provide better spirometric correlates of health impairment than the FEV1, as appears to be the case with dyspnoea during exercise.8 9
What can be learned from health status measurement?
Health status questionnaires have found their widest application in clinical trials where they are used to provide a measure of the overall symptomatic benefit from the treatment, together with an index of whether the effect was worthwhile. There is no universally agreed definition of worthwhile benefit in chronic disease, but a common view is that, if a patient can detect a definite reduction in symptoms or the impact of the disease on their daily life, that is clinically significant. The issue of clinically noticeable differences and thresholds for clinical significance is a complex topic that is discussed in depth elsewhere.46-49 Sufficient to say here that the suggested thresholds of 0.5 per domain for the CRQ50 and 4 units for the SGRQ51 appear to be reliable.
A good example of the contribution that health status measurement can make to the evaluation of a treatment is contained in a meta-analysis of the effect of pulmonary rehabilitation.52 Health status data from that review are shown in fig 2. Across 4–6 trials, all domains of the CRQ improved by a statistically significant amount. The mean improvements were greater than the minimum clinically important difference (MCID). Even more strikingly, for the Dyspnoea and Mastery components of the CRQ, the lower 95% confidence interval (CI) of the treatment effect did not cross the MCID—that is, pulmonary rehabilitation produced an effect that was significantly greater than that needed for a minimum worthwhile benefit. Few treatments can claim such a result in any disease. The same magnitude of effect from pulmonary rehabilitation—that is, one that was significantly greater than the minimum clinically significant change—has now been reported from a single large study in which such benefits were seen with both the CRQ and the SGRQ.3
In the context of pharmacological studies, health status measurements may provide another particularly important contribution as illustrated by a 16 week study that compared two doses of salmeterol with placebo in COPD.1 The improvement in FEV1 was similar with both doses of the drug (110–120 ml). This magnitude of improvement is at the lower end of those typical of COPD bronchodilator studies. In patients given salmeterol 50 μg twice daily, the SGRQ score improved by over 5 units (a clinically and statistically significant amount) compared with the changes seen with placebo. In contrast, patients given salmeterol 100 μg twice daily experienced neither a clinically nor a statistically significant improvement in SGRQ, despite having an improvement in FEV1 of the same magnitude as that obtained with the lower dose. This lack of symptomatic benefit appears to have been due to side effects.1 Use of health status measurement in this study has shown that salmeterol can, in the right dose, produce worthwhile benefits that patients do notice. Physicians are often tempted to increase drug doses in the face of a poor response or to get an even better effect. Use of direct measurements of health status has shown that this may lead to a loss of any benefit, rather than additional gain.
Longitudinal trends in health status
The accelerated decline in FEV1 that occurs in smokers with COPD is familiar to all chest physicians since the work of Fletcher and Peto. Only recently, following the ISOLDE study, has the accompanying decline in health status been documented.22 43 The existence of this decline was predictable from clinical experience and cross sectional data such as those in fig 1 in which lower FEV1 was associated with worse health. However, the rate of decline was not predicted. In patients with a mean post-bronchodilator FEV1 of 50% predicted and treated with bronchodilators alone, the SGRQ score declines approximately 3 units per year. Thus, on average, patients reach a clinically significant level of deterioration of 4 units every 15 months. This is much faster than the age related worsening in the SGRQ score of 0.12 units per year observed in healthy subjects without COPD.53 The mechanisms underlying this decline have yet to be fully established, although the rate of decline in FEV1is a factor,43 as is the rate of exacerbation.54 Clearly there will be “fast” and “slow” decliners in health status, and the challenge will be to identify “fast” decliners early on in their disease and develop appropriate interventions.
Demonstration of a measurable decline in health status has important implications for the design and interpretation of long term clinical trials in COPD, and for the management of patients in routine practice. Progressive worsening of patients' health over time will appear to erode earlier therapeutic gains. This may not mean that the treatment effect has worn off—it is just a reflection of the fact that COPD is a relentlessly progressive disease. In this respect, the most encouraging finding from the ISOLDE study was that fluticasone reduced the rate of decline in SGRQ score by nearly 40% and that the difference between steroid and placebo treated groups widened progressively with time.22 43
Limitations of health status measurement
Health status instruments are not perfect. Their developers attempt to make the best instrument that they can, but limitations emerge with use. These may be related to the administration or scoring of the questionnaire, but it should be appreciated that comprehensive measurements require sophisticated instruments. Improvement in signal/noise ratio of the scores is possible by elimination of less reliable and less discriminatory items, but consistency between studies requires that scores from any “improved” version should be directly compatible with those obtained with the earlier version. Nevertheless, these objectives may be achievable, although radical reductions in item numbers would not occur.
Scores for the existing COPD questionnaires are usually normally distributed with little evidence of so-called “floor” and “ceiling” effects, although one comparison of the BPQ, CRQ, and SGRQ reported BPQ scores to be distributed towards the low end.37 One unresolved issue concerns the application of these instruments to the most severe patients. These questionnaires were developed in patients who were largely not housebound, so they may not be appropriate for patients with end stage disease, although in COPD patients with hypoxia55 and hypoxia plus hypercapnoea56 SGRQ scores were high but still normally distributed. Questionnaires designed for the most severe patients have been developed, including the London Chest Activity of Daily Living scale (LCADL)33 and a respiratory failure specific instrument, the MRF-28.57 Studies that compare these new instruments with existing questionnaires will serve as a test of the adequacy of the latter in the most severe subgroup.
Health status measurement in routine practice
Health status questionnaires were developed and validated in populations of patients as research tools that would allow standardised assessments. Disease specific questionnaires such as the CRQ and SGRQ are composed of “lowest common denominator” items that are applicable to most patients with COPD. When used in clinical trials, they indicate the average response to treatment. In routine practice, however, clinicians treat individuals, not the average. This presents the challenge of assessing whether an individual patient has had a worthwhile improvement. Physiological improvement does not appear to be an adequate surrogate for symptomatic or health status improvement. In pulmonary rehabilitation the correlation between improvement in health status and improvement in exercise performance is generally weak.2 58-60 Similarly, with long acting bronchodilators the correlation between changes in FEV1 and health status is weak (fig 3).1 There is a significant correlation between FEV1 and SGRQ score, but also much scatter around the regression line. Some patients, as exemplified by patient A, had a measurable improvement in FEV1 but no improvement in SGRQ. Others such as patient B showed very large improvements in health status but no detectable change in FEV1. Use of spirometric testing as the sole method of assessing benefit would deny such patients a worthwhile treatment. Bronchodilators may produce symptomatic benefit, not only by reducing expired airflow limitation but also by improving inspiratory flow rates,7 minimising the effects of dynamic hyperinflation,8 9 and by improving sleep.
The absence of a strong correlation between symptomatic and spirometric gain is not surprising, given the many factors that influence the development and perception of respiratory symptoms and the ensuing disability, but it does show that symptomatic gain in individual patients in routine practice cannot be inferred reliably from spirometric changes. This poses the question as to how symptomatic and health status gain should be assessed in routine practice. Unfortunately, use of standardised questionnaires such as the CRQ, SGRQ, and even the short and simple AQ20 may not be the answer. These questionnaires complement baseline spirometry to provide a more complete picture of the patient's disease severity, but they have limitations which restrict their usefulness when assessing an individual patient's response to a specific treatment. Any questionnaire short enough for routine use will contain only a small number of items that have been carefully selected to be relevant to all patients with COPD. These items give very little opportunity for an individual patient to indicate how they experience personal benefit from treatment. There are also statistical issues. In a population of patients with stable COPD the short term repeatability of these questionnaires is good. For example, the correlation between SGRQ measurements made 2 weeks apart is 0.92,25 but the correlation coefficient does not give the full picture since the standard deviation for the difference between the two measurements is ±9 units. Approximately half of the patients will show a change in SGRQ score that is greater or less than the 4 unit threshold for a clinically significant change, whether or not there has been a real change in their state. Equally, in other patients who have a “true” worthwhile benefit, the health status score may change by less than the clinically significant threshold. This problem applies also to the CRQ and, indeed, is not unique to health status measurement. It also arises when assessing an individual patient's spirometric response to long acting bronchodilator. The mean change in FEV1 (typically 100–200 ml) lies within the limits of reproducibility of the measurement.
Measurement of individual patient benefit
It is only worth continuing to prescribe symptomatic treatments if the patient can report benefit, but how can that benefit be assessed if health status questionnaires have insufficient reliability within an individual patient? The answer is to draw upon data from health status research to inform clinical history taking. It has been shown that patients' global retrospective assessment of the effect of a treatment, reported using a 4 point scale (ineffective, satisfactory, effective, very effective) correlated well with the improvement in SGRQ score.1 A 4 point change in SGRQ score was associated with the patients' overall assessment that the treatment was “effective”. Physicians can also identify, with some confidence, patients who have had a worthwhile response to treatment (fig 4). However, many clinicians may be concerned about the simple acceptance of the patient's self-report that the treatment was effective and look for more confirmation that a significant change had occurred. Further analysis of health status data through “back calculating” changes in the health status score has enabled clinical scenarios to be developed that illustrate the type of changes that may occur following effective treatment. For example, a 4 point change in SGRQ score (the threshold for a clinically significant change) corresponds to a patient who returns a few weeks after the prescription of a new treatment to report that he or she no longer takes so long to wash or dress, can now walk up stairs without stopping, and is now able to leave the house for shopping or entertainment. (Note: a 4 unit improvement in the SGRQ score would only occur if the patient reported all three improvements). Examples of similar scenarios may be found elsewhere.49 These benefits are not marginal and can be readily identified by patients and reported to their clinician. A simple scheme of clinical assessment, based on results from health status and quality of life research, is illustrated in box 1. If the clinician is convinced by the patient's responses that a beneficial change has taken place, and the patient considers the change to be worthwhile, the treatment should be judged clinically effective.
Assessment of individual patient benefit from symptomatic treatment
Has your treatment made a difference to you?
Is your breathing easier in any way?
Can you do some things now that you couldn't do at all before, or do the same things but faster?
Can you do the same things as before but are now less breathless when you do them?
Has your sleep improved?
Please give me an example.
Box 1 Components of a clinical assessment of the response to symptomatic treatment for COPD. Patients should be able to provide examples of improvements that they have noticed and think are worthwhile.
Health status questionnaires provide a valid and standardised estimate of the overall impact of COPD and can complement spirometric measurements in the baseline assessment of patients in routine practice. Simple questionnaires are now available for this purpose. In a clinical trial health status scores provide a measure of the overall level of symptomatic benefit to be obtained with that treatment. In the individual patient assessment of symptomatic benefit and quality of life improvement requires that a careful clinical history be taken.