Background The National Early Warning Score (NEWS), proposed as a standardised track and trigger system, may perform less well in acute exacerbation of COPD (AECOPD). This study externally validated NEWS and modifications (Chronic Respiratory Early Warning Score (CREWS) and Salford-NEWS) in AECOPD.
Methods An observational cohort study (2012–2014, two UK acute medical units (AMUs)), compared AECOPD (2361 admissions, 942 individuals, International Statistical Classification of Diseases and Related Health Problems-10 J40–J44 codes) with AMU patients (37 109 admissions, 20 415 individuals).
Outcome In-hospital mortality prediction was done by admission NEWS, CREWS and Salford-NEWS assessed by discrimination (area under receiver operating characteristic curves (AUROCs)) and calibration (plots and Hosmer-Lemeshow (H-L) goodness-of-fit).
Results Median admission NEWS in AECOPD was 4 (IQR 2–6) versus 1 (0–3) in AMUs (p≤0.001), despite mortality of 4.5% in both. AECOPD AUROCs were NEWS 0.74 (95% CI 0.66 to 0.82), CREWS 0.72 (0.63 to 0.80) and Salford-NEWS 0.62 (0.53 to 0.70). AMU NEWS AUROC was 0.77 (0.75 to 0.78). At threshold NEWS=5 for AECOPD (44% of admissions), positive predictive value (PPV) of death was 8% (5 to 11) and negative predictive value (NPV) was 98% (97 to 99) versus AMU patients PPV of 17% (16 to 19) and NPV of 97% (97 to 97). For NEWS in AECOPD H-L p value=0.202.
Conclusion This first validation of the NEWS in AECOPD found modest discrimination to predict mortality. Lower specificity of NEWS in patients with AECOPD versus other AMU patients reflects acute and chronic respiratory physiological disturbance (including hypoxia), with resultant low PPV at NEWS=5. CREWS and Salford-NEWS, adjusting for chronic hypoxia, increased the specificity and PPV but there was no gain in discrimination.
- COPD Exacerbations
Statistics from Altmetric.com
What is the key question?
How does the National Early Warning Score (NEWS) perform in predicting mortality for acute exacerbation of COPD (AECOPD)?
What is the bottom line?
The NEWS shows acceptable discrimination in AECOPD and at a cut-off of 5 points has high negative predictive value and low positive predictive value and therefore can predict survival but not mortality.
Why read on?
This large dual-centre cohort study is the first validation of the NEWS in AECOPD.
Hospitalised patients are frequently exposed to avoidable harm.1 Adverse trends in clinical observations are often missed or misinterpreted, while shortcomings in management reflect poor organisation, appreciation of urgency or lack of supervision.1 ,2 One response to this knowledge has been the introduction of early warning scores (EWSs). Incorporating observations, with points aggregated in a weighted manner, depending on the degree of abnormality, EWSs are examples of prognostic prediction models. Derivation studies of EWSs, usually predicting inpatient mortality, have used observations at admission3 or at 24 hours prior to the outcome.4 By 2008, over 30 EWSs for acute hospital admissions had been published, with variable implementation.
The Royal College of Physicians (RCP) published a National Early Warning Score (NEWS) in 2012,5 aiming to standardise practice (see online supplementary figure A1 for further details). The NEWS is based on the validated ViEWS score derived at a single UK hospital (35 585 general acute medical unit (AMU) admissions, median age 73, mortality 5.6%),4 reporting an area under the receiver operating characteristic curve (AUROC) of 0.89 for predicting in-hospital mortality within 24 hours of the observation set.4 The RCP report suggested a range of thresholds, to standardise the frequency of observations, responder personnel and timing of senior review. For example, at a NEWS of 5 points, monitoring is suggested at least hourly, along with an urgent clinical review. The report noted that patients with COPD may have chronically disturbed physiology, potentially altering NEWS performance, but did not quantify this or suggest a way this should be compensated for. Two groups have proposed adjusted NEWS for patients with chronic respiratory disease, addressing this issue.6 ,7 First, the Chronic Respiratory Early Warning Score (CREWS)6 assigns risk points at a lower oxygen saturation threshold compared with patients without COPD. Second, Salford-NEWS combines this lower threshold with risk points for the use of supplemental oxygen in the context of higher oxygen saturation levels, reflecting a concern of hyperoxia-induced hypercapnic respiratory failure7 (see online supplementary figure A2).
Over 900 000 people have been diagnosed with COPD in the UK8 and around 110 000 emergency hospital admissions in England each year are due to acute exacerbation of COPD (AECOPD), a frequency second only to pneumonia.9 Concern has been expressed that the NEWS lacks specificity in AECOPD, over-alerting relatively stable patients, especially due to the weighting of ‘chronic’ hypoxia, with the potential for inappropriate diversion of resources and potentially encouraging (inappropriate) oxygen therapy.6 ,7 The study, performed in two adult AMUs, has two aims:
To externally validated the performance of the NEWS in terms of inpatient death among:
patients admitted to hospital with an AECOPD (‘AECOPD cohort’) for their first admission during the study period;
unselected patients admitted to hospital (‘AMU cohort’) for their first admission during the study period;
the AECOPD cohort for all inpatient episodes during the study period and
the AMU cohort for all inpatient episodes during the study period.
To externally validate alternative EWSs (CREWS and Salford-NEWS).6 ,7 Clauses 1a and 1b represent the primary analysis and clauses 1c and 1d form the sensitivity analysis. Published guidance for reporting was followed.10
An observational retrospective cohort external validation study of the NEWS was performed in the adult AMUs of Worthing Hospital and St Richard's Hospital sites of Western Sussex Hospitals NHS Foundation Trust (WSHFT), for the period from March 2012 to February 2014. WSHFT is an 870-bed Trust on the South Coast of England with a combined annual emergency department attendance over 150 000 and 50–60 acute medical admissions per 24-hour period. There is a separate admissions unit for complex elderly patients on the Worthing site (not included in the analysis). Ethical approval was given by NHS Research Ethics Committee London—South East (REC reference 13/LO/0884).
At admission, all inpatients have physiological observations measured and entered via handheld systems into the clinical data software system (Patientrack Sydney, New South Wales, Australia), with the NEWS automatically calculated. Criteria for the AECOPD cohort were as follows: age over 40 years, admitted to one of the AMUs, staying for at least one night over the 24-month period (2012–2014), as identified by a primary diagnosis from the International Statistical Classification of Diseases and Related Health Problems-10 (ICD-10) classification J40–J44 (88% coded J44.0 or J44.1).11 For comparison, over the same period, data were extracted on all other patients aged ≥18 years admitted for at least one night, through the two AMUs (AMU cohort). Exclusion criteria were as follows: patients moved directly from the accident and emergency (A&E) department to the Intensive Care Unit (ICU) (as neither area uses the Patientrack data system), aged <18 years or those discharged without spending a night in hospital.
Patients were followed up until discharge from hospital, or death, during the 24 months observation period. The primary analysis was performed for the first inpatient admission. A sensitivity analysis was performed, including all episodes during the study period, in an attempt to account for countervailing prognostic factors such as survivor bias and the effects of repeated admissions. Analysing all episodes also aids generalisability of results; as in clinical practice the NEWS is applied whatever the number of prior admissions. Furthermore, the data were analysed ‘per admission’ by creating a multilevel (hierarchical) multiple logistic regression model in which we adjusted the main (fixed) effect of the NEWS score for the number of admissions (level 1) per patient (level 2). 25 second-level clusters or “types of COPD patients” were formed. The main effect of NEWS score as predictor was adjusted for its random slope at level 1 (as nested within 25 clusters at level 2) and the number of admissions as a random intercept. Finally, a multiple regression model was run to test whether any difference seen in NEWS between AECOPD and AMU cohorts might have been due to the older age of the AECOPD cohort. As all patients had a NEWS calculated automatically by the Patientrack system before any outcome had occurred, there were no missing data at admission. The outcome predicted by admission NEWS was inpatient mortality. None of the researchers involved in analysis of the data were involved in the management of the patients. Data for the CREWS and Salford-NEWS prediction scores were collected and elaborated in the same way. The research team members responsible for data analysis had access only to the fully anonymised individual-level data and were blinded to any other patient data, as well as to the components of the calculated scores in the hospital information system. Since there is no consensus on how to determine what counts as an adequate sample size in such studies,10 all available 39 470 hospital episodes for the period 2012–2014 were included in the analysis.
Performance of the score as predictor is assessed by discrimination and calibration.10 Discrimination is demonstrated by the AUROC of the receiver operating characteristic curve, representing how well a model separates patients who experienced the outcome (in this case mortality) from those who did not. Calibration describes how well predicted results from a logistic regression model agree with the observed results. Over the entire range of prediction, this is referred to as goodness-of-fit. The Hosmer-Lemeshow (H-L) test is the most commonly used statistics in this field.12 The H-L test-associated p value, when significant (<0.05), may indicate poor fit.12 ,13 It is also recommended to graphically plot predicted against observed outcomes, for example, with a calibration slope.10 The agreement between the predicted probabilities and the observed frequencies for calibration was evaluated graphically by plotting the predicted probabilities (x-axis) by the observed event rate (y-axis) of the outcome (at each level of the score). The association between predicted probabilities and observed event rate can be described by a line with an intercept and a slope. An intercept of zero and a slope of one indicate perfect calibration. Predictive values were also calculated at suggested NEWS call-out thresholds, to further inform on the way model performance could impact on clinical workload. Following extraction, all data were fully anonymised on Microsoft Excel and analyses were performed on SPSS (V.22) and STATA SE (V.14).
Over the 2-year study period, there were 2361 AECOPD inpatient episodes (123 inpatient deaths, median of 3 admissions (IQR 2–5)) and 37 109 non-COPD AMU episodes (1911 deaths). For the primary analysis (first admission), there were 942 patients in the AECOPD cohort and 20 415 patients in the AMU cohort. The AECOPD cohort had a median age of 74 (67–82) versus 71 (55–82) in the AMU cohort (p<0.001). Median admission NEWS was significantly different—AECOPD 4 points (2–6) versus AMU 1 point (0–3) (p≤0.001). Inpatient mortality for first admission did not differ (4.5% in both cohorts). Table 1 summarises admission clinical demographic variables.
The spread of scores for the AECOPD cohort can be seen to be bell-shaped, in contrast with the AMU cohort, where the data are right-skewed (figure 1). In the AECOPD cohort, 44% had a score of ≥5 points on admission, compared with only 11% in the AMU cohort. Using a NEWS threshold of 5 points, to predict inpatient mortality in the AECOPD cohort, sensitivity was 76% (95% CI 61% to 88%) specificity was 57% (54% to 61%), positive predictive value (PPV) was 8% (5% to 11%) and negative predictive value (NPV) was 98% (97% to 99%). In contrast, in the AMU cohort, sensitivity was 43% (40% to 46%), specificity was 90% (90% to 91%), PPV was 17% (16% to 19%) and NPV was 97% (97% to 97%). (See table 2, which includes threshold of NEWS of 7 points.)
In the AECOPD cohort, for their first admission, the AUROCs for predicting inpatient mortality for the three prediction scores were as follows: NEWS=0.74 (95% CI 0.66 to 0.82), CREWS 0.72 (0.63 to 0.80) and Salford-NEWS 0.62 (0.53 to 0.70). In the AMU cohort, for their first admission, the AUROC for the NEWS was 0.77 (0.75 to 0.78) (figure 2). In the AECOPD cohort, the H-L test p value was 0.202 for NEWS, 0.399 for CREWS and 0.08 for Salford-NEWS (see online supplementary table A1 for H-L observed and expected results for NEWS). Calibration plots (shown in figure 3) suggest no improvement in calibration with the alternative scores, which both underpredicted mortality, though the number of deaths was small. In lower-risk groups, NEWS in the AMU cohort also underpredicted observed mortality. By assigning less points for hypoxaemia, CREWS and Salford-NEWS increased specificity, with an accompanying decrease in sensitivity in the AECOPD cohort. For example, at a call-out threshold of 5 points, sensitivity (to predict mortality) was 76% for NEWS, 48% for CREWS and 24% for Salford-NEWS, respectively, while specificity was 57%, 88% and 91%. CREWS and Salford-NEWS in the AECOPD cohort performed similarly to the NEWS in the AMU cohort.
In the AECOPD cohort, for all inpatient episode over the 2-year study period (n=2361), AUROCs for predicting inpatient mortality for the three prediction scores were as follows: NEWS=0.69 (0.64 to 0.75), CREWS 0.70 (0.64 to 0.75) and Salford-NEWS 0.67 (0.61 to 0.72). In the AMU cohort, for all inpatient episodes (n=37 109) using NEWS the AUROC was 0.75 (0.74 to 0.76) (see online supplementary table A2 and figures A3–4 for further details on all episodes). After adjusting for the number of admissions per patient (using ‘admissions’ as a random intercept), all p values for the main effect remained significant at p<0.05. After further adjusting the main effect (admission NEWS), as a random slope (at first level) as nested within the number of admissions (at second level), the main (fixed) effect of the NEWS as a predictor remained significant at p<0.05 (see online supplementary file). A further multiple regression model revealed that both AECOPD and age were independent predictors of NEWS, suggesting that the increased age in the AECOPD cohort did not account for the increase in NEWS seen in this cohort.
Statement of principal findings
This is the first validation study of the NEWS, CREWS and Salford-NEWS in AECOPD admissions. To predict inpatient mortality, admission NEWS in a AECOPD cohort demonstrated similar discrimination to an AMU cohort (AUROC 0.74 (66 to 82) vs 0.77 (75 to 78)). However, at suggested RCP cut-offs of 5 and 7 points (to predict mortality) in the AECOPD cohort, specificity and PPV values of the NEWS were lower compared with the AMU cohort, though sensitivity at the same cut-offs was higher. Modified scores have been suggested to account for chronically altered physiology in AECOPD.6 ,7 However, this goes against the premise that a universal scoring system (with potential significant advantages) should be employed throughout NHS hospitals. Furthermore, patients with COPD were included in the original derivation cohort for the NEWS. Assigning lower oxygen saturation thresholds for scoring could result in patients at high risk of death being categorised into a lower-risk group, thereby missing opportunities to intervene early. As predictors of mortality on admission (assessed by respective AUROCs), CREWS (0.72) and Salford-NEWS (0.66) did not improve discrimination compared with NEWS (0.74). At a threshold of 5 points, both alternatives improved specificity and PPV though sensitivity was reduced.
In a large dual-centre adult AMU cohort, to predict inpatient mortality, admission NEWS discriminated satisfactorily (AUROC 0.77). A trade-off between sensitivity and specificity must be noted, for example, at a cut-off NEWS of 7 points; sensitivity was only 25% for inpatient mortality. The AUROC for the NEWS is similar to a previously described admission prediction model by Duckitt et al (AUROC 0.74).3 The higher AUROC in the original derivation study for NEWS (0.89) is explained by prediction time frame (mortality 24 hours from observation set versus admission score),4 and derivation studies usually perform better than validation studies, making the later crucial to perform.10 Admission NEWS was analysed here, as it can be used to triage the patient and facilitate early physician review in higher-risk patients. This study complements three external validations of the ViEWS, on which the NEWS is based. One Canadian study found that an abbreviated ViEWS gave an AUROC of 0.81 (0.80–0.82) to predict 30-day mortality;14 a US study reported an AUROC of 0.86 (timing of outcome not reported)15 and a Ugandan study reported an AUROC of 0.89 (0.83 to 0.95) to predict mortality within 24 hours of admission.16 Overall mortality was lower than in the original ViEWS derivation study, reflecting a decrease across the NHS for emergency admissions, previously reported.17 ,18
Strengths of the study
This large observational cohort validation study using automatic (electronic) collection of observation information provides novel insights into the performance of the NEWS, CREWS and Salford-NEWS in a specialist group of patients, who nevertheless account for a large number of acute hospital admissions. Population characteristics, inpatient mortality, duration of stay and number of readmissions closely resemble the most recent UK British Thoracic Society (BTS) Audit.19 The study also provides the first UK external validation of the NEWS on admission. The additional analyses, adjusting for number of admissions, using multilevel modelling strengthen the findings of the primary analysis that NEWS is a strong predictor of mortality in this group.
Limitations of the study
The study relied on the ICD-10 coding of diagnosis, which has potential shortcomings,20 and using a primary diagnosis could have missed or misdiagnosed episodes. As there is no single diagnostic test for COPD, clinical judgement based on history, physical examination and confirmation of airflow obstruction on spirometry is used. It follows that any study on an often heterogeneous group of patients will be open to criticism of inclusion criteria. Indeed, current COPD guidelines may overdiagnose older men and underdiagnose young women.21 In one recent study, using a diagnostic code alone, PPV for COPD was 87% (78% to 92%), while adding spirometry plus specific medication only marginally increased the PPV to 89% (81% to 95%).22 Second, as only two hospital sites (including one with a separate admissions unit for complex elderly patients) were used, with a largely white, elderly demographic, in one South-East England county, cautions over generalisability must be noted. Third, the number of outcome events in the AECOPD cohort for the primary (first admission) analysis was limited (n=42) due to the relatively short observation period, though a sensitivity analysis on all admissions over the study period (n=123 deaths) produced similar results. There is little empirical evidence to guide sample size in validation studies and though 100 events have been suggested, this is based on limited simulation studies10 and the presented overall sample size is large. Fourth, no A&E observations were available, as the electronic system is only for patients admitted to a hospital ward (including AMU), so for the proportion of patients admitted via A&E, initial observations (up to four hours) would not be reflected in the data. Patients directly admitted to ICU were not included, either, though in both hospitals these account for only <1% of all admissions; such patients, by definition, have already been promptly recognised as critically unwell. Finally, using in-hospital as the outcome measure could potentially miss a patient who subsequently died in the community, though a minority of patients with respiratory disease die at home.23
Prediction models: statistical and clinical relevance and limitations
Although the AUROC is used widely in predictive models, it has well-documented shortcomings,24–27 including reflecting only a model's ability to rank order cases rather than be a function of actual predicted probabilities and does not inform on the consequences of using a model. Calibration is often overlooked26 and its most commonly used statistics (H-L test) has shortcomings depending on sample size.12 ,13 Groups should also be visualised to interpret calibration (plotting predicted versus observed outcomes).10 For clinical use, a strong model should lead to a wide range of predicted values and accurately stratify individuals into higher or lower risk categories. The calibration plots in the presented study for the AECOPD cohort suggest an overestimation of risk with NEWS but a potential underestimation with suggested alternatives (and in the AMU cohort using NEWS).
The NEWS is proposed to highlight those at highest risk to enable early, appropriate stratification of resources (ie, frequency of performed observations and senior input). Almost half of the AECOPD cohort had observations that would prompt hourly monitoring and urgent senior input (NEWS ≥5 points). As AECOPDs account for a relatively large number of admissions (6% of the 39 470 admissions in the study period), the low PPV has the potential for alert fatigue that could in turn lead to a failure to act when urgent attention is required for patients with and without AECOPD. In the AMU cohort, NEWS of ≥5 points at admission has high specificity and PPV (90% and 17%, respectively) for mortality, making the requirement for senior input entirely appropriate. The study, however, highlights how concentrating just on such patients would miss the majority (57%) of inpatient deaths.
Models to predict mortality or adverse events in AECOPD
As the NEWS has limitations in adequately risk-stratifying patients with AECOPD, investigating more specific prediction models using electronic hospital records is of interest. A systematic review in 201328 noted 10 prediction model studies in AECOPD, though only 4 were new models to predict inpatient mortality in general cohorts.29–32 table 3 summarises these studies and a further four studies were recently published.33–36 Three have external validation evidence29 ,31 ,32 with discrimination for the outcome studied, as assessed by the AUROC, ranging from 0.72 to 0.86. Automatic computer-based provision of recommendations as part of clinician workflow, at the time and location of decision-making, predicts successful implementation of prediction models.37 A number of the models include subjective variables (cyanosis, use of accessory muscles)29 or variables that would require manual input,29 ,32 ,36 making immediate electronic automation a challenge. The externally validated Dyspnoea, Eosinopenia, Consolidation, Acidaemia and Atrial Fibrillation (DECAF) score has been recommended by the most recent BTS COPD Audit, which also recommended that the dyspnoea scale (included in the DECAF) should routinely be filled out at admission in AECOPD.19 Unfortunately, as with other areas of prediction model studies, there are as yet no published impact analysis studies,10 though of interest the DECAF score is undergoing analysis as a triage tool for hospital admission in A&E as part of a randomised controlled trial (ISRCTN29082260) comparing home to standard inpatient management of patients with AECOPD.
NEWS and AECOPD
We propose, at present, that the NEWS should remain unadjusted as it provides standardisation as an aid to clinical staff. However, as with any observation(s), its utility is limited by the individual interpreting it. Any EWS must be supported by ongoing education and training, with the understanding that certain patient groups are best managed on specialist wards (or with specialist input elsewhere). In a patient with chronically disturbed respiratory physiology, when a specialist has indicated a lower target oxygen saturation (eg, the BTS recommend 88%–92%)42 for supplemental oxygen delivery, it should be taken into account when interpreting the NEWS. One strategy for the increasing number of hospitals with electronic observations would be to use previous discharge oxygen saturations as a guide to a patient’s baseline function, to individually titrate treatment and observation frequency, during subsequent admissions. This study demonstrates that recommended RCP call-out thresholds in AECOPD lead to hourly or continuous monitoring of a large number of relatively stable patients, diverting resources and producing alarm fatigue. Thus, local practices could be adjusted. For example, 48 hours after a patient’s admission, observation frequencies could appropriately be reduced, on a respiratory ward, if observations are close to their baseline state. Alternatively, an adjustment could be made to the score to account for chronic hypoxia in the presence of a normal respiratory rate. Ongoing education in oxygen prescribing and delivery is essential; for example, a patient achieving their target range for oxygen saturation may trigger a higher NEWS, but in this case escalation of oxygen supplementation would be inappropriate.
Meaning of the study
At admission to hospital, the NEWS discriminates for risk of inpatient mortality in patients with AECOPD similarly to general medical patients. The AECOPD group have a significantly higher NEWS, despite the same inpatient mortality. Call-out thresholds for patients with chronic respiratory disease may need reviewing to avoid alert fatigue, though applying clinical acumen, training and guidelines should aid recognition of the high risk or deteriorating patient. It follows that patients with COPD are likely to be best served in a location where input from specialist staff is available. An avenue of future investigation to improve performance of the NEWS in this cohort could be to incorporate blood test parameters and previous (coded) history into an electronic prediction model. The results of the DECAF impact analysis study will also be awaited with interest.
For patients with AECOPD, the NEWS as a mortality prediction score on admission to hospital discriminates similarly to general medical patients. Adjustment of the NEWS is not without risks—for example, inappropriately assigning lower risk to hypoxia could potentially attenuate the benefits that NEWS standardisation can bring. Improving education, including utility of the NEWS, recognition of the deteriorating patient, oxygen prescribing and ensuring that such patients are managed with input by respiratory clinicians is recommended.19
Contributors LEH: study design, data collection, data analysis and writing up of the paper. BDD, JC, RV and LGF: study design, data analysis and writing up of the paper. PJR: study design and writing up of the paper.
Competing interests None declared.
Ethics approval A favourable ethical opinion for the study was given by National Health service Research Ethics Committee London—South East (REC reference 13/LO/0884).
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.