Article Text
Abstract
Introduction Air pollution can exacerbate respiratory disease. However, when air quality warnings are provided by authoritative bodies (e.g. MET office) media coverage may be disproportionate. We explored whether there is an association between respiratory admissions and media pollution coverage via non-linear predictive models, to potentially predict respiratory admissions.
Methods The association was examined as follows:
Baseline regression models were generated to predict daily respiratory admission episodes from 1st January 2009–9th April 2014. Predictors consisted of daily logs for PM10 particulate matter, PM2.5, Nitric oxide, Nitrogen dioxide, Ozone, Black carbon, Mean Temperature, Precipitation (obtained from National Oceanic and Atmospheric Administration data) and the DAQI Air Pollution Index. Models were optimised via cross-validation using daily respiratory admission levels sourced from Nottingham University Hospitals Trust data (ICD10 codes J39 – J9999).
Time series of levels of media coverage were generated by applying kernel density estimation at a range of bandwidths (using linear and exponential kernels at bandwidths of 1, 10, 25, 50 and 100 days) to daily counts of online news articles featuring pollution and air quality issues over the period 01.01.2013 – 9.04. 2014.
Predictive model accuracies were compared following integration of these time series of media coverage levels as an additional predictor.
Results Of the predictive models tested, random forests parameterized provided optimal results for air-quality predictors. When predicting daily respiratory admissions, the model’s accuracy was 19.90% better than simply predicting mean daily admissions, with an average root mean square error (RMSE) of 7.5031. However, on introduction of the media-coverage variable, RMSE was reduced to 7.3210, representing a 21.85% improvement over mean prediction. While this reflected a slight improvement in admissions forecasting, a corrected t-test suggested these differences were not statistically significant, with a p-value of 0.0633.
Conclusion Initial results indicate that consideration of media coverage may offer minor improvements in predicting respiratory admissions, but this effect was not statistically significant. While such a relationship requires further investigation, models informed by media coverage cannot currently be considered to be accurate enough for use in a practical setting. Better media data collection may improve prediction accuracy.