Comparison of performance of land use regression models derived for Catalunya, Spain
Introduction
Near-road traffic related air pollution has been associated in many epidemiologic studies with a multitude of health effects ranging from reproductive outcomes to all-cause mortality (HEI, 2010). While these studies indicate that living close to busy roads is associated with adverse health effects, findings are not easily translated into policies and generalization from one city to another is limited as “proximity” does not necessarily reflect the same types and levels of pollution. Thus, alternative objective measures of local traffic-related pollution are warranted. Land Use Regression (LUR) modelling is often the chosen exposure assessment methodology to capture small-scale differences in air pollution concentrations, particularly from traffic sources, with medium implementation costs (Jerrett et al., 2005). Numerous epidemiological studies use LUR models to investigate the health effects of air pollution. Nitrogen dioxide (NO2) has most frequently been used as marker of near-road traffic related pollutants. A landmark activity of the ESCAPE project (European Study of Cohorts for Air Pollution Effects) is the adoption of uniform air pollution exposure assessment methodologies to assess the spatial variability of traffic-related air pollutants (Beelen et al., 2013, Cyrys et al., 2012). The same methods of measurements and modelling were applied to 36 study areas across Europe in sites where pre-existing cohort studies were available to provide health outcome data. NO2 was one of the pollutants chosen as an indicator of traffic-related exposures to develop LUR estimates to then assign exposures to the participants of local studies.
A major challenge in the ESCAPE LUR development was the selection of the feasible number of measurement sites for a large number of European study areas within a limited budget. As previously shown, the number of measurement sites used to develop LUR models also impacts the performance of those models (Basagana et al., 2012, Wang et al., 2013). These studies have shown that for models developed on a small number of sites, the model and cross-validation R2 overestimate predictive ability at independent test sites. This highlights the importance of validating models with independent datasets. A second – and partly related – challenge was the ambitious spatial coverage demanded all across Europe for regions of heterogeneous sizes, namely varying from intra-city to large regional domains. One example of a single large but geographically diverse region is Catalunya, Spain, where ESCAPE provided a LUR model for several cohort studies from different towns and cities. For two cohorts – namely REGICOR (Girona Heart Registry) in the region of Girona (Rivera et al., 2012), and INMA (Environment and Childhood) in the city of Sabadell (Aguilera et al., 2008) – independent exposure assessment studies were undertaken to build NO2 LUR models for both study areas prior to the start of ESCAPE. The existence of parallel measurement campaigns and LUR models offers an interesting opportunity to compare the performance of the regional ESCAPE LUR model for NO2 with LUR models derived locally for each of these cohorts based on a higher density of measurement sites.
Comparisons between LUR models developed for the same area are scant. Dijkema et al. (2011) compared two LUR models developed at different scales (large area and city-specific, encompassing the same core-area of Amsterdam) and using different monitoring campaigns. They found in both cases a drop in model performance in terms of adjusted R2 when applying the model to the other model's monitoring sites, and highlighted in their conclusions the importance of a sampling location strategy purposefully designed to reflect locations where models are to be applied.
The first objective of our study was to study the performance of the LUR model by comparing predictions with observed values at locations that were not used for model development.
Because the ultimate goal of LUR modelling in epidemiology is to assign estimates of air pollution exposure to participants of health studies, our second objective was to compare the different model predictions at the residential addresses of the two local cohort studies.
Section snippets
Methods
Three distinct LUR models were developed within the region of Catalunya for three different epidemiological studies: ESCAPE (Cyrys et al., 2012), INMA-Sabadell (Aguilera et al., 2008), and REGICOR-Girona (Rivera et al., 2012). The model domain and the methodology and concepts of model development varied for each of the study models. As shown in Fig. 1, the ESCAPE domain encompasses the domain of the other two studies, as it was developed to assess exposures for participants of three European
Comparison of model performances at measurement sites (analyses 1 and 2)
The ESCAPE model performed well, explaining 69% of the variability in NO2 concentration at ESCAPE sites in Sabadell (Fig. 2a). The R2 for the ESCAPE model dropped to 53% when it was applied to INMA sites in Sabadell (Fig. 2b). The ESCAPE model performed less well in Girona province, with an R2 of 0.51 for ESCAPE sites in the province and 0.36 for REGICOR sites (Fig. 2c and d). The R2 values obtained by applying the ESCAPE model at independent INMA and REGICOR sites in Sabadell and Girona were
Summary and comparison with other studies
We evaluated the performance of the LUR model developed for the Catalunya region in Spain for the ESCAPE project and two other LUR models based on independent monitoring campaigns derived locally for the sub-regions of the INMA-Sabadell and REGICOR-Girona cohort studies (both included in the ESCAPE domain). The ESCAPE model employed relatively few sites for a large spatial area in comparison to more dense monitoring campaigns and different site selection protocols for INMA-Sabadell and
Conclusion
Three land use regression models developed on different scales and with different philosophies on overlapping regions were evaluated comparing predictions to measurements made on their own and on the other models' sampling dataset. In all three models validation R2 derived from the independent dataset were lower than the leave-one-out cross-validation R2, but still provided reasonable predictions. The three models still explained a substantial fraction of the variation at independent sites,
References (23)
- et al.
Effect of the number of measurement sites on land use regression models in estimating local air pollution
Atmospheric Environment
(2012) - et al.
Development of NO2 and NOx land use regression models for estimating air pollution exposure in 36 study areas in Europe – the ESCAPE project
Atmospheric Environment
(2013) - et al.
Comparison of the performances of land use regression modelling and dispersion modelling in estimating small-scale variations in long-term air pollution concentrations in a Dutch urban area
Atmospheric Environment
(2010) - et al.
A regression-based method for mapping traffic-related air pollution: application and testing in four contrasting urban environments
The Science of The Total Environment
(2000) - et al.
Variation of NO2 and NOx concentrations between and within 36 European study areas: results from the ESCAPE study
Atmospheric Environment
(2012) - et al.
Evaluation of land-use regression models used to predict air quality concentrations in an urban area
Atmospheric Environment
(2010) - et al.
Within-urban variability in ambient air pollution: comparison of estimation methods
Atmospheric Environment
(2008) - et al.
Spatial distribution of ultrafine particles in urban settings: a land use regression model
Atmospheric Environment
(2012) - et al.
New high resolution maps of estimated background ambient NOx and NO2 concentrations in the U.K.
Atmospheric Environment
(1997) - et al.
Temporal stability of land use regression models for traffic-related air pollution
Atmospheric Environment
(2013)
Estimation of outdoor NO(x), NO(2), and BTEX exposure in a cohort of pregnant women using land use regression modeling
Environmental Science & Technology
Cited by (9)
The application of semicircular-buffer-based land use regression models incorporating wind direction in predicting quarterly NO<inf>2</inf> and PM<inf>10</inf> concentrations
2015, Atmospheric EnvironmentCitation Excerpt :Dispersion models can potentially reflect the temporal and spatial variation of pollutants, but they can not meet high resolution requirement (Gulliver et al., 2013; Su et al., 2008; Wu et al., 2011). LUR models have been viewed as promising approach and have been successfully applied in many studies (de Nazelle et al., 2013; Madsen et al., 2011; Johnson et al., 2010; Wang et al., 2012). LUR models can be used to estimate mean annual or quarterly pollutant concentrations at unmeasured locations by establishing a statistical relationship between pollutant measurements and potential predictor variables (Saraswat et al., 2013).
Child exposure to indoor and outdoor air pollutants in schools in Barcelona, Spain
2014, Environment InternationalCitation Excerpt :The levels of PM2.5, NO2, and UFP found at schools in Barcelona in both indoor and outdoor environments are higher than expected since PM2.5 and NO2 concentrations are 1.7 and 1.2 times higher than those found in the UB-PR station. Outdoor levels of NO2 at BREATHE schools can be considered to be representative of all schools in Barcelona considering that they agree with modelled data employing Land Use Regression from the ESCAPE project for all the schools in Barcelona (Cyrys et al., 2012; De Nazelle et al., 2013). The modelled data yielded an average of 50 μg·m−3, which is practically the same as the value obtained with measurements at the 39 BREATHE schools, and higher than the value at the reference station of UB-PR (41 μg·m− 3).
The impact of different validation datasets on air quality modeling performance
2018, Transportation Research RecordIndependent Validation of National Satellite-Based Land-Use Regression Models for Nitrogen Dioxide Using Passive Samplers
2016, Environmental Science and TechnologyDevelopment, Evaluation, and Comparison of Land Use Regression Modeling Methods to Estimate Residential Exposure to Nitrogen Dioxide in a Cohort Study
2016, Environmental Science and Technology