Jain, Sakshi and Presto, Albert A. and Zimmerman, Naomi
Environmental Science & Technology: in press.
Publication year: 2021

Abstract: Previous studies have characterized spatial patterns of pollution with land use regression (LUR) models from distributed passive or filter samplers at low temporal resolution. Large-scale deployment of low-cost sensors (LCS), which typically sample in real time, may enable time-resolved or real-time modeling of concentration surfaces. The aim of this study was to develop spatiotemporal models of PM2.5, NO2, and CO using an LCS network in Pittsburgh, Pennsylvania. We modeled daily average concentrations in August 2016–December 2017 across 50 sites. Land use variables included 13 time-independent (e.g., elevation) and time-dependent (e.g., temperature) predictors. We examined two models: LUR and a machine-learning-enabled land use model (land use random forest, LURF). The LURF models outperformed LUR models, with increase in the average externally cross-validated R2 of 0.10–0.19. Using wavelet decomposition to separate short-lived events from the regional background, we also created time-decomposed LUR and LURF models. Compared to the standard model, this resulted in improvement in R2 of up to 0.14. The time-decomposed models were more influenced by spatial parameters. Mapping our models across Allegheny County, we observed that time-decomposed LURF models created robust PM2.5 predictions, suggesting that this approach may improve our ability to map air pollutants at high spatiotemporal resolution.