Abstract
Air pollution is one of the major concerns in global urbanization. Data science can help to understand the dynamics of air pollution and build reliable statistical models to forecast air pollution levels. To achieve these goals, one needs to learn the statistical models which can capture the dynamics from the historical data and predict air pollution in the future. Furthermore, the large size and heterogeneity of today’s big urban data pose significant challenges on the scalability and flexibility of the statistical models. In this work, we present a scalable belief updating framework that is able to produce reliable predictions, using over millions of historical hourly air pollutant and meteorology records. We also present a non-parametric approach to learn the statistical model which reveals interesting periodical dynamics and correlations of the dataset. Based on the scalable belief update framework and the non-parametric model learning approach, we propose an iterative update algorithm to accelerate Gaussian process, which is notorious for its prohibitive computation with large input data. Finally, we demonstrate how to integrate information from heterogeneous data by regarding the beliefs produced by other models as the informative prior. Numerical examples and experimental results are presented to validate the proposed method.
- U. S. Environmental Protection Agency. 1999. Nitrogen Oxides (NOx), Why and How They Are Controlled.Google Scholar
- V. Athira, P. Geetha, R. Vinayakumar, and K. P. Soman. 2018. Deepairnet: Applying recurrent networks for air quality prediction. Procedia Computer Science 132 (2018), 1394--1403.Google Scholar
- Matthias Bauer, Mark van der Wilk, and Carl Edward Rasmussen. 2016. Understanding probabilistic sparse Gaussian process approximations. In Advances in Neural Information Processing Systems. 1533--1541.Google Scholar
- Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer.Google Scholar
Digital Library
- Pier Giovanni Bissiri, Chris C. Holmes, and Stephen G. Walker. 2016. A general framework for updating belief distributions. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 78, 5 (2016), 1103--1130.Google Scholar
Cross Ref
- George E. P. Box. 1979. Robustness in the strategy of scientific model building. In Robustness in Statistics. Elsevier, 201--236.Google Scholar
- George E. P. Box and David R. Cox. 1964. An analysis of transformations. Journal of the Royal Statistical Society: Series B (Methodological) 26, 2 (1964), 211--243.Google Scholar
Cross Ref
- Tien-Cuong Bui, Van-Duc Le, and Sang-Kyun Cha. 2018. A deep learning approach for forecasting air pollution in south korea using LSTM. Arxiv Preprint Arxiv:1804.07891 (2018).Google Scholar
- Jerome T. Connor, R. Douglas Martin, and Les E. Atlas. 1994. Recurrent neural networks and robust time series prediction. IEEE Transactions on Neural Networks 5, 2 (1994), 240--254.Google Scholar
Digital Library
- Paulo S. R. Diniz, Eduardo A. B, Da Silva, and Sergio L. Netto. 2010. Digital Signal Processing: System Analysis and Design. Cambridge University Press.Google Scholar
- Yanjie Duan, Yisheng Lv, and Fei-Yue Wang. 2016. Travel time prediction with LSTM neural network. In 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC). IEEE, 1053--1058.Google Scholar
- Dongbing Gu and Huosheng Hu. 2012. Spatial gaussian process regression with mobile sensor networks. IEEE Transactions on Neural Networks and Learning Systems 23, 8 (2012), 1279--1290.Google Scholar
Cross Ref
- Min Han, Jianhui Xi, Shiguo Xu, and Fu-Liang Yin. 2004. Prediction of chaotic time series based on the recurrent predictor neural network. IEEE Transactions on Signal Processing 52, 12 (2004), 3409--3416.Google Scholar
Digital Library
- Ching-Hui Huang, Heng-Cheng Lin, Chen-Dao Tsai, Hung-Kai Huang, Ie-Bin Lian, and Chia-Chu Chang. 2017. The interaction effects of meteorological factors and air pollution on the development of acute coronary syndrome. Scientific Reports 7 (2017), 44004.Google Scholar
- Rishee K. Jain, Jose M. F. Moura, and Constantine E. Kontokosta. 2014. Big data+ big cities: Graph signals of urban air pollution [exploratory sp]. IEEE Signal Processing Magazine 31, 5 (2014), 130--136.Google Scholar
Cross Ref
- Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. Arxiv Preprint Arxiv:1412.6980 (2014).Google Scholar
- Andreas Krause, Ajit Singh, and Carlos Guestrin. 2008. Near-optimal sensor placements in gaussian processes: Theory, efficient algorithms and empirical studies. Journal of Machine Learning Research 9, Feb. (2008), 235--284.Google Scholar
- Yuxuan Liang, Songyu Ke, Junbo Zhang, Xiuwen Yi, and Yu Zheng. 2018. Geoman: Multi-level attention networks for geo-sensory time series prediction. In Proceedings of IJCAI. 3428--3434.Google Scholar
- Haitao Liu, Yew-Soon Ong, Xiaobo Shen, and Jianfei Cai. 2018. When Gaussian process meets big data: A review of scalable GPs. Arxiv Preprint Arxiv:1807.01065 (2018).Google Scholar
- M. Lototzis, G. K. Papadopoulos, F. Droulia, A. Tseliou, and I. X. Tsiros. 2018. A note on the correlation between circular and linear variables with an application to wind direction and air temperature data in a mediterranean climate. Meteorology and Atmospheric Physics 130, 2 (2018), 259--264.Google Scholar
- World Health Organization. 2015. Economic cost of the health impact of air pollution in Europe: Clean air, health and wealth.Google Scholar
- C. Arden Pope III and Douglas W. Dockery. 1992. Acute health effects of PM10 pollution on symptomatic and asymptomatic children. American Review of Respiratory Disease 145, 5 (1992), 1123--1128.Google Scholar
- Zhongang Qi, Tianchun Wang, Guojie Song, Weisong Hu, Xi Li, and Zhongfei Zhang. 2018. Deep air learning: Interpolation, prediction, and feature analysis of fine-grained air quality. IEEE Transactions on Knowledge and Data Engineering 30, 12 (2018), 2285--2297.Google Scholar
Digital Library
- Joaquin Quiñonero-Candela and Carl Edward Rasmussen. 2005. A unifying view of sparse approximate Gaussian process regression. Journal of Machine Learning Research 6, Dec (2005), 1939--1959.Google Scholar
Digital Library
- Carl Edward Rasmussen and Christopher Williams. 2006. Gaussian Processes for Machine Learning. The MIT Press.Google Scholar
Digital Library
- Matthias Seeger, Christopher Williams, and Neil Lawrence. 2003. Fast Forward Selection to Speed up Sparse Gaussian Process Regression. Technical Report.Google Scholar
- Khaled Bashir Shaban, Abdullah Kadri, and Eman Rezk. 2016. Urban air pollution monitoring system with forecasting models. IEEE Sensors Journal 16, 8 (2016), 2598--2606.Google Scholar
Cross Ref
- Robert H. Shumway and David Stoffer. 2017. Time Series Analysis and Its Applications: with R Examples. Springer.Google Scholar
- Edward Snelson and Zoubin Ghahramani. 2006. Sparse Gaussian processes using pseudo-inputs. In Advances in Neural Information Processing Systems. 1257--1264.Google Scholar
- In-Kwon Yeo and Richard A. Johnson. 2000. A new family of power transformations to improve normality or symmetry. Biometrika 87, 4 (2000), 954--959.Google Scholar
Cross Ref
- Xiuwen Yi, Junbo Zhang, Zhaoyuan Wang, Tianrui Li, and Yu Zheng. 2018. Deep distributed fusion network for air quality prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery 8 Data Mining. 965--973.Google Scholar
Digital Library
- Yu Zheng, Licia Capra, Ouri Wolfson, and Hai Yang. 2014. Urban computing: Concepts, methodologies, and applications. ACM Transactions on Intelligent Systems and Technology (TIST) 5, 3 (2014), 38.Google Scholar
Digital Library
- Yu Zheng, Furui Liu, and Hsun-Ping Hsieh. 2013. U-air: When urban air quality inference meets big data. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1436--1444.Google Scholar
Digital Library
- Yu Zheng, Xiuwen Yi, Ming Li, Ruiyuan Li, Zhangqing Shan, Eric Chang, and Tianrui Li. 2015. Forecasting fine-grained air quality based on big data. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2267--2276.Google Scholar
Digital Library
Index Terms
Scalable Belief Updating for Urban Air Quality Modeling and Prediction
Recommendations
Dynamic parameter estimation for a street canyon air quality model
The Operational Street Pollution Model (OSPM ) is a widely used air quality model for urban street canyons. It is a parametric model, simulating the contribution from traffic emissions on a single street at receptor points at the buildings' facades. The ...
Comparing statistical and neural network approaches for urban air pollution time series analysis
MIC '08: Proceedings of the 27th IASTED International Conference on Modelling, Identification and ControlThe paper presents an analysis of the performances obtained by using an artificial neural networks model and several statistical models for urban air quality forecasting. The time series of monthly averages concentrations (Sedimentable Dusts, Total ...
Local Ozone Prediction with Hybrid Model
SIMULTECH 2016: Proceedings of the 6th International Conference on Simulation and Modeling Methodologies, Technologies and ApplicationsTropospheric ozone in high concentrations can cause health problems. A reliable alerting system is needed.
In this paper we present the hybrid model that can be used for ozone forecasting in urban microlocations.
The hybrid model is combined from ...






Comments