Articles online

Comparing the Performance of Statistical Models for Predicting PM10 Concentrations

Category: Articles

Volume: 14 | Issue: 3 | Pages: 653-665
DOI: 10.4209/aaqr.2013.07.0259
PDF | RIS | BibTeX

Arwa S. Sayegh 1, Said Munir2, Turki M. Habeebullah2

  • 1 SETS International, Beirut, Lebanon
  • 2 The Custodian of the Two Holy Mosques Institute for Hajj and Umrah Research, Umm Al Qura University, Makkah, Saudi Arabia


The ability to accurately model and predict the ambient concentration of Particulate Matter (PM) is essential for effective air quality management and policies development. Various statistical approaches exist for modelling air pollutant levels. In this paper, several approaches including linear, non-linear, and machine learning methods are evaluated for the prediction of urban PM10 concentrations in the City of Makkah, Saudi Arabia. The models employed are Multiple Linear Regression Model (MLRM), Quantile Regression Model (QRM), Generalised Additive Model (GAM), and Boosted Regression Trees1-way (BRT1) and 2-way (BRT2). Several meteorological parameters and chemical species measured during 2012 are used as covariates in the models. Various statistical metrics, including the Mean Bias Error (MBE), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), the fraction of prediction within a Factor of Two (FACT2), correlation coefficient (R), and Index of Agreement (IA) are calculated to compare the predictive performance of the models. Results show that both MLRM and QRM captured the mean PM10 levels. However, QRM topped the other models in capturing the variations in PM10 concentrations. Based on the values of error indices, QRM showed better performance in predicting hourly PM10 concentrations. Superiority over the other models is explained by the ability of QRM to model the contribution of covariates at different quantiles of the modelled variable (here PM10). In this way QRM provides a better approximation procedure compared to the other modelling approaches, which consider a single central tendency response to a set of independent variables. Numerous recent studies have used these modelling approaches, however this is the first study that compares their performance for predicting PM10 concentrations.


Performance evaluation Multiple linear regression Quantile regression model Generalised additive model Boosted regression trees

Related Article

Critical Emissions from the Largest On-Road Transport Network in South Asia

Saroj Kumar Sahu , Gufran Beig, Neha Parkhi
Volume: 14 | Issue: 1 | Pages: 135-144
DOI: 10.4209/aaqr.2013.04.0137

Ambient Air Quality during Diwali Festival over Kolkata – A Mega-City in India

A. Chatterjee , C. Sarkar, A. Adak, U. Mukherjee, S.K. Ghosh, S. Raha
Volume: 13 | Issue: 3 | Pages: 1133-1144
DOI: 10.4209/aaqr.2012.03.0062

Exploring the Variation between EC and BC in a Variety of Locations

Gbenga Oladoyin Salako, Philip K. Hopke , David D. Cohen, Bilkis A. Begum, Swapan K. Biswas, Gauri Girish Pandit, Yong-Sam Chung, Shamsiah Abd Rahman, Mohd Suhaimi Hamzah, Perry Davy, Andreas Markwitz, Dagva Shagjjamba, Sereeter Lodoysamba, Wanna Wimolwattanapun, Supamatthree Bunprapob
Volume: 12 | Issue: 1 | Pages: 1-7
DOI: 10.4209/aaqr.2011.09.0150