Volume 14, No. 3, April 2014, Pages 653-665 PDF(1.01 MB)
Comparing the Performance of Statistical Models for Predicting PM10 Concentrations
Arwa S. Sayegh1, Said Munir2, Turki M. Habeebullah2
1 SETS International, Beirut, Lebanon
2 The Custodian of the Two Holy Mosques Institute for Hajj and Umrah Research, Umm Al Qura University, Makkah, Saudi Arabia
The ability to accurately model and predict the ambient concentration of Particulate Matter (PM) is essential for effective air quality management and policies development. Various statistical approaches exist for modelling air pollutant levels. In this paper, several approaches including linear, non-linear, and machine learning methods are evaluated for the prediction of urban PM10 concentrations in the City of Makkah, Saudi Arabia. The models employed are Multiple Linear Regression Model (MLRM), Quantile Regression Model (QRM), Generalised Additive Model (GAM), and Boosted Regression Trees1-way (BRT1) and 2-way (BRT2). Several meteorological parameters and chemical species measured during 2012 are used as covariates in the models. Various statistical metrics, including the Mean Bias Error (MBE), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), the fraction of prediction within a Factor of Two (FACT2), correlation coefficient (R), and Index of Agreement (IA) are calculated to compare the predictive performance of the models. Results show that both MLRM and QRM captured the mean PM10 levels. However, QRM topped the other models in capturing the variations in PM10 concentrations. Based on the values of error indices, QRM showed better performance in predicting hourly PM10 concentrations. Superiority over the other models is explained by the ability of QRM to model the contribution of covariates at different quantiles of the modelled variable (here PM10). In this way QRM provides a better approximation procedure compared to the other modelling approaches, which consider a single central tendency response to a set of independent variables. Numerous recent studies have used these modelling approaches, however this is the first study that compares their performance for predicting PM10 concentrations.
Performance evaluation; Multiple linear regression; Quantile regression model; Generalised additive model; Boosted regression trees.