A New Approach for Estimation of Fine Particulate Concentrations Using Satellite Aerosol Optical Depth and Binning of Meteorological Variables

Fine particulate matter (PM2.5) has recently gained attention worldwide as being responsible for severe respiratory and cardiovascular diseases, but point based ground monitoring stations are inadequate for understanding the spatial distribution of PM2.5 over complex urban surfaces. In this study, a new approach is introduced for prediction of PM2.5 which uses satellite aerosol optical depth (AOD) and binning of meteorological variables. AOD from the MODerate resolution Imaging Spectroradiometer (MODIS) Collection 6 (C006) aerosol products, MOD04_3k Dark-Target (DT) at 3 km, MOD04 DT at 10 km, and MOD04 Deep-Blue (DB) at 10 km spatial resolution, and the Simplified Aerosol Retrieval Algorithm (SARA) at 500 m resolution were obtained for Hong Kong and the industrialized Pearl River Delta (PRD) region. The SARA AOD at 500 m alone achieved a higher correlation (R = 0.72) with PM2.5 concentrations than the MODIS C6 DT AOD at 3 km (R = 0.60), the DT AOD at 10 km (R = 0.61), and the DB AOD at 10 km (R = 0.51). The SARA binning model ([PM2.5] = 110.5 [AOD] + 12.56) was developed using SARA AOD and binning of surface pressure (996–1010 hPa). This model exhibits good correlation, accurate slope, low intercept, low errors, and accurately represents the spatial distribution of PM2.5 at 500 m resolution over urban areas. Overall, the prediction power of the SARA binning model is much better than for previous models reported for Hong Kong and East Asia, and indicates the potential value of applying meteorologicallyspecific empirical models and incorporating boundary layer height in operational PM2.5 forecasting from satellite AOD retrievals.


INTRODUCTION
Fine Particulate Matter (PM 2.5 ) has been identified as a severe health hazard (Ward and Ayres, 2004;Bell et al., 2007;Pope et al., 2009).Studies have reported the association of PM 2.5 with respiratory (Kappos et al., 2004;Götschi et al., 2008), mutagenic (Fang et al., 2002), and cardiorespiratory disease (Englert, 2004), and mortality (Dominici et al., 2006;Gent et al., 2009).For example, an increase of 10 µg m -3 in PM 2.5 increases by 4%, 6%, and 8% the rate of cardiopulmonary diseases, lung cancer, and mortality, respectively (Pope et al., 2009).PM 2.5 can be emitted directly from natural and anthropogenic sources (El-Fadel and Hashisho, 2001;Dubovik et al., 2002;Wallace and Hobbs, 2006) or formed from gaseous precursors.In urban areas, PM 2.5 is often associated with local emissions recorded higher levels.Nevertheless, the PM 2.5 recorded at general stations is 3 to 4 times, and at roadside stations 3 to 5 times higher than the WHO annual AQS.
Compliance with health-based ambient standards requires continuous monitoring of aerosol concentrations and characteristics over both time and space.Ground-based air quality stations have been established in most large cities worldwide for measurement of PM 2.5 mass with high temporal frequency (Gomišček et al., 2004;Al-Saadi et al., 2005).However it is difficult to obtain spatial information from traditional in-situ measurements of PM 2.5 , especially at the fine scales required to assess variability in densely populated megacities.Advances in satellite remote sensing during the last decade have the potential to overcome such limitations, as satellite measurements of aerosols are spatially denser and have been used to estimate PM 2.5 for areas with no ground-based measurements (Engel-Cox et al., 2004;Al-Saadi et al., 2005;Gupta et al., 2006).Spectral aerosol optical depth (AOD) is the most accessible, thus most frequently used parameter (Clarke et al., 2001;Holben et al., 2001) in statistical models to predict PM 2.5 mass concentrations.However since AOD represents aerosol concentrations throughout a vertical column of atmosphere, its relationship to ground level PM 2.5 is affected by variability of surface contributions to vertically integrated extinction (Snider et al., 2014).
Numerous studies have shown the potential of satellitederived AOD to represent the spatial distribution of PM 2.5 at ground level, especially for annual average concentrations (Chu et al., 2003;Wang and Christopher, 2003;Engel-Cox et al., 2004;Hutchison et al., 2005;Gupta et al., 2006;Koelemeijer et al., 2006;Gupta et al., 2007;Kumar et al., 2007;Liu et al., 2007;Gupta and Christopher, 2009;Tian and Chen, 2010;Ma et al., 2014;Zou et al., 2015a, b).Recent studies have retrieved AOD from both passive (i.e., MODIS, MISR, and SeaWiFS) and active (i.e., CALIPSO) sensors, and simulated AOD from GEOS-Chem for estimation of regional and global PM 2.5 concentrations using empirical linear regression model and the Geo-graphically Weighted Regression (GWR) model (Saunders et al., 2014;Toth et al., 2014;Xin et al., 2014;Geng et al., 2015;van Donkelaar et al., 2016;Ma et al., 2016;Zou et al., 2016).However, the statistical relationship between AOD and PM 2.5 appears to vary with respect to land cover type, season, the AOD retrieval algorithm used, and its spatial resolution.For example, Gupta et al. (2006) reported a correlation of R = 0.60 in New York, 0.14 in Switzerland, 0.40 in Hong Kong, 0.41 in Delhi and 0.35 in Sydney using the same AOD retrieval algorithm.Kumar et al. (2007) increased the correlation between AOD and PM 2.5 from 0.67 to 0.87 by improving the spatial resolution from 10 km to 5 km.The AOD-PM 2.5 relationship can also be influenced by local meteorological variables and these can be used as additional predictors (Gupta and Christopher, 2009;Tian and Chen, 2010).For example, Koelemeijer et al. (2006) and Tsai et al. (2011) found that the relationship between AOD and PM 2.5 was significantly improved when AOD is divided by the mixing layer height, since it is assumed that higher particulate concentrations will be found near ground level.
Monitoring and understanding the temporal variability of atmospheric aerosols at local scales over complex urban terrain such as Hong Kong requires aerosol retrieval algorithms that support high spatial resolution.Recently, a Simplified Aerosol Retrieval Algorithm (SARA) was developed to retrieve AOD from MODerate resolution Imaging Spectroradiometer (MODIS) swath data products at 500 m spatial resolution without constructing a comprehensive look-up-table (LUT) (Bilal et al., 2013;Bilal and Nichol, 2015).SARA has been tested over the complex/hilly surfaces of Hong Kong (Bilal et al., 2013), a coastal sub-tropical city with mixed aerosol types and moderate aerosol loadings, and Beijing (Bilal et al., 2014;Bilal and Nichol, 2015), a city with sparse vegetation cover and under high aerosol loadings with greater influence of dust storms.In Hong Kong, PM 2.5 has also been estimated from surface meteorological variables alone (Shi et al., 2012), or by using only AOD from MODIS at 10 km (Gupta et al., 2006) or 500 m (Wong et al., 2011) resolutions.These studies are limited, as the influence of meteorological variables on the relationship between AOD and PM 2.5 was not investigated.This study aims to (i) develop and evaluate a new statistical approach for detailed monitoring of PM 2.5 in Hong Kong and Pearl River Delta (PRD) region based on the SARA AOD at 500 m resolution along with bins of meteorological variables, and (ii) compare the SARA binning model with existing PM 2.5 prediction models in Hong Kong.

STUDY AREA AND DATASETS
The Pearl River Delta (PRD) is one of the world's fastest developing regions, located in southern China and covering a land area of 42794 km 2 with population over 40 million.The PRD is facing serious air pollution problems due to increases in anthropogenic emission activities (Cao et al., 2003;Cao et al., 2004;Ansmann et al., 2005;Hagler et al., 2006) including manufacturing, power plants and shipping (Streets et al., 2006).The PRD covers the major urban areas of Guangdong Province, as well as the Special Administrative Regions of Macau and Hong Kong.Hong Kong (Fig. 1) is situated on complex and hilly terrain on the coast of southeast China with an area of 1104 km 2 and highest elevation of 957 m above sea level.It has a humid subtropical climate with mean annual rainfall from 1400 mm to 3000 mm.Hong Kong and the PRD mega-region have been experiencing visibility and air quality problems due to PM 2.5 , as have many other Asian cities (Chan and Yao, 2008).
In this study, hourly concentrations of surface-level PM 2.5 (µg m -3 ) were obtained from the Hong Kong Environmental Protection Department for five general air quality stations (Fig. 1) including Central (Central Business District), Tsuen Wan (urban commercial and residential area), Tung Chung (new commuter town), Yuen long (new commuter town), and Tap Mun (remote rural area) for the years 2007 to 2009.To identify the influence of meteorology on PM 2.5 , groundbased hourly meteorological variables including surface temperature (STEMP), surface relative humidity (SRH), surface wind direction (SWD) and surface wind speed (SWS) were collected from the Hong Kong Observatory's five Automatic Weather Stations (AWS) nearest to the PM 2.5 ground stations.These are the Hong Kong Observatory (HKO) at 2.8 km from Central, Tai Mo Shan (TMS) at 5.2 km from Tsuen Wan, Sha Lo Wan (SLW) at 4.1 km from Tung Chung, Wetland Park (WLP) at 3.3 km from Yuen Long, and Tap Mun (TM) at 0.0 km (Fig. 1).Hourly meteorological variables including 2 m temperature (WTEMP), 2 m relative humidity (WRH), 2 m specific humidity (WSH), 10 m wind speed (WWS), surface pressure (WPSFC), and planetary boundary layer height (WPBLH) were output from the Weather Research and Forecasting (WRF) model on the same 500 m resolution grid as the SARA AOD retrieval.WRF was configured for this high-resolution application with 35 vertical levels, Lin et al.'s (1983) microphysics, the Noah land surface model (Ek et al., 2003), and the Yonsei University PBL scheme (Hong, 2010), with results extracted from the finest of four nested regional grids.Global Forecast System (GFS) data were used for WRF boundary conditions and as the initiation dataset for the WRF Processing System (WPS).For detailed spatial monitoring of PM 2.5 at 500 m resolution, AOD at 500 m resolution from the SARA algorithm (Bilal et al., 2013;Bilal et al., 2014;Bilal and Nichol, 2015) was retrieved over Hong Kong and PRD region.The SARA Algorithm performs AOD retrievals over land during sunny skies based on three assumptions: (i) the surface is Lambertian, (ii) single scattering approximation, and (iii) the single scattering albedo (ω o ) and asymmetric parameter (g) remain spatially constant for a day of retrieval.The limitation is the requirement of Sun photometer AOD as an input to the SARA algorithm to perform retrieval.For comparison purpose, the operational MODIS level 2 aerosol products, the Collection 6 (C6) Dark-Target (DT) AOD at 3 km (Remer et al., 2013), the C6 DT AOD at 10 km (Levy et al., 2013), and the C6 Deep-Blue (DB) AOD at 10 km resolution (Hsu et al., 2013), were obtained from the MODIS Level 1 and Atmosphere Archive and Distribution System (http://ladsweb.nascom.nasa.gov).

RESEARCH METHODOLOGY
This study develops a new approach for modeling of PM 2.5 using SARA-retrieved AOD at 500 m spatial resolution, and binning of meteorological variables obtained from HKO and the WRF model at 500 m spatial resolution.The prediction model is developed as shown in Fig. 2. In Step 1, the relationship between PM 2.5 from ground stations and four aerosol products (the SARA retrieved AOD at 500 m, MOD04_3K C6 DT AOD at 3 km, MOD04 C6 DT AOD at 10 km, and MOD04 C6 DB AOD at 10 km) was investigated for autumn and winter seasons of 2007 and 2008 (Fig. 3) to identify the best satellite aerosol product for reliable prediction of PM 2.5 over urban and rural areas of Hong Kong.Autumn and winter were used because higher pollution occurs and more collocations were available during these months, whereas fewer collocations were available during spring and summer due to cloud contamination.Satellite AOD observations are available around 10:30 a.m.local time, while PM 2.5 measurements are obtained throughout the day.In order to understand the variability of the AOD-PM 2.5 relationship, the PM 2.5 data were averaged for the time window, 09-12 hr, which encompass the time at which MODIS Terra passes over Hong Kong.In order to increase the number of statistical samples and also to account for the spatial variability imposed by local atmospheric dispersion, the satellite AOD observations were extracted from a spatial subset region of 3 × 3 pixels (average of 9 pixels) centered on the air quality station, and the relationship with PM 2.5 was examined using orthogonal regression/Deming Regression (Deming, 1943).
In step 2, the binning approach was used to achieve a higher correlation between PM 2.5 and AOD and for accurate estimation of PM 2.5 .In order to develop PM 2.5 prediction models based on the SARA AOD and bins of meteorological variables, ten bins out of seventy six (Appendix-A) were selected when a "significant correlation coefficient" between SARA AOD and PM 2.5 , "sufficient number of observations", and "sufficient PM 2.5 measurements above Hong Kong's proposed 24-hr AQO" (to include higher PM 2.5 concentrations in the dataset associated with a particular bin) were available.These PM 2.5 prediction models (Eqs.( 1)-( 10)) were developed for specific meteorological conditions using continuous datasets from autumn and winter of 2007 and 2008, but can be applied for both specific as well as all available meteorological conditions.
In Step 3, predictions from Step 2 based on SARAretrieved AOD at 500 m spatial resolution were used to predict PM 2.5 associated with all meteorological conditions using the full year 2009, and validation was conducted using the ground-based observed PM 2.5 (Fig. 4).The best PM 2.5 prediction model was selected based on the statistical parameters such as "slope", "root mean square error (RMSE)", "mean absolute error (MAE)", "mean", and "standard deviation (StDev)".Data from the Tap Mun air quality station were not considered in development of the prediction PM 2.5 model due to unavailability of corresponding SRH data.Therefore, the Tap Mun PM 2.5 station was used for validation only.As a comparison, a previous PM 2.5 prediction model ([PM 2.5 ] = 63.66 [AOD] + 26.56) developed for Hong Kong by Wong et al. (2011) was tested and validated using the Tap Mun data (Table 1).

Relationship between PM 2.5 and Satellite Aerosol Products
The number of satellite AOD observations varies from one air quality station to another in Hong Kong and fewer were available during spring and summer due to cloud cover.Therefore, all AOD products (SARA, MOD04_3K DT, MOD04 DT, and MOD04 DB) were tested against PM 2.5 measurements from the five air quality stations for autumn  3).A stronger relationship was found with SARA AOD (R = 0.72) than for DT (R = 0.60 (3 km), and R = 0.61 (10 km)) and DB AOD (R = 0.51) algorithms.The substantially higher correlations for SARA than for MOD04 C6 (Fig. 3) suggest that the SARA AOD is more able to monitor PM 2.5 concentrations over the mixed surfaces of Hong Kong.Due to the better relationship between SARA and PM 2.5 , only SARA was used in the subsequent analyses.

Binning Approach for Accurate Estimation of PM 2.5 over Urban Areas
A binning approach based on the bins of meteorological variables can be used to achieve higher correlation between AOD and PM 2.5 for meteorologically-specific refinement of regression coefficients of the PM 2.5 prediction models.For the binning approach, equations are developed for bins of each meteorological variable as listed in Appendix-A.Bins are determined for specific ranges of each meteorological variable which has sufficient number of PM 2.5 and AOD samples, where at least a few PM 2.5 measurements exceed Hong Kong's 24-hr AQ objective (75 µg m -3 ).Thus multiple equations based on each meteorological variable can be tested for accurate prediction of PM 2.5 concentrations within or outside the AQ objective.The following Eqs.( 1 1), (b) Eq. ( 2), (c) Eq. ( 3), (d) Eq. ( 4), (e) Eq. ( 5), (f) Eq. ( 6), (g) Eq. ( 7), (h) Eq. ( 8), (i) Eq. ( 9), and (j) Eq. ( 10) using ground-based observed PM 2.5 at four urban/suburban air quality stations in Hong Kong for the year 2009.Binning of multiple variables may also improve the correlations.For example the AOD-PM 2.5 correlation increased from 0.88 to 0.99 when Eq. ( 10) was sorted for both SRH = 47-79% and SWS = 2.53-3.40m s -1 (only 10 such measurements were available in the dataset, therefore the equation is not given).
The best PM 2.5 prediction model has accurate regression coefficients (slope and intercept) and can be applied for specific as well as all meteorological conditions.Eq. ( 8) ([PM 2.5 ] = 110.5 [SARA AOD] + 12.56) based on the bin of surface low pressure (WPSFC = 996-1010 hPa) is statistically better (i.e., acceptable slope and smaller intercept) than other equations.Generally, surface low pressure is associated with moist and warm air rising slowly above the surface along with pollutants, which gives a degree of vertical mixing of pollutants throughout the atmospheric column.Since PM 2.5 is measured near the surface, whereas AOD represents whole column distribution of aerosols, the increased vertical distribution of pollutants under low surface pressure has the potential to increase the correlation between PM 2.5 and AOD, and in our study, an increase from 0.72 (Fig. 3(a)) to 0.86 (Eq.( 8)) was observed.

Validation of PM 2.5 Predicted by the SARA Binning Model
Validation of the PM 2.5 prediction models (Fig. 4) using data from 2009 shows a good correlation (R) between predicted and observed PM 2.5 mass concentrations, but often with large under/overestimations.As expected, the SARA binning model (Eq.( 8)) was the best of 10 models for accurate prediction of PM 2.5 , having accurate regression slope (1.01) with ground-bases measurements, low RMSE (12.14 µg m -3 ) and MAE (9.44 µg m -3 ) errors and comparable descriptive statistics.The results also suggest that the bin method can improve the predictive power of the regression model based on satellite-retrieved AOD for accurate prediction of PM 2.5 .Since the SARA binning model based on surface low pressure is accompanied by a wide range of other meteorological conditions, this may explain the significant improvement in the correlation when the SARA binning model is validated with data representing all meteorological conditions.

Spatial Distribution of PM 2.5 over Hong Kong and the PRD Region: An Example of a High Pollution Episode
In order to illustrate the applicability of the SARA binning model to the Hong Kong and PRD region, PM 2.5 was retrieved at 500 m resolution for a high pollution episode on 4 th December 2007.Fig. 5(a) shows higher PM 2.5 concentrations over the eastern PRD region as well as over Hong Kong International Airport (HKIA) and the dense urban areas.Fig. 5(b) suggests that higher PM 2.5 concentrations in Hong Kong are present over lower elevations (< 100 m) covering all types of land use including urban, suburban and rural regions.Since this distribution is independent of land use, this suggests a major contribution from non-local sources of the fine particulates in this high pollution event.A maximum PM 2.5 concentration of 92 µg m -3 was observed over the urban areas and the HKIA, which is higher than Hong Kong's air quality standard (75 µg m -3 ) and also almost four times higher than that of the WHO (25 µg m -3 ).Table 2 compares SARA predicted PM 2.5 observations with actual measurements from five PM 2.5 stations.The average difference in the concentration is 12.88 µg m -3 which is almost the same as the intercept of the SARA binning model.

Comparison between then SARA Binning Model and the Existing PM 2.5 Model over Hong Kong
The predicted PM 2.5 concentrations using the SARA binning model have a better agreement (R = 0.78, slope = 0.87 and N = 109) with ground-based PM 2.5 concentrations, with regression line closer to the 1:1 line (Fig. 6) than the model developed by Wong et al. (2011) (R = 0.78, slope = 0.50 and N = 109).The SARA predicted PM 2.5 is comparable to the observed PM 2.5 at Tap Mun air quality station (Table 1), and the SARA binning model has much lower intercept (12.56) than Wong et al.'s model enabling accurate prediction for lower PM 2.5 concentrations (Fig. 6(a)).However, the lower intercept is still a source of error as the SARA binning model cannot predict the PM 2.5 concentrations lower than 12.56 µg m -3 , and even if there are no aerosols in the atmosphere, the model will still show an error of 12.56 µg m -3 .Wong

SUMMARY AND CONCLUSIONS
The primary objective of this study was to devise a new approach for predicting and monitoring fine particulates (PM 2.5 ) in Hong Kong and the Pearl River Delta (PRD) region using high resolution aerosol optical depth (AOD) and binning of meteorological variables.The AOD from a Simplified high resolution (500 m) MODIS Aerosol Retrieval Algorithm (SARA) was obtained to develop PM 2.5 prediction models at 500 m resolution at four urban/suburban air quality stations in Hong Kong for spatio-temporal monitoring of PM 2.5 using Deming Regression.The results indicated that the SARA-retrieved AOD at 500 m resolution has a better correlation (R = 0.72) than the MODIS C6 DB AOD at 10 km resolution (R = 0.51), and DT AOD at 3 km (R = 0.60) and 10 km resolution (R = 0.61) with the PM 2.5 concentrations at five air quality stations located in different land cover types in Hong Kong.The C6 retrievals are unable to retrieve accurate AOD over mixed surfaces due to errors in surface reflectance estimation and the aerosol model used.The correlation observed for MOD04 C6 in this study is similar to that for several previous studies based on MOD04 C5 but significantly lower than for the SARA 500 m AOD algorithm.The correlation between satellite AOD and PM 2.5 appears to depend on the AOD retrieval algorithm, and also its spatial resolution, which supports the findings of Kumar et al. (2007).In this study, higher correlation was achieved between the SARA AOD at 500 m resolution and PM 2.5 which is 15% to 29% higher than for the MODIS DT and DB aerosol products, respectively.The SARA binning model based on the SARA AOD and the bin of surface low pressure (WPSFC = 996-1010 hPa) was developed for accurate prediction of of PM 2.5 .The correlation achieved for the SARA binning model is significantly higher than results for other studies in Hong Kong (R = 0.40, Gupta et al. (2006)), and USA (R = 0.63, Engel-Cox et al. (2004); R = 0.70, Wang and Christopher (2003); R = 0.62, Gupta and Christopher (2008)).This means that the SARA binning model is more reliable under all climatic conditions, than other satellite-based PM 2.5 retrieval methods.
This study indicates that binning of the meteorological variables is able to further refine the coefficients of the AOD regression model for greater accuracy, and a maximum correlation of 0.99 between AOD and PM 2.5 was obtained for defined conditions of boundary layer height, relative humidity and wind speed.The results demonstrate that the SARA binning model is better than existing AOD models for detailed monitoring of PM 2.5 concentrations over the dense and congested areas of Hong Kong as well as over the industrialized regions of the PRD, and is robust under low to high PM 2.5 concentrations.The significant improvement observed with binning of meteorological variables may be applied using look-up-tables (Appendix-A) of SARA AOD and meteorological conditions prevalent at the image time, for higher accuracy of the retrievals.The SARA binning model can be applied to regions other than Hong Kong if they have similar air quality conditions.For those regions with different air quality conditions, the demonstrated binning approach can be used to develop a region-specific model for accurate prediction of PM 2.5 .

Fig. 1 .
Fig. 1.Study area and locations of ground-based air quality stations (triangles) and meteorological stations (round square) in the complex terrain of Hong Kong.

Fig. 2 .
Fig. 2. Development of SARA binning model for prediction of PM 2.5 .

Fig. 5 .
Fig. 5. Spatial distribution of SARA Predicted PM 2.5 during a high pollution episode (4 th December 2007), over (a) Pearl River Delta (PRD) region and (b) Hong Kong.
et al.'s model on the other hand, overestimates at lower concentrations (Fig. 6(b)), as seen by the large intercept (26.56).Similarly, the SARA binning model can more accurately predict high PM 2.5 concentrations than Wong et al.'s model due to the unbiased slope.The results demonstrate that the SARA binning model is efficient and more capable than Wong et al.'s model for accurate prediction of PM 2.5 concentrations greater than 12.56 µg m -3 over the urban and rural surfaces of Hong Kong.

Fig. 6 .
Fig. 6.Validation of (a) SARA binning model and (b) Wong et al. (2011) PM 2.5 prediction model at Tap Mun air quality station in Hong Kong for 2007 to 2009.

Table 1 .
Comparison between SARA predicted PM 2.5 and observed PM 2.5 at five air quality stations during high pollution episode (4 th December 2007).

Table 2 .
Statistics of SARA and Wong et al. (2011) predicted PM 2.5 concentrations at Tap Mun air quality station for 2007 to 2009.