Statistical Downscaling of Air Dispersion Model Using Neural Network for Delhi

Statistical downscaling methods are used to extract high resolution information from coarse resolution models. The accuracy of a modelling system in analyzing the issues of either continuous or accidental release in the atmosphere is important especially when adverse health effects are expected to be found. Forecasting of air quality levels are commonly performed with either deterministic or statistical. In this study, statistical downscaling approach is investigated for hourly PM10 (particulate matter with aerodynamic diameter < 10 μm) pollutant for Delhi. The statistical downscaling is used on air dispersion model using neural network technique. The air dispersion model is based on analytical solution of advection diffusion equation in Neumann boundary condition for a bounded domain. Power laws are assumed for height dependent wind speed; and downwind and vertical eddy diffusivities are considered as an explicit function of downwind distance and vertical height. The predicted concentration of dispersion model with meteorological variables is used as input parameters to the neural network. It is found that performance of both air dispersion model and “pure” statistical models is inferior to that of the statistical downscaled model. In particular the root mean squares error (RMSE) of the deterministic model is reduced by at least 35% and 45% for hourly and rush hours particulate matter concentrations respectively using statistical downscaling. In addition, the results with statistical downscaled method show that the errors of the forecasts are reduced by at least 30% for stable and unstable-neutral atmospheric conditions.


INTRODUCTION
The atmospheric diffusion equation (Seinfeld, 1986) has long been used to describe the dispersion of airborne pollutants in a turbulent atmosphere.The use of analytical solutions of this equation was the first and remains the convenient way for modelling the air pollution study (Demuth, 1978).Air dispersion models based on its analytical solutions posses several advantages over numerical models, because all the influencing parameters are explicitly expressed in a mathematically closed form.Analytical models are also useful in examining the accuracy and performance of numerical models.In practice, most of the operative models for estimating the dispersion of gases and particles in the atmospheric boundary layer are based on the Gaussian approach, which assumes the constant wind speed and † Now at University of California, Davis, USA ‡ Now at The Northcap University, Gurgaon, Haryana, India turbulent eddies with height.However, non-Gaussian model agreed better with the observed data than Gaussian model (Hinrichsen, 1986).
Several efforts have also been made for non-Gaussian models of point and line sources.Since observational studies show that the wind speed and eddy diffusivity vary with vertical height above the ground (Stull, 1988).Analytical solutions of the advection diffusion equation, with wind speed and vertical eddy diffusivity both as power function of vertical height, are well known for point and line sources bounded by Atmospheric Boundary Layer (ABL) (Seinfeld, 1986;Lin and Hildemann, 1996).Taylor's (1921) analysis and statistical theory suggest that the eddy diffusivity depends on the downwind distance from the source (Arya, 1995).The advection diffusion equation has also solved analytically with wind speed as function of height and eddy diffusivity as a function of downwind distance from the source (Sharan and Modani, 2006).Thus in general, the eddy diffusivity should be a function of both vertical above the ground and downwind distance from the source (Mooney and Wilson, 1993).Recently Sharan and Kumar (2009) formulate the advection diffusion equation considering the wind speed as a function of vertical height and vertical eddy diffusivity as a function of both vertical height and downwind distance, applicable only for point source release in reflecting boundary condition.However, downwind eddy diffusivity has not been considered, which is important in low wind condition.
The statistical techniques are also used in air quality modeling studies in different part of the world (McCollister and Willson, 1975;Aron and Aron, 1978;Lin, 1982;Aron, 1984;Katsoulis, 1988;Robeson and Steyn, 1990;Milionis and Davies, 1994;Perez, 2001;Chelani et al., 2002;Nunnari et al., 2004;Kurt and Oktay, 2010).The results of statistical models are also improved after combining it with principal component analysis (PCA) (Kumar and Goyal, 2011).During the last decade, neural network based models have also been applied to predict pollutant concentration (Gardner and Dorling, 1999;Kolemainen et al., 2001;Chattopadhyay and Chattopadhyay, 2011).The models based on neural network have been found to perform well in comparison with traditional statistical models like regression models.These models provide a better alternative to statistical models because of their computational efficiency and generalization ability.They can handle data having high dimensionality.The neural network model has also been improved by using the PCA technique by Kumar and Goyal (2013).In addition, this technique also reduces the dimension of predictors.
In general, deterministic and statistical models are applied separately in the processing of environmental data.A major advantage of deterministic model based forecasts is a uniform spatial coverage, while statistical models are best representative at a given measurement site.However, different downscaling techniques had been developed as tools for interpolating large-scale information into local or regional variables.Statistical downscaling was a popular approach that uncovered the stable relationship between one or several large-scale meteorological variables (predictors) (Liu and Fan, 2013).Downscaling is a technique that is used to extract high-resolution information from large/regional scale variables produced by numerical models.Many downscaling methods are available but pertain mostly to the climate modeling community.Statistical downscaling is based on developing a statistical relationship between observed small-scale variables (predictands) and largescale variables (predictors) from a numerical model.The main advantage of statistical downscaling methods is that they are computationally inexpensive and appropriate when computational resources are limited (Wilby et al., 2004).There is an advantage in using statistical downscaling on coarse resolution air quality models instead of using a high resolution chemical transport models (CTMs): downscaling combines the benefits of both statistical methods and CTMs.Statistical downscaling provide point-specific forecasts taking into consideration the physical and chemical process at a scale that is not available to even a fine scale CTM due to the local characteristics of the station, whereas CTMs show skills in the highly non-linear modelling of chemistry and transport at a more regional scale (Alkuwari et al., 2013).Downscaling methods can be applied in air quality management.Indeed, one can downscale CTM forecasts, by using the established relationship between the model and observations that were estimated using historical data, to predict local air quality.Few studies about downscaling air quality model have been published in the literature (Guillas et al., 2008;Berrocal et al., 2012).
The goal of this study is to investigate the prospects of the combined use of deterministic and statistical models.An idea of such a combination is to use a statistical model in order to correct predictions made by a deterministic model.Pelliccioni and Tirabassi (2006) successfully coupled the air dispersion model and neural network only for experimental data.A deterministic chemical transport model (CTM) namely CHIMERE was also combined with regression model for PM 10 forecasting in Europe (Konovalov et al., 2009).Advantages of the combined use of deterministic and statistical methods for air quality forecasting have not yet been sufficiently investigated in India.In this study, the analytical dispersion model and neural network (NN) model are coupled and applied for hourly PM 10 (respirable suspended particulate matter (RSPM)) pollutant forecasting one day ahead for Delhi.There are number of studies related to its pollution over Delhi (Mohan and Payra, 2006;Mohan and Payra, 2014).The new methodology is adopted for the training of the NN and includes the inclusion of the concentration levels predicted by analytical, along with other meteorological variables as input variables to the PCA-NN.The concentration levels are resulting from analytical model for dispersion of air pollutants released from point, line and area sources by considering the total reflection at ground and top of ABL (Neumann boundary condition).The analytical solution of advection-diffusion equation for above described boundary condition is derived using the separation of variable technique with wind speed as a power law profile of vertical height above the ground.The downwind and vertical eddy diffusivities are considered as an explicit function of vertical height and downwind distance from the source.The consideration of downwind diffusivity increases the applicability of the model to low wind also.The emission rate of PM from various sources namely vehicular, domestic, industries and power plants has been estimated using the primary and secondary data.Meteorological variables have been simulated using weather research and forecasting (WRF) model.This statistical downscaled model (air dispersion model and PCA-NN model) is evaluated at different locations against observed data of PM, obtained from CPCB (http://164.100.43.188/cpcbnew/movie.html).
This paper presents a discussion on the statistical downscaling of air dispersion model using NN model for Delhi.The second section of this paper describes air quality modelling study of Delhi city.The methodology of coupling of dispersion model and PCA-NN is discussed in section 3. The comparison between statistical downscaled model's simulated concentration and observed concentration of PM 10 is presented in the section 4 of this paper.The section 5 summarizes the conclusions.

AIR QUALITY MODELLING STUDY OF DELHI CITY
Delhi is situated in the northern part of India.There is a river Yamuna in the eastern boundary of the city.Delhi has a semi-arid climate with high variation between summer and winter temperatures.It is situated between the Great Indian Desert (Thar Desert) of Rajasthan to the west, the central hot plains to the south and the cooler hilly region to the north and east.Because of Delhi's proximity to the Himalayas, cold waves from the Himalayan region dip temperatures across it.The average annual rainfall is approximately 714 mm, most of which falls in the Monsoon season (June, July, and August) (Economy survey of Delhi, 2008-09).The most important season in Delhi is winter, which starts in December and ends in February.This period is dominated by cold, dry air, and ground-based inversion with low wind conditions (u ≤ 1 m s -1 ), which occur frequently and increases the concentration of pollutants (Anfossi et al., 1990).The summer (March, April and May) is governed by high temperature and high winds, while postmonsoon (September, October, November) have moderate temperature and moderate wind conditions.Delhi is the capital city of India and it has 16.9 million inhabitants in 2007-08 spread over 1483 km 2 .Due to the presence of large number of industries and migration of people from nearby states, nearly 5.63 million vehicles were plying on Delhi roads in 2007-08 (Economy survey of Delhi, 2008-09).At 1749 km of road length per 100 km 2 , Delhi has one of the highest road densities in India.Delhi's high population growth coupled with high economic growth has resulted in ever-increasing demand for transportation and has created excessive pressure on the city's existent transport infrastructure.Like many other cities in the developing world, it faces acute transport management problems leading to air pollution, congestion, and resultant loss of productivity.

METHODOLOGY
To understand the methodology of proposed approach, it will be useful to give a brief description of analytical air dispersion models background, which has been used for concentration predictions.

Background of Analytical Dispersion Models
In this study, the air dispersion models have been developed for point, line, and area sources.We have considered steady state advection-diffusion equation for dispersion of a non reactive contaminates released from continuous source and it is described as: where x, y, and z are coordinates in the along-wind, cross wind, and vertical directions, respectively.C is the mean concentration of pollutants, and U (z) is the mean wind speed in downwind direction.K x (x, z), K y (x, z) and K z (x, z) are eddy diffusivities of pollutants in the along wind, crosswind and vertical directions respectively.(i) The following are the Neumann Boundary (total reflection) conditions, in which, h is the top of the inversion/mixed layer: (iv) The pollutant is released from an elevated point source located at the point (0, y s , z s ) with strength Q p , where δ is the Dirac delta function.
The transport of contaminant emitted from a source primarily depends on the wind speed U.The formulations of the commonly used dispersion models assume the wind speed to be constant.However, it is well known that wind speed increases with height in the lower part of the atmospheric boundary layer (Arya, 1999).The height dependent wind speed is expressed as where U(z r ) is the wind speed at reference height z r and p depends on atmospheric stability.Generally, K z is parameterized as a function of the height z above the ground only (Lin and Hildemann, 1996;Park and Baik, 2008).However, based on the Taylor's analysis and statistical theory, it is revealed that the eddy diffusivity depends on the downwind distance x from source (Arya, 1995).Mooney and Wilson (1993) suggest a form of separation of variable for K z (x, z) = K' z (z)f(x).In this expression f(x) is considered as a correction to it for near source dispersion and is a dimensionless integrable function of x.K' z (z) is parameterized as a power law profile in z as K' z (z) = bz β , b = K' z (z r )bz r -β .In the modified form of K z (x, z), the correction function f(x) is taken from Mooney and Wilson (1993) is the along-wind length scale across which the diffusivity achieves its "farfield" value and the Lagrangian time scale τ z at the source height z s , is parameterized by the relation (Mooney and Wilson, 1993): where σ w is the vertical turbulent intensity.Similarly, eddy diffusivity in x direction is also extended as: K x (x, z) = K' x (z)g(x).In that expression g(x) is a correction to it for near source dispersion and is a dimensionless integrable function of x.K' x (z) is parameterized as a power law profile in z as K' x (z) = cz p , c = K' x (z r )z r -p .In the modified form of K x (x, z), the correction function g(x) is considered as: g(x) = x/L 2 , where L 2 = U(z s )τ x (z s ) and τ x at the source height , where σ u is the downwind turbulent intensity.Therefore the form of K z (x, z) and K x (x, z) are given as: where K' z (z), K' x (z) are eddy diffusivities depending on z and these are parameterized as a power law profile of z as given below: where K' z (z r ) and K' x (z r ) are defied as K' z and K' x respectively at height z = z r .f(x) = x/L 1 and g(x) = x/L 2 are the function of x. p, β, σ u and σ w depend on atmospheric stability (Hanna et al. 1982).Using Taylor's hypothesis, the lateral eddy diffusivity has been represented by (Huang, 1979;Brown et al., 1997): where σ y is the standard deviation of concentration distribution in the crosswind/latterly direction.The analytical solution of Eq. ( 1) for the profiles of wind speed (Eq.( 7)) and eddy diffusivity (Eqs.( 8)-( 12)), with boundary conditions Eqs. ( 2)-( 6) is obtained using the separation of variable technique as (Kumar and Goyal, 2014): where and K r are the Bessel function of order -µ and modified Bessel function of second kind of order r respectively.Eq. ( 13) gives the concentration released from an elevated point source in the atmosphere when the wind speed is parameterized as a power law profile of z; and downwind and vertical eddy diffusivities in terms of both x and z. γ n 's are zeros of the equation: where

Line Source Model is Extended from Point Source Solution
A line source can be considered as a superposition of point sources.The solution of a finite line source has be obtained by integrating point source solution from y s = y 1 to y 2 with unit source strength Q where erf is the error function defined by erf (a) =

Area Source Model is Extended from the Line Source Solution
For a finite area source extending from y 1 to y 2 in the crosswind direction and from x 1 to x 2 in the downwind direction, the concentration at (x, y, z) is calculated as a superposition of line sources.So for a finite area source with uniform strength Q a per unit area, the solution has been obtained by integrating finite line source solution from x s = x 1 to x 2 .
where, C(x -x s , y, z) in Eq. ( 16) is equivalent to the finite line source as obtained in section 3.1.1.The source strength Q  is replaced by Q a in area source model.Eq. ( 16) represents the finite area source solution and it has been solved using numerical integration.The different types of air pollution sources namely domestic, industries, power plants and vehicles are considered as point, area and line sources.The concentration of PM 10 has been obtained for Dec 2008 at two locations as Delhi College of Engendering (DCE) and Income Tax Office (ITO) of Delhi as shown in Fig. 1.

Brief Description of the Emission and Meteorological Data
The emission rate of PM 10 has been estimated due to domestic, industries, power plants and vehicles in the year 2008-09 for Delhi over the area of 26 km × 30 km with 2 km × 2 km grid size using the primary and secondary data.The emission of PM 10 , from each category of vehicle, was estimated using the number of vehicles (monitored by Central Road Research Institute (CRRI)), emission factor (estimated by ARAI, 2007)  From an atmospheric pollution perspective, the most important season in Delhi is the winter lasting from December to February.This period is dominating by cold, dry air and ground based inversion with low wind conditions (< 1 m s -1 ), which increase the ground level concentration of pollutants.For practical reasons, the December month is used as representative of winter season.The Advance Research WRF (ARW) modelling system version 3.1.1 is adopted for simulating the hourly meteorological parameters for 1 Dec to 30 Dec 2008 with 24 h slice interval with two way nesting option in this study.ARW has fully compressible, Euler nonhydrostatic equations with run-time hydrostatic option available.The model uses terrain-following hydrostatic pressure coordinate system with permitted vertical grid stretching (Laprise, 1992).Arakawa-C grid staggering is used for horizontal discretization.The model equations are conservative for scalar variables.The detailed description of WRF is presented in Wang et al. (2004).The computational domains of 70 × 70 × 51, 91 × 91 × 51 and 55 × 52 × 51 grid points and horizontal resolutions of 27, 9 and 3 km, respectively, have been chosen in this study.The model is initialized by real boundary conditions using NCAR-NCEP's Final Analysis (FNL) data (NCEP-DSSI, 2005) having a resolution of 1° × 1° (~111 km × 111 km).A ratio of 1:3 is maintained between resolutions of the outer domain and FNL data to ensure reliable boundary conditions for the model.The model physics options are presented in Table 1 and these physics options have been selected from Kumar and Goyal (2014).The microphysical sub grid processes are represented by the scheme described by Purdue Lin scheme (Lin et al., 1983).The Kain-Fritsch Scheme (Kain and Fritsch, 1993) is used to represent cumulus parameterization.Rapid Radiative Transfer Model (RRTM) long-wave radiation parameterization (Mlawer et al., 1997) and short wave radiation parameterization by Dudhia (1989) are used to represent the long wave and short wave radiation process, respectively.The ACM2 (Pleim) PBL parameterization (Pleim, 2007) is used to represent the PBL over domain.The land surface process is represented by thermal diffusion scheme and surface layer is based on similarity theory (Janjic, 2002).Atmospheric stability has been classified according to temperature gradient (Tirabassi, 2010), which range from unstable to stable.The dispersion parameters have been calculated using different stability parameters.

Neural Network Model
Neural network (NN) models can handle multivariable problem.The use of NN as mentioned in the literature is an effective alternative to more traditional statistical techniques for forecasting the air pollutants.It can be trained to approximate virtually any smooth, measurable and highly nonlinear functions between input and output; and requires no prior knowledge to the nature of this relationship (Gardner and Dorling, 1998).The most popular tool is provided by multilayer perceptrons (MLP) with an error-backpropagation supervised learning rule in forecasting with neural networks (Rojas, 1996).It is also called the back-propagation neural network (BPNN).
A 3-layer perceptron model is used in the present study as network architecture.The first layer contains the input variables of network such as atmospheric meteorological variables namely wind speed, wind direction index as defined by Lalas et al. (1982), sea level pressure, dry bulb temperature, dew point temperature, relative humidity and visibility.The second layer consists of neurons of hidden layer.The third layer is the output layer and observed concentration of PM 10 is the target of the forecasting model.Neural networks can be constructed with multiple hidden layers (> 3-layer perceptron model) although there are usually no advantages to doing so (Tu et al., 1996).So, 3-layer perceptron (1 input, 1 hidden layer and 1 output) model has been used in the present study.The inputs of neural network are atmospheric meteorological variables as these meteorological variables have sufficient correlation at 95% significant level with observed concentrations of PM 10 .The meaning of sufficient correlation is defined as that the correlations between observed concentration of PM 10 and atmospheric meteorological variables are more than their critical correlation values at 95% significant level.The number of neurons of the hidden layer is one of the parameters to be chosen in the perceptron model and they were estimated by varying the number of the neurons in training period.

Statistical Downscaling Modelling Approach
The real use of the proposed methodology lies in the choice of the dispersion models, in fact, the inclusion of the predicted model concentration as input values of the network means that it perform a twofold task.The first of these is to start from a situation close to reality.The second, and conceptually more important task, is linked to the fact that models perform well under certain hypothesis, while tending systematically to differentiate in performance when reality falls short of the ideal situations.As Kumar and Goyal (2013) show that neural network is improved after inclusion the PCA technique.So in this case, the PCA-NN functions as a filter of the model, correcting it so that it can give the best reproduction of the real situation.Principal component analysis (PCA) is a statistical technique to reduce a data set containing a large number of variables to a data set containing fewer numbers of variables.These new variables are linear combinations of the original ones

Emission Rate of PM 10 Pollutant and Validation of WRF Outputs
In the present study, the emission inventory of PM 10 emitted from different types of sources viz., domestic, industrial, power plant and vehicles has been developed for Delhi during the year 2008-09 over the area of 26 km × 30 km with 2 km × 2 km grid resolution.The total emission rates of PM 10 at DCE and ITO are 3.03 gm/sec and 8.02 gm/sec respectively.It has also been observed that total emission is contributed by road dust emission as 79.95% and 72.54% for DCE and ITO respectively.The vehicular source is contributing 5.63% and 26.30% of total emission for DCE and ITO respectively.The remaining 14.42% and 1.16% of total emission is contributed by domestic sources for DCE and ITO respectively.The contribution of domestic sources in DCE is more in comparison to ITO as DCE is residential area, while ITO is the major traffic intersection in Delhi.It is also found that there is no power plant and industries in both of the girds.
The scattered diagram between hourly observed surface temperature at Safdarjung Airport and WRF model simulated temperature at 2 m is shown in Fig. A1.The computed and observed surface temperatures show a significant similar trend with correlation coefficient about 0.89 for Dec, 2008.The deviation of the simulated temperature from observed ranges are between -7.12°C (10 UTC of 18 Dec) and +7.86°C (05 UTC of 30 Dec).The RMSE between observed and model simulated temperature is 2.46 for Dec, 2008.The distribution of temperature deviation show that about 31%, 57%, 75% and 95% deviations lie in the ranges of ±1°C, ±2°C, ±3°C and ±4°C, respectively for Dec month.The positive (negative) temperature deviation can affect the stability in analytical model but it is estimated through the temperature differences at different levels.The resulting deviation if any, in the predicted concentrations will be minor and limited to that particular hour in the day.Such minor deviations in hourly concentrations are not likely to significantly affect the 24 hourly averaged concentrations in the final results.Thus it can be concluded that WRF model is able to simulate point surface temperatures over Delhi reasonably well to simulate the concentration from analytical model.
The hourly average angular distribution of observed wind directions at Safdarjung Airport and WRF model simulated wind direction at 10 m is shown Figs.A2(a) and A2(b), respectively for Dec, 2008.The wind direction between observation and WRF are shifted by 22.5 degrees.It can be shifted because of different frequency of monitored wind direction from IMD and simulated wind direction from WRF model.The reason for such a discrepancy could be attributed to the poor initial and boundary conditions as the model is initialized by real boundary conditions using NCAR-NCEP's Final Analysis (FNL) data (NCEP-DSSI, 2005) having a resolution of 1° × 1° (~111 km × 111 km).It is observed that out of 720 observational records of wind directions, 2.01%, 8.65%, 41.65% and 47.69% records show that winds are prevailing from northeast, southeast, southwest and northwest directions, respectively.Similarly, simulated wind directions, 2.29%, 5.35%, 12.78% and 79.58% are in northeast, southeast, southwest and northwest directions, respectively.The observed and simulated outputs show maximum percentage in northwesterly direction and minimum percentage in northeasterly direction.In case of the wind speeds, the model seems to under predicting the speeds (average speed 1.57 m s -1 ) as compared to those observed (average 1.86 m s -1 ).In addition to this, model is able to capture the calm wind (wind speed < 1 m s -1 ).The model is showing around 24% times the calm wind condition in comparison to 31% in observation in Dec month.
Overall it can be concluded that the winds are represented reasonably well by WRF to simulate the concentration from analytical model.

Statistical Downscaled Model Results
In order to assess the performance of analytical dispersion model, PCA-NN and statistical downscaled models concentration are evaluated at two different locations namely DCE and ITO.The emission rate and WRF model's simulated meteorological variables are used to estimate the concentration from analytical dispersion model.The inputs of PCA-NN model are the only observed meteorological variable, while the output of dispersion model's concentration with atmospheric meteorological variables is used as input to PCA-NN model as statistical downscaling modelling system.
The primary aim of the present study is to forecast the hourly PM 10 pollutant one day ahead using the statistical downscaled model for Delhi.The hourly data of first 25 days (600 hours as 80% of total hours) of Dec 2008 is used for training the data and hourly data of last 5 days (120 hours as 20% of total hours) of Dec 2008 is used for evaluation.The percentage of training and evaluation period influence the prediction results as shown by Pelliccioni and Tirabassi (2006).The flow of data, used in statistical downscaled model, is shown as schematic diagram in Fig. A3.The output of dispersion model's concentration with meteorological variables is used as input to PCA-NN model as statistical downscaled modelling system.PCA is a statistical technique to reduce a data set containing a large number of variables to a data set containing fewer numbers of variables.These new variables are linear combinations of the original ones and these linear combinations of chosen to represent the maximum possible fraction of the variability contained in the original data.The dependent variables are pre-processed through PCA technique and these new orthogonal variables (PCs) are used as input to NN model in PCA-NN model.The difference between NN and PCA-NN is that the dependent variables are directly used as input to NN model; while dependent variables are preprocessed through PCA technique and these new orthogonal variables (PCs) are used as input to NN model in PCA-NN model.
It has been found that the Levenberg-Marquardt backpropagation learning algorithm with tansigmoid function in the hidden layer and linear transfer function in output layer is satisfactory after doing the sensitive experiments with respect to different activation function as tansigmoid, logistic and pure linear (identity).The number of neurons of the hidden layer is one of the parameters to be chosen in the perceptron model and was estimated by varying the number of the neurons in training.The hourly output of dispersion model's concentration and 7 meteorological variables of previous day are used as input to PCA for pre-processing the statistical downscaled model.The principal component, those cumulative amounts of variance is approximately 90% has be represented in Table A1 with its eigenvalue and amount of variance, and rest of the components are ignored.It is also noticeable that only 5 PCs have cumulative variance at ITO and DCE stations and these are used as input instead to 8 input variables.In the present simulation, it has been found that only 4 neurons in a hidden layer yield the best architecture at ITO and DCE station.The hourly daily forecasted concentration of PM 10 from all three models is compared with observed values in the form of scatter plots for training and evaluation period at DCE in Figs.2(a) and 2(b) respectively.CPCB is monitoring PM 2.5 at ITO station, while analytical dispersion model is simulating the PM 10 pollutant.So PM 2.5 has been considered as 0.8 fraction of PM 10 (Sengupta, 2008).The simulated concentration of PM 2.5 from all three models is compared with observed values in the form of scatter plots for training and evaluation period at ITO in Figs.3(a) and 3(b) respectively.It shows that a good agreement has been found between the observed and simulated concentrations by model 3 at both the locations.For an unpaired analysis, a quantile-quantile (Q-Q) plot (Figs. 4(a) for DCE and 4(b) for ITO) is also drawn by arranging both, the observed and predicted concentrations, in increasing order of their magnitudes.The Q-Q plot shows that the predicted concentrations from statistical downscaled model are closer to a one-to-one line in comparison to other two models at both the locations.
To appreciate the capabilities of the proposed methodology, three models are performed.First is using the analytical dispersion model alone, only PCA-NN model has been used as second one and third is using the coupling of dispersion model and PCA-NN.It has been found that the means and standard deviations are (161.94, 135.05); (366.20, 121.80) and (351.24, 113.05) for model 1, 2 and 3 at DCE.Similarly, the means and standard deviations are (223.51, 120.38); (201.14, 75.72) and (206.05, 74.56) for model 1, 2 and 3 at ITO.In these models, it has been observed that the correlation coefficients for first, second and third models at DCE and ITO are 0.12; 0.42; 0.48 and 0.14; 0.52; 0.61 in evaluation period.A comparison between the performances of all three models is shown in Table 2, where C o and C P are the observed and predicted concentrations, respectively, while σ is the standard deviation.Table 2 presents some statistical indices, defined as RMSE, Normalised mean square error (NMSE), Correlation coefficient (R), factor of two (FAC2), fractional bias (FB) and index of agreement (IA) in evaluation period.The increase of statistical downscaled model performances of all statistical indexes is evident at both the locations.In particular, the coupled methodology is able to improve on average the error between the calculated values and the measured ones from models 1, 2 and 3 as is shown in the form of RMSE and NMSE 249.61;161.95;148.00 and 1.18;0.22;0.19respectively at DCE.The same trend of the errors between calculated and measured is followed at ITO.It has also been shown that FAC2 of models 1, 2 and 3 at DCE and ITO as 35%; 72%; 75% and 74%; 85%; 89% respectively.The FB suggests that analytical model is performing under-prediction at both the locations.However, model 2 and 3 are performing under-prediction at ITO and over-prediction at DCE.This table also shows that IA of models 1, 2 and 3 at DCE and ITO as 0.43; 0.55; 0.59 and 0.47%; 0.61%; 0.67% respectively.This statistical error also reveals that model 3 is performing better than model 1 and 2. Based on the above statistical analysis, we can conclude that the performance of both air dispersion model and "pure" statistical models is inferior to that of the statistical downscaled model.In particular the root mean squares error of the deterministic forecasts is reduced, on the average, by up to 35 percent.
The performances of these models are also tested for different atmospheric conditions and rush hours in evaluation period.We have tested these models by dividing the atmospheric condition into two categories.The first category has been considered as stable condition and second category has been considered as unstable to neutral conditions.It has been found that RMSE and IA of models 1, 2 and 3 at DCE and ITO as (233.38,0.47);(202.10,0.46);(159.27,0.62) and (154.90,0.46);(104.65,0.64);(98.78,0.71)respectively in stable condition.Similarly, the RMSE and IA of models 1, 2 and 3 at DCE and ITO are (304.34,0.37);(153.36,0.52);(166.74,0.45) and (151.97,0.42);(142.17,0.49);(126.67,0.53)respectively in unstable to neutral condition.So, it has been found that these models are performing better in stable condition in comparison to unstable to neutral condition.In addition, these models are also tested for rush hours.The rush hours are selected from (08:00 AM to 10:00 AM) morning office opening hours and (05:00 PM to 07:00 PM) evening office closing hours.It has been found that RMSE and IA of models 1, 2 and 3 at DCE and ITO as (205.28,0.47);(123.13,0.72);(107.35,0.76) and (149.97,0.20);(91.13,0.69);(83.94,0.72)respectively in these selected hours.It is also revealed that statistical downscaled model is performing better than analytical and PCA-NN models in rush hours.

CONCLUSIONS
In this paper, we presented the application of statistical downscaling approach for forecasting of hourly PM 10 concentration for Delhi one day ahead.The statistical downscaling takes advantage by means of statistical post- processing of analytical model's output data with neural network.The neural network technique is used to compute the weights in training period and the same weights are used for forecasting at the stations.It has been found that the proposed procedure enables significant improvements of the deterministic PM and "pure" statistical model's forecasts.Specifically, the average reduction of RMSE for deterministic forecasts reaches 40 and 35 percent for DCE and ITO respectively.In addition, the RMSE of the deterministic model is reduced by at least 45% for rush hours particulate matter concentrations using statistical downscaling.The results also reveal that statistical downscaled method improves the forecasts by at least 30% for stable and unstable-neutral atmospheric conditions.Future work should address the issue of threshold exceedance forecast and the spatial distribution of the aggregated forecasts should also be improved.The study would have been more generalized, if it would have been done at different places in Delhi.However, due to unavailability of continuous air quality data at different locations, it has been made only at two locations.Improvement in the accuracy of emission rate of the pollutant, use of meteorological data assimilation for updating the initial and boundary conditions in WRF model will be required to improve the predictability of pollutant dispersion in the future.The proposed methodology can be modified for better after inclusion the improved air dispersion model as the present model has the limitations as dry deposition, wet deposition and atmospheric chemistry are neglected in the present study.The PCA-NN technique is used as statistical

Fig. 2 .
Fig. 2. Scatter plots between hourly observed and modeled predicted PM 10 concentration from all three models in a) training period and b) evaluation period of Dec 2008 at DCE.The middle solid line is one-to-one line between observed and simulated concentrations whereas the dotted lines correspond to factor of two.

Fig. 3 .
Fig. 3. Scatter plots between hourly observed and modeled predicted PM 2.5 concentration from all three models in a) training period and b) evaluation period of Dec 2008 at ITO.The middle solid line is one-to-one line between observed and simulated concentrations whereas the dotted lines correspond to factor of two.

Fig. 4 .
Fig.4.Q-Q plots between hourly observed and modeled predicted concentration from all three models at a) DCE for PM 10 and b) ITO for PM 2.5 in evaluation period of Dec 2008.The middle solid line is one-to-one line between observed and simulated concentrations whereas the dotted lines correspond to factor of two.

Table 1 .
WRF model physics options.The second layer consists of the neurons of the hidden layer.The third layer is the output layer, which is the observed concentration of PM 10 pollutant in the present study.Firstly, the weights of hidden layer are estimated with respect to minimum error between observed PM 10 and NN model's predicted PM 10 concentration in training.These weights are used as the same to validate the model.So, analytical model concentration of PM 10 the input of statistical downscaled model, while the observed concentration of PM 10 is output or target of forecasting model in training period.

Table 2 .
Statistical error analysis of analytical, PCA-NN and statistical downscaled models at ITO and DCE in Evaluation period against observed and predicted PM concentration can be further improved after inclusion the genetic algorithm and fuzzy logic with neural network.