Ozone in China: Spatial Distribution and Leading Meteorological Factors Controlling O3 in 16 Chinese Cities

Tropospheric ozone (O3) is one of the major air pollutants in China. This paper examined the O3 concentration in 16 important Chinese cities including 7 megacities and developed a statistical model named Generalized Additive Model (GAM) as a function of different factors to estimate the maximum daily 8 h (MDA8) O3 during 2014–2016 and how the leading factors impacts O3. We found that: (1) Three seasonal patterns of O3 have been summarized in the spatial-temporal analysis and summer is the highest season in most of the cities. (2) GAM performs very well that it can capture 43–90% of daily O3 variations. (3) DOY (day of year) and 6 meteorological factors of daily average relative humidity at 1000 mb, daily maximum temperature at 2 m, daily average zonal wind speed at 700 mb, distance of trajectory back 12-hour, surface pressure and geopotential height at 500 mb are sensitive for all 16 cities. The sequence of the leading factors is the same in each group respectively (3 group categories: Beijing, Shijiazhuang and Kunming; Harbin, Hohhot and Dalian; Chengdu and Wuhan). The other 8 cities have different leading factor combination. (4) HYSPLIT back trajectory data can help us to know the importance of transport direction for O3 concentration in Beijing and other three coastal cities Dalian, Shanghai and Guangzhou. (5) During the Beijing “Parade Blue” period in the summer of 2015, NO2 was reduced by 44.6% but O3 was only reduced by 15.7%. Most of these O3 changes can be explained by meteorological variations such as wind direction and air temperature.


INTRODUCTION
Ozone (O 3 ) is an important greenhouse gas contributing to global warming (Johnson et al., 1992;Berntsen et al., 1997;Sitch et al., 2007).Tropospheric O 3 is a secondary air pollutant and it is generated by a series of photochemical reactions with precursors including oxides of nitrogen (NO x ) and volatile organic compounds (VOCs) (Tu et al., 2007) and has a direct adverse impact on ecosystem and human health (Solomon et al., 2000;Silva et al., 2013;Lelieveld et al., 2015).Meteorological factors play an important role in the formation of O 3 such as dispersion, transport and dilution (Seaman, 2000;Tu et al., 2007;Jin et al., 2013;Cheng et al., 2015).Jin et al. (2013) analyzed the role of meteorology on O 3 sensitivities and precursors using the CMAQ model in California's San Joaquin Valley.Jaffe and Zhang (2017) showed how temperature, wind speed and cloud fraction anomaly led to elevated O 3 in the western U.S. in 2015.Lou et al. (2015) used the GEOS-Chem model to investigate the impact of changes in meteorological parameters and anthropogenic activity on O 3 variations and processes in China.
Background tropospheric ozone levels have been continuously rising due to urban extension and industrial development (Vingarzan, 2004).Starting in 1980, emissions began to shift equatorward as China and low-latitude nations became more industrialized (Zhang et al., 2016).In China, the population and personal vehicles have increasing rapidly and this resulted in increasing emission of air pollutants (Chan and Yao, 2008;Zhu et al., 2011).O 3 precursors such as VOCs and NO x have dramatically increased in the past decade (Akimoto, 2003;Vingarzan, 2004;Richter et al., 2005;Monks et al., 2009) and will continue to increase (Ou et al., 2010;Wang et al., 2013;Yang et al., 2015;Wu et al., 2017) for transportation is one of the major anthropogenic emissions of NO x (Zhang et al., 2009;Liu et al., 2015Liu et al., , 2016a;;Li et al., 2017;Saikawa et al., 2017).Although ozone is a key air pollutant, it was not regularly reported by environmental agencies until 2012 when the new Chinese O 3 standard, 8-hour maximum values of 160 µg m -3 (81 ppbv at 1 atmosphere and 25°C), went into effect (http://www.mep.gov.cn).
In a recent evaluation of Chinese O 3 data, Wang et al. (2017b) found that Beijing, Chengdu, Guangzhou, and Shanghai had O 3 that exceeded the WHO standard of 100 µg m -3 on more than 30% of the days in 2013-2015.Wang et al. (2017a) examined O 3 and its precursors on China's major urban centers Jing-jin-ji, Yangtze River Delta and Pearl River Delta and pointed out serious concerns with high levels of O 3 in these regions and reviewed key mechanisms, the role of regional transport and impacts on human health and crops.Shan et al. (2009) combined HYSPLIT back-trajectories and surface meteorological data to identify the meteorological conditions associated with an ozone episode (> 100 ppb) and non-episode days.Additionally, Chen et al. (2017) showed that health effect from O 3 was a widespread problem in China, even at concentrations below the current Chinese air quality standard.
Although many papers have discussed spatial-temporal distribution of O 3 and the interaction between O 3 and meteorological factors in China, there are still some problems.First, most of the spatial-temporal distributions of O 3 focused on the developed urban areas such as Jing-jin-ji, Yangtze River Delta and Pearl River Delta and the seasonal variation patterns were not summarized (Ding et al., 2013;Wang et al., 2017a).Second, most of them qualitatively described the relationships between O 3 and meteorological factors but the leading meteorological factors were not identified in different Chinese cities (Tu et al., 2007;Shan et al., 2009).
In this paper, we explored the spatial and temporal distributions of O 3 and used GAM to examine the important impact factors that affect O 3 concentration.GAM is a flexible regression model that can incorporate linear or non-linear relationships with numerical and categorical variables (Hastie and Tibshirani, 1990;Wood, 2006).The GAM is an effective tool to handle the complex nonlinearities of air pollutants (Carslaw et al., 2007;Belušić et al., 2015).Many previous studies have successfully used GAM to explore the relationship between air pollutants (such as PM 2.5 , O 3 , SO 2 and CO) and meteorological factors (Aldrin and Haff, 2005;Pearce et al., 2011;Belušić et al., 2015;Cai et al., 2016;Gong et al., 2017).So our goals for this study are: 1. Describe the spatial and temporal patterns for MDA8 O 3 in 16 important Chinese cities. 2. Use the GAM approach to identify the leading impact factors that control daily O 3 variation in each city and compare our result with the result of Random Forest Model (Zhan et al., 2018).

Examine impacts from short-term reductions in O 3
precursor emissions during the "Parade Blue" period in Beijing in August-September 2015.

Study Area
We examined O 3 for January to December, 2014-2016, for 16 Chinese cities spread across the mainland of China.
The 16 cities are characterized by diversified natural or human environments and they are located in China's seven main areas-North China, South China, East China, Central China, Southwest China, Northwest China and Northeast China.There are 7 megacities included in this study, with a population more than 5 million in the urban district.The specific information about the 16 cities can be found in Fig. 1 and Table S1.

Variables and Data Sources
The historic daily maximum 8 h average (MDA8) O 3 values are from the Chinese air quality on-line monitoring platform (https://www.aqistudy.cn/)which is a public platform that collects the air pollution data from MEPA of China and it started in 2014.The values of MDA8 O 3 for each city are the mean values of all the monitoring values in that city.The hourly average O 3 values were collected from the website (http://www.pm25.in/)and we used self-compiled programs to collect the real time data every hour for the 2 years (2015)(2016).The hourly values were used only to examine diurnal variations.The meteorological values were from the NCEP/NCAR reanalysis dataset (https://www.esrl.noaa.gov/psd/data/timeseries/daily/).The Hybrid Single Particle Lagrangian Integrated Trajectory Model (HYSPLIT) was run for each day to calculate 12-hour backward trajectories from each city.Each trajectory was used to calculate the 12-hour direct backward distance and transport quadrant, relative to the starting point.A list of the cities and the meteorological variables used for each model are given in the Supplemental Information, Tables S1 and S2.

GAM Method and Model Development
In this paper, we modeled the MDA8 O 3 of each city separately using GAM in R software with the "mgcv package".The equation is as follows: where i indicates the ith day's meteorological observation.f j (x) are smooth functions of the meteorological data.The element g(μ i ) is the "link" function, which specifies the relationship between the linear formulation on the right side of Eq. (1) and the response μ i .x i θ represents a categorical relationship for predictors which are not subject to non-linear transformations.Non-linear functions f j (x) are used to represent the complex relationship between ozone and meteorological parameters.ε i is the residual (Thompson et al., 2001;Wood, 2006).The link function can use either a log link with Gammar distributions or identity link with Gaussian distributions and both have of them been applied to the ozone data in previous studies.For example, Camalier applied the Generalized Linear Model to explore meteorological impacts on ozone using log functions and Gaussian distributions (Camalier et al., 2007) while others used log link (Aldrin and Haff, 2005;Pearce et al., 2011).
In this study, we used Gamma distribution and log link function out of mathematical rationality because the frequency distributions of ozone are skewed in most of the cities (see examples in Fig. S1).Penalized Cubic Regression Splines (CRS) were used for the smoothing functions f j (x) to allow a non-linear response between ozone and each meteorological parameter.The smoothness of each function f in the equation is controlled by the number of knots or the effective number of degrees of freedom (e.d.f).The model fits the observed ozone concentration better as the number of knots increases but the model becomes less smooth and effectively "overfit" the data.The default method of crossvalidation was used to choose the degree of smoothing (penalization) of k = 10 dimensional regression splines in the full 3-year and summer analyses and k = 3 for "Parade Blue" analysis.We use the GAM approach to fit observed O 3 MDA8 in these 16 cities for the full three-year period and for summer only (June 1-August 31) period.For Beijing, we also ran the model for a "Parade Blue" analysis (August 5-September 18, 2015) for three years (2014)(2015)(2016) to compare variations between years.
We can identify important predictors using F statistic in GAM (Wood, 2006;Camalier et al., 2007).Camalier et al. (2007) applied this method to find the dominant meteorological factors for 39 cities in the USA.F statistic is an effective tool to judge which parameter is the most important and it includes the comprehensive consideration of degrees of freedom (e.d.f.) and P value for each parameter.The larger the F statistic, the more important is the parameter.F statistics value will be used in the leading meteorological factors section.Partial response curve from the GAM can explain the non-linear/linear relationship between O 3 and individual factor and it is a good way to look into each variable's effect and these will be explained in effects of individual factors on O 3 part.In order to see the difference of wind directions between coastal and inland cities, we selected 5 cities of Beijing, Wuhan, Dalian, Shanghai and Guangzhou to build the GAM by using summer only (June 1-August 31) data for 2014-2016 and this will discuss in the importance of wind direction part.

Model Validation and Quality Control
There are several methods to examine the underlying assumptions of homogeneity, normality and independence (Wood, 2006;Zuur et al., 2009).To examine the quality of the model, we look at several factors including the residuals (observed values minus fit values).In this study, we used the following methods to examine the model quality: 1. Examine the autocorrelation of both the original O 3 data and residuals (Fig. S2). 2. Plot the fit ozone against observed ozone and model residuals (Figs. 4 and S3). 3. Use the gam.check code in R software to run QQ plots (sample quantiles against theoretical quantiles), scatterplots (residuals against linear predictor), histogram of the residuals and scatter plots (response against fitted values) (Fig. S4).Fig. S2 shows the autocorrelation in the original MDA8 data and the model residuals for Beijing.The original MDA8 data has a significant autocorrelation out to at least 15 days but the autocorrelation in the residuals drops to a small value within a few days.This reflects the fact that the GAM can correct the seasonal correlative series information and remove the strong seasonality in the original MDA8 data.

Spatial-Temporal Distribution of O 3 in 16 Chinese Cities Overview
Table 1 shows the basic statistical parameters for MDA8 O 3 in 2014-2016 for all 16 cities.Beijing (294 µg m -3 ), Chengdu (293 µg m -3 ) and Guangzhou (275 µg m -3 ) had the highest values for maximum MDA8 O 3 .Lhasa (103.77µg m -3 ), Shanghai (102.54 µg m -3 ) and Qingdao (99.38 µg m -3 ) had the highest annual mean MDA8 O 3 of the 16 cities.All of the cities exceeded the Chinese Ambient Air Quality Standards (CAAQS) (GB3095-2012) Grade II standards of 160 µg m -3 for MDA8 from 2014 to 2016 on at least one day.Beijing, Chengdu and Hangzhou had the greatest number of nonattainment days (219, 132 and 132, respectively) during the three years.

Seasonal Variations of MDA8 O 3
Most of the cities displayed clear seasonal variations.We calculated the mean value for each season during 2014-2016 in the 16 cities.This analysis identified three seasonal patterns for the MDA8 O 3 distribution.The first pattern, seen in 3/4 cities, shows the low values of ozone in the winter and the highest values of ozone in the warmer periods especially summer (first 12 cities listed in Table 2).The second pattern shows a maximum mean value in the spring, as seen in the higher elevation cities of Kunming (1930 m), Lhasa (3660 m) but also Qingdao.The spring peak likely reflects a dominance from global background O 3 , which typically peaks in spring at remote sites in the Northern Hemisphere (Zhang and Jaffe, 2017).The third pattern shows a more uniform distribution throughout the year, as seen in Xiamen (Examples are shown in Fig. S5).

Diurnal Variations of O 3
Fig. 2(a) shows the diurnal variations of average hourly O 3 concentrations in the summer and the whole year for Beijing, 2015-2016.Hourly data was not available for 2014.The summer and annual mean values show the same pattern: highest concentration in the afternoon and lowest concentration in the late night and early morning (Shan et al., 2008;Tang et al., 2012).The yearly mean diurnal curve shows that O 3 increased in the morning around 8:00, reached the maximum around 16:00 and then decreased in the evening.

Spatial Distribution of O 3
Fig. 3 shows the spatial variation of average MDA8 O 3 for all of the 16 cities for 2014-2016.Shanghai and Lhasa have the highest annual average O 3 with average MDA8 concentrations above 100 µg m -3 .High O 3 in Shanghai is a result of large industrial emissions, and high O 3 in Lhasa reflects higher free troposphere O 3 due to a combination of local photochemical production and its greater exposure to the free troposphere due to its high elevation (Newchurch, 2003;Bian et al., 2012;Ran et al., 2014).All of the cities with annual mean MDA8 O 3 concentrations above 90 µg m -3 are distributed between 29.65N-39.91Nand all the megacities, including Beijing, Chengdu, Wuhan, Hangzhou and Shanghai, show many high O 3 days.Urumchi, Harbin, Kunming and Xiamen have the lowest yearly mean MDA8 O 3 concentrations and the fewest non-attainment days.

Summary of GAM Result for Each City
The GAM was applied to the MDA8 O 3 values for each city using the meteorological data shown in Table S2.The fitting power of GAM is measured by the adjusted R 2 .Table 3 shows the adjusted R 2 and standard deviation of the residuals for the 16 cities.The GAM model captured most of the variations in MDA8, ranging from 0.43 (Xiamen) to 0.90 (Urumchi).The residuals are normally distributed, and the mean residuals of all cities are near 0. The standard deviations of the daily residuals are between 10.9 (Urumchi) and 33.02 (Guangzhou).Fig. 4 shows observed MDA8 versus model fit MDA8 for Beijing using the full 3 years data.Table 4 shows the adjusted R 2 and standard deviation of the residuals for the 5 cities for summer only period (June 1-August 31).Comparing the Table 3, most of the adjusted R 2 in Table 4 drop slightly except Dalian and the standard deviation of the residuals are higher.

The Leading Meteorological Factors and the Effects on Ozone The Leading Meteorological Factors
Table 5 displays the F statistic value for the 3 most important variables for each city.For example, DOY is the most important factor in Urumchi and Lhasa because the F statistic value is as high as 116 and 111 respectively, and these values are much higher than the second factors in these cities.In fact, DOY is a comprehensive factor which includes some meteorological information such as temperature and humidity.DOY and 6 meteorological factors of daily average relative humidity at 1000 mb, daily maximum temperature at 2 m, daily average zonal wind speed at 700 mb, distance of trajectory back 12-hour, surface pressure and geopotential height at 500 mb are sensitive for all 16 cities.The sequence of the leading factors is the same in each group respectively (3 group categories: Beijing, Shijiazhuang and Kunming; Harbin, Hohhot and Dalian; Chengdu and Wuhan).DOY, daily average relative humidity at 1000 mb and geopotential height at 500 mb are top 3 leading factors for Beijing, Shijiazhuang and Kunming.DOY, daily maximum temperature at 2 m and distance of trajectory back 12-hour are top 3 leading   We have developed another GAM model for Beijing that considers the 2015 Parade Blue period (August 20 th -September 3 rd ) plus 15 days on either side of this period (August 5 th -September 18 th ).The result is also in Table 5 and shows daily maximum temperature at 2 m is the top 1 leading factors in the period.We will discuss more in detail below.

Effects of Individual Factors on O 3
Each of the factors included in the GAM shows some degree of predictive power on the daily variations in the MDA8.Some effects occur for obvious reasons while others are more subtle and may not indicate a direct causation.For example, it is well known that O 3 increases with temperature due to photochemical reactions.Temperature can also be a surrogate for other factors such as high pressure, stagnation and low wind speeds.O 3 generally decreases with wind speed due to the dilution effect.However, there may be exceptions and most of the meteorological effects are non-linear.Fig. 5 shows how DOY, distance of trajectory back 12-hour, daily average relative humidity at 1000 mb and daily maximum temperature at 2 m impact the ozone in Urumchi, Guangzhou, Xiamen and Beijing "Parade Blue" period respectively.The X axes indicate the individual factors, and the Y axes indicate the f(x).
In Table 5, we see that DOY is a very important variable and 13 out of 16 cities have identified it as the most important predictor for the annual data.Fig. 5(a) shows the partial response of ozone to the DOY factor in Urumchi.We see that the smooth function f fits the DOY very well with a smooth curve and the 95% confidence bounds for the response is very small.This reflects the general seasonality of O 3 (either spring or summer peak), as discussed previously.For Urumchi, the f(DOY) peaks in summer (DOY:152-243).S2. the X axes and f(x) are clearly in an inversely proportional relationship and indicate that f(trajectory distance) and f(relative humidity) decrease with trajectory distance and relative humidity increase.This also shows that O 3 decreases with increasing trajectory distance and relative humidity.This result is similar to results shown in previous publications (Camalier et al., 2007).The degrees of freedom are 2.1 and 2.24 separately and they are nearly close to a linear function.Therefore, trajectory distance back 12-hour and relative humidity at 1000 mb is the top predictor for Guangzhou and Xiamen, respectively.Fig. 5(d) shows f(x) for the maximum temperature at 2 m in Beijing for the "Parade Blue" analysis.We see that f(x) increase with temperatures above 290 K (17°C).This finding is in agreement with previous studies by Pearce et al. (2011) andCamalier et al. (2007) who found O 3 will increase with daily maximum temperature above 17°C and 22°C respectively.However, above 305 K, there is a tendency towards a slightly lower response function.The influence of temperature on O 3 is statistically robust, as indicated by the fact that the 95% confidence interval for f(x) does not overlap with 0.

The Importance of Wind Direction
Back trajectories show where air mass came from.We use back trajectories to identify the 12-hour endpoints by quadrant (NE, SE, SW and NW).Table 6 shows MDA8 O 3 and GAM residuals for continental and marine cities as a function of 12-hour back-trajectory quadrant.Shanghai, Dalian and Guangzhou are marine cities. Shanghai is located in the Yangtze River Delta in the east of China and it is the largest metropolitan area in this area.We see that MDA8 O 3 levels differ by about 31-56 µg m -3 from the westerly versus easterly directions in Shanghai.This result is reasonable because the air mass coming from the west bring substantially more O 3 to Shanghai due to domestic emissions.The same phenomenon can be also seen for Dalian and Guangzhou.We see that observed ozone is very low from the southeast in Dalian, which is from the marine areas.Guangzhou, located in southern China, has the highest MDA8 O 3 from the north and lower values from the south For Beijing, the southerly direction has much higher mean MDA8 O 3 compared to northerly directions because there are large industrial areas to the south of Beijing and the research shows that O 3 precursor NO 2 pollution in Beijing is mainly affected by the regional transport from the southern surrounding cities around Beijing such as Baoding, Langfang and Tianjin (Liu et al., 2016b).Wuhan is a continental city in center China, and there is no obvious difference in MDA8 O 3 from any directions.Note that because the GAM used trajectory quadrant as a predictive variable, so we would expect the residuals for each quadrant to be very small.

O 3 Variations in Beijing "Parade Blue" in 2015
September 3, 2015 was China's 70 th anniversary of victory in World War II (V-Day).For the anniversary celebration, China adopted strict controls on emissions in Beijing and surrounding areas for the period of August 20 to September 3, 2015.In order to improve air quality in Beijing during the anniversary parade, the number of motor vehicles on the streets was reduced by half.And 10,000 companies were ordered to limit production or close construction activities (Li et al., 2016).Li et al. (2016) discussed the observed SO 2 , NO x , and PM data and found that concentrations were reduced by 61-77%, 48-57% and 64% for SO 2 , NO x and PM 1 respectively, during the V-Day parade.For NO 2 , concentrations were reduced by an average of 44.6% compared to the same time period the year before and the year afterwards.Since the emission reductions decreased PM concentrations, many studies call the V-Day parade "Parade Blue" in Beijing (Xu et al., 2017).
We used the GAM results to help understand how O 3 was impacted during the "Parade Blue" time period.Table 7 shows average NO 2 , MDA8 O 3 , GAM fit MDA8 O 3 and the GAM residual for a 45 day period each year in 2014-2016, centered on the 15-day "Parade Blue" time period.The data clearly show that emissions were reduced only for the period of August 20-September 3, 2015.Fig. 6 shows a time series of the daily values for this period in 2015, when emissions were reduced.During the 2015 "Parade Blue" period, the model captured the variations very well, with a relatively small standard deviation of the daily observations (21.5 µg m -3 ).Table 7 confirms that NO 2 decreased by 44.6%, similar to the decrease that was seen during the Olympic Games in 2008 (Chou et al., 2011).In contrast to the substantial decrease of the primary air pollutants, the MDA8 O 3 value only decreased 15.7% compared with same time period in 2014 and 2016.
Table 7 also shows the average 2-meter daily maximum temperature and number of trajectories from the more polluted SE and SW directions.The 2015 "Parade Blue" period was marked by much lower MDA8 O 3 values, especially compared to 2014, and these differences between years were very well fit by the GAM.The data also show that the 2015 period was much cooler, with an average 2-meter maximum temperature that was 4.5 K below the 2014 and 2016 average for the same time period.The 2015 "Parade Blue" period also had very few trajectories from the more polluted regions south of Beijing, compared to the same time period in the 2014 or 2016.Given that the residuals are relatively small for all of three time periods (-5.7, +1.8, -3.9 µg m -3 ) the GAM demonstrates that the largest contribution to the O 3 reduction in 2015 was due to meteorological variations.Since the GAM incorporates meteorological factors, but not emission changes, systematic changes in the residuals would suggest a change in chemical conditions or some other non-meteorological factors that controls O 3 .So even with a 44.6% reduction in NO 2 concentrations, O 3 was largely unchanged, except for the meteorological variations.This suggests that O 3 maxima in Beijing is largely controlled by production outside of the controlled region and/or that the Beijing area is NO x rich so that O 3 production is more controlled by VOCs.This result is consistent with other studies that have identified VOCs as an important control on O 3 , particularly in the city center for major Chinese cities (Geng et al., 2008;Wang et al., 2009;Li et al., 2013;Liu et al., 2013;You et al., 2017).Liu et al. (2016b) also examined the "Parade Blue" period using observations and back-trajectories.They found that transport direction was an important control on O 3 and that transport from the regions to the south brought higher O 3 to Beijing.Our results are comparable, but in particular our model allows us to directly quantify these effects.For example in Table 6, we show that transport from the SW direction is associated with an average enhancement in the MDA8 of more than 50 µg m -3 .Table 7 shows that for the 2015 "Parade Blue" period, temperatures were significantly  cooler and transport was dominated by flow from the cleaner northerly regions.So we find that most of the reduction in O 3 seen during the 2015 "Parade Blue" period is due to changes in meteorology, not emissions, and the GAM approach can be used to quantitatively assess this influence.

DISCUSSION AND CONCLUSIONS
In this paper, we examined the spatial-temporal distribution of O 3 in 16 Chinese cities including 7 megacities.Beijing, Chengdu and Hangzhou had the highest number of days that exceeded the Chinese MDA8 standard of 160 µg m -3 during 2014-2016.We summarized 3 seasonal patterns of O 3 for 16 cities.O 3 is highest in summer for most of the cities.Only Qingdao, Kunming and Lhasa have a spring maximum in seasonal mean MDA8.Xiamen has a more uniform distribution throughout the year.The diurnal variation of O 3 shows highest O 3 concentration is in the afternoon and lowest values in the late night and early morning.
GAM is an effective tool to explain the relationship between MDA8 O 3 and 9 impact factors (DOY, DOW and 7 meteorological factors).The model can capture between 43%-90% of the variance in daily MDA8 values, with a value greater than 50% for all but one city (Xiamen).Thus it follows that the variations in meteorology are important variables to explain the daily variations of O 3 .We use partial response curve to show how DOY, distance of trajectory back 12-hour, daily average relative humidity at 1000 mb and daily maximum temperature at 2 m impact the ozone in Urumchi, Guangzhou, Xiamen and Beijing "Parade Blue" period respectively.Transport pathway calculated by HYSPLIT model is a key factor for Beijing and coastal cities of Dalian, Shanghai and Guangzhou because the background concentration of O 3 traveling from different directions can affect the local O 3 concentration.For Beijing, the MDA8 O 3 concentration from southerly directions has more than 50 µg m -3 higher values compared to the north because there are large industrial areas to the south of Beijing and O 3 precursor NO 2 pollution in Beijing is mainly affected by the southern regional transport.The westerly air mass brings much higher MDA8 O 3 of 31-56 µg m -3 than easterly directions in Shanghai due to substantially westerly domestic emissions.The same phenomenon can be also seen for coastal cities Dalian and Guangzhou.
In comparing our results to Zhan et al.'s paper (Zhan et al., 2018), we see some significant similarities.Both analyses applied a statistical method to examine daily O 3 variations.We applied the GAM, whereas Zhan et al. (2018) developed a Random Forest Model.Zhan et al.'s model used 33 variables, many of which are not readily obtained from standard meteorological analysis (See Zhan SI).But the GAM approach described here is more intuitive and simpler, requiring only 9 variables that are easily obtained from website or HYSPLIT model.Results from the two models were similar, with the R 2 values from the GAM approach of 0.43-0.90(depending on city), while Zhan et al. (2018) obtained an R 2 for all sites of 0.69.The GAM model identified DOY, relative humidity at 1000 mb, daily maximum temperature at 2 m are one of the leading factors in all the 16 cities, whereas Zhan et al. (2018) identified evaporation rate, temperature, relative humidity and DOY are the top 4 important predictor.Thus both approaches identified these variables as key predictor for O 3 .
Finally, for the Beijing "Parade Blue" case study, we found that atmospheric concentration of NO 2 reduced by 44.6%, but the MDA8 O 3 only reduced by 15.7%.The results shows that O 3 maxima in Beijing is largely controlled by production outside of the parade controlled region, and/or that the Beijing area is NO x rich so that O 3 production is more controlled by VOCs and this consistent with previous studies that VOCs as an important control on O 3 , particularly in the city center for major Chinese cities.The GAM residuals suggest that most of O 3 reduction was caused by meteorological factors such as cooler temperatures and transport from the north.All the results are consistent with previous studies that demonstrate the complexity of O 3 control in an urban area (Geng et al., 2008;Wang et al., 2009;Li et al., 2013;Liu et al., 2013;Liu et al., 2016b;You et al., 2017).

Fig. 1 .
Fig. 1.The 16 Chinese cities considered in this study.
Fig. 2(b) shows the diurnal cycle in O 3 for Lhasa.Although the seasonal patterns are somewhat different, both Beijing and Lhasa show a similar diurnal pattern: highest concentrations in the afternoon and lowest values in the late night or early morning.These diurnal variations are controlled both by local photochemical production and also changes in boundary layer mixing.

Fig. 2 .
Fig. 2. Diurnal variations of O 3 concentrations in summer and whole year for 2015-2016 for (a) Beijing and (b) Lhasa.
factors for Harbin, Hohhot and Dalian.DOY, daily average relative humidity at 1000 mb and daily maximum temperature at 2 m are top 3 leading factors for Chengdu and Wuhan.The other 8 cities have different leading factor combination.Daily average relative humidity at 1000 mb, DOY and distance of trajectory back 12-hour are top 3 leading factors for Hangzhou.DOY, distance of trajectory back 12-hour and daily maximum temperature at 2 m are top 3 leading factors for Shanghai.DOY, distance of trajectory back 12-hour and daily average relative humidity at 1000 mb are top 3 leading factors for Lanzhou.Distance of trajectory back 12-hour, daily average relative humidity at 1000 mb and DOY are top 3 leading factors for Guangzhou.DOY, daily maximum temperature at 2 m and geopotential height at 500 mb are top 3 leading factors for Qingdao.Daily average relative humidity at 1000 mb, DOY and Zonal wind at 700 mb are top 3 leading factors for Xiamen.
Fig. 5(b) shows f(x) for the variable "trajectory distance back 12-hour" in Guangzhou.Fig. 5(c) is the partial plot of relative humidity in Xiamen.Figs.5(b) and 5(c) show that Table 5.The leading meteorological factors for all cities using the full-year model (2014-2016), except for Beijing-Parade Blue analysis, which used data for August 5Blue Period (August 20-September 3, 2014-2016) is the model using the data from August 5-September 18 for all three years.The description of each parameter is shown in Table

Fig. 5 .
Fig. 5. Partial response of ozone to individual parameters -(a) Day of year for Urumchi, (b) 12-hour backward trajectory distance (km) for Guangzhou and, (c) Relative humidity (%) at 1000 mb in Xiamen; (d) Maximum temperature (k) at 2 m in Beijing during "Parade Blue" analysis.Spline smoothing function f(x) is on the vertical axis, with labels including the degrees of freedom for the determined nonlinear smoothing.Dashed lines are 95% confidence bounds for the response and the lines on the X axes show the distribution of data points.

Fig. 6 .
Fig. 6.Time-series plot showing observed O 3 , fit O 3 , residuals and maximum temperature at 2 m for the period of August 5-September 18, 2015.
F statistic values have been calculated from the model to identify top 3 leading factors in each city.DOY and 6 meteorological factors of daily average relative humidity at 1000 mb, daily maximum temperature at 2 m, daily average zonal wind speed at 700 mb, distance of trajectory back 12-hour, surface pressure and geopotential height at 500 mb are sensitive for all 16 cities.The sequence of the leading factors is the same in 3 group cities respectively.DOY, daily average relative humidity at 1000 mb and geopotential height at 500 mb are top 3 leading factors for Beijing, Shijiazhuang and Kunming.DOY, daily maximum temperature at 2 m and distance of trajectory back 12-hour are top 3 leading factors for Harbin, Hohhot and Dalian.DOY, daily average relative humidity at 1000 mb and daily maximum temperature at 2 m are top 3 leading factors for Chengdu and Wuhan.The other 8 cities have different leading factor combination.DOY, distance of trajectory back 12-hour and daily maximum temperature at 2 m are top 3 leading factors for Shanghai.Distance of trajectory back 12-hour, daily average relative humidity at 1000 mb and DOY are top 3 leading factors for Guangzhou.Partial response curve from the GAM can explain the nonlinear/linear relationship between O 3 and individual factor.

Table 3 .
Adjusted R 2 values using the whole year data from GAM, 2014-2016.

Table 4 .
Adjusted R 2 values for 5 cities using the summer only data from the GAM, 2014-2016.

Table 7 .
Mean values of NO 2 , MDA8 O 3 , the GAM fit MDA8 O 3 , residuals in µg m -3 , average 2 meter daily maximum temperature and number of trajectories from SE or SW quadrants for August 5-September 18, 2014-2016.